StringType. It closely follows the Pandas pandas.Series.str API.
Example:
UDFs
capitalize() udf
Return string with its first character capitalized and the rest lowercased.
Equivalent to str.capitalize().
Signature:
casefold() udf
Return a casefolded copy of string.
Equivalent to str.casefold().
Signature:
center() udf
Return a centered string of length width.
Equivalent to str.center().
Signature:
width(Int): Total width of the resulting string.fillchar(String): Character used for padding.
contains() udf
Test if string contains a substring.
Signature:
substr(String): string literal or regular expressioncase(Bool): if False, ignore case
contains_re() udf
Test if string contains a regular expression pattern.
Signature:
pattern(String): regular expression patternflags(Int): flags for theremodule
count() udf
Count occurrences of pattern or regex.
Signature:
pattern(String): string literal or regular expressionflags(Int): flags for theremodule
endswith() udf
Return True if the string ends with the specified suffix, otherwise return False.
Equivalent to str.endswith().
Signature:
substr(String): string literal
fill() udf
Wraps the single paragraph in string, and returns a single string containing the wrapped paragraph.
Equivalent to textwrap.fill().
Signature:
width(Int): Maximum line width.kwargs(Any): Additional keyword arguments to pass totextwrap.fill().
find() udf
Return the lowest index in string where substr is found within the slice s[start:end].
Equivalent to str.find().
Signature:
substr(String): substring to search forstart(Int): slice startend(Optional[Int]): slice end
findall() udf
Find all occurrences of a regular expression pattern in string.
Equivalent to re.findall().
Signature:
pattern(String): regular expression patternflags(Int): flags for theremodule
format() udf
Perform string formatting.
Equivalent to str.format().
Signature:
fullmatch() udf
Determine if string fully matches a regular expression.
Equivalent to re.fullmatch().
Signature:
pattern(String): regular expression patterncase(Bool): if False, ignore caseflags(Int): flags for theremodule
index() udf
Return the lowest index in string where substr is found within the slice [start:end].
Raises ValueError if substr is not found.
Equivalent to str.index().
Signature:
substr(String): substring to search forstart(Int): slice startend(Optional[Int]): slice end
isalnum() udf
Return True if all characters in the string are alphanumeric and there is at least one character, False
otherwise.
Equivalent to [str.isalnum()](https://docs.python.org/3/library/stdtypes.html#str.isalnum
Signature:
isalpha() udf
Return True if all characters in the string are alphabetic and there is at least one character, False otherwise.
Equivalent to str.isalpha().
Signature:
isascii() udf
Return True if the string is empty or all characters in the string are ASCII, False otherwise.
Equivalent to str.isascii().
Signature:
isdecimal() udf
Return True if all characters in the string are decimal characters and there is at least one character, False
otherwise.
Equivalent to str.isdecimal().
Signature:
isdigit() udf
Return True if all characters in the string are digits and there is at least one character, False otherwise.
Equivalent to str.isdigit().
Signature:
isidentifier() udf
Return True if the string is a valid identifier according to the language definition, False otherwise.
Equivalent to str.isidentifier()
Signature:
islower() udf
Return True if all cased characters in the string are lowercase and there is at least one cased character,
False otherwise.
Equivalent to str.islower()
Signature:
isnumeric() udf
Return True if all characters in the string are numeric characters, False otherwise.
Equivalent to str.isnumeric()
Signature:
isspace() udf
Return True if there are only whitespace characters in the string and there is at least one character,
False otherwise.
Equivalent to str.isspace()
Signature:
istitle() udf
Return True if the string is a titlecased string and there is at least one character, False otherwise.
Equivalent to str.istitle()
Signature:
isupper() udf
Return True if all cased characters in the string are uppercase and there is at least one cased character,
False otherwise.
Equivalent to str.isupper()
Signature:
join() udf
Return a string which is the concatenation of the strings in elements.
Equivalent to str.join()
Signature:
len() udf
Return the number of characters in the string.
Equivalent to len(str)
Signature:
ljust() udf
Return the string left-justified in a string of length width.
Equivalent to str.ljust()
Signature:
width(Int): Minimum width of resulting string; additional characters will be filled with character defined infillchar.fillchar(String): Additional character for filling.
lower() udf
Return a copy of the string with all the cased characters converted to lowercase.
Equivalent to str.lower()
Signature:
lstrip() udf
Return a copy of the string with leading characters removed. The chars argument is a string specifying the set of
characters to be removed. If omitted or None, whitespace characters are removed.
Equivalent to str.lstrip()
Signature:
chars(Optional[String]): The set of characters to be removed.
match() udf
Determine if string starts with a match of a regular expression
Signature:
pattern(String): regular expression patterncase(Bool): if False, ignore caseflags(Int): flags for theremodule
normalize() udf
Return the Unicode normal form.
Equivalent to unicodedata.normalize()
Signature:
form(String): Unicode normal form ('NFC','NFKC','NFD','NFKD')
pad() udf
Pad string up to width
Signature:
width(Int): Minimum width of resulting string; additional characters will be filled with character defined infillchar.side(String): Side from which to fill resulting string ('left','right','both')fillchar(String): Additional character for filling
partition() udf
Splits string at the first occurrence of sep, and returns 3 elements containing the part before the
separator, the separator itself, and the part after the separator. If the separator is not found, return 3 elements containing string itself, followed by two empty strings.
Signature:
removeprefix() udf
Remove prefix. If the prefix is not present, returns string.
Signature:
removesuffix() udf
Remove suffix. If the suffix is not present, returns string.
Signature:
repeat() udf
Repeat string n times.
Signature:
replace() udf
Replace occurrences of substr with repl.
Equivalent to str.replace().
Signature:
substr(String): string literalrepl(String): replacement stringn(Optional[Int]): number of replacements to make (ifNone, replace all occurrences)
replace_re() udf
Replace occurrences of a regular expression pattern with repl.
Equivalent to re.sub().
Signature:
pattern(String): regular expression patternrepl(String): replacement stringn(Optional[Int]): number of replacements to make (ifNone, replace all occurrences)flags(Int): flags for theremodule
reverse() udf
Return a reversed copy of the string.
Equivalent to str[::-1].
Signature:
rfind() udf
Return the highest index where substr is found, such that substr is contained within [start:end].
Equivalent to str.rfind().
Signature:
substr(String): substring to search forstart(Optional[Int]): slice startend(Optional[Int]): slice end
rindex() udf
Return the highest index where substr is found, such that substr is contained within [start:end].
Raises ValueError if substr is not found.
Equivalent to str.rindex().
Signature:
rjust() udf
Return the string right-justified in a string of length width.
Equivalent to str.rjust().
Signature:
width(Int): Minimum width of resulting string.fillchar(String): Additional character for filling.
rpartition() udf
This method splits string at the last occurrence of sep, and returns a list containing the part before the
separator, the separator itself, and the part after the separator.
Signature:
rstrip() udf
Return a copy of string with trailing characters removed.
Equivalent to str.rstrip().
Signature:
chars(Optional[String]): The set of characters to be removed. If omitted orNone, whitespace characters are removed.
slice() udf
Return a slice.
Signature:
start(Optional[Int]): slice startstop(Optional[Int]): slice endstep(Optional[Int]): slice step
slice_replace() udf
Replace a positional slice with another value.
Signature:
start(Optional[Int]): slice startstop(Optional[Int]): slice endrepl(Optional[String]): replacement value
startswith() udf
Return True if string starts with substr, otherwise return False.
Equivalent to str.startswith().
Signature:
substr(String): string literal
strip() udf
Return a copy of string with leading and trailing characters removed.
Equivalent to str.strip().
Signature:
chars(Optional[String]): The set of characters to be removed. If omitted orNone, whitespace characters are removed.
swapcase() udf
Return a copy of string with uppercase characters converted to lowercase and vice versa.
Equivalent to str.swapcase().
Signature:
title() udf
Return a titlecased version of string, i.e. words start with uppercase characters, all remaining cased characters
are lowercase.
Equivalent to str.title().
Signature:
upper() udf
Return a copy of string converted to uppercase.
Equivalent to str.upper().
Signature:
wrap() udf
Wraps the single paragraph in string so every line is at most width characters long.
Returns a list of output lines, without final newlines.
Equivalent to textwrap.fill().
Signature:
width(Int): Maximum line width.kwargs(Any): Additional keyword arguments to pass totextwrap.fill().