String functions and operators
String operators
Operator | Expression & Description | Example |
---|---|---|
| | expression1 | expression2 [ | expression ] ... Concatenates two or more expressions. | 'Abcde' | 1 | 23 → Abcde123 |
^@ | string ^@ substring Returns true ( t ) if string starts with substring. This operator is equivalent to the starts_with () function. | 'abcdef' ^@ 'abc' → t |
String functions
ascii
This function returns the Unicode code point of the first character of the input_string. If the string is empty, it returns NULL
.
bit_length
This function returns the number of bits in the input string, which is 8 times the octet_length
.
btrim
This function is equal to trim (BOTH)
. It removes the specified characters from both the beginning and end of the input string.
char_length
, character_length
, length
These functions return the number of characters in the input string.
chr
This function returns the character with the Unicode code point equivalent to the input integer value provided.
concat
This function concatenates the arguments. NULL arguments are ignored.
concat_ws
This function concatenates the arguments with a separator. The first argument is used as the separator and should not be NULL. Other NULL arguments are ignored.
convert_from
This function converts a string to the database encoding. The original encoding is specified by src_encoding name
. The string must be valid in this encoding.
For this function, only encoding UTF8 is supported. RisingWave uses UTF8 encoding to store text, so this function primarily serves as a type conversion operation.
convert_to
This function converts a string to the encoding specified by dest_encoding name
and returns a byte array.
For this function, only encoding UTF8 is supported. RisingWave uses UTF8 encoding to store text, so this function primarily serves as a type conversion operation.
decode
This function decodes the text data in the input string into binary data. Supported formats for the encoded input string include base64
, hex
, and escape
.
encode
This function encodes the binary data in bytea into its textual representation. Supported encoding formats include base64
, hex
, and escape
.
format
This function produces output formatted according to a format string, in a style similar to the C function sprintf
.
The format_string specifies how the output_string should be formatted. It consists of text and format specifiers. Text is copied directly to the output string. Format specifiers are placeholders for the arguments to be inserted into the output string. The number of format specifiers should be equal to or less than the number of arguments.
The syntax of the format specifier:
type is the type of format conversion to use to generate the output of the format specifier.
The allowed values for type are:
s
: Formats the argument value as a string. NULL is treated as an empty string.I
: Treats the argument value as an SQL identifier.
Please note that format_string and format_arg can be variables.
For example, the following query works fine in RisingWave.
More examples
initcap
This function capitalizes the first letter of each word in the input string and converts the remaining characters to lowercase.
lower
This function converts the string to all lowercase.
left
This function returns the first input integer characters in the input string. If the input integer is negative, the last input_integer characters are removed from the output string.
lpad
This function pads the input string on the left with spaces until it reaches the specified input integer length. If the input string is longer than the input integer length, it is truncated to the specified length. Providing the optional padding string replaces the spaces with the padding string.
ltrim
This function is equal to trim (LEADING)
. It removes the specified characters from the beginning of the input string.
octet_length
This function returns the number of bytes in the string.
overlay
This function replaces a substring in the input string with a substring, starting at a specified position and with an optional length. If the length is omitted, its value is the length of the substring.
position
This function returns the starting index of the specified substring within the input string, or zero if it is not present.
quote_literal(string text)
Returns the given string properly quoted, so that a string can be safely used as a string literal in an SQL statement. This involves doubling any embedded single-quotes and backslashes. Note that if the input string is null, the function quote_literal
returns null. In such cases, the function quote_nullable is often a better choice. Note that the quotes are part of the output string.
quote_literal(value anyelement)
Converts the given value to text and then quotes it as a literal to be safely used in an SQL statement, similar to quote_literal(string text). This involves doubling any embedded single-quotes and backslashes to ensure their proper representation.
quote_nullable(string text)
Returns the given string properly quoted, so that a string can be safely used as a string literal in an SQL statement. Returns NULL if the input string is null. This involves doubling any embedded single-quotes and backslashes. When the argument is null, this function is usually more suitable than the quote_literal
function.
regexp_count
Returns the number of times a POSIX regular expressions pattern appears in input_string. Back reference, positive, negative lookahead, and positive, negative lookbehind are supported. Optional flags include i
, which stands for case-insensitive matching, and c
, which represents case-sensitive matching.
regexp_match
Returns a string array of captured substring(s) resulting from the first match of a POSIX regular expression pattern to a string. If there is no match, the result is NULL. Back reference, positive, negative lookahead, and positive, negative lookbehind are supported. Optional flags include i
, which stands for case-insensitive matching, and c
, which represents case-sensitive matching.
regexp_matches
Returns a set of string arrays of captured substring(s) resulting from matching a POSIX regular expression pattern to a string. Returns all matches by default. Back reference, positive, negative lookahead, and positive, negative lookbehind are supported. Optional flags include i
, which stands for case-insensitive matching, and c
, which represents case-sensitive matching.
regexp_replace
Replaces the substring that is either the first match or, optionally, the N’th match to the POSIX regular expression pattern in the input_string, starting from the character index specified by the optional start_integer.
Back reference, positive, negative lookahead, and positive, negative lookbehind are supported.
Optional flags can modify the matching behavior:
- The
g
flag indicates that all occurrences of the pattern in the input string should be replaced. If not used, only the first occurrence is replaced. - The
i
flag enables case-insensitive matching. - The
c
flag enables case-sensitive matching.
Note: If start_integer is used, flags is not permitted unless N_integer is also specified.
regexp_split_to_array
This function splits a string into an array of substrings based on a regular expression pattern.
repeat
Repeats input_string specific times. Null is returned when times_int is zero, negative, or null.
replace
Replaces all occurrences of substring from_string in input_string with substring to_string.
reverse
Returns the input_string with its characters in the reverse order.
right
Returns the last input_integer characters in the input_string. If input_integer is negative, the first |input_integer| characters are removed from output_string.
rpad
Pads the input string on the right with spaces until it reaches the specified length. If the string is longer than the specified length, it is truncated to the specified length. Providing the optional padding string replaces the spaces with the padding string.
rtrim
Equals to trim (TRAILING)
.
split_part
Splits the input string at occurrences of the delimiter string and returns the n’th field (counting from one), or when n is negative, returns the |n|‘th-from-last field. When n is zero, returns an ‘InvalidParameterValue’ error. When the input delimiter string is an empty string, returns the input string if querying the first or last field. Otherwise, returns an empty string.
starts_with
Returns true if the input string starts with the specified prefix string, otherwise returns false.
substr
/substring
Extracts the substring from input_string starting at position start_int and extending for count_int characters.
to_ascii
Returns the input string with non-ASCII characters replaced by their closest ASCII equivalents.
to_hex
Converts input_int or input_bigint to its hexadecimal representation as a string.
translate
Replaces each character in the input_string that matches a character in the from_string with the corresponding character in the to_string.
trim
Trims the longest contiguous substring of characters from the beginning, end, or both ends (BOTH by default) of input_string that contains only the characters specified in characters (which defaults to whitespace if not specified).
There are two syntax variants.
upper
Converts the string to all uppercase.
LIKE
pattern matching expressions
The LIKE
expression returns true if the string matches the supplied pattern. The NOT LIKE
expression returns false if LIKE
returns true. By using ILIKE
instead of LIKE
, the matching becomes case-insensitive.
Alternatively, you can use the operators ~~
and ~~*
as equivalents to LIKE
and ILIKE
, respectively. Similarly, the operators !~~
and !~~*
equal to NOT LIKE
and NOT ILIKE
.
Wildcards
- An underscore
_
in a pattern matches any single character. - A percent sign
%
matches any sequence of zero or more characters.
If the pattern does not contain _
or %
, then the pattern only represents the string itself. For example, the pattern ‘apple’ matches only the string ‘apple’. In that case, LIKE
acts like the equals operator =
.
Escape
To match a literal underscore or percent sign without matching other characters, the respective character in pattern must be preceded by the escape character \
. To match the escape character itself, write two escape characters: \\
.
You can use ESCAPE ''
to disable the escape mechanism, but specifying a custom escape character using the ESCAPE
clause is not supported.
Examples
SIMILAR TO
pattern matching expressions
The SIMILAR TO
expression returns true if the string matches the supplied pattern. The NOT SIMILAR TO
expression returns false if SIMILAR TO
returns true. The matching is case-sensitive.
Metacharacter
Operator | Description |
---|---|
% | Matches any sequence of zero or more characters. |
_ | Matches any single character. |
| | Denotes alternation (either of two alternatives). |
* | Repeats the previous item zero or more times. |
+ | Repeats the previous item one or more times. |
? | Repeats the previous item zero or one time. |
Repeats the previous item exactly m times. | |
{m,} | Repeats the previous item m or more times. |
{m,n} | Repeats the previous item at least m and not more than n times. |
() | Parentheses group items into a single logical item. |
[…] | A bracket expression specifies a character class. |
Escape
To match a metacharacter literally, use the escape character \
before the respective character in the pattern. To match the escape character itself, write two escape characters: \\
.
You can use ESCAPE ''
to disable the escape mechanism. The ESCAPE
clause supports specifying a custom escape character, which must be either empty or a single character.