Description

The Regex class implements methods for compiling and executing regular expressions.

The base class of Regex is Object.

Reference

Compile Flags Flags used to change regex compilation options.
Execute Flags Flags used to change regex execution options.


Constructor

Regex.new(String pattern, Number flags)

Constructs a new Regex object.


Methods

Array find(String subject, Number offset, Number flags)
Searches for the first match of the regexp pattern in subject, starting from offset, subject to flags.
Array match(String subject, Number offset, Number flags)
Searches for the first match of the regexp pattern in subject, starting from offset, subject to flags.
Iterator matches(String subject, Number flags)
Returns an iterator for repeated matching of the pattern in the string subject, subject to flags.
Array replace(String subject, Object replace, Number max_sub, Number flags)
Searches for all matches of the pattern in the string subject and replaces them according to the parameters replace and control.
Iterator split(String subject, Number flags)
Splits subject string into an array of substrings at the positions defined by a regular expression match.
String to_string()
Returns the regular expression string.


find

Array find(String subject,
           Number offset,
           Number flags)

Searches for the first match of the regexp pattern in subject, starting from offset, subject to flags.

Returns on success an array containing the following elements.

  • The start point of the match (Number).
  • The end point of the match (Number).
  • An array of substring matches ("captures"), in the order they appear in the pattern. false is returned for sub-patterns that did not participate in the match.

Each captured match is represented as an array containing the following values.

  • The captured substring.
  • Offset from start of string to start of substring.
  • The length of the captured substring.

Returns nil on failure.

The following example demonstrates usage of this method.

regex = System.Text.Regex.new("(code)(\\*)", System.Text.Regex.ICASE)
res = regex.find("*CODE*CODE*CODE*")
print(res)

Output:

{1,6,{{CODE,1,4},{*,5,1}}}


match

Array match(String subject,
            Number offset,
            Number flags)

Searches for the first match of the regexp pattern in subject, starting from offset, subject to flags.

Returns on success all substring matches ("captures"), in the order they appear in the pattern. false is returned for sub-patterns that did not participate in the match. If the pattern specified no captures then the whole matched substring is returned.

Each captured match is represented as an array containing the following values.

  • The captured substring.
  • Offset from start of string to start of substring.
  • The length of the captured substring.

Returns nil on failure.

The following example demonstrates usage of this method.

regex = System.Text.Regex.new("(code)(\\*)", System.Text.Regex.ICASE)
res = regex.match("*CODE*CODE*CODE*")
print(res)

Output:

{{CODE,1,4},{*,5,1}}


matches

Iterator match(String subject,
               Number flags)

This function is intended for use with the iterator construct. It returns an iterator for repeated matching of the pattern in the string subject, subject to execution flags.

On every iteration (that is, on every match), the iterator returns all captures in the order they appear in the pattern (or the entire match if the pattern specified no captures). The iteration will continue till the subject fails to match.

Each captured match is represented as an array containing the following values.

  • The captured substring.
  • Offset from start of string to start of substring.
  • The length of the captured substring.

Returns nil on failure.

The following example demonstrates usage of this method.

regex = System.Text.Regex.new("\\b(\\w+)\\b")
iter = regex.matches("This is one sentence.")

while (iter.has_next()) {
    print(iter.next())
}

Output:

{{This,0,4}}
{{is,5,2}}
{{one,8,3}}
{{sentence,12,8}}


replace

Array replace(String subject,
              Object replace,
              Number control,
              Number flags)

This function searches for all matches of the pattern in the string subject and replaces them according to the parameters replace and control.

The parameter replace can be either a String, a Method or a HashMap. On each match made, it is converted into a value replace_out that may be used for the replacement.

Returns on success an array containing the following elements.

  • The subject string with the substitutions made (String).
  • Number of matches found (Number).
  • Number of substitutions made (Number).

Returns nil on failure.

The parameter replace can be either a string, a function or a table. On each match made, it is converted into a value replace_out that may be used for the replacement.

replace_out is generated differently depending on the type of replace:

1. If replace is a string then it is treated as a template for substitution, where the %X occurences in replace are handled in a special way, depending on the value of the character X:

  • if X represents a digit, then each %X occurence is substituted by the value of the X-th submatch (capture), with the following cases handled specially:
    • each %0 is substituted by the entire match
    • if the pattern contains no captures, then each %1 is substituted by the entire match
    • any other %X where X is greater than the number of captures in the pattern will generate an error ("invalid capture index")
    • if the pattern does contain a capture with number X but that capture didn't participate in the match, then %X is substituted by an empty string
  • if X is any non-digit character then %X is substituted by X

2. If replace is a function then it is called on each match with the submatches passed as parameters (if there are no submatches then the entire match is passed as the only parameter). replace_out is the return value of the replace call, and is interpreted as follows:

  • if it is a string or a number (coerced to a string), then the replacement value is that string;
  • if it is a nil or a false, then no replacement is to be done;

3. If replace is a table then replace_out is replace [m1], where m1 is the first submatch (or the entire match if there are no submatches), following the same rules as for the return value of repl call, described in the above paragraph.

replace behaves differently depending on the type of control:

1. If control is a number then it is treated as the maximum number of matches to search for (an omitted or nil value means an unlimited number of matches). On each match, the replacement value is the replace_out string (see above).

2. If control is a function, then it is called on each match, after replace_out is produced (so if repl is a function, it will be called prior to the n call).

n receives 3 arguments and returns 2 values. Its arguments are:

  • The start offset of the match (Number)
  • The end offset of the match (Number)
  • replace_out

The type of its first return controls the replacement produced by replace for the current match:

  • true -- replace/don't replace, according to repl_out;
  • nil/false -- don't replace;
  • a string (or a number coerced to a string) -- replace by that string;

The type of its second return controls replace behavior after the current match is handled:

  • nil/false -- no changes: n will be called on the next match;
  • true -- search for an unlimited number of matches; n will not be called again;
  • a number -- maximum number of matches to search for, beginning from the next match; n will not be called again;


split

Iterator split(String subject,
               Number flags)

This function is used for splitting a subject string subject into parts (sections). The regular expression pattern represents separators between the sections.

The method returns an iterator for repeated matching of the pattern in the string subject, subject to flags.

On every iteration pass, the iterator returns:

  • A subject section (can be an empty string), followed by.
  • All captures in the order they appear in the pattern (or the entire match if the pattern specified no captures). If there is no match (this can occur only in the last iteration), then nothing is returned after the subject section.

The iteration will continue till the end of the subject. Unlike matches, there will always be at least one iteration pass, even if there are no matches in the subject.

Returns nil on failure.

The following example demonstrates usage of this method.

regex = System.Text.Regex.new("code", System.Text.Regex.ICASE)
iter = regex.split("**CODE^^CODE$$CODE%%CODE")

while (iter.has_next()) {
    print(iter.next())
}

Output:

{**,CODE,2,4}
{^^,CODE,8,4}
{$$,CODE,14,4}
{%%,CODE,20,4}


to_string

String to_string()

This function returns the regular expression string.