Regular expressions are patterns that can be searched for within a text string, instead of searching for an exact match to a known piece of text. They are much more versatile for find and replace operations, and therefore useful for parsing, filtering, etc.

Some example regular expressions are:

PatternCodeMeaning
B.*Dregex(“B.*D”)Find B, followed by any number of characters (including none), followed by a D.
[0-3]regex(@“[0-3]“)Find any digit from 0 to 3
foobarregex(“foo
\d+regex(@“\d+”,“g”)Find all sequences of digits

These are some of the patterns you can use. If you want to use any of the operators as an actual character, it must be escaped with a backslash.

It is highly recommended that you use raw strings like @"..." for your regular expression patterns, because with a regular DM string you have to escape all backslash “ and open bracket [ characters, which will make your regular expression much harder for you to read. It’s easier to write @"[d]n" than "[d]n".

PatternMatches
ab
.Any character (except a line break)
^Beginning of text; or line if m flag is used
$End of text; or line if m flag is used
\ABeginning of text
\ZEnd of text
[chars]Any character between the brackets. Ranges can be specified with a hyphen, like 0-9. Character classes like d and s can also be used (see below).
[^chars]Any character NOT matching the ones between the brackets.
\bWord break
\BWord non-break
(pattern)Capturing group: the pattern must match, and its contents will be captured in the group list.
(?:pattern)Non-capturing group: Match the pattern, but do not capture its contents.
\1 through \9Backreference; *N* is whatever was captured in the Nth capturing group.
Modifiers
Modifiers are “greedy” by default, looking for the longest match possible. When following a word, they only apply to the last character.
a*Match a zero or more times
a+Match a one or more times
a?Match a zero or one time
a{n}Match a, exactly n times
a{n,}Match a, n or more times
a{n,m}Match a, n to m times
modifier?Make the previous modifier non-greedy (match as little as possible)
Escape codes and character classes
\xNNEscape code for a single character, where NN is its hexadecimal ASCII value
\uNNNNEscape code for a single 16-bit Unicode character, where NNNN is its hexadecimal value
\UNNNNNNEscape code for a single 21-bit Unicode character, where NNNNNN is its hexadecimal value
\dAny digit 0 through 9
\DAny character except a digit or line break
\lAny letter A through Z, case-insensitive
\LAny character except a letter or line break
\wAny identifier character: digits, letters, or underscore
\WAny character except an identifier character or line break
\sAny space character
\SAny character except a space or line break
Assertions
(?=pattern)Look-ahead: Require this pattern to come next, but don’t include it in the match
(?!pattern)Look-ahead: Require this pattern NOT to come next
(?pattern)Look-behind: Require this pattern to come before, but don’t include it in the match (must be a fixed byte length)
(?<!pattern)Look-behind: Require this pattern NOT to come before (must be a fixed byte length)

The optional flags can be any combination of these:

FlagMeaning
iCase-insensitive matching
gGlobal: In Find() subsequent calls will start where this left off, and in Replace() all matches are replaced.
mMulti-line: ^ and $ refer to the beginning and end of a line, respectively.

After calling Find() on a /regex datum, the datum’s group var will contain a list—if applicable—of any sub-patterns found with the () parentheses operator. For instance, searching the string "123" for 1(d)(d) will match "123", and the group var will be list("2","3"). Groups can also be used in replacement expressions; see the Replace() proc for more details.

See also