. |
matches an arbitrary character, but not a newline
unless it is a single-line match (see m//s). |
(...) |
groups a series of pattern elements to a single element. |
^ |
matches the beginning of the target. In multiline mode
(see m//m) also matches after every newline character. |
$ |
matches the end of the line.
In multiline mode also matches before every newline character. |
[ ... ] |
denotes a class of characters to match.
[^ ... ] negates the class. |
( ... | ... | ... ) |
matches one of the alternatives. |
(?# TEXT ) |
Comment. |
(?: REGEXP ) |
Like (REGEXP) but does not make back-references. |
(?= REGEXP ) |
Zero width positive look-ahead assertion. |
(?! REGEXP ) |
Zero width negative look-ahead assertion. |
(? MODIFIER ) |
Embedded pattern-match modifier. MODIFIER can be one or more of
i, m, s, or x. |
Quantified subpatterns match as many times as possible.
When followed with a ? they match the minimum number of times.
These are the quantifiers:
|
+ |
matches the preceding pattern element one or more times. |
? |
matches zero or one times. |
* |
matches zero or more times. |
{N,M} |
denotes the minimum N and maximum M match count.
{N} means exactly N times;
{N,} means at least N times. |
A \ escapes any special meaning
of the following character if non-alphanumeric, but it turns most alphanumeric characters
into something special:
|
\w |
matches alphanumeric, including _,
\W matches non-alphanumeric. |
\s |
matches whitespace, \S matches non-whitespace. |
\d |
matches numeric, \D matches non-numeric. |
\A |
matches the beginning of the string, \Z matches the end. |
\b |
matches word boundaries, \B matches non-boundaries. |
\G |
matches where the previous m//g search left off. |
\n, \r, \f, \t |
etc. have their usual meaning. |
\w, \s and \d |
may be used within character classes,
\b denotes backspace in this context. |
Back-references:
|
\1 ... \9 |
refer to matched subexpressions, grouped with (),
inside the match. |
\10 |
and up can also be used if the pattern matches that many subexpressions. |