RE-Build reference

`RegExp` builders

The object obtained from building a regular expressions builders. Builders are augmented with members and methods to build the regex further, but they're basically immutable objects as every call to extend the builder returns a new builder instance.

Properties

All the following properties are read-only.

Type	Name	Description
string	`regex`	The regular expression defined by the builder. It's compiled the first time the property is requested, then cached
string	`source`	The source of the underlying regular expression. Used to compile it
string	`flags`	A string comprising the regex' flags. It may include one or more of the letters `"g"`, `"m"`, `"i"`, `"u"` or `"y"`
boolean	`global`	The regex' `global` flag
boolean	`ignoreCase`	The regex' `ignoreCase` flag
boolean	`multiline`	The regex' `multiline` flag
boolean	`unicode`	The regex' `unicode` flag
boolean	`sticky`	The regex' `sticky` flag

Methods

Returns	Name	Description
`RegExp`	`toRegExp()`	Basically, returns the `regex` property
`RegExp`	`valueOf()`	See above
string	`toString()`	Returns a string representation
boolean	`test(string)`	Uses the underlying regex to test a string. Short for `.regex.test(...)`
array	`exec(string)`	Executes the underlying regex on a string. Short for `.regex.exec(...)`
string	`replace(string, string/function)`	Uses the underlying regex to perform a regex-based replacement. Short for `string.replace(regex, ...)`
array	`split(string)`	Uses the underlying regex to perform a regex-based string split. Short for `string.split(regex)`
number	`search(string)`	Uses the underlying regex to perform a string search. Short for `string.search(regex)`

Building a regex

Regex building begins from the he RE object returned by the module. You can obtain a builder every time you use "words" like digit, then and such. Some of these words act like functions (like atLeast and codePoint), some like properties (like digit and theEnd), some work as both.

In this last case, if the word is not used as a function, additional words are expected to obtain a builder:

var foo = RE.matching.digit.then.alphaNumeric;

Many words that can (or must) be used as functions accept a variable number of arguments, that can be either strings, or regular expressions, or builders, which are all appended to the source. Strings are backslash-escaped, while in the other cases the source property is then added unescaped:

var amount = RE.oneOrMore.digit.then(".").then.digit.then.digit,
    currency = /[$€£]/;

var builder = RE.matching.theStart
                .then("Total: ", amount, currency)
                .then.theEnd;

Other words that work as functions only usually accept other types of arguments.

Flags

The flags of a builder (and its underlying regular expression) can be set using words starting from the RE object. After one of these words, another flag word or matching must follow, with the exception of withFlags that must be followed by matching only.

globally

Set the global flag on.
anyCase

Set the ignoreCase flag on.
fullText

Set the multiline flag on.
stickily

Set the sticky flag on.
withUnicode

Set the unicode flag on.
withFlags(flags)

Set multiple flags. flags is expected to be a string containing letters in the set "g", "m", "i" and "y".

Conjunctions

Conjunctions append additional blocks to the current source. They can follow any open or set block.

then

Appends a block to the current source.
or

Adds an alternative block (prefixed by the pipe | character in regular expressions).

Open and set blocks

These words can be used in both "open" sequences or inside character sets. They can be used after conjunction words, or a quantifier, or the matching word, or the RE object itself, or the and word joining blocks in character sets.

digit / not.digit

A digit character (\d) or its negation (\D).
alphaNumeric / not.alphaNumeric

An alphanumeric character plus the undescore (\w) or its negation (\W).
whiteSpace / not.whisteSpace

A whitespace (\s) or its negation (\W).
cReturn \r
newLine \n
tab \t
vTab \v
formFeed \f
null \0
slash \/
backslash \\
ascii(code)

An ASCII escape sequence (\xhh). code must be an integer between 0 and 255. It it then converted as two hexadecimal digits in the sequence.
codePoint(code, ...)

An Unicode escape sequence (\uhhhh, or \u{hhhhh} with the unicode flag set and with a code not from the Basic Multilingual Plane). code must be an integer between 0 and 1114111 (0x10ffff) or a RangeError will be thrown; or it can be a string, whose code points will be converted in the corresponding Unicode escape sequence. Keep in mind that code points from astral planes, when the unicode flag is not set, are encoded in the corresponding surrogate code point pairs (e.g.: "🍰" will become "\ud83c\udf70"): it is your duty to wrap the pairs in a group if needed or, when it's not possible (for example, in a character range) using an adequate regex structure.
control(letter)

A control sequence (\cx). letter must be a string of a single letter. It is then converted to uppercase in the sequence.

Open-only blocks

These words can be used in open block sequences only (which means, not inside character sets). They can be used after conjunction words, or a quantifier, or the matching word, or the RE object itself.

anyChar

The universal character (.).
theStart / theEnd

The string-start and string-end boundaries (^ and $, respectively).
wordBoundary / not.wordBoundary

A word boundary (\b) or its negation (\B).
oneOf / not.oneOf

Appends a character set ([...] or [^...], respectively). See the paragraph about character sets.
group(...)

Non-capturing group - (?:...). Used as functions only. Arguments can be strings, regexes or builders.
capture(...)

Capturing group - (...). Used as functions only. Arguments can be strings, regexes or builders.
reference(number)

Group backreference (\number). number should be a positive integer.

Character sets

Character sets are introduced by the oneOf word, and may include one or more blocks separated by the and word (e.g.: RE.oneOf.digit.and("abcdef")).

These words can be used in character sets only:

range(start, end)

Adds a character interval into the character set ([...start-end...]). start and end are supposed to be strings of single characters defining the boundaries of the character range; or they can be builders that define one single character, or character class usable in character ranges (which include: ascii, unicode, control, newLine, cReturn, tab, vTab, formFeed, null).
backspace

The backspace character, \b (U+0008). Not to be confused with the word boundary, which can be used as an "open" block only.

Quantifiers

Quantifiers can follow conjunction words, or the matching word, or the RE object itself, and can precede any "open" block, with the exception of wordBoundary, not.wordBoundary, theStart and theEnd.

They can be prefixed by lazily to define a lazy quantifier, instead of a greedy one.

Quantifiers can be used as functions, and accept strings, regexes or builders as arguments. A convenient group wrap will be used if necessary:

var foo = RE.oneOrMore("a");   // /a+/
var bar = RE.oneOrMore("abc"); // /(?:abc)+/

anyAmountOf *
oneOrMore +
noneOrOne ?
atLeast(n)

n must be a non-negative integer. If n is 0, a * is produced; if n is 1, then + is produced; else, the quantifier is {n,}.
atMost(n)

n must be a non-negative integer. If n is 1, then ? is produced; else, the quantifier is {,n}.
exactly(n)

n must be a non-negative integer. If n is 1, then no quantifier is defined; else, the quantifier is {n}.
between(n, m)

n and m must be non-negative integers. If the the values are adequate, the produced quantifier can be one of the above; otherwise, the quantifier is {n,m}.

Look-aheads

followedBy(...) / not.followedBy(...)

Appends a look-ahead ((?=...) or (?!...), respectively). Used as functions only. Arguments can be strings, regexes or builders.

Can follow any open block, or the matching word, or the RE object itself, or the or conjunction.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reference.md

reference.md

RE-Build reference

`RegExp` builders

Properties

Methods

Building a regex

Flags

Conjunctions

Open and set blocks

Open-only blocks

Character sets

Quantifiers

Look-aheads

Files

reference.md

Latest commit

History

reference.md

File metadata and controls

RE-Build reference

RegExp builders

Properties

Methods

Building a regex

Flags

Conjunctions

Open and set blocks

Open-only blocks

Character sets

Quantifiers

Look-aheads

`RegExp` builders