JS: RegExp Syntax

By Xah Lee. Date: . Last updated: .

Regular Expression (aka regex, regexp) syntax has 2 parts:

Syntax

Regex object are created in 2 ways:

/pattern/flags
Literal expression. This is convenient.
RegExp(patternStr, flagsStr)
[see RegExp Constructor] This is more general, and can be used to construct regex from string on the fly.

[see RegExp Tutorial]

RegExp Flags

RegExp Flags

RegExp Pattern Syntax

Character Class

.
Any character except newline characters: {\n, \r, \u2028, \u2029}.
If dotAll flag s is used, also match newline character.
const txt = `B3
yes?`;
const rgx = /.+/g;
console.log(...txt.matchAll(rgx));
// [ "B3" ] [ "yes?" ]
const txt = `B3
yes?`;
const rgx = /.+/gs;
console.log(...txt.matchAll(rgx));
// [ "B3\nyes?" ]
[]
Any character between the brackets. Can include character class in them such as \w.
console.log(
  ..."cat cet cit cot cut".matchAll("c[aou]t"),
);
// [ "cat" ] [ "cot" ] [ "cut" ]
console.log(..."fire-brand y42".matchAll(/[-\w]+/g));
// [ "fire-brand" ] [ "y42" ]
[^]
any char that's not one of the character in the brackets.
\w
any A to Z, a to z, and 0 to 9 and low line _.
console.log(..."x1 and x2".matchAll(/(\w+)/g));
// [ "x1", "x1" ] [ "and", "and" ] [ "x2", "x2" ]
// does not match unicode letter
console.log(..."♥ x_2 y_2 α β 甲".matchAll(/\w+/gu));
// [ "x_2" ] [ "y_2" ]
\W
any character that is not \w.
\d
any ASCII digit 0 to 9. Example: "xyz123".search( /\d/ )
\D
any character that's not \d.
\s
any whitespace character.
\S
any character that is not \s.

Boundaries

^
beginning of string. If RegExp Flag g is set, also match beginning of lines.
$
end of string. If RegExp Flag g is set, also match end of line.
\b
word boundary. For literal backspace, use [\b].
console.log("cats cat".search(/cat/) === 0);
console.log("cats cat".search(/\bcat\b/) === 5);
// all true
\B
Not word boundary.

Repetition

*
Match previous pattern 0 or more times. Same as {0,}.
?
Match previous pattern 0 or 1 time. Same as {0,1}.
+
Match previous pattern 1 or more times. Same as {1,}.
console.log("278".search(/\d+/) === 0); // true
{n}
Match previous pattern exactly n times.
console.log("eat".search(/e{2}/) === -1);
console.log("feet".search(/e{2}/)=== 1);
// all true
{n,}
Match previous pattern n or more times.
{n, m}
Match previous pattern n times or up to m times (inclusive).

Note: these will match as far as possible. For non-greedy version, add a ? after them.

Alternate and Conditions

x|y
Alternate. Match either x or y.
console.log("wildfire".search(/water|fire/) === 4); // true
x(?=y)
Look ahead assertion. Match only if x is followed by y
// replace all ab by abc, only if ab is followed by comma or period
console.log("abc, ab, ab.".replace(/ab(?=[\.,])/g, "abc") === "abc, abc, abc."); // true
x(?!y)
Match only if x is not followed by y
(?<=y)x
(JS2018) Look behind assertion.
Match only if y comes before x.
console.log(
  "sometimes somehow".replace(/(?<=some)how/, "one") === "sometimes someone",
); // true
(?<!y)x
(JS2018) Negative Look behind assertion.
Match only if y does not come before x.

Capture Group, Back Reference

()
Capture. Captured group can be later referenced by /n where n is a digit. \1 is the first captured group.
console.log(/(\d{4}).+(\d{4})/.exec("born 1899, died 1960"));
// [ "1899, died 1960", "1899", "1960" ]
console.log(
  "born 1899, died 1960".replace(/.+(\d{4}).+(\d{4})/, "$1 to $2") ===
    "1899 to 1960",
); // true
(?<name>)
(JS2018) Named capture group. The group can be refered to by \k<name> in regex or $<name> in replacement string.
// match text where width and height are the same
console.log(
  /width="(?<w>\d+)" height="\k<w>"/.exec('width="300" height="300"'),
);
// [ 'width="300" height="300"', "300" ]
console.log(
  "lived from 1899 to 1960".replace(
    /.+(?<born>\d{4}).+(?<died>\d{4})/,
    "$<born> - $<died>",
  ),
);
// 1899 - 1960
(?:)
Syntax for priority (precedence), but don't capture.
\n
The nth captured group. \1 is the first captured group.
\k<nanme>
Refer to named capture group name.

Unicode Property

\p{PropertyValue}
match a char that has property PropertyValue. Example RegExp Unicode Property
\p{PropertyName=PropertyValue}
match a char whose PropertyName is PropertyValue
\p{BinaryPropertyName}
\P{...}
(Note, uppercase P) Negation of \p{...}.

Escapes for Literal Characters

\0
the NUL character (ASCII 0) [see ASCII Table]
\t
horizontal tab (common tab char)
\n
line feed (unix newline char)
\v
vertical tab (rarely used)
\f
form feed (often used in emacs as code section break)
\r
carriage return (used in Mac OS Classic as newline)
\xxx
a ASCII character of hex code xx. For example, /\x61/ matches the letter “a” (ASCII code 97, hex 61)
\uxxxx
a Unicode character with hex code xxxx. It must be 4 digits. Add 0 in front if not. For example, /\u03b1/ matches “α” (codepoint 945, hex 3b1).
console.log("α".search(/\u03b1/) === 0);
console.log("α".search(/α/) === 0);
// all true
\cX
a ASCII control character. For example, /\cJ/ matches the unix newline \n. [see ASCII Table]
[\b]
a backspace.
JS in Depth
XAH
Buy Xah JavaScript Tutorial
JS in Depth
XAH

JS Obj Ref

DOM


JS Obj Ref

RegExp

prototype

Syntax

misc