JS: RegExp Syntax

By Xah Lee. Date: . Last updated: .

This page covers regex pattern syntax.

(For basic examples of how to use regex, see RegExp Tutorial)

Regex pattern has 2 parts:

RegExp Flags

Regex flags are specified in 2 ways:

Regex flag changes the meaning of the pattern, or the behavior of the regex function.

g
“global”. Find all matches. Don't stop after first found.
i
ignore case
m
“multiline”; make the RegExp syntax ^ and $ match any newline beginning/end. (not just the beginning/end of whole string)
u
(ES2015) “unicode”. Treat string as sequence of unicode characters. If your string contains a character whose codepoint is ≥ 2^16, you shoud set this flag. [see Character, Code Unit, Codepoint]
y
(ES2015) “sticky”; match starts at the index RegExp.prototype.lastIndex, except that the pattern ^ will always be beginning of string or line.

Example:

// check if the string contains t
console.log( "WATER".search( /t/ ) === -1);

// ignore case
console.log( "WATER".search( /t/i ) === 2 );

// return value is the start position of match, or -1 if not found

Special Escapes for Literal Characters

\0
the NUL character (ASCII 0)
\t
horizontal tab (common tab char)
\n
line feed (unix newline char)
\v
vertical tab (rarely used)
\f
form feed (often used in emacs as code section break)
\r
carriage return (used in Mac OS Classic as newline)
\xxx
a ASCII character of hex code xx. For example, /\x61/ matches the letter “a” (ASCII code 97, hex 61)
\uxxxx
a Unicode character with hex code xxxx. It must be 4 digits. Add 0 in front if not. For example, /\u03b1/ matches “α” (codepoint 945, hex 3b1)
\cX
a ASCII control character. For example, /\cJ/ matches the unix newline \n.
[\b]
a backspace.

[see ASCII Characters ␀ ␣ ¶]

// example of RegExp matching unicode by hex codepoint

console.log( "alpha α".search(/\u03b1/)); // 6

console.log( "alpha α".search(/α/)); // 6
// literal Unicode char is ok too

Character Sets, Character Classes

[]
any character between the brackets.
[^…]
any char that's not one of the character in the brackets.
.
any char, except newline characters: {\n, \r, \u2028, \u2029}.
\w
any letter (upper or lower) or digit or low line _.
\W
any character that is not \w.
\d
any ASCII digit 0 to 9.
\D
any character that's not \d.
\s
any Unicode whitespace character.
\S
any character that is not \s.
// check if string contain digit

console.log (
 "xyz 123".search( /\d/ )
); // 4

Boundaries

^
beginning of string. If flag g is set, also match beginning of lines.
$
end of string. If flag g is set, also match end of line.
\b
word boundary. For literal backspace, use [\b]
\B
Not word boundary.
// example of regex with boundary check

console.log ( "something".search( /thing/ ) ); // 4

// check for “thing” if it's a word by itself
console.log ( "something".search( /\bthing\b/ ) ); // -1

Repetition

*
Match previous pattern 0 or more times. Same as {0,}.
?
Match previous pattern 0 or 1 time. Same as {0,1}.
+
Match previous pattern 1 or more times. Same as {1,}.
{n}
Match previous pattern exactly n times.
{n,}
Match previous pattern n or more times.
{n,m}
Match previous pattern n times or up to m times (inclusive).

Note: these will match as far as possible. For non-greedy version, add a ? after them.

// example of regex repetition pattern

const str = "is 278";

// check if contains 1 or more digits
console.log ( str.search( /\d+/ ) ); // 3

// check if contains 4 or more digits
console.log ( str.search( /\d{4}/ ) ); // -1

Alternate and Conditions

x|y
Alternate. Match either x or y
x(?=y)
Match only if x is followed by y
x(?!y)
Match only if x is not followed by y
// check if the string contains “water” or “fire”

const str = "some fire";

console.log( str.search( /water|fire/ ) ); // -5

Capture Group, Back Reference

()
Capture. Captured group can be later referenced by /n where n is a digit. \1 is the first captured group.
(?:…)
Group, but don't capture.
/n
The nth captured group before. \1 is the first captured group.

Example String.prototype.match

RegExp Unicode Property

\p{PropertyValue}
match a char that has property PropertyValue
\p{PropertyName=PropertyValue}
match a char whose PropertyName is PropertyValue
\p{BinaryPropertyName}
\P{...}
(Note, uppercase P) Negation of \p{...}.

Example RegExp Unicode Property

Buy JavaScript in Depth

JavaScript in Depth

JS Obj Ref

DOM


JS Obj Ref

RegExp

prototype

Syntax