JavaScript: RegExp Syntax

By Xah Lee. Date: . Last updated: .

RegExp Flags

Regex flags are used in:

〔►see JavaScript: RegExp Constructor

Regex flag changes the meaning of the pattern, or the behavior of the regex function.

〔►see JavaScript: String Methods for RegExp

Regex Flags
flagmeaning
g“global”. Find all matches. Don't stop after first found.
iignore case
m“multiline”; make the RegExp syntax ^ and $ match any newline beginning/end. (not just the beginning/end of whole string)
u(ES2015) “unicode”. Treat string as sequence of unicode characters. If your string contains a character whose code point is ≥ 2^16, you shoud set this flag. 〔►see JavaScript: String is 16-Bit Unit Sequence
y(ES2015) “sticky”; match starts at the index RegExp.prototype.lastIndex, except that the pattern ^ will always be beginning of string or line.

Special Escapes for Literal Characters

RegExp Special Escapes
\0the NUL character (ASCII 0)
\thorizontal tab (common tab char)
\nline feed (unix newline char)
\vvertical tab (rarely used)
\fform feed (often used in emacs as code section break)
\rcarriage return (used in Mac OS Classic as newline)
\xxxa ASCII character of hex code xx. For example, /\x61/ matches the letter “a” (ASCII code 97, hex 61)
\uxxxxa Unicode character with hex code xxxx. It must be 4 digits. Add 0 in front if not. For example, /\u03b1/ matches “α” (codepoint 945, hex 3b1)
\cXa ASCII control character. For example, /\cJ/ matches the unix newline \n.
[\b]a backspace.

〔►see ASCII Character Symbols ␀ ␣ ¶

// example of RegExp matching unicode by hex codepoint
var xx = "alpha α";
console.log(xx.search(/\u03b1/)); // 6
console.log(xx.search(/α/));    // 6. literal Unicode is ok too.

Character Sets, Character Classes

RegExp Character Classes
syntaxmeaning
[…]any character between the brackets.
[^…]any char that's not one of the character in the brackets.
.any char, except newline characters: {\n, \r, \u2028, \u2029}.
\wany letter (upper or lower) or digit.
\Wany character that is not \w.
\dany ASCII digit 0 to 9.
\Dany character that's not \d.
\sany Unicode whitespace character.
\Sany character that is not \s.

Boundaries

RegExp Character Classes
syntaxmeaning
^beginning of string. If flag g is set, also match beginning of lines.
$end of string. If flag g is set, also match end of line.
\bword boundary. For literal backspace, use [\b]
\BNot word boundary.

Repetition

RegExp Repetition Syntax
SyntaxMeaning
*Match previous pattern 0 or more times. Same as {0,}.
?Match previous pattern 0 or 1 time. Same as {0,1}.
+Match previous pattern 1 or more times. Same as {1,}.
{n}Match previous pattern exactly n times.
{n,}Match previous pattern n or more times.
{n,m}Match previous pattern n times or up to m times (inclusive).

Note: these will match as far as possible. For non-greedy version, add a ? after them.

Alternate and Conditions

RegExp Repetition Syntax
SyntaxMeaning
x|yAlternate. Match either x or y
x(?=y)Match only if x is followed by y
x(?!y)Match only if x is not followed by y

Capture Grouping, Back Reference

RegExp Repetition Syntax
SyntaxMeaning
(…)Capture. Captured group can be later referenced by /n where n is a digit. \1 is the first captured group.
(?:…)Group, but don't captured.
/nThe nth captured group before. \1 is the first captured group.

Reference

ECMAScript® 2016 Language Specification#sec-patterns

RegExp Topic

  1. JavaScript: RegExp Tutorial
  2. JavaScript: String Methods for RegExp
  3. JavaScript: RegExp Object
  4. JavaScript: RegExp Constructor
  5. JavaScript: RegExp.prototype
  6. JavaScript: RegExp Syntax
Like what you read? Buy JavaScript in Depth
or, buy a new keyboard, see Keyboard Reviews.