JS: RegExp Syntax

By Xah Lee. Date: . Last updated: .

RegExp Flags

Regex flags are used in:

〔►see JS: RegExp Constructor

Regex flag changes the meaning of the pattern, or the behavior of the regex function.

〔►see JS: String Methods for RegExp

Regex Flags
g“global”. Find all matches. Don't stop after first found.
iignore case
m“multiline”; make the RegExp syntax ^ and $ match any newline beginning/end. (not just the beginning/end of whole string)
u(ES2015) “unicode”. Treat string as sequence of unicode characters. If your string contains a character whose code point is ≥ 2^16, you shoud set this flag. 〔►see JS: String Code Unit vs Code Point
y(ES2015) “sticky”; match starts at the index RegExp.prototype.lastIndex, except that the pattern ^ will always be beginning of string or line.

Special Escapes for Literal Characters

RegExp Special Escapes
\0the NUL character (ASCII 0)
\thorizontal tab (common tab char)
\nline feed (unix newline char)
\vvertical tab (rarely used)
\fform feed (often used in emacs as code section break)
\rcarriage return (used in Mac OS Classic as newline)
\xxxa ASCII character of hex code xx. For example, /\x61/ matches the letter “a” (ASCII code 97, hex 61)
\uxxxxa Unicode character with hex code xxxx. It must be 4 digits. Add 0 in front if not. For example, /\u03b1/ matches “α” (codepoint 945, hex 3b1)
\cXa ASCII control character. For example, /\cJ/ matches the unix newline \n.
[\b]a backspace.

〔►see ASCII Character Symbols ␀ ␣ ¶

// example of RegExp matching unicode by hex codepoint
var xx = "alpha α";
console.log(xx.search(/\u03b1/)); // 6
console.log(xx.search(/α/));    // 6. literal Unicode is ok too.

Character Sets, Character Classes

RegExp Character Classes
[…]any character between the brackets.
[^…]any char that's not one of the character in the brackets.
.any char, except newline characters: {\n, \r, \u2028, \u2029}.
\wany letter (upper or lower) or digit.
\Wany character that is not \w.
\dany ASCII digit 0 to 9.
\Dany character that's not \d.
\sany Unicode whitespace character.
\Sany character that is not \s.


RegExp Character Classes
^beginning of string. If flag g is set, also match beginning of lines.
$end of string. If flag g is set, also match end of line.
\bword boundary. For literal backspace, use [\b]
\BNot word boundary.


RegExp Repetition Syntax
*Match previous pattern 0 or more times. Same as {0,}.
?Match previous pattern 0 or 1 time. Same as {0,1}.
+Match previous pattern 1 or more times. Same as {1,}.
{n}Match previous pattern exactly n times.
{n,}Match previous pattern n or more times.
{n,m}Match previous pattern n times or up to m times (inclusive).

Note: these will match as far as possible. For non-greedy version, add a ? after them.

Alternate and Conditions

RegExp Repetition Syntax
x|yAlternate. Match either x or y
x(?=y)Match only if x is followed by y
x(?!y)Match only if x is not followed by y

Capture Grouping, Back Reference

RegExp Repetition Syntax
(…)Capture. Captured group can be later referenced by /n where n is a digit. \1 is the first captured group.
(?:…)Group, but don't captured.
/nThe nth captured group before. \1 is the first captured group.


ECMAScript® 2016 Language Specification#sec-patterns

RegExp Topic

  1. JS: RegExp Tutorial
  2. JS: String Methods for RegExp
  3. JS: RegExp Object
  4. JS: RegExp Constructor
  5. JS: RegExp.prototype
  6. JS: RegExp Syntax
Like what you read? Buy JavaScript in Depth