JS: RegExp Unicode Property

By Xah Lee. Date: . Last updated: .

New in JS2018

Each unicode character has many properties. For exmaple, whether it is a capital case, whether it is latin letter, whether it is a non-letter, whether it is a punctuation, whether it is a math symbol, whether it is a emoji, whether it is a Chinese character, etc.

in JavaScript regex, you can match unicode properties.

Syntax:

\p{PropertyValue}
match a char that has property PropertyValue
\p{PropertyName=PropertyValue}
match a char whose PropertyName is PropertyValue
\p{BinaryPropertyName}
\P{...}
(Note, uppercase P) Negation of \p{...}.

Get emoji:

console.log(
"i ♥ 😸. $_$".match(/\p{Emoji_Presentation}/gu)
);
// [ '😸' ]

Get latin characters:

console.log(
"i ♥ 😸. $_$".match( /\p{Script_Extensions=Latin}+/gu )
);
// [ 'i' ]

Get letters:

// get letters
console.log(
"∑Σπα".match( /\p{L}/gu )
);
// [ 'Σ', 'π', 'α' ]

Get punctuations:

// punctuations
console.log(
"i ♥ 😸. $_$".match( /\p{P}/gu )
);
// [ '.', '_' ]

Get currency symbols:

// currency symbols
console.log(
"i ♥ 😸. $_$".match( /\p{Sc}/gu )
);
// [ '$', '$' ]

See also: Unicode Escape Sequence

JS in Depth
XAH
Buy Xah JavaScript Tutorial
JS in Depth
XAH

JS Obj Ref

DOM


JS Obj Ref

RegExp

prototype

Syntax

misc