JS: Regex Unicode Property
(new in ECMAScript 2018)
Each unicode character has many properties. For exmaple, if it is: capital letter, Latin letter, punctuation, math symbol ( ± ∑ ∫ ), emoji, Chinese character, etc.
JavaScript regex allows you to match unicode properties.
Syntax:
\p{PropertyValue}-
match a char that has property PropertyValue
// get emoji chars console.log("i ♥ 😸".match(RegExp("\\p{Emoji_Presentation}", "gu"))); // [ "😸" ] \p{PropertyName=PropertyValue}-
match a char whose PropertyName is PropertyValue
// get Latin chars console.log("abc ♥ 😸. $_$".match(RegExp("\\p{Script_Extensions=Latin}", "gu"))); // [ "a", "b", "c" ] \p{BinaryPropertyName}-
// get punctuation chars console.log("i ♥ 😸. $_$".match(RegExp("\\p{P}", "gu"))); // [ ".", "_" ] // get currency symbols console.log("i ♥ 😸. $_$".match(/\p{Sc}/gu)); // [ "$", "$" ] // get characters that's unicode letter console.log("ä α ж の ♥ ⠮ 🦋 ∑ ° ⊕ +".match(/\p{L}/gu)); // [ "ä", "α", "ж", "の" ] \P{x}-
(Note, uppercase P) Negation of
\p{x}.