WolframLang: String pattern (regex)

By Xah Lee. Date: . Last updated: .

Many string function take a string pattern for argument. The pattern should be RegularExpression[regexStr] or StringExpression[wolframStrPatternSyntax]

RegularExpression

RegularExpression
Represents a string pattern. Used in functions that takes a pattern. RegularExpression

RegularExpression example

(* match any single character *)
StringCases["😅abc", RegularExpression[ "." ] ]
(* {"😅", "a", "b", "c"} *)
(* match a word char, one or more times *)
StringCases["😅abc", RegularExpression["\\w+"]]
(* {"abc"} *)
(* match a word char, one or more times *)
StringCases["πσ😅ps", RegularExpression["\\w+"]]
(* {"πσ", "ps"} *)
(* match 2 literal char, ignore case *)
StringCases["πσ😅ps", RegularExpression["ΠΣ"], IgnoreCase -> True]
(* {"πσ"} *)
WolframLang RegularExpression 2022-04-29 VxYf
WolframLang RegularExpression 2022-04-29

RegularExpression Captured Groups

Captured patterns can be represented by $1, $2 etc, and $0 represents the whole matched string.

StringCases[  "some 172 " , RegularExpression[ " (\\d+) *$" ] -> "$1" ] === {"172"}

StringExpression

StringExpression
Represents a string pattern, but syntax based Wolfram's symbolic pattern matching syntax. Used in functions that takes a pattern. StringExpression
WolframLang StringExpression 2022-04-29
WolframLang StringExpression 2022-04-29

StringExpression example

(* match any single character *)
StringCases["😅abc", StringExpression[ _ ] ]
(* {"😅", "a", "b", "c"} *)
(* match a letter char, one or more times *)
StringCases["😅abc", StringExpression[LetterCharacter..]]
(* {"abc"} *)
(* match a letter char, one or more times *)
StringCases["πσ😅ps", StringExpression[LetterCharacter..]]
(* {"πσ", "ps"} *)
(* match 2 literal char, ignore case *)
StringCases["πσ😅ps", StringExpression[ "ΠΣ"], IgnoreCase -> True]
(* {"πσ"} *)
StringCases 2022-04-29 HGXf
StringCases 2022-04-29 HGXf

RegularExpression vs StringExpression

On syntax, the RegularExpression syntax is more widely understood. StringExpression is more readable.

They almost have the same power, except:

WolframLang string pattern 2022-04-29 r2g8
WolframLang string pattern 2022-04-29 r2g8

for tutorial, see https://reference.wolfram.com/language/tutorial/WorkingWithStringPatterns.html

WolframLang String