Python Regex Flags

By Xah Lee. Date: . Last updated: .

Many Python Regex Functions and Regex Methods take a optional argument called “flags”. The flags modifies the meaning of the given regex pattern.

The flags can be any of:

Summary of Regex Flags
syntaxlong syntaxmeaning
re.Ire.IGNORECASEignore case.
re.Mre.MULTILINEmake begin/end {^, $} consider each line.
re.Sre.DOTALLmake . match newline too.
re.Ure.UNICODEmake {\w, \W, \b, \B} follow Unicode rules.
re.Lre.LOCALEmake {\w, \W, \b, \B} follow locale.
re.Xre.VERBOSEallow comment in regex.

To specify more than one of them, use | operator to connect them. For example, re.search(pattern,string,flags=re.IGNORECASE|re.MULTILINE|re.UNICODE).

「re.IGNORECASE」 or 「re.I」

Indicates case-insensitive matching.

「re.MULTILINE」 or 「re.M」

When specified, the pattern character ^ match the beginning of the string and the beginning of each line (immediately following each newline); and the pattern character $ match at the end of the string and at the end of each line (immediately preceding each newline).

Normally, ^ and $ only match at the beginning/end of the string. [see Python Regex Syntax]

# -*- coding: utf-8 -*-
# python 2

# example of regex flag re.MULTILINE

import re

ss = """abc
def
ghi"""

r1 = re.findall(r"^\w", ss)
r2 = re.findall(r"^\w", ss, flags = re.MULTILINE)

print r1    # ['a']
print r2    # ['a', 'd', 'g']

「re.DOTALL」 or 「re.S」

Make the dot character . match any character, including a newline. Without this flag, a dot will match anything except a newline.

# -*- coding: utf-8 -*-
# python 2

# example of regex flag re.DOTALL

import re

ss = """once upon a time,
there lived a king"""

r1 = re.findall(r".+", ss)
r2 = re.findall(r".+", ss, re.DOTALL)

print r1    # ['once upon a time,', 'there lived a king']

print r2    # ['once upon a time,\nthere lived a king']

「re.UNICODE」 or 「re.U」

Make the pattern characters {\w, \W, \b, \B} dependent on the Unicode character properties database.

# -*- coding: utf-8 -*-

# example of regex re.UNICODE flag

import re

x1 = re.search(r"\w+", u"♥αβγ!", re.U)
x2 = re.search(r"\w+", u"♥αβγ!")

if x1:
    print x1.group().encode("utf8") # → 「αβγ」
else:
    print "no match"

print x2    # → 「None」

Note that Unicode string can be in the pattern string. Just be sure to use the Unicode prefix u to the pattern string.

# -*- coding: utf-8 -*-

import re

result = re.findall(ur"β", u"αβγ", re.U)
print result[0].encode("utf8")  # prints β

「re.LOCALE」 or 「re.L」

Make the word pattern {\w, \W} and boundary pattern {\b, \B}, dependent on the current locale. [see Python Regex Syntax]

「re.VERBOSE」 or 「re.X」

This flag changes the regex syntax, to allow you to add annotations in regex. Whitespace within the pattern is ignored, except when in a character class or preceded by an unescaped backslash, and, when a line contains a # neither in a character class or preceded by an unescaped backslash, all characters from the leftmost such # through the end of the line are ignored.

# -*- coding: utf-8 -*-

import re

# example of the regex re.VERBOSE flag

# matching a decimal number
p1 = re.compile(r"""\d +  # the integral part
                   \.    # the decimal point
                   \d *  # some fractional digits""", re.X)

p2 = re.compile(r"\d+\.\d*")    # pattern p2 is same as p1

r1 = re.findall(p1, u"a3.45")
r2 = re.findall(p2, u"a3.45")

print r1[0].encode("utf8")  # 3.45
print r2[0].encode("utf8")  # 3.45

If you have a question, put $5 at patreon and message me.

Python

  1. Python 3 Basics
  2. Python 2 Basics
  3. Python 2 and 3 Difference
  4. Print Version
  5. Builtin Help
  6. Quote String
  7. String Methods
  8. Format String
  9. Operators
  10. Complex Numbers
  11. True, False
  12. if then else
  13. Loop
  14. List Basics
  15. Loop Thru List
  16. Map f to List
  17. Copy Nested List
  18. List Comprehension
  19. List Methods
  20. Sort
  21. Dictionary
  22. Loop Thru Dict
  23. Dict Methods
  24. Tuple
  25. Sets
  26. Function
  27. Closure
  28. 2 Closure
  29. Decorator
  30. Class
  31. Object, ID, Type
  32. List Modules
  33. Write a Module
  34. Unicode 🐍

Regex

  1. Regex Basics
  2. Regex Reference

Text Processing

  1. Read/Write File
  2. Traverse Directory
  3. File Path
  4. Process Unicode
  5. Convert File Encoding
  6. Find Replace in dir
  7. Find Replace by Regex
  8. Count Word Frequency

Web

  1. Send Email
  2. GET Web Page
  3. Web Crawler
  4. HTTP POST

Misc

  1. JSON
  2. Find Script Path
  3. Get Env Var
  4. System Call
  5. Decompress Gzip
  6. Append String in Loop
  7. Timing f timeit
  8. Keyword Arg Default Value Unstable
  9. Check Page Load Size
  10. Thumbnail Generation