Python: Regex

By Xah Lee. Date: . Last updated: .

What is Regex

Regular Expression (aka regex) is a character sequence that represent a text pattern.

For example, you can use it to find all email addresses in a file by matching the email address pattern.

Regex is used by many functions to check if a string contains certain pattern, or extract it, or replace it with other string.

Check If String Match

To use regex in Python, first you need to import re.

To check if a pattern is in string, use:, str, flags)
If pattern matches (part or whole of a string), then a Match Object is returned. Else, Returns None. (Match Object evaluates to True) [see Python: Regex Match Object]

For regex flags, see: Python: Regex Flags .

# regex matching email email address

import re

text = "this that"
xx =" (\w+@\w+\.com) ", text )

if xx:
    print("no match")

Find and Replace

sub(pattern, repl, string)
Substitute pattern in string by the replacement repl. If the pattern isn't found, string is returned unchanged. Returns a new string.

Optional 4th argument is number of replacement to make. If omitted, it replace all occurrences of matches.

# example of regex replace
import re
x = "123";
x2 = re.sub(r"2", r"8", x)
# 183

Here's a more complex example, replacing all “gif” image paths to “png” in HTML file.

# regex example of replacing gif to png in html img tag

import re

myText = r"""<p><img src="rabbits.gif" width="30" height="20">
and <img class="xyz" src="../cats.gif">,
but <img src ="tigers.gif">,
 <img src=

newText = re.sub(r'src\s*=\s*"([^"]+)\.gif"', r'src="\1.png"', myText)


# <p><img src="rabbits.png" width="30" height="20">
# and <img class="xyz" src="../cats.png">,
# but <img src="tigers.png">,
#  <img src="bird.png">!</p>


Note: A successful match does not necessarily mean it contains part of the given string. e.g. these patterns matches any string: '' and 'y*'.

Note: pattern string should be enclosed using raw quotes, like this r"…". Otherwise, backslashes in it must be escaped. e.g. to search for a sequence of tabs, use"\t+") or"\\t+"). [see Python: Quote String]

Python, Regular Expression