Python: Regex re.sub
re.sub(regex, repl, text)
-
Substitute regex in text by the replacement repl. If the pattern isn't found, text is returned unchanged.
Returns a new string.
repl can also be a function for more complicated replacement. The function must take a MatchObject as argument. For each occurrence of match, the function is called and its return value used as the replacement string.
re.sub(regex, repl, text, count)
- count is the maximum number of pattern occurrences to be replaced.
# example of using re.sub( ) import re # add alt to image tag t1 = '<img src="cat.jpg">' t2 = re.sub(r'src="([a-z]+)\.jpg">', r'src="\1.jpg" alt="\1">', t1) print(t1) # <img src="cat.jpg"> print(t2) # <img src="cat.jpg" alt="cat">
Function as Replacement
# example of using re.sub(pattern, rep, str ) where rep is a function import re def ff(xx): if xx.group(0) == "ea": return "æ" elif xx.group(0) == "oo": return "u" else: return xx.group(0) print(re.sub(r"[aeiou]+", ff, "encyclopeadia")) # encyclopædia print(re.sub(r"[aeiou]+", ff, "book")) # buk print(re.sub(r"[aeiou]+", ff, "geek")) # geek
regex may be a string or an regex object. If you need to specify regular expression flags, you can use a regex object. Alternatively, you can embed a flag in your regex pattern by (?iLmsux)
in the beginning of your pattern. e.g. re.sub("(?i)b+", "x", "bbbb BBBB")
returns 'x x'
. (See: regex pattern syntax for detail.)