Python: Split Line by Regex
this page shows you how to split a line by regex.
let's say you have a file like this:
你是我最苦澀的等待 | you are my hardest wait 讓我歡喜又害怕未來 | giving me joy and also fear the future
and you want to get just the Chinese lines.
# -*- coding: utf-8 -*- # python # split lines example import re myText = ur"""你是我最苦澀的等待 | you are my hardest wait 讓我歡喜又害怕未來 | giving me joy and also fear the future""" # split into lines myLines = re.split(r'\n', myText) for aLine in myLines: lineParts = re.split(r'\s*\|\s*', aLine, re.U) print lineParts.encode('utf-8') # prints: # 你是我最苦澀的等待 # 讓我歡喜又害怕未來