Python 2: Walk Directory, List Files
Suppose you want to visit every file in a directory. e.g. do find/replace on all HTML files.
os.path.walk(dirPath, f, arg)
-
walk a dir starting at dirPath. When it sees a directory (including dirPath), call
f(arg, currentDir , childrenNames)
.where:
- currentDir → the full path of the current directory.
- childrenNames → a list of all immediate children of the current directory. Each item is a string, it can be a file name or directory name.
The arg can be anything.
Note:
os.path.walk
is deprecated, and removed in Python 3. You should useos.walk
instead, also available in python 2.7.x. 〔see Python 3: Walk Directory, List Files〕Note: before Python 2.4.2: the first arg to
os.path.walk
must not end in a slash.
# -*- coding: utf-8 -*- # python 2 # traverse a directory import os mydir= "/home/xyz/Documents" def myfun(s1, s2, s3): print "s1 is {}".format(s1) print "s2 is {}".format(s2) print "s3 is {}".format(s3) print "--------------------------" os.path.walk(mydir, myfun, "xyz")
Here's a example of filter some files by file name extension, and for each file we want, call a function on it.
# -*- coding: utf-8 -*- # python 2 # traverse a dir, and list only html files import os mydir= "/home/xyz/Documents/" def processThisFile(fpath): print "g touched:", fpath def filterFile(dummy, thisDir, dirChildrenList): for child in dirChildrenList: if ".html" == os.path.splitext(child)[1] and os.path.isfile(thisDir+"/"+child): processThisFile(thisDir+"/"+child) os.path.walk(mydir, filterFile, None)
Note that os.path.splitext()
splits a string into two parts, a portion before the last period, and the rest in the second portion. Effectively it is used for getting file suffix. The os.path.isfile()
makes sure that this is a actual file and not a dir with “.html” suffix.