Suppose you want to visit every file in a directory. For example, do find/replace on all HTML files. You can use os.path.walk().
# -*- coding: utf-8 -*- # python # traverse a directory import os mydir= "/home/joe/web/" def myfun(s1, s2, s3): print s1 # arg from os.path.walk print s2 # current dir print s3 # list of children of current dir, including subdir print '------==(^_^)==------' os.path.walk(mydir, myfun, "arg")
os.path.walk(‹base_dir›, ‹func_name›, ‹arg›) will walk a dir starting at ‹base_dir›‹func_name›(‹arg›, ‹current_dir› , ‹list of children›).Here's a example of filter some files by file name extension, and for each file we want, call a function on it.
# -*- coding: utf-8 -*- # python # traverse a dir, and list only html files import os mydir= "/home/xah/web/xahlee_info/perl-python/" def processThisFile(fpath): print "g touched:", fpath def filterFile(dummy, thisDir, dirChildrenList): for child in dirChildrenList: if ".html" == os.path.splitext(child)[1] and os.path.isfile(thisDir+"/"+child): processThisFile(thisDir+"/"+child) os.path.walk(mydir, filterFile, None)
Note that os.path.splitext() splits a string into two parts, a portion before the last period, and the rest in the second portion. Effectively it is used for getting file suffix. The os.path.isfile() makes sure that this is a actual file and not a dir with “.html” suffix.
One important thing to note: in the mydir, it must not end in a slash. One'd think Python'd take care of such trivia but no. This took me a while to debug. (as of Python 2.4.2, this is fixed.)
for Python 3, see: Python 3: Traverse Directory
In Perl, to traverse a dir, use the “find” function in use File::Find;. Example:
# -*- coding: utf-8 -*- # perl # traverse a directory use File::Find qw(find); $mydir = '/home/xah/web/xahlee_info/perl-python/'; sub wanted { # if file name ends in .html and is a text file if ($_ =~/\.html$/ && -T $File::Find::name) { print $File::Find::name, "\n"; } } find(\&wanted, $mydir);
The line use File::Find qw(find); imports the “find” function. The “find” function is a directory walker. It will visit every file and subdirectorys in a given directory.
For each, it
sets the variable $_'s to the name of the file,
sets the variable $File::Find::name to the full path of the current file,
sets the variable $File::Find::dir to the full path of the current dir.
The “find” function has 2 parameters. The first is a reference to a function that will be called each time when “find” visits a file. The second is the path you want to traverse.
Note: The name “wanted” is just a convention used by the “File::Find” package. When your function “wanted” is called, nothing is passed to
it as argument. This means, you cannot write your “wanted” function as a
functional programing style that takes a file path as its
parameter. Instead, you must call the variable $File::Find::name or $_ inside the body of “wanted” to know the current file name.
Note: also, “wanted” cannot be written as a recursive function that calls itself to decent to subdirs.
blog comments powered by Disqus