MathCurvesSurfacesWallpaper GroupsGallerySoftwarePOV-Ray
ProgramingLinuxPerl PythonHTMLCSSJavaScriptPHPJavaEmacsUnicode ♥
Web Hosting by 1&1

Python & Perl: Traverse Directory

Xah Lee, ,

Python

Suppose you want to visit every file in a directory. For example, do find/replace on all HTML files. You can use os.path.walk().

# -*- coding: utf-8 -*-
# python

# traverse a directory

import os

mydir= "/home/joe/web/"

def myfun(s1, s2, s3):
    print s1 # arg from os.path.walk
    print s2 # current dir
    print s3 # list of children of current dir, including subdir
    print '------==(^_^)==------'

os.path.walk(mydir, myfun, "arg")

Here's a example of filter some files by file name extension, and for each file we want, call a function on it.

# -*- coding: utf-8 -*-
# python

# traverse a dir, and list only html files

import os

mydir= "/home/xah/web/xahlee_info/perl-python/"

def processThisFile(fpath):
    print "g touched:", fpath

def filterFile(dummy, thisDir, dirChildrenList):
    for child in dirChildrenList:
        if ".html" == os.path.splitext(child)[1] and os.path.isfile(thisDir+"/"+child):
            processThisFile(thisDir+"/"+child)

os.path.walk(mydir, filterFile, None)

Note that os.path.splitext() splits a string into two parts, a portion before the last period, and the rest in the second portion. Effectively it is used for getting file suffix. The os.path.isfile() makes sure that this is a actual file and not a dir with “.html” suffix.

One important thing to note: in the mydir, it must not end in a slash. One'd think Python'd take care of such trivia but no. This took me a while to debug. (as of Python 2.4.2, this is fixed.)

for Python 3, see: Python 3: Traverse Directory

Perl

In Perl, to traverse a dir, use the “find” function in use File::Find;. Example:

# -*- coding: utf-8 -*-
# perl

# traverse a directory

use File::Find qw(find);

$mydir = '/home/xah/web/xahlee_info/perl-python/';

sub wanted {

  # if file name ends in .html and is a text file
  if ($_ =~/\.html$/ && -T $File::Find::name) {
    print $File::Find::name, "\n";
  }
}

find(\&wanted, $mydir);

The line use File::Find qw(find); imports the “find” function. The “find” function is a directory walker. It will visit every file and subdirectorys in a given directory. For each, it sets the variable $_'s to the name of the file, sets the variable $File::Find::name to the full path of the current file, sets the variable $File::Find::dir to the full path of the current dir.

The “find” function has 2 parameters. The first is a reference to a function that will be called each time when “find” visits a file. The second is the path you want to traverse.

Note: The name “wanted” is just a convention used by the “File::Find” package. When your function “wanted” is called, nothing is passed to it as argument. This means, you cannot write your “wanted” function as a functional programing style that takes a file path as its parameter. Instead, you must call the variable $File::Find::name or $_ inside the body of “wanted” to know the current file name.

Note: also, “wanted” cannot be written as a recursive function that calls itself to decent to subdirs.

perldoc File::Find

blog comments powered by Disqus