Python 2: Walk Directory, List Files

By Xah Lee. Date: . Last updated: .

Suppose you want to visit every file in a directory. e.g. do find/replace on all HTML files.

os.path.walk(dirPath, f, arg)
walk a dir starting at dirPath. When it sees a directory (including dirPath), call f(arg, currentDir , childrenNames).

where:

  • currentDir → the full path of the current directory.
  • childrenNames → a list of all immediate children of the current directory. Each item is a string, it can be a file name or directory name.

The arg can be anything.

Note: os.path.walk is deprecated, and removed in Python 3. You should use os.walk instead, also available in python 2.7.x. 〔see Python 3: Walk Directory, List Files

Note: before Python 2.4.2: the first arg to os.path.walk must not end in a slash.

# -*- coding: utf-8 -*-
# python 2

# traverse a directory

import os

mydir= "/home/xyz/Documents"

def myfun(s1, s2, s3):
    print "s1 is {}".format(s1)
    print "s2 is {}".format(s2)
    print "s3 is {}".format(s3)
    print "--------------------------"

os.path.walk(mydir, myfun, "xyz")

Here's a example of filter some files by file name extension, and for each file we want, call a function on it.

# -*- coding: utf-8 -*-
# python 2

# traverse a dir, and list only html files

import os

mydir= "/home/xyz/Documents/"

def processThisFile(fpath):
    print "g touched:", fpath

def filterFile(dummy, thisDir, dirChildrenList):
    for child in dirChildrenList:
        if ".html" == os.path.splitext(child)[1] and os.path.isfile(thisDir+"/"+child):
            processThisFile(thisDir+"/"+child)

os.path.walk(mydir, filterFile, None)

Note that os.path.splitext() splits a string into two parts, a portion before the last period, and the rest in the second portion. Effectively it is used for getting file suffix. The os.path.isfile() makes sure that this is a actual file and not a dir with “.html” suffix.

Walk a Directory