Python: Traverse Directory

By Xah Lee. Date: . Last updated: .

Suppose you want to visit every file in a directory. For example, do find/replace on all HTML files. You can use os.path.walk().

Note: os.path.walk() is deprecated, and removed in Python 3. You should use os.walk() instead, also available in python 2.7.x. For how-to, see: Python 3: Traverse Directory.

# -*- coding: utf-8 -*-
# python 2

# traverse a directory

import os

mydir= "/home/xyz/Documents"

def myfun(s1, s2, s3):
    print "s1 is {}".format(s1)
    print "s2 is {}".format(s2)
    print "s3 is {}".format(s3)
    print "--------------------------"

os.path.walk(mydir, myfun, "xyz")

os.path.walk(base_dir, f, arg) will walk a dir starting at base_dir. When it sees a directory (including base_dir), it will call f(arg, current_dir , list_of_children).

where:

Here's a example of filter some files by file name extension, and for each file we want, call a function on it.

# -*- coding: utf-8 -*-
# python

# traverse a dir, and list only html files

import os

mydir= "/home/xyz/Documents/"

def processThisFile(fpath):
    print "g touched:", fpath

def filterFile(dummy, thisDir, dirChildrenList):
    for child in dirChildrenList:
        if ".html" == os.path.splitext(child)[1] and os.path.isfile(thisDir+"/"+child):
            processThisFile(thisDir+"/"+child)

os.path.walk(mydir, filterFile, None)

Note that os.path.splitext() splits a string into two parts, a portion before the last period, and the rest in the second portion. Effectively it is used for getting file suffix. The os.path.isfile() makes sure that this is a actual file and not a dir with “.html” suffix.

Note: before Python 2.4.2: the first arg to os.path.walk must not end in a slash.

for Python 3, see: Python 3: Traverse Directory

If you have a question, put $5 at patreon and message me.

Python by Example

  1. Python Basics
  2. Print Version String
  3. Builtin Help
  4. Quote String
  5. String Operations
  6. String Methods
  7. Format String
  8. True, False
  9. if then else
  10. for, while, Loops
  11. List Basics
  12. Loop Thru List
  13. Map Function to List
  14. List Comprehension
  15. List Methods
  16. Dictionary
  17. Loop Thru Dict
  18. Dict Methods
  19. Function
  20. Class
  21. List Modules
  22. Write a Module
  23. Unicode 🐍

Regex

  1. Regex Basics
  2. Regex Reference

Text Processing

  1. Read/Write File
  2. Traverse Directory
  3. Manipulate Path
  4. Process Unicode
  5. Convert File Encoding
  6. Find Replace in dir
  7. Find Replace by Regex
  8. Count Word Frequency

Web

  1. Send Email
  2. GET Web Page
  3. Web Crawler
  4. HTTP POST
  5. Check Page Load Size
  6. Thumbnail Generation

Misc

  1. JSON
  2. Find Script Path
  3. Get Env Var
  4. System Call
  5. Decompress Gzip
  6. Complex Numbers

Advanced

  1. Sort
  2. Copy Nested List
  3. Tuple vs List
  4. Sets, Union, Intersection
  5. Closure in Python 2
  6. Decorator
  7. Append String in Loop
  8. Timing f timeit
  9. Keyword Arg Default Value Unstable