Perl: Find Replace String Pairs in Directory
Here is a Perl script that do find and replace.
Features:
- Multiple pairs of find replace strings in one shot.
- Regex or plain string search.
- Optional backup to a separate folder
- Exclude large files
- Count number of replacements
use utf8; use strict; =pod Description: This script does find and replace on a given foler recursively. Features: • multiple Find and Replace string pairs can be given. • The find/replace strings can be set to regex or literal. • Files can be filtered according to file name suffix matching or other criterions. • Backup copies of original files will be made at a user specified folder that preserves all folder structures of original folder. • A report will be generated that indicates which files has been changed, how many changes, and total number of files changed. • files will retain their own/group/permissions settings. usage: 1. edit the parts under the section 2. edit the subroutine fileFilterQ to set which file will be checked or skipped. to do: • in the report, print the strings that are changed, possibly with surrounding lines. • allow just find without replace. • add the GNU syntax for unix command prompt. • Report if backup directory exists already, or provide toggle to overwrite, or some other smarties. Date created: 2000-02 Last run: 2019-01-13 web site: http://xahlee.info/perl/perl_find_replace_in_dir.html Author: Xah Lee =cut use File::Find; use File::Path; use File::Copy; use Data::Dumper; # dir search my $inputDirPath = q[/Users/xah/web/ergoemacs_org/emacs_manual/]; # backup dir path. if no exit, will be created my $backupDirPath = q[/Users/xah/backup]; # find/replace string pairs my %findReplaceH = ( q[find string here 1] => q[replace string here 1], q[find string here 2] => q[replace string here 2], ); # $useRegexQ has values 1 or 0. If 1, inteprets the pairs in %findReplaceH # to be regex. my $useRegexQ = 0; # in bytes. larger files will be skipped my $fileSizeLimit = 2000 * 1000; # -------------------------------------------------- # globals $inputDirPath =~ s[/$][]; # e.g. '/home/joe/public_html' $backupDirPath =~ s[/$][]; # e.g. '/tmp/joe_back'; $inputDirPath =~ m[/(\w+)$]; my $previousDir = $`; # e.g. '/home/joe' my $lastDir = $1; # e.g. 'public_html' my $backupRoot = $backupDirPath . '/' . $1; # e.g. '/tmp/joe_back/public_html' my $refLargeFiles = []; my $totalFileChangedCount = 0; # -------------------------------------------------- # subroutines # fileFilterQ($fullFilePath) return true if file is desired. sub fileFilterQ ($) { my $fileName = $_[0]; if ((-s $fileName) > $fileSizeLimit) { push (@$refLargeFiles, $fileName); return 0; }; if ($fileName =~ m{\.html$}) { print "processing: $fileName\n"; return 1;}; ## if (-d $fileName) {return 0;}; # directory ## if (not (-T $fileName)) {return 0;}; # not text file return 0; }; # go through each file, accumulate a hash. sub processFile { my $currentFile = $File::Find::name; # full path spect my $currentDir = $File::Find::dir; my $currentFileName = $_; if (not fileFilterQ($currentFile)) { return 1; } # open file. Read in the whole file. if (not(open FILE, "<$currentFile")) {die("Error opening file: $!");}; my $wholeFileString; {local $/ = undef; $wholeFileString = <FILE>;}; if (not(close(FILE))) {die("Error closing file: $!");}; # do the replacement. my $replaceCount = 0; foreach my $key1 (keys %findReplaceH) { my $pattern = ($useRegexQ ? $key1 : quotemeta($key1)); $replaceCount = $replaceCount + ($wholeFileString =~ s/$pattern/$findReplaceH{$key1}/g); }; if ($replaceCount > 0) { # replacement has happened $totalFileChangedCount++; # do backup # make a directory in the backup path, make a backup copy. my $pathAdd = $currentDir; $pathAdd =~ s[$inputDirPath][]; mkpath("$backupRoot/$pathAdd", 0, 0777); copy($currentFile, "$backupRoot/$pathAdd/$currentFileName") or die "error: file copying file failed on $currentFile\n$!"; # write to the original # get the file mode. my ($mode, $uid, $gid) = (stat($currentFile))[2,4,5]; # write out a new file. if (not(open OUTFILE, ">$currentFile")) {die("Error opening file: $!");}; print OUTFILE $wholeFileString; if (not(close(OUTFILE))) {die("Error closing file: $!");}; # set the file mode. chmod($mode, $currentFile); chown($uid, $gid, $currentFile); print "-----out77311-------------------------------\n"; print "$replaceCount replacements made at\n"; print "$currentFile\n"; } }; # -------------------------------------------------- # main find(\&processFile, $inputDirPath); print "--------------------------------------------\n\n\n"; print "Total of $totalFileChangedCount files changed.\n"; if (scalar @$refLargeFiles > 0) { print "The following large files are skipped:\n"; print Dumper($refLargeFiles); } __END__
Sample output
processing: /Users/xah/web/ergoemacs_org/emacs_manual/elisp/_0025_002dConstructs.html processing: /Users/xah/web/ergoemacs_org/emacs_manual/elisp/A-Sample-Function-Description.html -----out77311------------------------------- 4 replacements made at /Users/xah/web/ergoemacs_org/emacs_manual/elisp/A-Sample-Function-Description.html processing: /Users/xah/web/ergoemacs_org/emacs_manual/elisp/A-Sample-Variable-Description.html -----out77311------------------------------- 1 replacements made at /Users/xah/web/ergoemacs_org/emacs_manual/elisp/A-Sample-Variable-Description.html ... hundreds lines processing: /Users/xah/web/ergoemacs_org/emacs_manual/emacs/Words.html processing: /Users/xah/web/ergoemacs_org/emacs_manual/emacs/Writing-Calendar-Files.html processing: /Users/xah/web/ergoemacs_org/emacs_manual/emacs/X-Resources.html processing: /Users/xah/web/ergoemacs_org/emacs_manual/emacs/Xref-Commands.html processing: /Users/xah/web/ergoemacs_org/emacs_manual/emacs/Xref.html processing: /Users/xah/web/ergoemacs_org/emacs_manual/emacs/Yanking.html processing: /Users/xah/web/ergoemacs_org/emacs_manual/emacs/Yes-or-No-Prompts.html -------------------------------------------- Total of 18 files changed. The following large files are skipped: $VAR1 = [ '/Users/xah/web/ergoemacs_org/emacs_manual/elisp/index_index.html' ];
I've been using this script from 2000 to 2005.
- i switched to Python in ~2005.
- then switched to elisp in ~2007.
- then switched to golang 2018.
Find Replace Scripts
- Golang: Find String (grep) Script
- Golang: Script to Find Replace Multi-Pairs of Regex in a Directory
- Python: Find Replace Regex in Dir
- Perl: Find Replace String Pairs in Directory
- Emacs: Interactive Find Replace Text in Directory
- Emacs: Xah Find Replace (xah-find.el)
- WolframLang: Find/Replace Script