Emacs: Inconsistency of Search Features

By Xah Lee. Date: . Last updated: .

This page discuss inconsistency of the many emacs features related to text search (grep).

Emacs has many commands related to searching text. e.g. {list-matching-lines, grep, rgrep, lgrep, grep-find, find-dired, dired-do-search, etc}. However, their interface are inconsistent. Their implementation is also inconsistent.

Interface Inconsistency

The interface isn't consistent. For example: grep and grep-find (with alias find-grep) both directly prompt you to enter unix command in one shot. But {find-dired, rgrep, lgrep} do several prompts asking for: {search string, file extension, directory}. (though, they still require user to be familiar with the unix commands. For example: When find-dired prompts “Run find (with args):”, you have to give -name "*html" or -type f etc.)

People who are not familiar unix won't be able to search a folder. The unix find/grep parameters are quite complex, and emacs documentation doesn't try to explain them.

Implementation Inconsistency

list-matching-lines (alias of occur) is implemented with elisp, while the others rely on unix utilities {grep, find}.

Calling external util has lots problems under Microsoft Windows. On Windows, external utilities may not be installed. Searching text is a critical feature of emacs. Then, there's Cygwin, MinGW and other variety issues. Emacs has to go thru several layers to interface with them.

On unix/linux including Mac OS X, BSD, there's also complex issues of identifying the version of grep/find used that differ in options and behavior. I recall that rgrep didn't work on Mac OS X around year 2007.

Then, emacs has to process the output to make syntax coloring. Also, if your search string contains Unicode, then there's complex problem about locale/encoding setup, environment variable setup and inheritance issues among the shell, the Operating System, and emacs.

〔see Problems of grep in Emacs

Suggestion: Unified Interface and Implemented in Emacs Lisp

It seems to me, they could all use elisp, with a single code engine, and with the same interface. The files to be searched can be from buffer list or dired marked files, or entered from prompt with emacs regex. The output can be a output like occur or can be dired list or buffer list. For example, you could have a command list-matching-lines, and then “list-matching-lines-directory” with options to do nested dirs, and options to use a pattern to filter files to search for, or options to get marked files in dired, and also options to show result as file list in dired.

These text searching task is what emacs lisp is designed to do, and is trivial to implement. It would eliminate the whole external program problems.

The core are already in emacs. occur, dired-do-query-replace-regexp, dired-do-search, and probably something in eshell too. They are just scattered and not unified.

Emacs Lisp Speed Issues?

Doing it inside elisp is not that slow. Slower than those written in C, but remember that emacs has to parse their output and the communication overhead might just make it up. I've written a grep command in elisp 〔see Elisp: Write grep〕. Just tested now, calling it on a dir with subdir with total of 1592 files. Using a search word that returns over a thousand files. Both my script and unix util are close to 3 seconds. (e.g. Alt+x shell in emacs, then give grep -c */*html) Calling grep -c */*html by shell-command is less than 1 second. Calling grep -c */*html by emacs's grep is about 3 seconds too.

(this article was originally posted to comp.emacs At http://groups.google.com/group/comp.emacs/browse_frm/thread/9458d26644fde3e4)

This report applies to GNU Emacs 24.0.93.1

Emacs Modernization