Emacs: Find Replace on Multiple Files by Function: Add Unicode Name in HTML

By Xah Lee. Date: . Last updated: .

This page shows a example of Find Replace on 6 thousand files, using a emacs lisp function that computes the proper replacement string.

Problem

You have this in HTML file:

<mark class="unicode">♥</mark>

You want to change it to this:

<mark class="unicode" title="U+2665: BLACK HEART SUIT">♥</mark>

here's other examples:

<mark class="unicode"</mark>
<mark class="unicode">ℜ</mark>
<mark class="unicode"</mark>
<mark class="unicode"</mark>
<mark class="unicode">👾</mark>

they need to become:

<mark class="unicode" title="U+192: LATIN SMALL LETTER F WITH HOOK"</mark>
<mark class="unicode" title="U+211C: BLACK-LETTER CAPITAL R">ℜ</mark>
<mark class="unicode" title="U+B0: DEGREE SIGN"</mark>
<mark class="unicode" title="U+3B1: GREEK SMALL LETTER ALPHA"</mark>
<mark class="unicode" title="U+1F47E: ALIEN MONSTER">👾</mark>

There are thousands of such markups, scattered in 4 thousands of files. What do you do?

Emacs comes to the rescue.

Solution

first, i define a function:

(defun ff ()
  "temp. Modify unicode markup in html"
  (interactive)
  (let* (
         (xcodepoint (string-to-char (match-string 1)) )
         (xname (get-char-code-property xcodepoint 'name))
         )
    (format "<mark class=\"unicode\" title=\"U+%X: %s\">%c</mark>" xcodepoint xname xcodepoint)
    ) )

〔see Evaluate Emacs Lisp Code

then i call find-dired, give a path, then mark by regex \.html$, then press Q (dired-do-query-replace-regexp), then type search string:

<mark class="unicode">\(.\)</mark>

and type this replacement string:

\,(ff)

then i type y to change each, or ! for the whole file, or Y for all replacement.

then i go to ibuffer, type * u S D to mark all unsaved and save and close them. All's done. 〔see Emacs: List Buffers

30 minutes job with Ruby, Python, Perl is now 10 minutes with emacs lisp.

For detailed explanation of how all this works, see: Elisp: Call Function in Replacement String

Emacs 🧡

Function as Replacement String