Emacs Lisp: a Function That Works on String or Region

By Xah Lee. Date: . Last updated: .

Addendum

2018-10-30 i stopped writing functions this way, because i find that it complicates the code.

Now, i just write 2 versions if necessary. One for interactive use, one for elisp code that takes a string and return a string, if needed.

This article shows you how write a elisp text-transformation function that can be used in 2 ways: (1) change text in a buffer region. (2) takes a string argument and returns a string.

Emacs lisp level: advanced.

Problem

For a function that transform text, find a way to code it so that:

  1. When called interactively: When there is a text selection, transform the selected text. Otherwise, use the current paragraph as input.
  2. When called in elisp code, the function can take a string and return a string, or, it can take buffer positions {$from, $to} and work on that region (i.e. replace the region with result).

For example, suppose you have a command remove-vowel that works on a region, but you also want a version “remove-vowel-in-string” which just takes a string input and returns a string. The string version is very convenient in lisp code. But i don't want to keep 2 functions. I want just one single function.

Detail

Been coding elisp for 5 years now, perhaps about 2 hours a day. I have ~30 commands that do text transformation on text under cursor. For examples:

In the past year, i find that i often need 2 versions of a function. One version for working in a buffer, while another version simply work on string. The string version is very convenient and simple when used in elisp code.

This is becoming a problem, because for every text processing function i seem to need to write and maintain 2 versions. For example, let's say i have a function named remove-vowel that changes “something” to “smthng”. Typically, i'd write a “remove-vowel-in-string” that takes a string as argument and output a string. Then i write another version remove-vowel that is a interface wrapper, and calls “remove-vowel-in-string” to do the actual work.

Having 2 versions of every function is becoming annoying. So, today i thought about it and came up with a solution.

Solution

The solution is this: The function would take 1 argument, and 2 more optional arguments, like tis:

(defun remove-vowel ($string &optional $from $to) …)

When remove-vowel is called interactively, simply feed the function {nil, $from, $to}.

This way, the function can be used as a string manipulation function, or it can be used as a buffer text changing function, with no penalties or inefficiencies i can think of. Here's how it's done using remove-vowel as example:

(defun remove-vowel ($string &optional $from $to)
  "Remove the following letters: {a e i o u}.

When called interactively, work on current paragraph or text selection.

When called in lisp code, if $string is non-nil, returns a changed string.
If $string nil, change the text in the region between positions $from $to."
  (interactive
   (if (use-region-p)
       (list nil (region-beginning) (region-end))
     (let ((bds (bounds-of-thing-at-point 'paragraph)) )
       (list nil (car bds) (cdr bds)) ) ) )

  (let (workOnStringP inputStr outputStr)
    (setq workOnStringP (if $string t nil))
    (setq inputStr (if workOnStringP $string (buffer-substring-no-properties $from $to)))
    (setq outputStr
          (let ((case-fold-search t))
            (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr) )  )

    (if workOnStringP
        outputStr
      (save-excursion
        (delete-region $from $to)
        (goto-char $from)
        (insert outputStr) )) ) )

The meat of this function is just (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr). But let's see how the input/output is done.

Use of (interactive)

The interactive is a declaration that lets emacs know how arguments are passed to the function when it is used interactively. For example, it can be user input from a prompt in minibuffer, or from universal-argumentCtrl+u】. Or, how to interpret the input, as a string, number, a buffer name, file name, etc.

When a function has (interactive) (usually placed right after the doc string), it means the function is a command (i.e. it can be called by execute-extended-commandAlt+x】).

When a function has (interactive "r"), then emacs will take the {beginning, ending} cursor positions of a region and feed it to the function as the first 2 arguments. The "r" is called the “interactive code”. See: Interactive Codes (ELISP Manual) .

Normally, the argument to interactive is a string, but it can be other lisp expression. When it is a lisp expression, the return value of the expression must be a list, and the items are feed to the function as arguments.

So, in our case of remove-vowel, our argument to interactive is a lisp expression that return a list of 3 items. Like this:

(defun remove-vowel ($string &optional $from $to)
 "…"
 (interactive
    (if (use-region-p)
        (list nil (region-beginning) (region-end))
      (let ((bds (bounds-of-thing-at-point 'paragraph)) )
        (list nil (car bds) (cdr bds)) ) ) )
…
)

If there's a text selection (region is active), it sets “$string” to nil and {$from, $to} to region {begin, end} positions.

If there is no text selection (region is not active), it sets “$string” to nil and {$from, $to} to paragraph's {begin, end} positions.

In both cases, the “$string” is set to nil, so the function will work on the region text.

(See: Using thing-at-pointWhat is Region, Active Region, transient-mark-mode?)

Rest of Code

The above takes care of interactive use of the function.

Now, remember that our function takes 3 arguments: {$string, $from, $to}. The {$from, $to} are optional. When “$string” is given (i.e. not nil), the function will take that as input and return a string. Otherwise, it takes {$from, $to} as region positions and transform text in the buffer.

For clarity, first we set “workOnStringP”:

(setq workOnStringP (if $string t nil))

then we set the “inputStr” like this:

(setq inputStr (if workOnStringP $string (buffer-substring-no-properties $from $to)))

Now, it works on the string, like this:

(setq outputStr
 (let ((case-fold-search t))
  (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr) ) )

Then, it either returns the outputStr or just change the region in buffer, depending whether “workOnStringP” is true, like this:

(if workOnStringP
        outputStr
      (save-excursion
        (delete-region $from $to)
        (goto-char $from)
        (insert outputStr) ))

Note: i use dollar sign $ as sigil for function parameter names, for easy distinction from builtin symbols. [see Variable Naming: English Words Considered Harmful]

Emacs 🧡