Emacs Lisp: a Function That Works on String or Region
Addendum
2018-10-30 i stopped writing functions this way, because i find that it complicates the code.
Now, i just write 2 versions if necessary. One for interactive use, one for elisp code that takes a string and return a string, if needed.
This article shows you how write a elisp text-transformation function that can be used in 2 ways: (1) change text in a buffer region. (2) takes a string argument and returns a string.
Emacs lisp level: advanced.
Problem
For a function that transform text, find a way to code it so that:
- When called interactively: When there is a text selection, transform the selected text. Otherwise, use the current paragraph as input.
- When called in elisp code, the function can take a string and return a string, or, it can take buffer positions {$from, $to} and work on that region (i.e. replace the region with result).
For example, suppose you have a command remove-vowel
that works on a region, but you also want a version “remove-vowel-in-string” which just takes a string input and returns a string. The string version is very convenient in lisp code. But i don't want to keep 2 functions. I want just one single function.
Detail
Been coding elisp for 5 years now, perhaps about 2 hours a day. I have ~30 commands that do text transformation on text under cursor. For examples:
- Emacs Lisp: URL to HTML Link
- Emacs: HTML Image Path to Img Tag
- Emacs: Remove Accent Marks 🚀
- Emacs Lisp: Parse Date Time
- Emacs: HTML, Make Citation Link
- Emacs Lisp Find Replace String-Pairs Commands
- Emacs Lisp: Syntax Color Source Code in HTML
In the past year, i find that i often need 2 versions of a function. One version for working in a buffer, while another version simply work on string. The string version is very convenient and simple when used in elisp code.
This is becoming a problem, because for every text processing function i seem to need to write and maintain 2 versions. For example, let's say i have a function named remove-vowel
that changes “something” to “smthng”. Typically, i'd write a “remove-vowel-in-string” that takes a string as argument and output a string. Then i write another version remove-vowel
that is a interface wrapper, and calls “remove-vowel-in-string” to do the actual work.
Having 2 versions of every function is becoming annoying. So, today i thought about it and came up with a solution.
Solution
The solution is this: The function would take 1 argument, and 2 more optional arguments, like tis:
(defun remove-vowel ($string &optional $from $to) …)
- If “$string” is given, then the function take that as input and returns a string.
- If “$string” is nil, then the function takes {$from $to} positions and change the text in the region.
When remove-vowel
is called interactively, simply feed the function {nil, $from, $to}.
This way, the function can be used as a string manipulation function, or it can be used as a buffer text changing function, with no penalties or inefficiencies i can think of. Here's how it's done using remove-vowel
as example:
(defun remove-vowel ($string &optional $from $to) "Remove the following letters: {a e i o u}. When called interactively, work on current paragraph or text selection. When called in lisp code, if $string is non-nil, returns a changed string. If $string nil, change the text in the region between positions $from $to." (interactive (if (use-region-p) (list nil (region-beginning) (region-end)) (let ((bds (bounds-of-thing-at-point 'paragraph)) ) (list nil (car bds) (cdr bds)) ) ) ) (let (workOnStringP inputStr outputStr) (setq workOnStringP (if $string t nil)) (setq inputStr (if workOnStringP $string (buffer-substring-no-properties $from $to))) (setq outputStr (let ((case-fold-search t)) (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr) ) ) (if workOnStringP outputStr (save-excursion (delete-region $from $to) (goto-char $from) (insert outputStr) )) ) )
The meat of this function is just
(replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr)
.
But let's see how the input/output is done.
Use of (interactive)
The interactive
is a declaration that
lets emacs know how arguments are passed to the function when it is
used interactively. For example, it can be user input from a prompt
in minibuffer, or from universal-argument
【Ctrl+u】. Or, how to interpret the input, as
a string, number, a buffer name, file name, etc.
When a function has (interactive)
(usually placed right after the doc string), it means the function is a command (i.e. it can be called by execute-extended-command
【Alt+x】).
When a function has (interactive "r")
, then emacs will take the {beginning, ending} cursor positions of a region and feed it to the function as the first 2 arguments. The "r"
is called the “interactive code”.
See:
Interactive Codes (ELISP Manual)
.
Normally, the argument to interactive
is a string, but it can be other lisp expression. When it is a lisp expression, the return value of the expression must be a list, and the items are feed to the function as arguments.
So, in our case of remove-vowel
, our argument to interactive
is a lisp expression that return a list of 3 items. Like this:
(defun remove-vowel ($string &optional $from $to) "…" (interactive (if (use-region-p) (list nil (region-beginning) (region-end)) (let ((bds (bounds-of-thing-at-point 'paragraph)) ) (list nil (car bds) (cdr bds)) ) ) ) … )
If there's a text selection (region is active), it sets “$string” to nil and {$from, $to} to region {begin, end} positions.
If there is no text selection (region is not active), it sets “$string” to nil and {$from, $to} to paragraph's {begin, end} positions.
In both cases, the “$string” is set to nil, so the function will work on the region text.
(See: Using thing-at-point • What is Region, Active Region, transient-mark-mode?)
Rest of Code
The above takes care of interactive use of the function.
Now, remember that our function takes 3 arguments: {$string, $from, $to}. The {$from, $to} are optional. When “$string” is given (i.e. not nil), the function will take that as input and return a string. Otherwise, it takes {$from, $to} as region positions and transform text in the buffer.
For clarity, first we set “workOnStringP”:
(setq workOnStringP (if $string t nil))
then we set the “inputStr” like this:
(setq inputStr (if workOnStringP $string (buffer-substring-no-properties $from $to)))
Now, it works on the string, like this:
(setq outputStr (let ((case-fold-search t)) (replace-regexp-in-string "a\\|e\\|i\\|o\\|u\\|" "" inputStr) ) )
Then, it either returns the outputStr or just change the region in buffer, depending whether “workOnStringP” is true, like this:
(if workOnStringP outputStr (save-excursion (delete-region $from $to) (goto-char $from) (insert outputStr) ))
Note: i use dollar sign $ as sigil for function parameter names, for easy distinction from builtin symbols. [see Variable Naming: English Words Considered Harmful]
Emacs ♥