Emacs Lisp: Automatic Code Formatting
For lisp languages, it would be nice, if a programer can press a button in emacs, then the current code block would be formatted by a simple lexical scan. (similar to how fill-paragraph
would work)
I think it is not hard to write this command, but to my surprise, it is not done. I was told by one Scheme expert Taylor R Campbell (aka Riastradh, author of paredit mode) that this is non-trivial, but i couldn't believe it and maybe he misunderstood what i wanted about this command.
here's a outline how this would work.
Simply count the levels of nesting of parens. For example, consider this lisp code:
(defun previous-user-buffer () "Switch to the next user buffer in cyclic order." (interactive) (previous-buffer) (let ((i 0)) (while (and (string-match "^*" (buffer-name)) (< i 10)) (setq i (1+ i)) (previous-buffer) )))
each left paren has a level of nesting. Say, n=0, n=1, n=2…etc. A simplest version of auto-format is to start a new line for each left paren, with n being the number of indent. So, the above code would be formatted like this (using 1 space for indent in this example):
(defun previous-user-buffer () "Switch to the next user buffer in cyclic order." (interactive) (previous-buffer) (let ( (i 0)) (while (and (string-match "^*" (buffer-name)) (< i 10)) (setq i (1+ i)) (previous-buffer))))
Now, this is probably too many short lines when compared to how lisp code is traditionally formatted. We can modify the auto-format heuristics to reduce short lines: if a complete unit of expression is less than 70 char, then render the whole expression in one line.
Here's how the code would look with this rule:
;23456789 123456789 123456789 123456789 123456789 123456789 123456789 (defun previous-user-buffer () "Switch to the next user buffer in cyclic order." (interactive) (previous-buffer) (let ((i 0)) (while (and (string-match "^*" (buffer-name)) (< i 10)) (setq i (1+ i)) (previous-buffer))))
… looks much better. I don't know how well this would work out for more complex code… but i think idea is there. Adding to the heuristics might be special rules dealing with the doc string and other special non-regular lisp syntaxes (such as those involving special chars “ ' ` # ,@ . ;”, etc). (However, such special rule should be kept as minimal as possible)
On the whole, a simple formatting by lexical scan in not going to be as pretty as manual formatting. However, it is my opinion, if lispers adapt to such a uniform, simple, machine-produced auto-formatting, the impact on lisp community considered as whole, will be tremendous. It would get rid of the “source code formatting style” literature and debates for good, because all coders will be accustomed to this machine-produced, uniform, style, when they begin to learn lisp. (each coder can set some personal preferences to the auto-formatter if she so wishes, and re-format entire source code on the fly) Once a language's source code are presented in a uniform style universally, it would fundamentally influence the idioms and program constructs lisp coders actually produce. (this is a advantage the Python language offers transparently.)
See also the section “Automatic, Uniform, Universal, Source Code Display” at Fundamental Problems of Lisp, for the relation between a regular syntax and source code formatting.
Some related links:
- Google group rant: http://groups.google.com/group/comp.lang.lisp/msg/5c3ad44be794ebec
- format whole buffer [2017-07-26 https://github.com/tuhdo/semantic-refactor/blob/master/srefactor-demos/demos-elisp.org ]
- A implementation by Andy Stewart, beta. http://www.emacswiki.org/emacs/ElispFormat
Note, starting with golang's auto format command gofmt
in 2009, the idea of auto format became popular.
As of 2021-04-03, it's in deno of JavaScript (deno fmt path
), and also in
clang project for C/C++/Java/JavaScript/Objective-C/Protobuf/C# languages.
(clangformat
)
Also note, auto format is not “lint” tool or syntax checker. Lint for example only does minor formatting correction or suggestions. gofmt and deno does reformat by looking at the source code as char stream.
new package: emacs-elisp-autofmt
https://gitlab.com/ideasman42/emacs-elisp-autofmt
If you made that package robust, i'd highly recommend it. Back around 2007, i tried to pay someone some small money to implement it. It was a chinese coder, a hotshot, wrote tons of emacs packages but in general not good in my opinion. He coded it, but from what i see isn't usable.
I think what's needed is:
- Have a command to format current function.
- Have a command to format current region.
- Have a command to format current buffer.
- Have a command to format a file given file path.
- The command must be robust and 100% correct. It it screwed up the code 1 in a million, it's not usable. (watch out for comments, strings, escapes in string, and complex regex in string.)
- The command must be fast. So, writing it in elisp isn't that great. Eventually, it needs to be written in golang or such.
- It must be able to run in batch and update format of the 1.5k elisp files bundled in gnu emacs. It must not be slow. (compare speed of gofmt of golang and deno of JavaScript.)
- It must support the traditional gnu elisp format style. MUST. Other format style is optional. Else, the package won't be adopted by many.
- Most important: it must be able to format a compacted long line of lisp code into multi-lines.
Now, here's the MOST important thing, of ALL the above items: It must do the reformat completely. E.g. If i have
(defun f () "DOCSTRING" (interactive) (let () 3 ))
it must format to
(defun f () "DOCSTRING" (interactive) (let () 3 ))