Docstring Convention: Python vs Emacs Lisp

, , …,

here's a interesting stylistic clash between languages. Here's a excerpt from Python guide on docstring convention.

Do not use the Emacs convention of mentioning the arguments of functions or methods in upper case in running text. Python is case sensitive and the argument names can be used for keyword arguments, so the docstring should document the correct argument names. It is best to list each argument on a separate line. For example:

def complex(real=0.0, imag=0.0):
    """Form a complex number.

    Keyword arguments:
    real -- the real part (default 0.0)
    imag -- the imaginary part (default 0.0)

    if imag == 0.0 and real == 0.0: return complex_zero

The BDFL [Benevolent Dictator for Life] recommends inserting a blank line between the last paragraph in a multi-line docstring and its closing quotes, placing the closing quotes on a line by themselves. This way, Emacs' fill-paragraph command can be used on it.

source: 〔Docstring Conventions By David Goodger, Guido Van Rossum. @…

there are quite a few interesting aspects.

CAPS for Parameters?

Note that in emacs's inline doc convention, function's parameters should be in CAPS in docstring. See: (info "(elisp) Documentation Tips")

When a function's documentation string mentions the value of an argument of the function, use the argument name in capital letters as if it were a name for that value. Thus, the documentation string of the function eval refers to its second argument as ‘FORM’, because the actual argument name is form:

Evaluate FORM and return its value.

Here's a example:

(defun insert-register (register &optional arg)
  "Insert contents of register REGISTER.  (REGISTER is a character.)
Normally puts point before and mark after the inserted text.
If optional second arg is non-nil, puts mark before and point after.
Interactively, second arg is non-nil if prefix arg is supplied.

Interactively, reads the register using `register-read-with-preview'."

Python's convention is slightly better, because it's more intuitive and convenient. Programers don't need to remember to type ALLCAPS in documentation.

Note: elisp is case sensitive too. Though, in elisp, basically no parameter (identifier) is ever ALLCAPS. Because of lisp syntax, lisp identifiers allows hyphen. So, basically ALL identifiers are by convention all-lower-case or all-lower-case-with-hyphen. While in python, camelCase or Capfirst is common, for objects, and all caps is usually for CONSTANTS.

the problem with elisp convention is that, it basically limits the charset of identifiers to English letters only. For example:

(defun geometry-transform-f (ξ φ)
  "do transform Ξ and Φ …"

Note that now it's hard to understand, because most of us are not familiar with the capitalization of Greek alphabet.

The disadvantage of python's style is that now it's impossible to have a word in documentation that happens to be the same as the parameter name. For example: “Insert contents of register REGISTER. (REGISTER is a character.)”. If it were documented in python style, the word “register” is ambiguous.

The best solution is actually introduce a markup, for example:

"Insert contents of register p(register). "

Newline at the End?

Guido's python style guide also suggests that the ending quote be on a line by itself. (that is, last char in docstring should be a newline char) Like this:

def f(x):
    """Something …
    x -- the arg.
    The End.

On the other hand, emacs's convention tells people not to do that. Like this:

(defun f (x)
  "Something …
X is ….
The End."
  ;;   )

Here, i'm not sure one convention is absolutely better then the other.

Python's style is more convenient (when you edit and cut lines). Computers can easily add or remove such char when processing the docstring for rendering purposes or whatever purpose.

Emacs's style, is more “clean”. Because, if you don't require a line ending, you might want to not include it in your source code.

there are 3 ways to go about this:

The ④ is the best. Google's golang does this.

Line Length

another interesting point is that emacs's convention suggests using less than 67 chars per line. Python doesn't suggest this, but in practice, most lines are less than 80 chars per line because they need indentation at beginning of each line to make the left side aligned.

A better way is to not have line length limit, and newline char (␤) should be used for logical break only. Again, computer can trivially parse and truncate lines when necessary. It should not be a human burden. Also, limiting the use of ␤ for logical purposes makes it semantically meaningful. If ␤ is also used for formatting purposes, then parser won't be able to tell, thus losing parts of automation power.

blog comments powered by Disqus