Emacs Lisp Doc String Curly Quote Controversy
In 2015, emacs dev mailing list has a flame war about whether elisp doc string should contain the curly quotes as literal unicode characters.
“U+201C: LEFT DOUBLE QUOTATION MARK
” U+201D: RIGHT DOUBLE QUOTATION MARK
It ends up with thousand or so messages, spanning 4 months.
Here's some of the evolution of the thread titles:
- Support curved quotes in doc strings. 2015-05-28
- Upcoming loss of usability of Emacs source files and Emacs. 2015-06-15 https://lists.gnu.org/archive/html/emacs-devel/2015-06/msg00202.html
- On the masking of undisplayable characters
- A simple solution to “Upcoming loss of usability ...”
- Escaping quotes in docstrings
- Please stop putting curly quotes into doc strings. 2015-09 https://lists.gnu.org/archive/html/emacs-devel/2015-09/msg00253.html
Richard Stallman, as far as i know, he has not participated, but in the end, Richard Stallman, killed it with a single email.
From: Richard Stallman Subject: Please stop putting curly quotes into doc strings! Date: Fri, 04 Sep 2015 21:14:42 -0400 Please stop inserting curly quotes into doc strings in Emacs sources. -- Dr Richard Stallman President, Free Software Foundation (gnu.org, fsf.org) Internet Hall-of-Famer (internethalloffame.org) Skype: No way! See stallman.org/skype.html.
he was not the maintainer.
One of the message, cited one of my article.
Re: [Emacs-diffs] master 9ce1d38: Use curved quotes in core elisp diagno From: Paul Eggert Subject: Re: [Emacs-diffs] master 9ce1d38: Use curved quotes in core elisp diagnostics Date: Tue, 18 Aug 2015 10:34:45 -0700 Bastien wrote: Paul Eggert writes: Format strings are easier to read and use, particularly by novices, if characters typically stand for themselves. Did we ever receive a complaint from a novice about `...' readability? Most novices don't bother to write bug reports -- they don't even know how to write bug reports. But yes, people occasionally gripe about the use of grave accent to quote, and this can hurt Emacs's reputation among people who may not know it better. For example, <a href="http://wordyenglish.com/musing/typography.html">http://wordyenglish.com/musing/typography.html</a> (2007) says: "the problem with the GNU is that even today, in 2007, where curly quotes have been widely available in word processors for over a decade (and Unicode have been practical and widely available for at least 5 years...), they are still using plain ASCII hacks. (in general, GNU and the Open Source morons have like a 5 to 10 years lag in adopting technology, for reasons that are inadvertently intentional and or simply incapable)" And here we are in 2015, with the quote problem still only partly fixed.
Paul Eggert is the author of the famous parser generator yacc . See
man yacc, AUTHOR section.
bison, is a yacc copycat for the gnu project, written by Robert Corbett and Richard Stallman.
Stallman haven't coded for like 10 or 20 years, and constantly makes a show of his ignorance now.
The support for curly quotes and auto rendering of `emacs style quote' to ‘single curly quotes’
is coded by Eggert. You can see lots of his coding in emacs git. Apparently, after all these years, he still code.
describe-function then type
Look at the quote chars.
Move cursor to the left of the char and Alt+x
Then, jump to the source code and look at the source code.
here's a message that retorts the reason to use unicode chars directly in emacs source:
• Re: [Emacs-diffs] master 9ce1d38: Use curved quotes in core elisp diagno • From: Dmitry Gutov • Subject: Re: [Emacs-diffs] master 9ce1d38: Use curved quotes in core elisp diagnostics • Date: Tue, 18 Aug 2015 23:47:36 +0300 On 08/18/2015 08:34 PM, Paul Eggert wrote: Most novices don't bother to write bug reports -- they don't even know how to write bug reports. Bug reports are written by users who are at least a little experienced, sure, but we shouldn't assume that every such user has necessarily become accustomed to Emacs's quirks, and wouldn't call out this problem, if it were a real problem. But yes, people occasionally gripe about the use of grave accent to quote, and this can hurt Emacs's reputation among people who may not know it better. For example, <http://wordyenglish.com/musing/typography.html> (2007) says: I sincerely hope the whole effort wasn't kicked off by this Xah Lee's rant. It's pretty shallow. And the author should really "know Emacs better" by now. "the problem with the GNU is that even today, in 2007, where curly quotes have been widely available in word processors for over a decade (and Unicode have been practical and widely available for at least 5 years...), they are still using plain ASCII hacks. (in general, GNU and the Open Source morons have like a 5 to 10 years lag in adopting technology, for reasons that are inadvertently intentional and or simply incapable)" "morons"... yeah. And here we are in 2015, with the quote problem still only partly fixed. One would have to define the "problem" first. In 2015, the documentation markup languages (Markdown, Asciidoc, etc) support rich content (images, hyperlinks, document structure), and decoupling markup from presentation (usually through rendering into HTML). Yet here we are, not talking about any big features, and instead discussing using unicode quotes in the markup (which none of the modern markup languages do), because it's "easier" if the markup and presentation are the same. That's a step back, if anything.
I've read all the reasons against using unicode directly in source code. Most of them are, idiotic.
One simple way to see this is that, in China or Japan, where you have few thousand chars, you use them, without ado.
For Western white men, somehow, there's a sense that ASCII should be the only char used in computer source code. This idea, is held usually by old coders. The millennials, mostly have no problem with NON-ASCII chars. They grew up with it.
here's a nice summary by Stephen J. Turnbull, who was the leader for xemacs.
Stephen J. Turnbull Subject: Re: [Emacs-diffs] master 9ce1d38: Use curved quotes in core elisp diagnostics Date: Wed, 19 Aug 2015 15:31:47 +0900 Óscar Fuentes writes: > Dmitry suggests this, and his comment about modern markup languages > restricting themselves to ASCII is something to think about. Not really. No chicken developed from that egg because there was no chicken to lay the egg in the first place. By and large programmers' environments are deficient in respect of input methods, especially in the U.S., and until a few years ago solid multilingual Unicode environments weren't really available (and still aren't on Windows, if I understand Eli's descriptions correctly). So programmers (who design markup languages) restrict themselves to ASCII-based markup. It's only become reasonable to think about going beyond ASCII in the last 5 years or so (if you want to maintain fairly general appeal). And there's the counterexample of Xe[La]TeX, which in fact developed for Mac, the most complete Unicode implementation available at the time -- a single anecdote, but very suggestive IMHO. Emacs is the perfect environment to experiment with *discoverable* *multilingual* input methods. AFAIK, they don't exist yet, *anywhere*. Apple is going backwards, even. Microsoft doesn't have them, either. The proprietary technology is quite good -- within the context of monolingual environments (which is where the money is, even in Europe the number of companies where individuals need multilingual environments is limited). But they require effort for neophytes to learn, and are less than useful for "inputting 'exotic' characters." As far as I can tell, there's nothing better out there for free software, either -- we're now on our fourth or fifth generation of new input management frameworks for GNOME and/or KDE, and *still* the most frequent n00b question on the Tokyo Linux Users Group[sic] is "I just upgraded MyDistro and now I can't input Japanese in WhateverOffice". My Chinese students and Buddhist scholar friends all use Macs because it's very easy to switch among input methods (Chinese, Japanese, and Sanskrit are radically different -- it's sort of possible to share an input method between Chinese and Japanese, but it's very painful). But all of these methods are monolingual, and must be learned separately (or "taught", as most "learn" the user's habits, changing priorities in the dictionaries and storing common sequences of words for "predictive translation"). Emacs at least has Quail, giving language flexibility as good or better than Apple, although the input methods themselves are static, so aren't as user-friendly as the proprietary ones that "learn" the users' habits. And (one small step for Emacs, one giant step for mankind) Quail methods are self-documenting (although again discoverability needs to be improved for the purpose of "typing 'exotic' characters"). > I admit that I'm intrigued by your plan about how this change will > initiate an evolution on Emacs input system that will make easier to > type exotic characters (defining "exotic" by "something that it is > infrequent in your daily usage.") By giving people an itch they want to scratch. Most people will just cut'n'paste or add ad hoc keybindings for the characters they need. Some people will do more, and sooner or later one of them will come up with a much better way to do input methods. It's not obvious to me what that will be, and it's probably useless to ask Paul what it will be too. David K pointed out that there are some useful ideas in x-symbol. That might be one place to look. Also, besides input methods, it will likely lead to improvements in other technologies such as searching (adding character classes of "cognates" such as ` and ‘, for example -- this is useful for repertoires like Japanese which has about a dozen variants on open parenthesis more or less commonly used in text, as well as a pile of numeral variants used for paragraph numbering, and the like). Those opposed to the change will cry YAGNI, and that's true -- if you live in an 8-bit world anyway, you just can't afford that kind of redundancy. But like it or not, the world is now mostly Unicode and that will only increase. Japanese is probably the most perverse character set in existence, but I believe Chinese and Korean also have similar issues of many classes of characters that have redundant functionality, and it shows up in other places (eg, arrows and emoticons). > Maybe describing the specific user-visible improvements that this > change will help to bring into reality would buy you more support. The user-visible improvements have been described and are easily visible to the eye desiring to see them. Tastes just differ here; the people who don't like the change see little to no improvement, and IIUC Drew even considers it a clear step backward aesthetically.
from Stephen J. Turnbull http://lists.gnu.org/archive/html/emacs-devel/2015-08/msg00676.html
It is Alan Mackenzie, who insisted that docstring should be pure ASCII. After hundreds of flamewar messages, then, it is Richard Stallman, sealed the damage. RMS was not a emacs maintainer at the time, and supposedly he does not have the authority for final decision on anything none political. Maintainer was Stefan Monnier
What is the problem of using some form of encoding for none ascii chars?
Using encoded form is unreadable when you have lots of them. And it adds complexity of a transformation step.
e.g. in math, → ∑ α etc.
etc in html.
Or worse in TeX escape or
in many langs.
Or consider euro lang. é and other. Basically, when you have several of that in a sentence, using encoded form is not practical. You have unreadability, and complexity of conversion, which easily go wrong, e.g. Encoded/decoded twice.
Imagine, if Spanish people had to write niña by ni＆ntilde;a or ni\x00f1a. Or French has to read télé from t＆eacute;l＆eacute; or t\x00e9l\x00e9. That is what programers who insist on ascii-only source code are asking.
Never make your code less readable simply out of fear that some programs might not handle non-ASCII characters properly. If that happens, those programs are broken and they must be fixed.
here's how the curly quote complex fuck things up.
before, you can write things literally, in particular, symbol values can be represented as
'symbol, they look as is:
now after the render layer Alan Mackenzie fuckup, it looks like this:
If you want the APOSTROPHE '
character as is, you have to either escape it, or tell users to do
(setq text-quoting-style 'straight)
- rms Stole Emacs from Gosling
- 2020 Bozhidar Batsov RuboCop Incident
- 2019 rms Resigned from FSF
- 2019 rms at Microsoft
- 2018 rms is a Tyrant
- 2018 Linus: Respect is Earned Not Given
- 2017 rms Pushing for GPL3, Kicked Out Free Software from Free Software Platform
- 2016 Elisp Doc String Curly Quote Controversy
- 2016 rms Removes Color Emoji on Mac Emacs
- 2017 Language Server Protocol (LSP) Kills Elisp
- 2016 Ugly Redisplay Internals Hack
- 2015 rms: What's magit?
- 2013 Rants on Emacs Visual Lines by Don Hopkins, Mark Crispin
- 2013 rms Wants Emacs to be Word Processor
- 2013 How Much Donation FSF Get
- 2012 rms on Open Source
- 2012 Daniel Weinreb Died
- 2011 rms Speech Requirement
- 2007 Daniel Weinreb Rebuttal to rms's Lie
- Young rms on Software Freedom
The Emacs Cult
- 2010 Emacs Dev Inefficiency
- Emacs Dev Inefficiency, Emacs Web 2?
- 2001 Emacs and XEmacs Schism
- 2007 Emacs vs XEmacs
- 2008 Problems of Emacswiki
- 2011 Emacs vs Windows Notepad
- 2011 Emacs Undo Cult Problem
- 2010 Have You Read Emacs Manual