Richard Stallman, Emacs Tyrant (2018-03)

By Xah Lee. Date: . Last updated: .

Richard Stallman should be kicked out of emacs dev.

He has not written code for 10 years, and yet, become arrogant, selfish, tyrannical, and completely don't know what's going on in programing.

here's latest series of messages going on in emacs dev list.

If you are impatient, here's my summary:


https://lists.gnu.org/archive/html/emacs-devel/2018-03/msg00014.html

From: Daniel Colascione
Subject: Let's make the GC safe and iterative (Was: Re: bug#30626)
Date:   Thu, 1 Mar 2018 15:22:39 -0800
User-agent:     Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0
Noam mentioned that I should make a new thread for this proposal, so I'm posting an edited version of my original message.

tl;dr: we should be able to make the GC non-recursive with minimal overhead, solving the “Emacs crashed because we ran out of stack space in GC” problem once and for all.

On 02/27/2018 10:08 AM, Eli Zaretskii wrote:
What can we do instead in such cases?  Stack-overflow protection
cannot work in GC, so you are shooting yourself in the foot by
creating such large recursive structures.  By the time we get to GC,
where the problem will happen, it's too late, because the memory was
already allocated.

Does anyone has a reasonable idea for avoiding the crash in such
programs?

We need to fix GC being deeply recursive once and for all. Tweaking stack sizes on various platforms and trying to spot-fix GC for the occasional deeply recursive structure is annoying. Here's my proposal:

I. NAIVE APPROACH

Turn garbage_collect_1 into a queue-draining loop, initializing the object queue with the GC roots before draining it. We'll make mark_object put an object on this queue, turning the existing mark_object code into a mark_queued_object function.

garbage_collect_1 will just call mark_queued_object in a loop; mark_queued_object can call mark_object, but since mark_object just enqueues an object and doesn't recurse, we can't exhaust the stack with deep object graphs. (We'll repurpose the mark bit to mean that the object is on the to-mark queue; by the time we fully drain the queue, just before we sweep, the mark bit will have the same meaning it does now.)

We can't allocate memory to hold the queue during GC, so we'll have to pre-allocate it. We can implement the queue as a list of queue blocks, where each queue block is an array of 16k or so Lisp_Objects. During allocation, we'll just make sure we have one Lisp_Object queue-block slot for every non-self-representing Lisp object we allocate.

Since we know that we'll have enough queue blocks for the worst GC case, we can have mark_object pull queue blocks from a free list, aborting if for some reason it ever runs out of queue blocks. (The previous paragraph guarantees we won't.) garbage_collect_1 will churn through these heap blocks and place each back on the free list after it's called mark_queued_object on every Lisp_Object in the queue block.

In this way, in non-pathological cases of GC, we'll end up using the same few queue blocks over and over. That's a nice optimization, because we can MADV_DONTNEED unused queue blocks so the OS doesn't actually have to remember their contents.

In this way, I think we can make the current GC model recursion-proof without drastically changing how we allocate Lisp objects. The additional memory requirements should be modest: it's basically one Lisp_Object per Lisp object allocated.

II. ELABORATION

The naive version of this scheme needs about 4.6MB of overhead on my current 20MB Emacs heap, but it should be possible to reduce the overhead significantly by taking advantage of the block allocation we do for conses and other types --- we can put whole blocks on the queue instead of pointers to individual block parts, so we can get away with a much smaller queue.

It's also interesting to note that we don't need separate queue blocks to put a block on the queue, as we do if we want to enqueue individual Lisp_Object pointers. Instead, we can add to each block type a pointer to the next block *on the to-be-marked queue* and a bitmask yielding the positions within that block that we want to mark.

For example, cons_block right now looks like this:

struct cons_block
{
  /* Place `conses' at the beginning, to ease up CONS_INDEX's job.  */
  struct Lisp_Cons conses[CONS_BLOCK_SIZE];
  bits_word gcmarkbits[1 + CONS_BLOCK_SIZE / BITS_PER_BITS_WORD];
  struct cons_block *next;
};

We'd turn it into something like this:

struct cons_block
{
  /* Place `conses' at the beginning, to ease up CONS_INDEX's job.  */
  struct Lisp_Cons conses[CONS_BLOCK_SIZE];
  bits_word gcmarkbits[1 + CONS_BLOCK_SIZE / BITS_PER_BITS_WORD];
  bits_word scan_pending[1 + CONS_BLOCK_SIZE / BITS_PER_BITS_WORD];
  struct cons_block *next;
  struct cons_block *next_scan_pending;
};

When we call mark_object on a cons, we'll look up its cons_block and look up the cons in gcmarkbits. If we find the cons mark bit set, we're done. Otherwise, we look at the scan_pending bit for the cons cell. If _that's_ set, we're also done. If we find the scan_pending bit unset, however, we set it, and then look at next_scan_pending. If that's non-zero, we know the block as a whole is enqueued for scanning, and we're done. If *that's* zero, then we add the whole block to the to-be-scanned queue.

We'll modify garbage_collect_1 to drain both the Lisp_Object queue I described in the last section (which we still need for big objects like buffers) *and* the queue of blocks pending scanning. When we get a cons block, we'll scan all the conses with scan_pending bits set to one, set their gcmarkbits, and remove the cons block from the queue.

That same cons block might make it back onto the queue later if someone calls mark_object for one if its conses we didn't already scan, but that's okay. Scanning scan_pending should be very cheap, especially on modern CPUs with bit-prefix-scan instructions.

Under this approach, the reserved-queue-block scheme would impose an overhead of somewhere around 1MB on the same heap. (I think it'd actually be a bit smaller actually.) Conses, strings, and vectors are the overwhelming majority of heap-allocated objects, and thanks to block packing, we'd get bookkeeping for them for practically free. This amount of overhead seems reasonable. I think we may end up actually using less memory that we would for recursive mark_object stack invocation.

This scheme interacts well with the portable dumper too. pdumper already uses a big bit array to store mark bits; we'd just add another array for its scan_pending. We'd basically treat the entire pdumper region as one big cons_block for GC purposes.

What do you think? I think this approach solves a longstanding fiddly problem with Emacs GC without too much disruption to the internals. It also paves the way for concurrent or generational GC if we ever want to implement these features.

https://lists.gnu.org/archive/html/emacs-devel/2018-03/msg00089.html

From: Richard Stallman
Subject: What improvements would be truly useful?
Date:   Mon, 05 Mar 2018 08:11:38 -0500
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

An improvement in GC wouldn't be a bad thing, but it may not be worth
the effort.  It is likely to lead to many bugs that would be hard to
fix.  Once working, it would not make much difference to users.
It would only permit some operations on larger problems than now.

When I was working at the AI Lab, one of the older programmers told me
that hackers are often eager to make improvements of this sort: which
make the program better in an abstract sense, but not better for
users.  I took that advice to heart.  Now I pass it on.

Changing Emacs to handle indentation and alignment with
variable-width fonts would be an important and useful change.
Certain kinds of use would make sense, which currently don't.

It would be a big step towards making Emacs do the job of
a word processor, which is what I would like to see some day.
Imagine if you could edit nicely formatted documents directly
with Emacs, instead of using LibreOffice?  LibreOffice is
fine to use, it is free software, but it isn't Emacs.

--
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.

https://lists.gnu.org/archive/html/emacs-devel/2018-03/msg00104.html

From: Daniel Colascione
Subject: Re: What improvements would be truly useful?
Date:   Mon, 5 Mar 2018 11:18:29 -0800
User-agent:     Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

On 03/05/2018 10:04 AM, Eli Zaretskii wrote:

{
From: Rostislav Svoboda
Date: Mon, 5 Mar 2018 18:32:06 +0100
Cc:  ,
        Ken Raeburn ,
        Paul Eggert

I know it's a bitter pill to swallow, but let's face it - do we think,
our bellowed Emacs will ever be able to display anything like the
examples from https://threejs.org ?
}

It can already, with XEmbed --- same way Office does it, with OLE. ☺ It's a nice trick, but I don't think that sort of pretty display helps anyone accomplish a task.

Personally, I don't think word processing is a good focus for Emacs.
There are two groups of people who want to prepare documents: those who want a WYSIWYG system and those who don't.
The former group is well-served by LibreOffice, which is a free and powerful office suite.
The latter group is well-served by Emacs with its extensive LaTeX integration.

Instead of focusing on areas where we're weak and will realistically never catch up with projects dedicated to the task, we should focus on improving existing strengths.

(1) We should be the best editor around for text and program code.
There's an opportunity to do much better than the mainstream.
Conventional IDE groups put a ton of brute force effort into tuning IDEs for specific coding styles in specific environments.
We can be more generic and more flexible, ultimately offering more power and greater efficiency for people willing to invest time into learning the system.

(a) We should do a better job of integrating interesting ideas like undo-tree, ace-jump-mode, yasnippet, helm, and others into the core and enabling them by default.
I don't think we need to be as conservative as we've been historically, and I think there's still a lot of room to improve the core editing mechanics.

(b) There are long-standing defects that are minor in the scheme of things, but that tend to create a poor impression.
In particular, long-line handling is a sore point, as is support for very large files.
For long lines: I haven't sufficiently studied what the necessary redisplay hacks would look like.

For large files: by moving from a gap buffer to a rope representation for buffers, we can partially use memory-mapped backing storage, and even when we do need private, modifiable memory for editing, we can allocate only when we immediately need and not have to move the gap around through humongous amounts of main memory.
Such a system would not only improve our support for humongous files, but would also make a 32-bit Emacs capable of editing files larger than its address space.

(c) We need a project system.
There's been some good work in this area, but there's too much fragmentation, which hinders productive integration.
For example, there is no default keybinding to jump, in C++, between an “implementation” and a “header” file, and that's because Emacs by default has no idea what either concept means and there are something like, what, a dozen(?) different ways to teach it the concept.

(d) We need better fontification and indentation.
We don't have good language coverage, and support for more obscure languages is sometimes spotty, limited to fontifying comments, strings, and keywords.
Keeping up with language development is a constant struggle, and it's easy to introduce odd bugs, infloops, and so on in ad-hoc parsing code, especially when this code needs to be simultaneously fast, incremental, and error tolerant.

I'm now wondering whether the manual approach is wrong.
We've been using it along with everyone else, but there might be better options these days.
It's a somewhat radical idea: let's use a machine learning model to classify program tokens, then apply manual fontification and indentation rules to the resulting token classifications.
We'd train the model by taking labeled program text (say, from Savannah or GitHub, run through a parser), then perturb the program text, rewarding the model for retaining token labels under various editing and truncation operations.

In this way, we'd learn an approximate model for understanding even damaged program text without having to manually write a lot of code.
Tons and tons of stuff in cc-mode is heuristics for dealing with damaged program text, and I think we could learn this understanding instead.
The system is equivalent in power to anything we could write by hand: LSTMs and other systems are Turing-complete.
This way, to add support for a new language, you'd just feed Emacs examples.
I imagine you might even be able to gently correct the system when it misunderstands and improve the overall accuracy.

But it's probably a crazy idea. ☺

(2) Startup should be instant in all cases.
Now that we have a portable dumper, we should automatically dump the results of user initialization and regenerate the dump when we detect that something's changed.
This way, users perceive Emacs as a fast, modern system.
I know that the daemon exists and that it's possible to optimize even a customized initialization so that it's fast even without hacks (I do), but users shouldn't have to go to the trouble of this kind of manual setup and tweaking

(3) Mobile support.
One of Emacs' strengths is its portability, and this portability comes with a much lower footprint than other approaches.
Desktop and laptop sales have been declining for six years.
There are lots of tools built on Emacs that would be useful (like gnus, org-mode, etc.), except that I can't use them on mobile, which means I end up choosing other tools entirely for these tasks.

There is no reason that Emacs couldn't be a good Android citizen.
A good Java<->elisp bridge would let us transparently use various system APIs.
While we would probably need mobile-specific GUI code (because the plain buffer interface wouldn't be suitable for most tasks, at least without mobile-desktop convergence), all the logic and back-end glue would work on mobile as well as it works anywhere else, greatly simplifying the task of building general-purpose tools like org-mode that really ought to work anywhere.

https://lists.gnu.org/archive/html/emacs-devel/2018-03/msg00121.html

From: Richard Stallman
Subject: Re: What improvements would be truly useful?
Date:   Mon, 05 Mar 2018 18:05:36 -0500
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > Personally, I don't think word processing is a good focus for Emacs.

I want to do my word processing in Emacs,
so please stop interfering.  You, personally, can work on other
areas if that area doesn't interest you.

The improvements you suggested for editing programs are also
desirable.  We want to improve what Emacs can do in editing programs.
One of the improvements we need is to have a GNU language server and
connect it to Emacs.  The best proposal for how to make a GNU language
server is to do it by building on GDB.

However, Emacs is also meant for editing textual documents.
To make that easier, we need more support for editing and
saving formatted documents with various fonts.

--
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.

https://lists.gnu.org/archive/html/emacs-devel/2018-03/msg00122.html

From: dancol
Subject: Re: What improvements would be truly useful?
Date:   Mon, 5 Mar 2018 15:16:41 -0800
User-agent:     SquirrelMail/1.4.23 [SVN]
> [[[ To any NSA and FBI agents reading my email: please consider    ]]]
> [[[ whether defending the US Constitution against all enemies,     ]]]
> [[[ foreign or domestic, requires you to follow Snowden's example. ]]]
>
>   > Personally, I don't think word processing is a good focus for Emacs.
>
> I want to do my word processing in Emacs,
> so please stop interfering.  You, personally, can work on other
> areas if that area doesn't interest you.

You are advocating for your suggested improvements. I am advocating for
mine. Neither of us "interfering" with the other.

https://lists.gnu.org/archive/html/emacs-devel/2018-03/msg00189.html

From: Richard Stallman
Subject: Re: What improvements would be truly useful?
Date:   Tue, 06 Mar 2018 15:54:36 -0500
[[[ To any NSA and FBI agents reading my email: please consider    ]]]
[[[ whether defending the US Constitution against all enemies,     ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

  > You are advocating for your suggested improvements. I am advocating for
  > mine.

Emacs is not an independent project, and it isn't governed by its
contributors.  I'm the head of the GNU Project, and that includes
Emacs.

You, as a contributor, can advocate a certain decision, which means
you present arguments why I should approve it.  I don't need to
advocate a decision in that sense.

I leave most technical decisions up to the contributors, including
you, because for most of the questions I don't have any special
preference of my own.  For those questions, I'm happy with whatever
works, and I know the contributors can figure out what works.

However, making progress on Emacs as a word processor is one of my
specific goals.  This is what Emacs needs to do to be useful in all
the ways it should be useful.

It will only take a few more features to make Emacs start to be useful
as a word processor.  Once we can do proper formatting of paragraphs
with variable-width text, with a few kinds of alignment, and we can
save these in files, we will be able to use it for writing letters and
handouts, instead of LibreOffice.

I am really looking forward to this.  LibreOffice is free software,
and it's ethical, but it isn't Emacs.

--
Dr Richard Stallman
President, Free Software Foundation (https://gnu.org, https://fsf.org)
Internet Hall-of-Famer (https://internethalloffame.org)
Skype: No way! See https://stallman.org/skype.html.

https://lists.gnu.org/archive/html/emacs-devel/2018-03/msg00195.html

From: Daniel Colascione
Subject: Re: What improvements would be truly useful?
Date:   Tue, 6 Mar 2018 13:15:45 -0800
User-agent:     Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

On 03/06/2018 12:54 PM, Richard Stallman wrote:

{
However, making progress on Emacs as a word processor is one of my specific
goals. This is what Emacs needs to do to be useful in all the ways it should
be useful.
}

If you want to add word processing features to Emacs, I suggest implementing the necessary changes features yourself instead of suggesting that discussion of other priorities is somehow "interfering".

Code wins arguments. `eval' tends not to work very well when you pass it a block of prose about how Emacs should be a word processor.

{
It will only take a few more features to make Emacs start to be useful
as a word processor. Once we can do proper formatting of paragraphs with
variable-width text, with a few kinds of alignment, and we can save these
in files, we will be able to use it for writing letters and handouts, instead
of LibreOffice.
}

Sure. It'd be nice if Emacs were suitable for that task. I don't see the urgency, considering that plenty of equally free packages exist.