The Importance of Terminology's Quality In Computer Languages

By Xah Lee. Date:

I'd like to introduce a blog post by Stephen Wolfram, on the design process of Mathematica. In particular, he touches on the importance of naming of functions.

[Ten Thousand Hours of Design Reviews By Stephen Wolfram. At , accessed on 2013-10-03 ]

The issue is fitting here today, in our discussion of “closure” terminology recently, as well the jargons “lisp 1 vs lisp2” (multi-meaning space vs single-meaning space), “tail recursion”, “currying”, “lambda”, that perennially crop up here and elsewhere in computer language forums in wild misunderstanding and brouhaha.

The functions in Mathematica, are usually very well-named, in contrast to most other computing languages. In particular, the naming in Mathematica, as Stephen Wolfram indicated in his blog above, takes the perspective of naming by capturing the essence, or mathematical essence, of the keyword in question. (as opposed to, naming it according to convention, which often came from historical happenings) When a thing is well-named from the perspective of what it actually “mathematically” is, as opposed to historical developments, it avoids huge of confusion.

Let me give a few example.

• “lambda”, widely used as a keyword in functional languages, is named just “Function” in Mathematica. The “lambda” happened to be called so in the field of symbolic logic, is due to use of the greek letter lambda “λ” by happenstance. The word does not convey what it means. While, the name “Function”, stands for the mathematical concept of “function” as is.

• {Module, Block, With}, in Mathematica is in lisp's various “let*” keywords. The lisp's keywords “let”, is based on the English word “let”. That word is one of the English word with multitudes of meanings. If you look up its definition in a dictionary, you'll see that it means many disparate things. Example:



Mathematica's choice of Module, Block, is based on the idea that it builds a self-contained segment of code. (however, the choice of Block as keyword here isn't perfect, since the word also has meanings like “obstruct; jam”)

• Functions that takes elements out of list are variously named {First, Rest, Last, Extract, Part, Take, Select, Cases, DeleteCases, …} as opposed to {car, cdr, filter, filter, pop, push, shift, unshift}, in lisps and perl and other langs.

The above are some examples. The thing to note is that, Mathematica's choices are often such that the word stands for the meaning themselves in some logical and independent way as much as possible, without having dependent on a particular computer science field's context or history. One easy way to confirm this, is taking a keyword and ask a wide audience, who doesn't know about the language or even unfamiliar of computer programing, to guess what it means. The wide audience can be made up of mathematicians, scientists, engineers, programers, laymen. This general audience, are more likely to guess correctly what Mathematica's keyword is meant in the language, than the name used in other computer languages whose naming choices goes by convention or context.

For some example of namings in popular computer languages… Perl's naming heavily relies on unix culture (grep, pipe, croak, pop, shift, stat, die, hash, glob, sigil, regex, file handle, etc). As with the unix culture, many names and terminologies intentionally exude juvenile humor. If you are a unix hacker, you may have a blast.

But if you are a mathematician, scientist, engineer, most are even struggling with writing a report in HTML/CSS or LaTeX, the unix-colored jargons will make the learning process harder.

Similarly, for programers coming from other backgrounds such as Microsoft Windows, the pun and unix styled naming reduces his chances of learning the language.

Functional lang's terminologies are typically heavily based on the field of computer science (For example, lambda, currying, closure, monad, predicate, tail recursion, continuations). They are that way, because the people who work with these languages are typically academics. They model their language and name constructs based on particular model of mathematical logic. Thus, we have keywords names, function names, library names, and language concepts like lambda, currying, closure, monad. Effectively, these type of namings puts a extra barrier for normal people to use the language. The language's features and constructions, are not necessarily difficult, but practically speaking, these upfront terms are part of the reason these languages remain in a small academic or hobbyist community.

Lisp's cons, car, cdr, are based on computer hardware (this particular naming, caused a major damage to the lisp language to this day). (Other examples: pop, push, shift are based on particular data structure concept in computer science called “stack” (as in, stack of books). Grep is from Global Regular Expression Print, while Regular Expression is from theoretical computer science of Automata (whose etymology is from the meaning of “self-operating machine”). Thru the several levels of contexts and term borrowing, the extremely simple and clear term “string patterns” became known as “regular expression”. The term “regular expression”, used syntax to match strings in practical languages like perl, has no connection to the original concept of “regular expression” in the theory of “Automata” whatsoever. The name regex has done major hidden damage to the computing industry, in the sense that if it have just called it “string patterns”, then a lot explanations, literature, confusions, would have been avoided.)

(Note: Keywords or functions in Mathematica are not necessarily always best named. Nor are there always one absolute choice as best, as there are many other considerations, such as the force of wide existing convention, the context where the function are used, brevity, limitations of English language, different scientific context (For example, math, physics, engineering), or even human preferences.)

Here are the relevant essays on many of the issues regarding the importance and effects of terminology's quality.