What is Expressiveness in Programing Languages

By Xah Lee. Date: 2005-02-28. Last updated: 2006-03-28.

In languages human or computer, there's a notion of expressiveness.

English for example, is very expressive in manifestation, witness all the poetry and implications and allusions and connotations and diction. There are a myriad ways to say one thing, fuzzy and warm and all. But when we look at what things it can say, its power of expression with respect to meaning, or its efficiency or precision, we find natural languages incapable.

These can be seen thru several means. A sure way is thru logic, linguistics, and or what's called philosophy of language. One can also glean directly the incapacity and inadequacy of natural languages by studying the artificial language lojban, where one realizes, not only are natural languages incapable in precision and lacking in efficiency, but simply a huge number of things are near impossible to express thru them.

One thing commonly misunderstood in computing industry is the notion of expressiveness. If a language has a vocabulary of (smile, laugh, grin, giggle, chuckle, guffaw, cackle), then that language will not be as expressive, as a language with just (severe, slight, laugh, cry). The former is “expressive” in terms of nuance, where the latter is expressive with respect to meaning.

Similarly, in computer languages, expressiveness is significant with respect to semantics, not syntactical variation.

These two contrasting ideas can be easily seen thru Perl versus Lisp. Perl is a language of syntactical variegation. Lisp on the other hand, is a example of austere syntax uniformity, yet its efficiency and power in expression, with respect to semantics, showcases Perl's poverty in specification.

(Further readings from wikipedia: Philosophy of language, lojban, Mathematical logic )

[see Xah's lojban Tutorial]

Case Examples

In the following, i show concrete examples from various languages to demonstrate the comparative capabilities of expressiveness.

Example: String Patterns in Regex

In Perl, the facilities for finding and replacing a text pattern is this construct $myText =~ s/myPattern/replStr/, where the $myText is a variable of string, the s/// is a regex operator, the “myPattern” is a regex pattern, and the “repStr” is the string to be used when the pattern matches. (when the pattern in $myText matches, $myText is changed.

In Python, the analogous facility is re.sub( myPattern, replStr, myText , myCount), where the replacement string repStr can be a function, and a max number of replacement myCount can be specified. If there is a match, and if replStr is given a function myFunc instead, then a special regex match object is feed to myFunc and it can return different strings based on how the pattern is matched.

This is a instance where the language is more expressive. In Python's regex facilities, there are several other flexibilities not present in Perl. e.g. its regex pattern can be frozen as a compiled object and moved about. And, when there is a match, Python returns a special object that contains all information about the regex, such as the original pattern, the number of matches, the set of matched strings and so on, and this object can be passed around, or have information extracted.

In this case, the flexibilities and facilities of Python's capabilities of textual pattern matching is a instance of superior expressiveness to Perl's.

Example: Exact Arithmetic

In Mathematica, the language support fractions wherever a number is used, and if all numbers are written as fractions (or integers) in a expression (a block of code), the result of the computation is also in exact fractions. (and automatically in a reduced fraction form where common factors in the numerator and denominator are taken out. (Fractions in Mathematica is written as a/b with a and b integers.))

In[1]:=
3/7+4/3

Out[1]=
37/21

In[2]:=
3/7+4/3.
(* notice the decimal in the second denominator*)

Out[2]=
1.7619

In[3]:=
Table[n/(n+1),{n,1,10,1/2}]
(* generate a list of n/(n+1), with n going from 1 to 10, in steps of 1/2*)

Out[3]=
{1/2, 3/5, 2/3, 5/7, 3/4, 7/9, 4/5, 9/11, 5/6, 11/13, 6/7,
    13/15, 7/8, 15/17, 8/9, 17/19, 9/10, 19/21, 10/11}

If the programer does not need exact arithmetic, he can simply input one of his number in a decimal form (For example, write 1.5 instead of 3/2), or use the “N” function around the code (For example, N[π]), so that the result will be in a decimal form. The reason, that a programer even has to distinguish the concept of numbers in the form of Exact number and approximate number, is due to a engineering/technology fact, that approximate numbers are far more easier to compute than exact arithmetics. (Specifically: numbers are turned into binary digits of uniform length (such as 16 bits, 32 bits, of types typically named “int”, “long”, “float”, “double”, to facilitate fast calculation))

Such a built-in feature of exact arithmetics in the language, where the programer do not have to think about the compiler engineering issues of how numbers are presented in the hardware, is a powerful feature in language. Besides Mathematica, Lisp and Haskell and some other functional languages, also support this in a transparent way. In most imperative languages such as C, C++, Java, Perl, Python, JavaScript, the programer are presented with the extraneous concept from computer engineering such as “long”, “float”, “double”. A Practical consequence of this is that, these languages are harder to use, and in particular very cumbersome or unsuitable in the field of scientific computing where exact arithmetics is needed.

The transparent support for exact arithmetics, is a example of expressiveness of the language.

Example: Range

In Perl, there's a range construct (a..b) that returns a list of numbers from a to b. However, this construct cannot generate a list with regular intervals such as (1, 3, 5, 7, 9). Nor would it generate decreasing lists such as (4, 3, 2, 1) when a > b.

In Python, its range function range(a,b,c) returns a range of numbers from a to b with steps c. e.g. range(3,-5,-2) returns [3, 2, 1, -1, -3]. The arguments can be negative, however, they all must be integers.

In the Mathematica language, there's also a range function Range[a,b,c]. Here, the arguments can be either real numbers or exact fractions. If one of the input argument is a decimal, then the computation is done as machine precision approximations and the result list's elements are also in decimals. If all inputs are in fractions (including integers), returned list's elements will also be exact fractions.

In[14]:=
Range[1,-3,-1/2]

Out[14]=
{1, 1/2, 0, -1/2, -1, -3/2, -2, -5/2, -3}

In[15]:=
Range[1,-3,-0.5]

Out[15]=
{1,0.5,0.,-0.5,-1.,-1.5,-2.,-2.5,-3.}

Mathematica's Range function, showing that its flexible parameters for accepting negative increment value, and the input can be exact fractions or decimals, with corresponding results of exact arithmetic or approximations.

Example: Rich and Systematic Function Parameters

In many languages, there is a function called “map”. Usually it takes the form map(f, myList) and returns a list where f is applied to every element in myList. e.g. here is a example from Python and Perl and lisp:

# -*- coding: utf-8 -*-
# python 2

map( lambda x:x**2 , [1,2,3])

# -*- coding: utf-8 -*-
# perl

map {$_**2} (1,2,3);

; -*- coding: utf-8 -*-
; emacs lisp

(mapcar (lambda (x) (expt x 2)) (list 1 2 3))」

In the Mathematica language, its map function takes a optional third argument, which specifies the level of the list to map to. In the following example, f is mapped to the leafs of the list {1,2,{3,4}}.

In[1]:=
Map[Function[#^2],{1,2,{3,4}}, {-1}]

Out[2]=
{1,4,{9,16}}

The expressive power of a language does not solely lies with its functions taking more options. Such is only a small aspect of expressibility. However, if a language's functions are designed such that they provide important, useful features in a consistent scheme, certainly makes the language much more expressive.

Example: Symbolic Computation

Lisp differs from most imperative programing languages in that it deals with symbols. What does this mean?

In imperative languages, a value can be assigned a name, and this name is called a variable. e.g. x=3, and whenever this “name” is encountered, it is evaluated to its value. It does not make any sense, to have variables without a assigned value. That is, the “x” is not useful and cannot be used until you assign something to it.

However, in lisp, there is a concept of Symbols. As a way of explanation, a “variable” needs not to be something assigned of a value. Symbols can stand by themselves in the language. And, when a symbol is assigned a value, the symbol can retain its symbolic form without becoming a value.

This means that in lisp, “variables” can be manipulated in its un-evaluated state. The situation is like the need for the “evaluate” command in many languages, where the programer can built code as strings and do evaluate(myCodeString) to achieve meta-programing.

For example, in imperatives languages once you defined x=3, you cannot manipulate the variable “x” because it gets evaluated to 3 right away. If you want, you have to build a string "x" and manipulate this string, then finally use something like evaluate(myCodeString) to achieve the effect. If the imperative language does provide a evaluate() function, its use breaks down quickly because the language is not designed for doing it. It's extremely slow, and impossible to debug, because there lacks facilities to deal with such meta programing.

In lisp, variable's unevaluated form are always available. One just put a apostrophe ' in front of it. In x=3, the x is a variable in the context of the code logic, x is a name of the variable in the context of meaning analysis, and x is a symbol in the context of the programing language. This Symbols concept is foreign to imperative languages. It is also why lisp are known as symbolic languages. Such makes meta-programing possible.

The power of symbolic processing comes when, for example, when you take user input as code, or need to manipulate math formulas, or writing programs that manipulates the source code, or generate programs on the fly. These are often needed in advanced programing called Artificial Intelligence. This is also the reason, why lisp's “macro” facility is a powerful feature unlike any so-called “pre-processors” or “templates” in imperative languages.

Mathematica for example, is sometimes known as a Computer Algebra System. It too, works with symbols in the language. They can be manipulated, transformed, assigned a value, evaluated, hold unevaluated etc.

One way for a programer familiar with only imperative languages to understand symbols, is to think of computing with strings, such as which Perl and Python are well known for. With strings, one can join two strings, select sub strings, use string pattern (regex) to transform strings, split a string into a list for more powerful manipulation, or dump a list back into string (aka “serialization”), and use evaluate() to make the string alive. Now imagine all these strings need not be strings but as symbols in the language, where the entire language works in them and with them, not just string functions. That is symbolic computation.

Here we see, a expressibility unseen in non-lisp family of languages.

Example: Expressiveness in Syntax Variability

See: The Concepts and Confusions of Prefix, Infix, Postfix and Fully Functional Notations

Example: Defining Linear Algebra's Normalize Function

See: A Example of Mathematica's Expressiveness .

Note: This essay is incomplete. There are some important criterions that are as important than the above. e.g. what typically called dynamic typed languages, or scripting languages. The characteristics of these is that they don't require the programer to have a concept of some computer engineering byproduct (such as int, float, double, non-algebraic data types, declarations), and consequently, typically it is few times faster to develop in these so-called scripting languages (python, ruby, JavaScript etc) than what's typically known as compiled languages (java, C, etc.)

A good exposition on this topic is Scripting: Higher Level Programming for the 21st Century 1998-03 By John K Ousterhout. At Scripting: Higher Level Programming for the 21st Century