How Purely Nested Notation Limits the Language's Utility

By Xah Lee. Date: .

How Purely Nested Notation Limits the Language's Utility

There is a common complain by programers about lisp's notation, of nested parenthesis, being unnatural or difficult to read. Long time lisp programers, often counter, that it is a matter of conditioning, and or blaming the use of “inferior” text editors that are not designed to display nested notations. In the following, i describe how lisp notation is actually a problem, in several levels.

(1) Some 99% of programers are not used to the nested parenthesis syntax. This is a practical problem. On this aspect alone, lisp's syntax can be considered a problem.

(2) Arguably, the pure nested syntax is not natural for human to read. Long time lispers may disagree on this point.

(3) Most importantly, a pure nested syntax discourages frequent or advanced use of function sequencing or compositions. This aspect is the most devastating.

The issue that most programers are not comfortable with nested notation is well known. It is not a technical issue. Whether it is considered a problem of the lisp language is a matter of philosophical disposition.

The issue of heavy nested not being natural for human to read, may be debatable. I do think, that deep nesting is a problem to the programer. Here's a example of 2 blocks of code that are syntactically equivalent in the Mathematica language:

vectorAngle[{a1_, a2_}] := Module[{x, y},
    {x, y} = {a1, a2}/Sqrt[a1^2 + a2^2] // N;
    If[x == 0, If[Sign@y === 1, π/2, -π/2],
      If[y == 0, If[Sign@x === 1, 0, π],
        If[Sign@y === 1, ArcCos@x, 2 π - ArcCos@x]
        ]
      ]
    ]
SetDelayed[vectorAngle[List[Pattern[a1,Blank[]],Pattern[a2,Blank[]]]],
    Module[List[x,y],
      CompoundExpression[
        Set[List[x,y],
          N[Times[List[a1,a2],
              Power[Sqrt[Plus[Power[a1,2],Power[a2,2]]],-1]]]],
        If[Equal[x,0],
          If[SameQ[Sign[y],1],Times[π,Power[2,-1]],
            Times[Times[-1,π],Power[2,-1]]],
          If[Equal[y,0],If[SameQ[Sign[x],1],0,π],
            If[SameQ[Sign[y],1],ArcCos[x],
              Plus[Times[2,π],Times[-1,ArcCos[x]]]]]]]]]

In the latter, it uses a full nested form (called FullForm in Mathematica). This form is isomorphic to lisp's nested parenthesis syntax, token for token (i.e. lisp's (f a b) is Mathematica's f[a,b]). As you can see, this form, by the sheer number of nested brackets, is in practice problematic to read and type. In Mathematica, nobody really program using this syntax. (The FullForm syntax is there, for the same reason of language design principle shared with lisp of “consistency and simplicity”, or the commonly touted lisp advantage of “data is program; program is data”.)

The following shows the same code with tokens transformed into the lisp style.

(SetDelayed
 (vectorAngle (list (Pattern a1 (Blank)) (Pattern a2 (Blank))))
 (let (list x y)
   (progn
    (setq (list x y)
          (N (* (list a1 a2)
                (exp (sqrt (+ (exp a1 2) (exp a2 2))) -1))))
    (if (eq x 0)
        (if (equal (signum y) 1) (* π (exp 2 -1))
          (* (* -1 π) (exp 2 -1)))
      (if (eq y 0) (if (equal (signum x) 1) 0 π)
        (if (equal (signum y) 1) (acos x)
          (+ (* 2 π) (* -1 (acos x)))))))))

Note: The steps to transform the sample Mathematica code in FullForm to lisp's sexp is done by these operations (for those curious):

(1) Move the head inside. (using this regex \([A-Za-z]+\)\[[\1)

(2) Replace bracket by parens. []()

(3) Replace function names to lisp styled names:

The third issue, about how nested syntax seriously discourages frequent use of inline function sequencing, is the most important.

One practical way to see how this is so, is by considering unix's shell syntax. You all know, how convenient and powerful is the unix's pipes. Here are some practical example: ls -al | grep xyz, or cat a b c | grep xyz | sort | uniq.

Now suppose, we get rid of the unix's pipe notation, instead, replace it with a pure functional notation, for example: (uniq (sort (grep xyz (cat a b c)))), or enrich it with a composition function and a pure function construct, so this example can be written as: ((composition uniq sort (lambda (x) (grep xyz x))) (cat a b c)).

You see, how this change, although syntactically equivalent to the pipe “|” (or semantically equivalent in the example using function compositions), but due to the cumbersome nesting of parenthesis, will force a change in the nature of the language by the code programer produces. Namely, the frequency of inline sequencing of functions on the fly will probably be reduced, instead, there will be more code that define functions with temp variables and apply it just once as with traditional languages.

A language's syntax or notation system, has major impact on what kind of code or style or thinking pattern on the language's users. This is a well-known fact for those acquainted with the history of math notations.

The sequential prefix notation such as Haskell's f g h x, Mathematica's f@g@h@x, or postfix as unix's x|h|g|f, Ruby's x.h.g.f, Mathematica's x//h//g//f, are much more convenient and easier to decipher, than (f (g (h x))) or ((composition f g h) x). In real world source code, any of the f, g, h will likely be full of nested parens themselves, because often they are functions constructed inline (aka lambda).

Lisp, by sticking with a almost uniform nested parenthesis notation, it immediately reduces the pattern of sequencing functions, simply because the syntax does not readily lend the programer to it. For programers who are aware of the coding pattern of sequencing functions (aka function chaining, filtering), now either need to think in terms of a separate “composition” construct, and or subject to the much problematic typing and deciphering of nested parenthesis. (in practice, it's mostly done by writing the inline functions as a auxiliary function definitions on their own, then another code block sequence them together. e.g. (defun f …) (defun g …) (defun h …) (f (g (h x))) )

Note: Lisp syntax is actually not a pure nested form. It has ad hoc syntax equivalents such as the quote construct '(a b c), dotted notation for cons, for example: (a . b) for (cons a b), special syntax for quoted vector [1 2 3], splice and partial hold eval, for example: `(,@ x ,@ y ,z 4)), weirded comment syntax #|something|#. Mathematica's FullForm, is actually a version of unadulterated nested notation. For a full discussion, see: Fundamental Problems of Lisp, Syntax Irregularity

For a practical example of this problem, see: LISP Syntax Problem of Piping Functions .