LispForm

By Xah Lee. Date:

This is the 6th weekly posting of a series of expositions on Mathematica expressions, trees, and index sets. I plan to continue this series untill all Mathematica tree functions are covered (i.e. Mathematica book, 2.1 - 2.2). So far it's all preliminary work. Today's topic is on a pair of functions that convert lisp to and from Mathematica expressions.

Last week we discussed how Mathematica and lisp expressions correspond to trees and index sets. When Mathematica expression contains non-atomic heads, it is often helpful to see the expression in lisp form, which reveals the expression's structure visually. Thus, we'll write a LispForm function that serves similar purpose as TreeForm. We'll also write a LispToMathematica function so we can convert expressions in both directions.

Here's the code for LispForm:

Clear[lispFormInternal,LispForm];

LispForm::"usage"=
  "LispForm[expr] returns boxes that represents expr in the Lisp language form.
Related: LispToMathematicaExpression. Example: ToString@LispForm@Array[f,{2,2}]";

lispFormInternal[expr_?AtomQ]:=ToString@expr;

lispFormInternal[expr_]:=
  RowBox[{"(",Sequence@@(lispFormInternal/@Level[expr,{1},Heads->True]),
      ")"}];

LispForm[expr_]:=
  Module[{rational,complex},
    DisplayForm@
      StyleBox[lispFormInternal[
            expr//.{Rational[a_,b_]:>rational[a,b],
                Complex[a_,b_]:>complex[a,b]}]/.{
            ToString@rational->"Rational",ToString@complex->"Complex"},
        AutoSpacing->False]];

The algorithm for LispForm is very simple. The two rules of lispFormInternal are the main code. The rule of LispForm[expr_] is an interface, serving as a gate. It first replaces symbols Complex and Rational in expr to two temporary symbols, then passes the result to lispFormInternal, and finally replaces back the Complex and Rational heads in the result. The reason for this extra trip is because objects like Complex[1,2] is actually an atom in the sense AtomQ@Complex[1,2] returns True. Similarly for Rational. If this step is not taken, then expresions containing Complex or Rational objects will not be converted correctly because the line lispFormInternal[expr_?AtomQ]:=ToString@expr; stops the conversion prematurely. For example,

lispFormInternal[Rational[1,2]] will return the argument unchanged, but the correct result should be (Rational 1 2).

Here's the code for LispToMathematicaExpression:

Clear[LispToMathematicaExpression];

LispToMathematicaExpression::"usage"=
  "LispToMathematicaExpression[LispExpressionString] returns the Mathematica form
of the lisp expression. All symbols in lisp expr are turned into string. '() and ()
are turned into nil. The lisp expression is not expected to contain strings. Note:
LispToMathematicaExpression returns a Mathematica structured expression, not a
string. Related: LispForm. Example: LispToMathematicaExpression[\"(define fact
(lambda (n) (if (eq? n 1) 1 (* n (fact (- n 1))))))\"]";

LispToMathematicaExpression[expr_String]:=
  Module[{tempSymb},(ReplaceRepeated[#,tempSymb[hh_,rest___]->hh[rest]]&)@
      ToExpression@(
            StringReplace[#,{"["->StringJoin[ToString@tempSymb,"["]}]&)@
          Function[{sexp},
              With[{atomPositionsInString=(DeleteCases[#,"delete me",{1}]&)@
                      Map[If[First@#+1>Last@#-1,"delete me",#+{1,-1}]&,(
                            Partition[#,2,1]&)@
                          First@Transpose@
                              StringPosition[sexp,{"[","]",","}]]},(
                  StringReplacePart[
                    sexp,(StringJoin["\"",StringTake[sexp,#],"\""]&)/@
                      atomPositionsInString,atomPositionsInString])]]@(
                StringReplace[#,{"( "->"("," )"->")","("->"[",")"->"]",
                      " "->","}]&)@(
                  FixedPoint[StringReplace[#,{"  "->" "}]&,#]&)@(
                    StringReplace[#,{"\t"->" ","\n"->" ","'()"->"nil",
                          "()"->"nil"}]&)@expr];

Description of algorithm used:

We are given a lisp expression in string form. For example, "(define fact (lambda (n) (if (eq? n 1) 1 (* n (fact (- n 1))))))"

1. Replace all white spaces to a singe space, replace spaces to comma, replace parenthesis to brackets, replace spaces around parenthesis. For example, ( + 1 2) becomes [+,1,2]. (string quotes are omitted in this description for clarity.)

2. Turn all atoms to string. For example, [+,1,2] become ["+","1","2"]. This is done so that all atoms in the eventual expression does not contain atoms foreign to Mathematica. For example, +[1,2] is not a legal expression, but "+"[1,2] is.

3. To convert the whole string to a Mathematica expression, add a temporary head. For example, ["+","1","2"] becomes tempHead["+","1","2"]. The latter is converted to an expression, no longer a string.

4. Delete the tempHead and use the first element as head. i.e. tempHead["+","1","2"] becomes "+"["1","2"].

5. Done.

Coding explained: The coding correspondes to its algorithm fairly well. tempSymb is a unique temporary symbol used for temporary head. The code is a one liner of the form a@b@c@...@arg. The steps corresponds to the functions from right to left.


Here are some examples of using these functions. In the following, expr1 is an array of dimensions {2,2,2}, in the sense that each node has two children at every generation for 3 generations. It is generated by FullArrayExpressionGenerator[{1,1,1},Heads->True]. We passed argument {1,1,1} because recall that we count branches starting at 0.

expr2 is the same array but using Head->False (Mathematica) intepretation to grow the tree. That is, Heads do not give birth. The following printout reveals much of their difference. (Out labels are removed for clarity.)

In[160]:=
 expr1=FullArrayExpressionGenerator[{1,1,1},Heads->True]
 expr2=FullArrayExpressionGenerator[{1,1,1},Heads->False]

 0[0][0[0]][0[0][0[0]]]

 0[0[0[0]]]

In[162]:=
 ToString@LispForm@expr1
 ToString@LispForm@expr2

 "(((0 0) (0 0)) ((0 0) (0 0)))"

 "(0 (0 (0 0)))"

In[164]:=
 expr1CompleteIndexSet=
   Sort@CompleteIndexSet@(First/@ExpressionToIndexSet@expr1)
 expr2CompleteIndexSet=
   Sort@CompleteIndexSet@(First/@ExpressionToIndexSet@expr2)

 {{},{0},{1},{0,0},{0,1},{1,0},{1,1},{0,0,0},{0,0,1},{0,1,0},{0,1,1},{1,0,0},{
     1,0,1},{1,1,0},{1,1,1}}

 {{},{0},{1},{1,0},{1,1},{1,1,0},{1,1,1}}

In[166]:=
 LeavesIndexSet@expr1CompleteIndexSet
 LeavesIndexSet@expr2CompleteIndexSet

 {{0,0,0},{0,0,1},{0,1,0},{0,1,1},{1,0,0},{1,0,1},{1,1,0},{1,1,1}}

 {{0},{1,0},{1,1,0},{1,1,1}}

In[168]:=
 MinimumIndexSet@expr1CompleteIndexSet
 MinimumIndexSet@expr2CompleteIndexSet

 {{0,0,1},{0,1,1},{1,0,1},{1,1,1}}

 {{1,1,1}}

Some of the functions used above have not been posted here yet. I'll probably write about them later, or put up a notebook on WWW.

Here a test for LispToMathematicaExpression.

In[171]:=
LispToMathematicaExpression[
  "(define fact (lambda (n) (if (eq? n 1) 1 (* n (fact (- n 1))))))"]

Out[171]=
"define"["fact",
  "lambda"["n"[],"if"["eq?"["n","1"],"1","*"["n","fact"["-"["n","1"]]]]]]

Finally,

ToString@LispToMathematicaExpression@ToString@LispForm@expr===
  ToString@expr

and

ToString@LispForm@LispToMathematicaExpression@sexpString===
    sexpString

should always be true. Please let me know bugs. Thanks.

Note:

LispForm is written using v.3 functions such as StyleForm and RowBox. I havn't considered how should it be written in v.2. The definition given is more or less a temporary hack, because I do not have a thorough understanding of Mathematica's formating engine and conventions. Also, I havn't considered evalutation issues. That is, the argument are evaluated before they're passed to LispForm. This may not be desirable. Considering evalutation issue will probably complicate the code. The current definition is good in the sene that it works. Mathematica gurus please feel free to correct or expound.

Question: Are there other objects like Complex and Rational? i.e. AtomQ returns True but LeafCount returns more than 1. (looks like we can write the code and see)