Sorting Tree Indexes

By Xah Lee. Date:

This is the 4th posting on a series of exposition on Mathematica expressions, Trees, and Indexes. We'll come to implementing Mathematica structure manipulating function after we've completely understood the isomorphism between expressions, trees, and index sets.

This week's topic is IndexSetSort.

Recall that Mathematica expressions consists of atoms and brackets. Each atom has an index. The full set of indexes of atoms can be shown using the function Function[Position[#,_,{-1}]]. example

Function[Position[#,_,{-1},Heads->True]]@Array[f,{2,2}]

returns:

{{0},{1,0},{1,1,0},{1,1,1},{1,1,2},{1,2,0},{1,2,1},{1,2,2},{2,0},{2,1,0},{2,1,1},{2,1,2},{2,2,0},{2,2,1},{2,2,2}}

Now notice the order returned by the function. In particular, {1,1,0} comes before {2,0}. This is not the standard order returned by Sort. What order is the returned list from Position? A more fundamental question is: if we have a complete index set (of a tree), what kinds of ordering are good?

There are two useful criterions for sorting index set.

• Sort by depth: By the level of an index. i.e. the Length of the index. For example, {2} comes before {1,2}.

• Sort by index: By the index itself. For example, {1,2} comes before {2}.

The order of a sorted index set depends on which criterion we use first. For example, compare the following two sorted (incomplete) index set: {{2},{1,2}} vs. {{1,2},{2}}. The fact we need both criterions together is because neither one uniquely determine an order for all possible indexes. For example, Sort by depth is neutral about {{3},{4}}, sort by index is neutral about {{1,2},{1}}.

Regardless of ordering criterions, there's always the choice of ascending and descending. Thus there are total of 8 possible orderings for our index set: (depth (ascending/descending), index (ascending/descending)) and (index (ascending/descending), depth (ascending/descending)).

The default order by Sort is clearly by depth then by index. Both in ascending fashion. This is a useful ordering. When we deal with trees, it turns out that two other orderings are useful: (index ascending, then depth ascending) and (index ascending, then depth decending). Here are some examples:

Here is a complete index set that includs non-leaves.

indexSet=Function[Position[#,_,{1,-1}]]@Array[f,{2,2}]
{{0},{1,0},{1,1,0},{1,1,1},{1,1,2},{1,1},{1,2,0},{1,2,1},{1,2,2},{1,2},{1},{
2,
    0},{2,1,0},{2,1,1},{2,1,2},{2,1},{2,2,0},{2,2,1},{2,2,2},{2,2},{2}}

Sort by (depth ascending, then index ascending), using Sort[indexSet]:
{{0},{1},{2},{1,0},{1,1},{1,2},{2,0},{2,1},{2,2},{1,1,0},{1,1,1},{1,1,2},{1,
2,
    0},{1,2,1},{1,2,2},{2,1,0},{2,1,1},{2,1,2},{2,2,0},{2,2,1},{2,2,2}}

Sort by (index ascending, then depth ascending):
{{0},{1},{1,0},{1,1},{1,1,0},{1,1,1},{1,1,2},{1,2},{1,2,0},{1,2,1},{1,2,2},{

2},{2,0},{2,1},{2,1,0},{2,1,1},{2,1,2},{2,2},{2,2,0},{2,2,1},{2,2,2}}

Sort by (index ascending, then depth decending):
{{0},{1,0},{1,1,0},{1,1,1},{1,1,2},{1,1},{1,2,0},{1,2,1},{1,2,2},{1,2},{1},{
2,
    0},{2,1,0},{2,1,1},{2,1,2},{2,1},{2,2,0},{2,2,1},{2,2,2},{2,2},{2}}

When we put a function to traverse a tree, it is usually more desirable to visit the leaves before the non-leaves.

In other words, depth first. In our terminology, the sort order is (index ascending, depth decending).

For example, the function may delete things, for which we must work on the leaves first before we visit its parent node. Otherwise, the leaves may no longer exist before we have a chance to visit it (because it's parent node's already gone).

Depth first is exactly what Level, Position, Map...etc. functions take.

For example, if MapAll[(Print[#];ReplaceAll[#,_->0])&,Array[f,{2,2}]] did not visit leaves first, then it would first replace the whole expression by 0 and resulting no subexpressions to test.

If we look at indexes, then these functions uses the order index ascending, then depth decending. It is desirable to have a function IndexSetSort, such that IndexSetSort[indexes,DepthFirst->True] sorts by index ascending, then depth ?d, and IndexSetSort[indexes,DepthFirst->False] sorts by index ascending, then depth ?a. Here are two solutions:

(*IndexSetSort The code is self-explainatory.*)

Clear[IndexSetSort,IndexSetSort2,DepthFirst]; IndexSetSort::"usage"=
  "IndexSetSort[{index1,index2,...},(DepthFirst->True)] sorts the
indexes \ properly as the indexes of an ordered tree. If
DepthFirst->True, then the \ deeper of two indexes of equal order will
be given first. For example, {1,2} comes \ before {1}. Example: \
IndexSetSort[{{1},{2},{1,2},{2,2},{1,2,3}},DepthFirst->False]";

DepthFirst::"usage"=
  "DepthFirst is an option for IndexSetSort. If DepthFirst->True, then
the \ deeper of two indexes of equal order will be given first. e.g.
{1,2} comes \ before {1}.";

Options[IndexSetSort]={DepthFirst->True};

IndexSetSort[indexes_List]:=
  IndexSetSort[indexes,DepthFirst->(DepthFirst/.Options[IndexSetSort])];

IndexSetSort[indexes_List,Rule[DepthFirst,p_]]:=
  Sort[indexes,
    Function[{x,
        y},(If[UnsameQ@@#,OrderedQ@#,
              If[p,Length@x>Length@y,Length@x<Length@y]]&)@(
          Take[#,Min[Length@x,Length@y]]&/@{x,y})]];

(* Here's a version that is faster by using different codes for
DepthFirst True or False. If DepthFirst->False, it's an order faster.*)

Clear[IndexSetSort2];
IndexSetSort2[indexes_List,DepthFirst->True]:=
  Sort[indexes,
    Function[{x,
        y},(If[UnsameQ@@#,OrderedQ@#,Length@x>Length@y]&)@(
          Take[#,Min[Length@x,Length@y]]&/@{x,y})]];

IndexSetSort2[indexes_List,DepthFirst->False]:=
  Module[{depth},depth=Max@(Length/@indexes); Last@Transpose@(

Sort@Transpose@{(Flatten@{#,Table[0,{depth-Length@#}]}&)/@indexes,
                indexes})];

(*test*)

(*DepthFirst->False cases.*)

Clear[li];
Do[li= RandomIndexSet[{0,7},{1,7},{1,50}];
If[Equal@@((#[DepthFirst->False]&)/@{IndexSetSort,IndexSetSort2})//Not,
    Print["problem: ",li]],{200}];

(*DepthFirst->True cases*)
(*this test requires RandomExpressionGenerator, which we'll come to
later.*) Clear[li];
Do[li=Position[RandomExpressionGenerator[Heads->True],_,{-1}];
If[IndexSetSort[li,DepthFirst->True]===li//Not,Print["problem: ",li]],{
    100}];

As usual, there are many different types of exemplary code: by clarity, by speed, by elegance, by extremity... or a combination of the above. If you are so inclined, give us your best version!