Unix Pipe as Functional Language

,

Found the following juicy interview snippet today:

Is there a connection between the idea of composing programs together from the command line throught pipes and the idea of writing little languages, each for a specific domain?

Alfred Aho: I think there's a connection. Certainly in the early days of Unix, pipes facilitated function composition on the command line. You could take an input, perform some transformation on it, and then pipe the output into another program. …

When you say “function composition”, that brings to mind the mathematical approach of function composition.

Alfred Aho: That's exactly what I mean.

Was that mathematical formalism in mind at the invention of the pipe, or was that a metaphor added later when someone realized it worked the same way?

Alfred Aho: I think it was right there from the start. Doug McIlroy, at least in my book, deserves the credit for pipes. He thought like a mathematician and I think he had this connection right from the start. I think of the Unix command line as a prototypical functional language.

It is from a interview with Alfred Aho, one of the creators of AWK. The source is from this book: Masterminds of Programming: Conversations with the Creators of Major Programming Languages By Federico Biancuzzi et al. amazon

Since about 1998, when i got into the unix programing industry, i see the pipe as a postfix notation, and sequencing pipes as a form of functional programing, but finding it overall extremely badly designed. I've wrote a few essays explaining the functional programing connection and exposing the lousy syntax. (mostly in years around 2000) However, i've never seen another person expressing the idea that unix pipes is a form of postfix notation and functional programing. It is a great satisfaction to see one of the main unix author state so.

funsh functional bash
A comic by Mike Ledger for his project fun.sh

Unix Pipe as Functional Programing

The following email content (slighted edited) is posted to Mac OS X mailing list, 2002-05. Source

From: xah@xahlee.org
Subject: Re: mail handling/conversion between OSes/apps
Date: May 12, 2002 8:41:58 PM PDT
Cc: macosx-talk@omnigroup.com

Yes, unix have this beautiful philosophy. The philosophy is functional programing. For example, define:

power(x,y) := x^y

so “power(3,2)” returns “9”.

Here “power” is a function that takes 2 arguments. First parameter is the number to multiply by itself, the second is the number of times of doing so.

functions can be nested,

f(g(h(x)))

or composed

compose(f,g,h)(x)

Here the “compose” itself is a function, which take other functions as arguments, and the output of compose is a new function that is equivalent to nesting f g h.

Nesting does not necessarily involve nested syntax. For example, f(g(h)) in Mathematica's postfix notation is written like this:

x // h // g // f

or prefix notation:

f @ g @ h @ x

The principle is that everything is either a function definition or function application, and function's behavior is strictly determined by its arguments.

Apple around 1997 or so have this OpenDoc technology, which is similar idea applied more broadly across OS. That is, instead of one monolithic browser or big image editors or other software, but have lots of small tools or components that each does one specific thing and all can call each other or embedded in a application framework as services or the like. For example, in a email app, you can use BBEdit to write you email, use Microsoft's spell checker, use XYZ brand of recorder to record a message, without having to open many applications or use the Finder the way we would do today. This multiplies flexibility. (OpenDoc was killed when Steve Jobs became the iCEO around 1998 and did some serious house-cleansing, against the ghastly anger of Mac developers and fanatics, I'm sure many of you remember this piece of history.)

The unix pipe syntax “|”, is a postfix notation for nesting. example

ps auwwx | awk '{print $2}' | sort -n | xargs echo

in conventional syntax it might look like this:

xargs(  echo, sort(n, awk('print $2', ps(auwwx)))  )

So when you use “pipe” to string many commands in unix, you are doing supreme functional programing. That's why it is so flexible and useful, because each component or function does one thing, and you can combine them in myriad of ways. However, this beautiful functional programing idea, when it is implemented by the unix heads, becomes a f���ing mess. Nothing works and nothing works right.

I don't feel like writing a comprehensive exposition on this at the moment. Here's a quick summary:

Maybe some other day when i'm pissed, i'll write a better exposition on this issue. I've been wanting to write a final-say essay on this for long. Don't feel like it now.

Unix Syntatical and Semantical Stupidity Exposition

The following is posted to a Mac OS X mailing list. Source. Slightly modified here.

From: xah@xahlee.org
Subject: unix inanity: shell env syntax
Date: June 7, 2002 12:00:29 AM PDT
To: macosx-talk@omnigroup.com

arguments are given with a dash prefix. example

ls -a -l

Order does not matter (USUALLY!!). So,

ls -a -l

is the same as

ls -l -a

but arguments can be combined, example

ls -al

means the same thing as

ls -a -l

However, some option consists of more than one character. example

perl -version
perl -help

therefore, the meaning of a option string "-ab" is ad hoc dependent on the program. It can mean two parameters “a” and “b”, or just one parameter named "ab".

Then, often there are two versions of the same optional argument. example

perl -help
perl -h
perl -version
perl -v

this equivalence is dependent for each program.

Different program will disagree on common options. For example, to get the version, here are common varieties:

-v
-V
-version

sometimes v/V stands for "verbose mode", i.e. to output more detail.

Sometimes, if a option is given more than once, then it specifies a degree of that option. For example, some command accept the -v for "verbose", meaning that it will output more detail. Sometimes there are few levels of detail. The number of times a option is given determines the level of detail. ⁖ on Solaris 8,

/usr/ucb/ps -w
/usr/ucb/ps -w -w

Thus, meaning of repeated option may have special meaning depending on the program.

Oftentimes some options automatically turn on or surpress a bunch of others. ⁖ Solaris 8,

/usr/bin/ls -f

When a named optional parameter is of a boolean type, that is a toggle of yes/no, true/false, then, often, instead of taking a boolean value, their sole existence or non-existence define their value. (this is a confusion/infusion of named parameter and optional parameter)

Toggle options are sometimes represented by one option name for yes, while another option name for no. And, when both are present, the behavior is program dependent.

For named options, the syntax for arguments is inconsistent. Some program uses one syntax variation, others require another, some accept more than one syntax variations, for others it's syntax error. example

command -o="myvalue"
command -omyvalue
command -o myvalue

Often one option may have many synonyms…

A example of a better design… (Mathematica, Scheme, Dylan, Python, Ruby… there's quite a lot elegance and practicality yet distinct designs and purposes and styles …)

(recall that unix is a bad design to begin with; it's a donkey shit pile from the beginning and continuation. Again, unix is not simply technically incompetent. If that, then that's easy to improve, and i don't have a problem with, since there are things in one way or another considered horrendous by today's standard like COBOL or FORTRAN or DOS etc. But, unix is a brain-washing idiot-making machine, churning out piles and piles of religiously idiotic and pigheaded keyboard punchers. For EVERY aspects of good engineering methodology improvement or language design progress opportunity, unixers will unanimously turn it down.)

Inevitably someone will ask what's my point. My point in my series of unix-showcasing articles have always been clear for those who study it: Unix is a crime that caused society inordinate harm, and i want unix geeks to wake up and realize it.

Unix Shell Syntax, 2000 〜 2013

as of 2013, the unix shell tool syntax have gone thru more evolution.

note, in 1990s, the GNU introduced the double-dash syntax. ⁖ emacs --no-init-file, in hope to make it readable. Unfortunately, it didn't really catch on. Most commands today do offer the double-dash variant, but only for some options. The double-dash option does not necessarily mean there's a corresponding single-dash one, and vice versa. What happened instead is just a new syntax variation.

during 2000s, a new syntax form became popular, one that has a “action” keyword immediately following the command name. Here are some prominent examples:

One good example of the confuse ball in one command can be seen in Linux's “ps” command. See man ps.

Microsoft PowerShell

Note: Microsoft's new shell programing language, PowerShell (b2006), adopted much of unix shell's syntax and the pipe paradigm, but with consistent design. (see: PowerShell Tutorial)

discussion on Google Plus

blog comments powered by Disqus