Why List Comprehension is Bad

By Xah Lee. Date: . Last updated: .

List Comprehension is a little language. It's an EXCEPTION in SYNTAX and SEMANTICS. Like, YE WHO ENTER HERE, ALL RULES NO APPLY!

List Comprehension destroys syntax algebra and functional programing. Here's why.

What is List Comprehension?

Here is a example of List Comprehension in python:

xx = [2 * n for n in range(0, 9) if ((n % 2) == 0)]
print(xx)

# prints [0, 4, 8, 12, 16]

It generates a list from 0 to 8 by range(0,9), then remove the odd numbers by ( (n % 2) == 0), then multiply each element by 2 in 2*n, then returns a list.

Python's list comprehension syntax has this form:

[expression for var_name in list if true_false_expr]

In summary, it is a special syntax for generating a list, and allows programers to also filter and apply a function to the list, but all done using one expression.

In functional notation, list comprehension is doing this:

map( f, filter(list, predicate))

Other languages's list comprehension are similiar. Here are some examples from Wikipedia. In the following, the filter used is x^2 > 3, and the 2*x is applied to the result.

Here is how Wikipedia explains List comprehension. Quote:

A list comprehension is a syntactic construct available in some programming languages for creating a list based on existing lists.

The following features makes up list comprehension:

Why is List Comprehension Harmful?

List Comprehension is a opaque jargon; It hampers communication, and encourage misunderstanding. (the phrase is borrowed from math.)

List Comprehension is a redundant concept in programing. It is a very simple list generator. It can be easily done by map and filter functions e.g. python

map(lambda x: 2 * x, filter(lambda x: x % 2 == 0, range(9)))

or by a loop, e.g. perl:

for (0..9) { if ( ($_ % 2) == 0) {push @result, $_*2 }}

The special syntax of List Comprehension as it exists in many langs, are not necessary. If a special purpose function is preferred, then it can simply be a plain function, e.g genlist(function, list, predicate).

Map + Filter = List Comprehension Semantics

The list comprehension semantics is not necessary. A better way and more in sync with functional lang spirit, is simply to combine plain functions:

map( f, filter(list, predicate))

Here is the python syntax:

xx = map(lambda x: 2 * x, filter(lambda x: x % 2 == 0, range(9)))

print(list(xx))
# [0, 4, 8, 12, 16]

In Mathematica, this can be written as:

Map[ #*2 &, Select[Range@9, EvenQ]]

In Mathematica, arithemetic operations can be applied to list directely without using Map explicitly, so the above can be written as:

Select[Range@9, EvenQ] * 2

It can also be written in a linear prefix style:

(#*2 &) @ (Select[#, EvenQ]&) @ Range @ 9

or linear postfix style:

9 // Range  // (Select[#, EvenQ]&)  // (#*2 &)

In the above, we sequence functions together, as in unix pipe. We start with 9, then apply “Range” to it to get a list from 1 to 9, then apply a function that filters out odd numbers, then we apply a function to multiply each number by 2. The “//” sign is a postfix notation, analogous to bash's “|”, and “@” is a prefix notation that's the reverse of “|”.

[see Short Intro of Mathematica For Lisp Programers]

List Comprehension Function Without Special Syntax

Suppose we want some “list comprehension” feature in a functional lang. Normally, by default this can be done by

map(func, filter(inputList, Predicate))

but perhaps this usage is so frequent that we want to create a new function for it, to make it more convenient. As a single standalone function, it is easier for compiler to optimize. So, we might create a function list comprehension like this:

list_comprehension(func, inputList, Predicate)

this is about whether a lang should create a new convenient function that otherwise require 3 function combinations. Common Lisp vs Scheme Lisp are the typical example of extreme opposites.

Note, there is no new syntax involved.

Suppose, someone argues that

For instance, this is far more convenient:

xx = [x + 1 for x in [1, 2, 3, 4, 5] if x % 2 == 0]
print(xx)
# [3, 5]

than this:

xx = map(lambda x: x + 1, filter(lambda x: x % 2 == 0, [1, 2, 3, 4, 5]))
print(list(xx))
# [3, 5]

How about this:

genlist(func, inputList, P)

compared to

[func for myVar in inputList if P]

the functional form is:

Issues and Decisions on Creating a New Function

Suppose we decided that generating list by a filter is so frequently used that it is worthwhile to create a new function.

list_comprehension(func, inputList, Predicate)

Now, in functional langs, in general a design principle is that you want to reduce the number of function unless you really need it. Because, any combination of list related functions could potentially be a new function in your lang. So, if we really think list comprehension is useful, we might want to generalize it to maximize a function's utility.

For example, we might consider, is it worthwhile to add a 4th parameter so user can specify just returning the first n? example

list_comprehension(func, inputList, Predicate, n)

what about partition the list to m sublists?

list_comprehension(func, inputList, Predicate, n, m)

what about actually more generalized partition, by m sublist then by m1 sublist then by m2 sublist?

list_comprehension(func, inputList, Predicate, n, list(m,m1,m2,…))

what about sorting? maybe that's always used together when you need a list?

list_comprehension(func, inputList, Predicate, n, list(m,m1,m2,…), sortPredcate)

When we apply a function to a list, sometimes we want to map parallel to each branch. (as in Common Lisp's “map”.) So, perhaps we should have another optional parameter, example

list_comprehension(func, inputList, Predicate, n, list(m,m1,m2,…), sortPredcate,
mapBranch:True)

what if …

you see, each of these or combination of these can be done by default in the lang by sequencing one or more functions (i.e. composition). But when we create a new function, we really should think a lot about its justification, because otherwise the lang becomes a bag of functions that are non-essential, confusing.

So the question is, is generating a list really that much needed? And, if so, why should we create a special syntax such as [ expr for var in list if P] than list_comprehension(func, list, P)?

Also note, that list comprehension is not capable of generating arbitrary nested list. For a example of a much powerful list generator that can generate arbitrary nested tree, see Mathematica's Table function:

List Comprehension, Ad Hoc Syntax, Imperative Languages

The real “advantage” of list comprehension, is its special syntax, being consistent with imperative language's ad hoc syntax nature. Because, in imperative languages, it is the tradition to have lots of ad hoc syntax and keywords for each construct. For example, i++, ++i, for(;;){}, while(){}, 0x123, expr1 ? expr2 : expr3, sprint(…%s…,…), etc.

For those who find imperative language's syntax good, then perhaps “list comprehension” is good, because it adds another idiosyncratic syntax to the lang. The ad hoc syntax and redundant keywords aids in reading code. For example, in the syntax [… for … in …], the square bracket there gives programer a clue that the thing's value is a list, and the keywords {for, in} aids in recognizing parts of the construct's purpose. Compare this to functional syntax genlist(…, …, …), there are no syntax or keyword that aids reading. The programer has to understand the function's parameters.

Bad Jargon and How To Judge a Jargon

Someone wrote:

The term “list comprehension” is intuitive, it's based on math set notation.

The jargon “list comprehension” is opaque. It hampers communication and increases misunderstanding. A better name is simply “list generator”.

What is your basis in saying that “List Comprehension” is intuitive? Any statics, survey, research, references?

To put this in context, are you saying that “lambda”, is also intuitive? “let” is intuitive? “for” is intuitive? “when” is intuitive? I mean, give your evaluation of some common computer language terminologies, and tell us which you think are good and which are bad, so we have some context to judge your expertise of terminology.

For example, let us know, in your view, how good are terms: currying, lisp1 lisp2, tail recursion, closure, subroutine, command, object. Or, perhaps expound on the comparative merits and meaning on the terms “module” vs “package” vs “add-on” vs “library”. I would like to see your view on this with at least few paragraphs of analysis on each. If you, say, write a essay that's at least 1k words on this topic, then we all can make some judgment of your familiarity and understanding in this area.

Also, “being intuitive” is not the only aspect to consider whether a term is good or bad. For example, emacs's use of the term “frame”. It's quite intuitive, because frame is a common english word, everyone understands. We have door frame, window frame, picture frame, are all analogous to emacs's “frame” on a computer. However, by some turn of history, in computer software we call such as “window” now, and by happenstance the term “window” also has a technical meaning in emacs, what we call “split window” or “frame” today. So, in emacs, the term “frame” and “window” is confusing, because emacs's “frame” is what we call “window”, while emacs's “window” is what we call a frame. So here, is a example, that even when a term is intuitive, it can still be bad.

As another example, common understanding by the target group the term is to be used is also a important aspect. For example, the term “lambda”, which is a name of greek character, does not convey well what we use it for. The word's meaning by itself has no connection to the concept of function. The “λ” happens to be used by a logician as a shorthand notation in his study of what's called “lambda calculus” (the “calculus” part is basically 1700's terminology for a systematic science, especially related to mechanical reasoning). However, the term “lambda” used in this way in computer science and programing has been long and wide, around 50 years in recent history (and more back if we trace origins). So, because of established use, here it may decrease the level of what we might think of it as a bad jargon, by the fact that it already is in common usage. Even so, note that just because a term has established usage, but if the term itself is very bad in many other aspects, it may still warrant a need for change. For one example of a reason, the jargon will be a learning curve problem for all new generations.

You see, when you judge a terminology, you have to consider many aspects. It is quite involved. When judging a jargon, some question you want to ask are:

• Does the jargon convey its meaning by the word itself? (i.e. whether the jargon as a word is effective in communication)

• How long has been the jargon in use?

• Do people in the community understand the jargon? (For example, more scientifically: what percentage?)

Each of these sample questions can get quite involved. For example, it calls for:

Also, you may not know, there are bodies of professional scientists who work on terminologies for publication. It is not something like “O think it's good, becus it is intuitive to me.”.

Acknowledgement

Thanks to w_a_x_man for this ruby code:

(0..9).select{|n| n.even?}.map{|n| 2*n}

Note that this is not list comprehension, because it does not use a special syntax. But it captures the map(f, filter(list,predicate)) in ruby style.

This article hit reddit. At: Source www.reddit.com