Programing Language Design: the Hack of Bitmask Used as Boolean Parameters

,

In this article, i explain how the use of bit masks is a hack in many imperative languages.

Often, a function will need to take many True/False parameters. For example, suppose i have a function that can draw a rainbow, and each color of the rainbow can be turned on or off individually. My function specification can be of this form: rainbow(red, orange, yellow, green, blue, violet, purple). Each parameter is a true or false value. So, to draw a rainbow with only red and yellow stripes on, one would code, for example rainbow(t,f,t,f,f,f,f), where “t” stands for true and “f” stands for false. (or, similar values for the true/false of the language's boolean system)

The problem with this simple approach is that when a function has too many parameters, “which position means what” becomes difficult to remember and manage. Imagine: MissleControl(t,f,t,t,f,f,f,t,f,t,t,t,t,f,f,f,f,t,f,f). Alternatively, a high-level language may provide a system for named parameters. So, for example, the function may be called like this with 2 arguments rainbow(red:t, yellow:t), meaning, supply true values to the parameters named “red” and “yellow”. Parameters not given may automatically assumed to have false values by default. Similarly, the language can simply have the function call look like this: rainbow(red, yellow), where omitted parameter names simply means false.

In many low-level languages, the need for the situation where a function has many true/false parameters, is dealt with by using a concept of bit-mask that came from low-level languages. The way to call this rainbow function would now look like this: rainbow(red | yellow). On the surface, it seems just a syntax variation. But actually, the “red” and “yellow” here are global constants of type integer, defined by the language, and the | is actually a bit-wise binary operator. To explain this to a educated person (⁖ a mathematician) but who are not a professional programer, it gets a bit complex as one has to drag in binary notation, boolean operation on binary notation realized as a sequence of slots, and the language designer's laziness in resorting to these instead of a high-level interface of named parameters.

The hack of using the so-called bit-mask as a interface for functions that need named parameters, is similar to languages using 1 and 0 as the true/false symbols in its boolean system, and languages using the or operator || as a method of nested “if else” program flow constructs. The problem with these hacks, is that they jam logically disparate semantics into the same construct. Their effects is making the source code more difficult to read, and thus increased programer error.

It may seem like nitpicking to say that it is a hack. However, when many such seemingly trivially improper designs appear in a language, adds up to the language's illness, and overall making the language difficult to learn, difficult to read, difficult to extend, increase programing errors, and most importantly, reduce a clear understanding of key concepts.

Unix and C, are the primary perpetrator of this sin. Due to their “$free$” and “speedy” and “simplistic” nature as cigarettes given to children, have passed these designs to many imperative languages and left programers not understanding the basic issues of a function's parameters and named parameters.

Examples of using bitmask as a hack:

A example of confusion about function's parameters is exhibited in unix's bewildering, inconsistent syntaxes in its command line tools's ways of taking arguments. (some parameter influence other parameters. Argument order sometimes matter, sometimes doesn't, sometimes causing unintended output and sometimes causing syntax error. Named parameters sometimes have the names optional(!). Named parameters that are predicates sometimes act by their presence along, sometimes by their value, sometimes accept both, sometimes causes syntax error. Some defaults are supplied to unnamed parameters, and some are to named parameters. Some parameters has synonyms. …)

For another example in a more modern language, is Python's re.search() function for text pattern matching. Its optional third parameter is a bitmask. (See: Regular Expressions in Python)

As a example of not clearly understanding a function's parameters and the need and role of named parameters in computing languages, Python's “sorted” function as well as its “lambda” construct are both victims.

Further readings:

If a bit-field appears as a function's parameter, it is not necessarily a hack. For example, functions dealing wiht IP addresses or otherwise TCP/IP protocols, where the networking protocol primarily deals with bits, bytes, and bitmasks. Other examples would be functions in a byte processing package, or binary logic algorithm, … etc.

blog comments powered by Disqus