In this article, i explain how the use of bit masks is a hack in many imperative languages.
Often, a function will need to take many
True/False parameters. For example, suppose i have a function that can
draw a rainbow, and each color of the rainbow can be turned on or off
individually. My function specification can be of this form:
rainbow(red, orange, yellow, green, blue, violet, purple). Each
parameter is a true or false value. So, to draw a rainbow with only
red and yellow stripes on, one would code, for example
rainbow(t,f,t,f,f,f,f), where “t” stands for true and “f” stands for
false. (or, similar values for the true/false of the language's
boolean system)
The problem with this simple approach is that when a function has too
many parameters, “which position means what” becomes difficult to
remember and manage.
Imagine: MissleControl(t,f,t,t,f,f,f,t,f,t,t,t,t,f,f,f,f,t,f,f).
Alternatively, a high-level language may provide
a system for named parameters. So, for example, the function may be
called like this with 2 arguments rainbow(red:t, yellow:t), meaning,
supply true values to the parameters named “red” and “yellow”.
Parameters not given may automatically assumed to have false values by
default. Similarly, the language can simply have the function call
look like this: rainbow(red, yellow), where omitted parameter names
simply means false.
In many low-level languages, the need for the situation where a
function has many true/false parameters, is dealt with by using a
concept of bit-mask that came from low-level languages. The way to
call this rainbow function would now look like this:
rainbow(red | yellow). On the surface, it seems just a syntax
variation. But actually, the “red” and “yellow” here are global
constants of type integer, defined by the language, and the | is
actually a bit-wise binary operator. To explain this to a educated
person (⁖ a mathematician) but who are not a professional
programer, it gets a bit complex as one has to drag in binary
notation, boolean operation on binary notation realized as a sequence
of slots, and the language designer's laziness in resorting to
these instead of a high-level interface of named parameters.
The hack of using the so-called bit-mask as a interface for functions
that need named parameters, is similar to languages using 1 and 0
as the true/false symbols in its boolean system, and languages using
the or operator || as a method of nested “if else” program flow
constructs. The problem with these hacks, is that they jam logically
disparate semantics into the same construct. Their effects is making
the source code more difficult to read, and thus increased programer
error.
It may seem like nitpicking to say that it is a hack. However, when many such seemingly trivially improper designs appear in a language, adds up to the language's illness, and overall making the language difficult to learn, difficult to read, difficult to extend, increase programing errors, and most importantly, reduce a clear understanding of key concepts.
Unix and C, are the primary perpetrator of this sin. Due to their “$free$” and “speedy” and “simplistic” nature as cigarettes given to children, have passed these designs to many imperative languages and left programers not understanding the basic issues of a function's parameters and named parameters.
Examples of using bitmask as a hack:
A example of confusion about function's parameters is exhibited in unix's bewildering, inconsistent syntaxes in its command line tools's ways of taking arguments. (some parameter influence other parameters. Argument order sometimes matter, sometimes doesn't, sometimes causing unintended output and sometimes causing syntax error. Named parameters sometimes have the names optional(!). Named parameters that are predicates sometimes act by their presence along, sometimes by their value, sometimes accept both, sometimes causes syntax error. Some defaults are supplied to unnamed parameters, and some are to named parameters. Some parameters has synonyms. …)
For another example in a more modern language, is Python's
re.search() function for text pattern matching. Its optional third
parameter is a bitmask. (See: Regular Expressions in Python)
As a example of not clearly understanding a function's parameters and the need and role of named parameters in computing languages, Python's “sorted” function as well as its “lambda” construct are both victims.
Further readings:
If a bit-field appears as a function's parameter, it is not necessarily a hack. For example, functions dealing wiht IP addresses or otherwise TCP/IP protocols, where the networking protocol primarily deals with bits, bytes, and bitmasks. Other examples would be functions in a byte processing package, or binary logic algorithm, … etc.