Learning Notes on Goto, Continuation, Coroutine

By Xah Lee. Date: 2010-02-05

Some learning notes on Goto, Continuation, Coroutine, in programing languages.

a Semantic Feature of a Language

I realized today that these concepts are not about implementation, hardware, efficiency, nor algorithm, but rather, they are a particular construct, choice of design, in the semantics of computer languages. For example, to express a repetition, some language has “for” loop, some has “while”, some has “do … until”, some relies on “Nest”, “Fold”, “FixedPoint”, most languages have a combination of these, but any single one of them is sufficient with respect to expressing a algorithm. Same thing can be said for constructs like “case”, “which”, “switch”, “if then elseif … elseif …”, all can be expressed by just having a simple “if else” construct. Continuations and Coroutine, are a bit more complex, but they are basically a form of GOTO statement.

More specifically, i realized, that any language without any of goto, continuation, coroutine, does not necessarily mean the language is less powerful, or less expressive, or require more lines of code in practice. In fact, i think a language with them is probably not desirable. I think these are mostly useful in concurrency applications, and languages designed for concurrency have better alternative models of control than any of Goto, Continuation, Coroutine.

These language design issues are quite interesting. As far as i know, there is not any practical, rigorous mathematical foundation about this.

Alternatives and Absence of Constructs

When you code in some lang, suppose it doesn't have “do … until”, but has “while”, it is trivial to code around it. However, in many situations, it is not easy to code one construct using another. If you have coded in several languages, you'll know this. Often the difficulty is due to unfamiliarity with the lang, and you are coding in lang X with lang Y habits. However, it does happen sometimes you are expert in both languages and certain construct is optimal for your task but lang X doesn't have it.

For example, when you code in a lang using nested “for” loop construct and need conditional exit in several places in inner loops, and if the lang does not have any “break” or “continue”, usually it is quite a pain; you have to re-structure your whole nested loop logic. Also, in some functional langs that doesn't have any sort of iterations but only “map” and “nest” constructs that applies to a whole list, it is also a pain in the ass. Sometimes a iteration type of construct is the most clear and optimal. Likewise, sometimes all you need is a Map or Fold, or passing functions as a expression as a argument, but most imperative langs don't have it.

It's important to note here a few issues:

In online forums, most programer's complaints about some lang lacking some construct is due to their unfamiliarity of the lang.
There are situations that some construct is optimal and not in the lang, even for a well designed lang.
There is not a PRACTICAL, rigorous, mathematical model (algebra), that can tell us what construct should be in a lang with a given goal/context, nor even how to measure the expressiveness of a lang.

I've been doing functional programing for 15 years. I consider myself to be among the world's top 500 expert in Mathematica. Over the years i learned a little bit of other functional langs: Scheme, Haskell, OCaml. But never obtained any practical coding expertise in these (except Emacs Lisp). In argument in rowdy discussion forums such as newsgroup comp.lang.scheme, often some tech geekers will throw out terms like continuation and coroutine and taunt you for ignorance. Sometimes i challenge them to explain what these really means. I want them to tell me, mathematically, what these are. I was never able to get any answer. Of course, today i know, that a mathematical answer is impossible, because there is no math model to speak of, and continuation and coroutine are rather complex to explain because they involve passing program state, which, inevitably involves implementation issues such as “stacks”, which has very little relevance to a language's semantics expressed as a piece of math. However, it is possible to give a clear context of what these construct means, or their significance, in the domain of language semantics and design. However, to do that is beyond their ability, either in actually understanding it, or expressing it in words.

Continuation and Coroutine in One Sentence Summary

So, what's continuation in one sentence? We can say that a continuation is a controlled form of goto statement. Instead of simply moving execution point from one place to another, it also passes the current program state (For example, local vars).

So, what's coroutine in one sentence? We can say that a coroutine is a subroutine/function in which you can call other subroutine in the block and passing all local state info to the called routine. In a sense, it is a local form of continuation. So, suppose you have subroutine f and g. f can call g in its definition, and when it calls g, it passes all state info to g, and f is simply done and exits, and all control is now in g. In particular, f does not wait for g to finish and return control to f. Likewise, g can call f the same way. So, 2 subroutines can mutually call each other and form a recursion, but without incurring the non-ending resource growth problem. (in this situation, it's more like bouncing a ping-pong ball between two pads.)

The following explains in more detail.

Understanding Goto

For modern programers, to understand goto, continuation, coroutine, a easy way to understand them is by first understanding goto as used in 1960 or 1970's languages. First, read Wikipedia Goto .

Unstructured vs Structured Programing

Essentially, older languages, like BASIC, at that time, does not have much of a concept like today's subroutines, code blocks such as “while”, “for” etc, and of course no libraries or module concept. Here's a code example from Wikipedia that illustrate this:

10   INPUT "What is your name: ", U$
20   PRINT "Hello "; U$
30   INPUT "How many stars do you want: ", N
40   S$ = ""
50   FOR I = 1 TO N
60   S$ = S$ + "*"
70   NEXT I
80   PRINT S$
90   INPUT "Do you want more stars? ", A$
100  IF LEN(A$) = 0 THEN GOTO 90
110  A$ = LEFT$(A$, 1)
120  IF A$ = "Y" OR A$ = "y" THEN GOTO 30
130  PRINT "Goodbye "; U$
140  END

Note that basically a source code is just one single file of hundreds of lines. There's not much of a code block or subroutine, not much of local variables, and the program basically just execute lines sequentially, and GOTO is used to jump to different lines, as primary means of aptly-called “control flow” for program logic.

Now look at the Wikipedia article on GOTO again, you'll understand it. Now, you can imagine this unstructured way is complex and is a problem. The use of GOTO as flow control turns your code logic into spaghetti. There is no clarity of logical units. So, in the 1970s or 1980s, languages evolved into what's called structured programing. This is familiar to programers today. Regardless what language you use, we are all doing structured programing. So, what's structured programing? You can see from this example of improved BASIC language:

INPUT "What is your name: ", UserName$
PRINT "Hello "; UserName$
DO
  INPUT "How many stars do you want: ", NumStars
  Stars$ = STRING$(NumStars, "*")
  PRINT Stars$
  DO
    INPUT "Do you want more stars? ", Answer$
  LOOP UNTIL Answer$ <> ""
  Answer$ = LEFT$(Answer$, 1)
LOOP WHILE UCASE$(Answer$) = "Y"
PRINT "Goodbye "; UserName$

You can see that now there's more clear concept of code blocks or code units, and GOTO is not used here. This is more of a Structured programming, which is familiar to all of us. But more interesting is that this switch from unstructured to structured languages isn't so obvious in the 1970s. You can read Structured program theorem, and see back then it was heatedly debated, and in fact the theoretical background was just budding.

At this point, it is worth to remember that whether the language is unstructured or allows structured code, both are equivalent with respect to the possible algorithms they can express. More precisely, here's how Wikipedia puts it:

[Structured program theorem] states that every computable function can be implemented in a programming language that combines subprograms in only three specific ways. These three control structures are

(1) Executing one subprogram, and then another subprogram (sequence)

(2) Executing one of two subprograms according to the value of a boolean variable (selection)

(3) Executing a subprogram until a boolean variable is true (iteration)

Understanding Continuation

Continuations is in a sense a controlled form of goto. Scan these articles:

The last one is most easy to understand. By understanding continuation passing style with actual code, you might get some idea about continuation.

Suppose we define a function named “times”, that takes 2 arguments and multiply them together. However, we make it to have 3 parameters. The 3rd parameter will be a function with 1 argument. (the purpose of the 3rd parameter will be explained soon) The “return value” of times is the return value of the f applied to “x * y”.

def f (x y g) {
g(x * y);
}

(define (times x y f)
  (f (* x y)))

So, to compute “x * y” using our “times” function, we give it a third argument of a “identity” function, which takes one argument and returns it. (it doesn't do anything)

; define the “identity” function
(define (identity x) x)

; compute “3 * 4” using our function “times”
(times 3 4 identity)

Following is a code in Scheme Lisp that defines a function “pyth” that computes Sqrt[x^2+y^2].

(define (pyth x y)
  (sqrt (+ (* x x) (* y y))))

Following is the same function written in Continuation Passing Style.

(define (pyth x y k)
  (* x x (lambda (x2)
           (* y y (lambda (y2)
                    (+ x2 y2 (lambda (x2py2)
                               (sqrt x2py2 k))))))))

You can see that, basically it is like a style of piping, or called filtering or function sequencing.

Am not clear what is the significance of this continuation passing. Or, what's the level or context of this style. Is it implementation of a language (compiler)? Syntax design? Language semantic design? or just different style of coding of a given language for a given problem? If the last, then it doesn't make much sense, since it is better or easier to write it simply as nested function or function sequence. Then, perhaps with continuation passing style, the compiler creates efficient code… but then that is not necessarily so due to the style, since there is no reason a function sequencing or nesting style wouldn't produce the same background for creating efficient machine code.

Don't fully understand continuation yet, will have to dig later.

Understanding Coroutine

Here is Wikipedia quote:

In computer science, coroutines are program components that generalize subroutines to allow multiple entry points for suspending and resuming execution at certain locations. Coroutines are well-suited for implementing more familiar program components such as cooperative tasks, iterators, infinite lists and pipes.

Here is a sample pseudo-code:

var q := new queue

coroutine produce
    loop
        while q is not full
            create some new items
            add the items to q
        yield to consume

coroutine consume
    loop
        while q is not empty
            remove some items from q
            use the items
        yield to produce

As with Continuation, will need to have hands on coding experience of a lang with that feature to really understand fully.

2010-01-04

Coroutine as a Model of Instruction Set for Abacus

Was reading about Eve online, a massive multi-player online spaceship game. Here's a interesting quote:

Both the server and the client software for Eve Online are developed in Stackless Python, a variant of the Python programming language. Stackless Python allows a relatively large number of players to perform tasks without the overhead of using the call stack used in the standard Python distribution. This frees the game developers from performing some routine work and allows them to apply changes to the game universe without resetting the server.

Humm… Stackless Python. And then on this article, it mentioned that Second Life is also using it. The source of this claim is here http://wiki.secondlife.com/w/index.php?title=Eventlet&oldid=51543 .

Central to the issue here is Coroutine, which i never understood. From reading the Wikipedia article, it appears to be a language feature that allows mutual loop. For example, 2 functions f and g, each calling the other at the end. Normally, that would quickly run out of memory or such. hum… still not sure i fully understand. How is the mathematical nature of this feature?

When thinking about computers, often i model it to manipulating abacus. (you could use Turing Machine to model but more complicated to think about) So, a subroutine, is then a reusable part of algorithm, or, reusable part of instruction on how to manipulate the abacus. So, normally, subroutine is expected to terminate, meaning, given a unit of abacus manipulation instruction set, the way you use this is to expect it to reach the end of the instruction set, then follow another instruction set. But coroutine, is then such a instruction set that is not expected to reach a end. Rather, you expect that in the middle of the instruction to follow some other instruction set (kinda like Goto). This all makes much sense now.

Some Wikipedia Reading on Compiler Optimization

Some easy to understand compiler optimization techniques: Constant folding, Dead code elimination .

Also, to facilitate the above optimizations , modern compilers uses a Static single assignment form for their intermediate form.

On reading these, gave me some thoughts about how to write programs today. For example, in the past, especially functional programing, you avoid defining variables, so that your code runs faster. For example, say you want a constant “delayTime” to be 2 hours, and you may write delayTime = 60*60*2, which is easier to understand than delayTime = 7200. However, ten years ago you may want to use the hard-coded version because that way you avoid computing some multiplication. But today, compiler technology has advanced a lot. The bottom line is: Always write your code in the MOST human readable way, and never optimize until in the final stage, and if you really have a NEED. Also, when u optimize, the most important step is the algorithmic level. That is, exam your approach to the problem and the overall algorithm chosen. After that, fix bottle-necks. The aspect of optimization you should have the least concern, is diddling at the code level, such as trying to use more “idioms”. The reasons for this principle are few:

(1) Hardware speed increases faster than typical software have to perform, in general.
(2) Compiler technology optimizes away mathematical equivalences in a way more able than human ever possibly can.
(3) Premature optimization is harmful and wasteful. It reduces your code's flexibility, and human time is some thousand times more expensive than cpu time.

Tech geeking programers often have a unhealthy infatuation with optimization by diddling their code (For example, perlers, and otherwise “idiom” lovers, or “pythonic” faak.)

For example of how technology fundamentally changes programing style in a major way, see Guy Steele on Parallel Programing .

More items to study: