Optional Reading: Accumulator Variables

Accumulator Variables

The second argument to our factorial function in Session 11 is called an accumulator variable. How do we create one when writing a recursive function?

Suppose we started with the standard recursive implementation of factorial:

(define factorial
  (lambda (n)
    (if (zero? n)
        1
        (* n (factorial (sub1 n))))))

What happens on each recursive call?

But to compute (factorial (sub1 n)), factorial must wait for the result of (factorial (- n 2)), which must wait for the result of (factorial (- n 3)), which must wait for the result of ... and so on. This approach makes a lot of use of the system stack: It computes all of the (factorial (- n k)) values, for n-1 down to 0, before it multiplies anything by n!

This process is expensive in its use of space. It is the reason most of us learned early to be wary of recursion for fear of causing a stack overflow.

If only we could write a procedure that evaluates the (* n ...) part of the computation right away. Then we could eliminate the need to save up all those pending computations.

We can do that, by reorganizing the way we compute the answer. That's how I created factorial-aps:

(define factorial-aps
  (lambda (n answer)
    (if (zero? n)
        answer
        (factorial-aps (sub1 n) (* n answer)))))

This function evaluates the (* n ...) portion of its work first and then it passes that result as an argument on the recursive call that computes (factorial (sub1 n)). Instead of computing...

n * ( (n-1) * ( (n-2) * ... (2 * 1)))

from the bottom up, as the original function does, factorial-aps computes ...

(((n * (n-1)) * (n-2)) * ... 2) * 1

from the top down. Multiplication is associative, so the answer is the same, and so we are still happy.

As we saw in a cool demo of Racket's behavior during Session 2, this function offers phenomenal performance, because it makes a vast improvement in the amount of space used by the function. That is the performance improvement we saw earlier in the session.

The formal parameter answer is known as an accumulator variable. It accumulates the intermediate results of the computation in much the same way that a local variable accumulates a running total in the loop of a procedural program.

Notice that using an accumulator variable usually requires us to create an interface procedure. We have to pass the accumulator as an extra argument on each recursive call. The interface procedure passes the initial value of the accumulator on the first call. This value is the identity value of the operation being used. With multiplication, that is 1:

(define factorial
  (lambda (n)
    (factorial-aps n 1)))

By the way, I use the suffix -aps in the name of my helper function to indicate that it is written in Accumulator Passing Style. That is the name for the style of programming in which use accumulator variables to track our intermediate solutions.

Using an accumulator variable to implement factorial has the feel of writing a loop. The dact that using an accumulator variable gives us these feelings is not a coincidence; as we saw in Session 11, they are closely related. In general, accumulator-passing style resembles imperative, sequential programming of the sort you are used to doing in Python, Java, and C. Here, we are just doing it through the order of function applications!

In this example, we used an accumulator variable to create a tail recursive function. However, this is only one use of the technique. The true effect of an accumulator variable is that it gives the programmer greater control over the order of execution. Notice that we used the accumulator in our factorial function to do multiplications before function calls. When we use an accumulator variable, we control the order of execution not by doing things in sequence and rearranging the sequence, but by making function calls and rearranging the order in which we nest arguments.

Continuation-Passing Style

Note: This is a theoretical digression, even more optional than the discussion above!

A natural extension to the idea of passing an accumulator variable is to pass a function that can be applied to the initial value to compute the desired answer. This defers all of the actual computation until later, which can be handy in a variety of contexts, such as recognizing and handling error conditions.

When the accumulator is a function, we often refer to it as a continuation, because it is the continuation of the computation yet to be done. This may seem strange, but keep in mind that we can pass this function to any function at any time. Passing continuations around — so-called continuation passing style — makes it possible to implement all sorts of exotic control structures, such as exceptions, threads, backtracking, and the like. How? Because the called function gets to decide when — and even if! — to call the continuation.

Scheme is a minimalist language, in that it tends to provide only a necessary core of operations out of which all other operations can be built. This minimalism accounts for its lack of loops, for instance, which can be simulated recursively. Instead, Scheme provides support for accessing the "current continuation" of any computation: see the middle of the language definition of control features. With the current continuation of a computation, we can implement most of the control structures we desire!