Optional Reading: Accumulator Variables
Accumulator Variables
The second argument to our factorial
function in
Session 11
is called an accumulator variable. How do we create one
when writing a recursive function?
Suppose we started with the standard recursive implementation of
factorial
:
(define factorial (lambda (n) (if (zero? n) 1 (* n (factorial (sub1 n))))))
What happens on each recursive call?
-
factorial
must wait for the result of(factorial (sub1 n))
before it can apply the*
function ton
and the result. -
To wait, it must remember the value of
n
and the value of*
. As you may have learned in prior courses, each call tofactorial
requires its own stack frame to remember the state of its computation.
But to compute (factorial (sub1 n))
,
factorial
must wait for the result of
(factorial (- n 2))
, which must wait for the result of
(factorial (- n 3))
, which must wait for the result of
... and so on. This approach makes a lot of use of the system
stack: It computes all of the
(factorial (- n k))
values, for n-1 down to
0, before it multiplies anything by n
!
This process is expensive in its use of space. It is the reason most of us learned early to be wary of recursion for fear of causing a stack overflow.
If only we could write a procedure that evaluates the
(* n ...)
part of the computation right away. Then we
could eliminate the need to save up all those pending computations.
We can do that, by reorganizing the way we compute the
answer. That's how I created factorial-aps
:
(define factorial-aps (lambda (n answer) (if (zero? n) answer (factorial-aps (sub1 n) (* n answer)))))
This function evaluates the (* n ...)
portion of its
work first and then it passes that result as an argument
on the recursive call that computes
(factorial (sub1 n))
. Instead of computing...
n * ( (n-1) * ( (n-2) * ... (2 * 1)))
from the bottom up, as the original function does,
factorial-aps
computes ...
(((n * (n-1)) * (n-2)) * ... 2) * 1
from the top down. Multiplication is associative, so the answer is the same, and so we are still happy.
As we saw in a cool demo of Racket's behavior during Session 2, this function offers phenomenal performance, because it makes a vast improvement in the amount of space used by the function. That is the performance improvement we saw earlier in the session.
The formal parameter answer
is known as an
accumulator variable. It accumulates the intermediate
results of the computation in much the same way that a local
variable accumulates a running total in the loop of a procedural
program.
Notice that using an accumulator variable usually requires us to create an interface procedure. We have to pass the accumulator as an extra argument on each recursive call. The interface procedure passes the initial value of the accumulator on the first call. This value is the identity value of the operation being used. With multiplication, that is 1:
(define factorial (lambda (n) (factorial-aps n 1)))
By the way, I use the suffix -aps
in the name
of my helper function to indicate that it is written in
Accumulator Passing Style. That is
the name for the style of programming in which use accumulator
variables to track our intermediate solutions.
Using an accumulator variable to implement factorial
has the feel of writing a loop. The dact that using an accumulator
variable gives us these feelings is not a coincidence; as we saw in
Session 11,
they are closely related. In general, accumulator-passing style
resembles imperative, sequential programming of the sort you are
used to doing in Python, Java, and C. Here, we are just doing it
through the order of function applications!
In this example, we used an accumulator variable to create a tail
recursive function. However, this is only one use of the technique.
The true effect of an accumulator variable is that it gives the
programmer greater control over the order of execution.
Notice that we used the accumulator in our factorial
function to do multiplications before function calls. When we
use an accumulator variable, we control the order of execution
not by doing things in sequence and rearranging the sequence,
but by making function calls and rearranging the order in which
we nest arguments.
Continuation-Passing Style
Note: This is a theoretical digression, even more optional than the discussion above!
A natural extension to the idea of passing an accumulator variable is to pass a function that can be applied to the initial value to compute the desired answer. This defers all of the actual computation until later, which can be handy in a variety of contexts, such as recognizing and handling error conditions.
When the accumulator is a function, we often refer to it as a continuation, because it is the continuation of the computation yet to be done. This may seem strange, but keep in mind that we can pass this function to any function at any time. Passing continuations around — so-called continuation passing style — makes it possible to implement all sorts of exotic control structures, such as exceptions, threads, backtracking, and the like. How? Because the called function gets to decide when — and even if! — to call the continuation.
Scheme is a minimalist language, in that it tends to provide only a necessary core of operations out of which all other operations can be built. This minimalism accounts for its lack of loops, for instance, which can be simulated recursively. Instead, Scheme provides support for accessing the "current continuation" of any computation: see the middle of the language definition of control features. With the current continuation of a computation, we can implement most of the control structures we desire!