CS 3540 Reading Program Derivation

Program Derivation

Increasing Efficiency Through Program Derivation

Our original definition of subst in Session 10 was somewhat confusing — both to read and to write. We then saw that following the BNF and using mutual recursion made the code easier to write and easier to understand. This ease comes, however, at the cost of extra function calls.

How so?

Notice: we now make two function calls each time the first of the s-list is an s-list: one to subst-symbol-expr, and then an immediate return call to subst. Such "double dispatch" can be expensive on a large dataset.

Sometimes, the run-time costs introduced by mutual recursion outweigh the program-time and read-time benefits of the separate functions. Can we modify our definition without losing too many of its benefits?

We can use Racket's substitution model to get back to a single function. Our solution currently looks like this:

(define (subst new old slist)
  (if (null? slist)
      '()
      (cons (subst-symbol-expr new old (first slist))
            (subst new old (rest slist)))))

(define (subst-symbol-expr new old symexp)
  (if (symbol? symexp)
      (if (eq? symexp old) new symexp)
      (subst new old symexp)))

We can substitute the definition of subst-symbol-expr into subst, using the standard rules from the substitution model. This is exactly what the Racket interpreter will do at run-time. First, we substitute the lambda in place of the name:

(define subst
  (lambda (new old slist) 
    (if (null? slist)
        '()
        (cons ( (lambda (new old symexp)       ;; 
                  (if (symbol? symexp)         ;; Here
                      (if (eq? symexp old)     ;; is
                              new              ;; the
                              symexp)          ;; first
                      (subst new old se)))     ;; substitution.
                new old (first slist))
              (subst new old (rest slist))))))

Next, we replace the application of the lambda with the body of the lambda, substituting the arguments for the corresponding formal parameters: new for new, old for old, and (first slist) for symexp:

(define subst
  (lambda (new old slist) 
      (if (null? slist)
          '()
          (cons (if (symbol? (first slist))         ;;
                    (if (eq? (first slist) old)     ;; Here is
                        new                         ;; the second
                        (first slist))              ;; substitution.
                    (subst new old (first slist)))  ;;
                (subst new old (rest slist))))))

The result is a single function that behaves exactly like the two original functions. After all, all we did was to derive by hand the same result that the Racket evaluator will produce. So, provided that we made no errors in our derivation, the resulting function has the same functionality. Our unit tests can help us ensure that we haven't broken the code.

However, the new version is more efficient, because it eliminates the extra function calls. We hope that it is nearly as readable as the two-function version.

Take a closer look. The derived function is not like the single-function solution we wrote earlier. That function repeated the expression (subst new old (cdr slist)) several times, because we worked through the details of every possible case. Using mutual recursion followed by program derivation — letting Racket's substitution model do some of the work for us — results in a program with a single (subst new old (rest slist)).

We can do this in Racket because the if construct is an expression that returns a value, not a statement. In many languages, if is a statement and returns no value. A few, including Java and C++, have a "computed if" expression that may let us do something like this. In Java, a "computed if" is written as

<test> ? <then-value> : <else-value>

A Related Concept: Function Inlining

C++ has a concept that is similar in spirit to program derivation: the inlining of member functions. The difference, though, is that it is implemented by the compiler. When we declare a function inline, the compiler tries to replace all calls to the function with equivalent code from the body of the function.

For example, we may well use an accessor method x() frequently when interacting with an object that has an x-coordinate. By declaring the x() method as inline, the compiler will replace the method call with the equivalent code from the body of the function.

This enables the programmer to eliminate the overhead of extra function calls at run time, without obscuring the readability and design of our class.

Most other languages do not have an inline keyword, but their compilers often inline code aggressively as a way to make programs more efficient. This is especially valuable in languages that depend heavily on function calls, including Java and functional programming languages.

Program derivation works like inlining, but it is a technique used by programmers to modify their code. (I can certainly imagine having a Racket compiler implementing program derivation automatically, thus saving the programmer the effort and risk of error!)

Final Note

We will use the program derivation technique occasionally to simplify the result of mutual recursion, and any other technique that introduces unwanted function calls that create undesirable inefficiency at run-time — but only when the cost of the extra function calls outweighs the benefits of separate functions.

A Closing Exercise: `count-occurrences`

In Session 10, we implemented a function named count-occurrences using mutual recursion. Our solution is in this Racket file from Session 10's zip file.

Our implementation of count-occurrences has the same "double dispatch" behavior as subst. In this case, it seems even more of a problem, given how simple the helper function is.

Use program derivation to eliminate the count-occurrences-symbol-expr function from our solution.

Do you like the result?

You can see my solution in this Racket file. (Control-click to download the file to your computer.)

You can practice program derivation on any code we implement using mutual recursion. Give it a try!

Program Derivation

Increasing Efficiency Through Program Derivation

A Related Concept: Function Inlining

Final Note

A Closing Exercise: count-occurrences

A Closing Exercise: `count-occurrences`