The Visitor Design Pattern
Balancing Forces
The second of the reading for Session 14 focused on representing the abstract syntax tree in a program. We were left with an undesirable situation whichever way we went:
- Use a record and be faced with the same multi-way switch statement in every function that processes the tree — and interminable up- and down-casting, if we use a statically-typed language.
- Use an object and be faced with modifying every tree node class every time we want to implement a behavior we did not anticipate up front — or leave the objects' data open to the public, thus coupling every behavior class tightly with the underlying AST representations.
Can we achieve the best of both worlds? To do so, we need a way to add behavior for manipulating ASTs without having to modify the AST classes themselves. But we also need a way to expose type information dynamically without casting. Let's consider how we can implement abstract syntax trees in an object-oriented language in a way that balances these competing forces, using the Visitor pattern.
Using Visitors to Implement Abstract Syntax in an OO Language
The Visitor pattern is a design construct that balances the forces between working with objects and adding new behaviors. This pattern uses a more general pattern, the type-revealing message, that is commonly found in dynamically-typed languages. By revealing type information through messages, the programmer does not have to use typecasts, and the object does not have to make its instance variables public to all objects. +
For more on type-revealing messages and similar techniques, check out Kent Beck's Smalltalk Best Practice Patterns. It is an awesome book which will teach you a lot about programming in any style or language. Here is a PDF of an early draft of the book, which you can buy through Amazon.
In code...
- Each object in the system (for us, nodes in the AST) must implement an interface that specifies not only the objects' actual behaviors but also the ability to accept a visitor.
- Each visitor object must implement an interface that specifies behaviors for visiting each object (again, for us, nodes in the AST).
To add a new behavior to the system, we need only create a new kind of visitor, with one method for each type of node. This is very similar to writing a new procedure with an arm for each type of node.
The primary strength of this approach is that it is relatively easy to add behaviors to the system, without having to edit the tree node classes themselves. Note, though, a big advantage of this approach. Visitor allows us to do type-driven programming with objects that does all type-checking at run-time.
As with any design construct, there are also costs. We must write a lot of little methods in the visitors. If we add a new type of thing to our system, then we must edit (and perhaps re-compile) all of the visitor classes! In essence, this technique trades OO benefits and costs for procedural benefits costs, in an OOP setting.
You can find a longer description of the Visitor pattern on Wikipedia.
Bonus Code: Implementing Abstract Syntax Using Visitors
What might it look like for our little expression grammar? Consider this implementation:
-
First, we have a simple set of AST classes defined by the
interface
Expression. Each kind of expression knows how to convert itself to an infix strings, and accepts anAstVisitor. -
The visitor classes are specified by the interface
AstVisitor. Each kind of visitor knows how to visit each kind ofExpression. - A test program applies a postfix printer and an evaluator, both implemented as visitors, to a simple expression.
Notice the interplay between accept() messages
that are sent to Expressions and
visit() messages that are sent to
AstVisitors. Perhaps now you understand why
visitor is an example of what is called
double dispatch.
Notice, too, that the composite objects AdditionExp
and MultiplicationExp delegate to their operands,
which are themselves Expressions. In a functional
or procedural solution, the sub-expressions would be processed
by a recursive call. What you are seeing is the object-oriented
version of recursion — a composite delegating to simpler
components, which ultimately reach the "base case" in the form
of concrete components. This has been documented as a design
pattern in its own right, called
Object Recursion.
(Be sure to check out the acknowledgements in that paper! :-)
You will find all of this code in the zip file for Session 14.