Program Derivation
Increasing Efficiency Through Program Derivation
Our
original definition of subst
in Session 10 was somewhat confusing — both to read
and to write. We then saw that following the BNF and
using
mutual recursion
made the code easier to write and easier to understand. This ease
comes, however, at the cost of extra function calls.
How so? Notice that we now make two function calls each time
the first
of the s-list contains an s-list: one
to subst-symbol-expr
, and then a return call to
subst
. Such "double dispatch" can be expensive on a
large dataset.
Sometimes, the run-time costs introduced by mutual recursion outweigh the program-time and read-time benefits of the separate functions. Can we modify our definition without losing too many of its benefits?
We can use Racket's substitution model to get back to a single function. Our solution currently looks like this:
(define subst (lambda (new old slist) (if (null? slist) '() (cons (subst-symbol-expr new old (first slist)) (subst new old (rest slist)))))) (define subst-symbol-expr (lambda (new old symexp) (if (symbol? symexp) (if (eq? symexp old) new symexp) (subst new old symexp))))
We can substitute the definition of subst-symbol-expr
into subst
, using the standard rules from the
substitution model. This is exactly what the Racket interpreter
will do at run-time. First, we substitute the lambda
in place of the name:
(define subst (lambda (new old slist) (if (null? slist) '() (cons ( (lambda (new old symexp) ;; (if (symbol? symexp) ;; Here (if (eq? symexp old) ;; is new ;; the symexp) ;; first (subst new old se))) ;; substitution. new old (first slist)) (subst new old (rest slist))))))
Next, we replace the application of the lambda
with
the body of the lambda
, substituting the arguments
for the corresponding formal parameters: new
for
new
, old
for old
, and
(first slist)
for symexp
:
(define subst (lambda (new old slist) (if (null? slist) '() (cons (if (symbol? (first slist)) ;; (if (eq? (first slist) old) ;; Here is new ;; the second (first slist)) ;; substitution. (subst new old (first slist))) ;; (subst new old (rest slist))))))
The result is a single function that behaves exactly like the two original functions. After all, all we did was to derive by hand the same result that the Racket evaluator will produce. So, provided that we made no errors in our derivation, the resulting function has the same functionality. Our unit tests can help us ensure that we haven't broken the code.
However, the new version is more efficient, because it eliminates the extra function calls. We hope that it is nearly as readable as the two-function version.
Take a closer look. The derived function is not
like
the single-function solution we wrote earlier.
That function repeated the expression
(subst new old (cdr slist))
several times, because we
worked through the details of every possible case. Using mutual
recursion followed by program derivation — letting Racket's
substitution model do some of the work for us — results in a
program with a single (subst new old (rest slist))
.
We can do this in Racket because the if
construct is
an expression that returns a value, not a statement. In many
languages, if
is a statement and returns no value.
A few, including Java and C++, have a "computed if" expression
that may let us do something like this. In Java, a "computed if"
is written as
<test> ? <then-value> : <else-value>
C++ has a concept that is similar in spirit to program derivation:
the in-lining of member functions. The difference, though,
is that its is implemented by the compiler. When we declare a
class member function inline
, the compiler tries to
replace all calls to the function with equivalent code from the
body of the function.
For example, we may well use an accessor method x()
frequently when interacting with an object that has an
x-coordinate. By declaring the inline
, the compiler will replace the method call
with the equivalent code from the body of the function.
This enables the programmer to eliminate the overhead of extra function calls at run time, without obscuring the readability and design of our class.
Program derivation works like inlining, but it is a technique used by programmers to modify their code. (I can certainly imagine having a Racket compiler implementing program derivation automatically, thus saving the programmer the effort and risk of error!)
We will use the program derivation technique occasionally to simplify the result of mutual recursion, and any other technique that introduces unwanted function calls that create undesirable inefficiency at run-time -- but only when the cost of the extra function calls outweighs the benefits of separate functions.
An Exercise Applying Program Derivation
Use program derivation to eliminate the
count-occurrences-symbol-expr
function in our
solution to the count-occurrence problem. Do you like the result?
You can see my solution in the zip file for Session 10.