Session 20:
Computing Lexical Addresses

Lessons from a Wise Tiger

A four-panel black-and-white 'Calvin and Hobbes' comic strip. Calvin says that the way to handle a seemingly overwhelming problem is to break it into parts.  Hobbes asks if he intends to use Structural Recursion to decompose his problem. Calvin asks, 'Do I even care?'
Full disclosure: This is a rif on the original comic

We all might have guessed that Calvin would cut corners, but who knew that Hobbes was a fan of structural recursion?

Finishing Up Our Exercise in Lexical Addressing

We are writing a function called (lexical-address exp), in which exp is any expression in this little language:

<exp> ::= <varref>
      | (lambda (<var>*) <exp>)     ; 0 or more parameters
      | (<exp> <exp>*)              ; 0 or more arguments
      | (if <exp> <exp> <exp>)

lexical-address returns an equivalent expression with every variable reference v replaced by a list (v : d p), as described in Session 18.

I imagine that this problem looks overwhelming to many of you, or did when you first started to work on it.

While Calvin's notorious devil-may-care attitude makes for a good laugh, he is right in an important respect: Decomposing this problem into several smaller problems can makes it seem less imposing. Knowing how to decompose a problem is always the difficult first step, and that's where Hobbes's wisdom comes in. You have a not-so-secret weapon at your disposal: Structural Recursion over the inductively-defined data type that is our little language!

By the end of last session, many of you had produced some useful code, after working through some of the key ideas needed to write lexical-address:

Spend a few minutes now and try to extend your solution. Some of you are close enough that you could finish!

Then write down the biggest challenge you faced when while trying to solve this problem. (If you really want to write down two, please do.)

The Lexical Addressing Function

Let's build a solution now.

1.  Use Structural Recursion over the inductive definition for expressions to build the basic structure for our solution.

The problem says, Write a function named (lexical-address exp) ...:

(define lexical-address
  (lambda (exp)
    ...))

... where exp is any expression in our our language. That is a trigger to use structural recursion, to follow the grammar of the language:

(define lexical-address
  (lambda (exp)
    (cond ((varref? exp) ...)
          ((if?     exp) ...)
          ((app?    exp) ...)
          ((lambda? exp) ...)
          (else (error 'lexical-address "unknown exp ~a" exp) ))))

The next sentence of the problem describes the return value, which gives us a clue for implementing the four cases: lexical-address returns an equivalent expression with every variable reference v replaced by its lexical address....

This tells us that, for three of the cases, we will be calling the corresponding constructor (make-app in the app? case, etc.). Variable references are replaced by lexical addresses, so we will have to do something different there.

(define lexical-address
  (lambda (exp)
    (cond ((varref? exp) ...)
          ((if?     exp) (make-if ...))
          ((app?    exp) (make-app ...))
          ((lambda? exp) (make-lambda ...))
          (else (error 'lexical-address "unknown exp ~a" exp) ))))

2.  Solve the if case.

The if expression is a simple recursive case. An if expression neither declares a new variable nor accesses one. And it has exactly three components. We have only to build as the answer an equivalent expression beginning with if, with each of its parts lexical-addressed.

(define lexical-address
  (lambda (exp)
    (cond ((varref? exp) ...)
          ((if? exp)
              (make-if (lexical-address (if->test exp))
                       (lexical-address (if->then exp))
                       (lexical-address (if->else exp))))
          ((app?    exp) (make-app ...))
          ((lambda? exp) (make-lambda ...))
          (else (error 'lexical-address "unknown exp ~a" exp) ))))

3.  Solve the application case.

Apps are also a simple recursive case, with a function expression and one or more expressions as arguments. app->args returns a list of expressions, so we can save ourselves from having to write a helper function by using map:

(define lexical-address
  (lambda (exp)
    (cond ((varref? exp) ...)
          ((if? exp)
              (make-if (lexical-address (if->test exp))
                       (lexical-address (if->then exp))
                       (lexical-address (if->else exp))))
          ((app? exp)
              (make-app (lexical-address (app->proc exp))
                        (map lexical-address (app->args exp))))
          ((lambda? exp) (make-lambda ...))
          (else (error 'lexical-address "unknown exp ~a" exp) ))))

And just like that we are halfway done, sort of. It's the easy half, but still. Half is half.

4.  Solve the varref case.

Now we have to bite the bullet and solve one of the cases that involves variable references and declarations. It mostly doesn't matter which we do first. Either will force us to think more deeply about how we compute the actual addresses.

Today, I'll choose to solve the varref case first, because I sense that computing a variable reference's lexical address will require some work. Why solve it first then? I know that I'll need to do some non-trivial computations, and attacking it first means that:

Our lexical-address-for function does require some new information in order to do its job. It needs to know all of the variable declarations that we have seen so far, block-by-block, inside out!

Consider this case:

(lambda (a b)
  (lambda (c)
    (lambda (d e)
      b)))

To compute the lexical address for 'b, our function needs to the list of variables declared in each block along the way: the first inner two, to know that 'b is not declared there, and the outermost, to know that it is. (What if we replace the 'b with an 'f?)

To compute a lexical address for a variable reference, we look at declarations in the current block first, then declarations in the block that contains the current block, then delarations in the block that contains that block, and so on, until we find a matching parameter. So, our function will need to search through a list of variable declarations created by the blocks in most-recent-first order, to find a variable declaration that matches the variable reference it is given. As it goes deeper into the sequence of variable declarations, it must increment a counter that keeps track of the depth of the search.

So, we need to pass two arguments to lexical-address-for: a list of block declarations and an initial depth counter of 0:

(define lexical-address
  (lambda (exp)
    (cond ((varref? exp)
              (lexical-address-for exp list-of-decls 0))
          ((if? exp)
              (make-if (lexical-address (if->test exp))
                       (lexical-address (if->then exp))
                       (lexical-address (if->else exp))))
          ((app? exp)
              (make-app (lexical-address (app->proc exp))
                        (map lexical-address (app->args exp))))
          ((lambda? exp) (make-lambda ...))
          (else (error 'lexical-address "unknown exp ~a" exp) ))))

If lexical-address is going to pass list-of-decls to lexical-address-for at this point, then it needs to know about the declarations, too. It will have to receive list-of-decls as an argument! lexical-address will also have to use this information in other parts of the function as well:

It turns out that list-of-decls is a part of every section of lexical-address, even the arms that don't work with variable references directly.

If the function we are writing has to receive list-of-decls as an argument, then it isn't lexical-address after all. lexical-address receives only one argument. The function we are writing is a helper that does the recursive work for lexical-address.

5.  Convert this version of lexical-address into a helper function.

So we add the declaration list as an argument to lexical-address and convert lexical-address into a helper function.

(define lexical-address-helper
  (lambda (exp list-of-decls)
    (cond ((varref? exp)
              (lexical-address-for exp list-of-decls 0))
          ((if? exp)
              (make-if (lexical-address-helper (if->test exp) list-of-decls)
                       (lexical-address-helper (if->then exp) list-of-decls)
                       (lexical-address-helper (if->else exp) list-of-decls)))
          ((app? exp)
              (make-app (lexical-address-helper (app->proc exp) list-of-decls)
                             ;; ***** problem *****
                        (map lexical-address-helper (app->args exp))))
          ((lambda? exp) (make-lambda ...))
          (else (error 'lexical-address "unknown exp ~a" exp) ))))

But now we have to backtrack and fix the app case. We used map to apply lexical-address to each of the arguments in an application. lexical-address-helper now takes a second argument, the list of declarations, so we cannot map it over a list.

Notice, though: We use the same declaration list to evaluate each of the sub-expressions, because an app does not create any new variables of its own. So we can create a one-argument lambda on the fly to curry lexical-address-helper with a "hardwired" declaration list!

(define lexical-address-helper
  (lambda (exp list-of-decls)
    (cond ((varref? exp)
              (lexical-address-for exp list-of-decls 0))
          ((if? exp)
              (make-if (lexical-address-helper (if->test exp) list-of-decls)
                       (lexical-address-helper (if->then exp) list-of-decls)
                       (lexical-address-helper (if->else exp) list-of-decls)))
          ((app? exp)
              (make-app (lexical-address-helper (app->proc exp) list-of-decls)
                        (map (lambda (e)
                               (lexical-address-helper e list-of-decls))
                             (app->args exp))))
          ((lambda? exp) (make-lambda ...))
          (else (error 'lexical-address "unknown exp ~a" exp) ))))

This a great example of the practical value of currying. It's also a good example of using an anonymous lambda expression. If we want, we can create a stand-alone helper function with a name, but it wouldn't add much to our solution. We need this function only once, in this expression, for the sole purpose of mapping lexical-address-helper. It truly is a temporary function.

6.  Re-create lexical-address as an interface procedure.

Now we re-create lexical-address as an interface function:

(define lexical-address
  (lambda (exp)
    (lexical-address-helper exp ?????)))

We need to pass an initial declaration list to lexical-address-helper from our interface procedure. An empty list might seem like a reasonable solution at first, because at that moment our program has not yet encountered any variable declarations. But what about the free variables in our expression?

> (lexical-address 'a)
(a : 0 0)

> (lexical-address '(if a b c))
(if (a : 0 0) (b : 0 1) (c : 0 2))

Remember, all of the free variables in an expression must be bound at run-time to some declaration, either to a primitive or to a definition at the top level. Our program can simulate this by treating all free variables as bound to some declaration that exists at the time exp is evaluated. So, (free-vars exp) can serve as the variables that are declared in our initial block. But we need to have list of declaration lists, so we pass (list (free-vars exp)) to the helper as the initial value.

(define lexical-address
  (lambda (exp)
    (lexical-address-helper exp (list (free-vars exp)))))

7.  Finally, solve the lambda case

lambda expressions create new bindings. We need to:

  1. ... compute the lexical address of all variables in the lambda's body with a recursive call. On this call, though, we must also include the variables declared by the lambda, in addition to the list of all previous declarations.
  2. ... return an equivalent lambda expression.

So:

(define lexical-address-helper
  (lambda (exp list-of-decls)
    (cond ((varref? exp)
              (lexical-address-for exp list-of-decls 0))
          ((if? exp)
              (make-if (lexical-address-helper (if->test exp) list-of-decls)
                       (lexical-address-helper (if->then exp) list-of-decls)
                       (lexical-address-helper (if->else exp) list-of-decls)))
          ((app? exp)
              (make-app (lexical-address-helper (app->proc exp) list-of-decls)
                        (map (lambda (e)
                               (lexical-address-helper e list-of-decls))
                             (app->args exp))))
          ((lambda? exp)
              (make-lambda (lambda->params exp)
                           (lexical-address-helper
                             (lambda->body exp)
                             (cons (lambda->params exp) list-of-decls))))
          (else (error 'lexical-address "unknown exp ~a" exp) ))))

Notice how the make-lambda constructor lets us think about the parts of the solution without worrying about its form. This is the difference between concrete and abstract syntax.

And that's lexical-address. We aren't quite done yet, though, because we still haven't written the final piece of the puzzle: lexical-address-for!

Computing the Addresses Themselves

As complex as lexical-address and its helper may be, our assumption of a function that computes the lexical address of a particular variable reference made our job above a bit easier, while at the same time helping us to think through some of the subtleties that lie in making our recursive calls.

Notice how far we were able to defer writing this piece of code. Even if this function turns out to be easy to write, it would have been a distraction while we were writing the code that processes expressions. Now that we are happy with that lexical-address, we can devote our full attention to this last detail.

We can begin by writing the header:

(define lexical-address-for
  (lambda (var list-of-decls curr-depth)
    ;; FILL IN THE BLANK
  ))

What next? list-of-decls is a list of binding lists, where each binding list is a list of variable declarations (symbols).

<list-of-decls> ::= ()
                  | (<list-of-symbols> . <list-of-decls>)

The lists are in most-recent order of the blocks that lexical-address has seen thus far. That is, the first list of identifiers is for the innermost scope, and the last is for the outermost scope (the free variables of the expression).

Reconsider our earlier example:

(lambda (a b)
  (lambda (c)
    (lambda (d e)
      b)))

When lexical-address-for is asked to compute the address for 'b, it will receive this list as its second argument:

( (d e) (c) (a b) () )
Where did the '() come from?

We need to search through these lists in order until we find a match for var. The 0-based number of the binding list in which we find the first occurrence of var is its depth d, and the 0-based position of the variable in that list is its position p.

Let's build the structure for finding the right block. decls is a list of lists, so we can process it using standard structural recursion:

(define lexical-address-for
  (lambda (var list-of-decls curr-depth)
    (if (null? decls)
        ;; base case: no declarations
        ;; pair case: look in first, then look in rest ) ))

What happens if decls is empty? We have an error somewhere!! Every variable that occurs in the expression is either free or bound, and we started our list of declarations with an outermost list that contains all the free variables. Rather than tinker with the basic structure of our programs, let's stick with this base case and signal an error if we ever reach it:

(define lexical-address-for
  (lambda (var list-of-decls curr-depth)
    (if (null? list-of-decls)
        (error 'lexical-address-for
               "variable ~a neither free nor bound" var)
        ;; recursive case ) ))

We could try to write this function on the assumption that we will never reach an empty list. But structural recursion takes us directly here, and thinking about special cases differently is actually more work than writing the structurally-recursive solution in this way! If you prefer, you can make your base case (null? (rest list-of-decls)), though that may result in some duplication you will want to eliminate later.

Okay, so what if list-of-decls is not empty? We need to determine if the var occurs in the first of the list. If it does, then we can compute its lexical address using that list and the current depth. If not, then we compute its address using the remaining lists and the next depth, with a recursive call. Let's do the recursive case first, because it seems simpler:

(define lexical-address-for
  (lambda (var list-of-decls curr-depth)
    (if (null? list-of-decls)
        (error 'lexical-address-for
               "variable ~a neither free nor bound" var)
        (if ;; var occurs in (first list-of-decls)
            ;; compute the address
            (lexical-address-for var (rest list-of-decls)
                                     (add1 curr-depth)) )) ))

Notice that we increment curr-depth to record the fact that we have to look into an outer block for the declaration. The original caller of lexical-address-for needs to pass the initial count, which is 0. That's because each time we start looking for a variable in order to compute its address, we always start with the innermost scope, whose depth is 0.

We are finally to the point of actually computing the lexical address of a var. If var is a member of the first of the list, then its depth is curr-depth, and its position is the index of its position in this block's list of declarations. We already have a function that will find a symbol's 0-based position in a list of symbols, list-index. (Building generic tools often pays off later, when we find opportunities to use them!)

Even better, we can even use list-index to determine whether the variable is a member the list in the first place. If we ask for the list index of a symbol that doesn't occur in the list, list-index returns -1 as a failure code. We can use that code as the test on our if statement to see if var is bound in the current block.

So: we will call list-index to find the index of var in (first list-of-decls). If it returns something other than -1, we will use that as var's position; otherwise, we will make the recursive call:

(define lexical-address-for
  (lambda (var list-of-decls curr-depth)
    (if (null? list-of-decls)
        (error 'lexical-address-for
               "variable ~a neither free nor bound" var)
        (let ( (position (list-index var (first list-of-decls))) )
            (if (> position -1)
              (list var ': curr-depth position)
              (lexical-address-for var (rest list-of-decls)
                                       (add1 curr-depth)))))))

And, finally, we are done. Try some test cases. Or run these automated Rackunit tests.

Debriefing the Solution

The solution seems almost anticlimactic. This is a complex problem. However, we have all the tools we need to decompose the complex problem into a number of simpler problems, the solutions to which we can reassemble to build an answer to the complex problem. The keys are:

  1. not to paralyze ourselves with the fear of a "hard problem", and
  2. to use the techniques we have learned to guide the process, from decomposition to solution.

Notice that our solution uses:

It also uses map and a curried lexical-address-helper function as part of a functional solution to the app case.

Solving such a hard problem will help you to build confidence in your skills and just plain feels good!

Full Implementation

Here is a complete implementation of the function lexical-address, with references to the required helper files. Study it; run it; modify it to do other things! The zip file for today contains this file and all of its supporting code.

Does it work? Let's see:

> (lexical-address 'a)
(a : 0 0)

> (lexical-address '(if a b c))
(if (a : 0 0) (b : 0 1) (c : 0 2))

> (lexical-address '(lambda (a b c)
                      (if (eq? b c)
                          ((lambda (c) (cons a c)) a)
                          b)) )
(lambda (a b c)
    (if ((eq? : 1 0) (b : 0 1) (c : 0 2))
        ((lambda (c) ((cons : 2 1) (a : 1 0) (c : 0 0))) (a : 0 0))
        (b : 0 1)))

Quick Exercise: No Variables

Earlier we saw that with lexical addresses we can remove variable references from a program.

Modify our lexical-address solution to eliminate the names in variable references.

Hint: This change is really small.

We also made the bolder claim that we can eliminate even the parameter names themselves from a program.

Modify our lexical-address solution to eliminate the names in variable declarations.

Hint: This one is almost as small.

Removing Variable References and Declarations

For the first task, we can look to the only location in the code that produces a lexical address for a variable reference: lexical-address-for. Simply take the variable reference out of the result:

(if (> pos -1)
    (list var ': curr-depth pos)
    ...)

For the second task, we can look to the only location in the code that creates a new variable: the lambda clause in lexical-address-helper's cond. Instead of reproducing the lambda expression's list of parameters, we can substitute the length of that list:

((lambda? exp)
   (make-lambda (length (lambda->formals exp))
                (lexical-addr-helper
                  (lambda->body exp)
                  (cons (lambda->formals exp) var-table))))

Is there any danger in eliminating the variable declarations while we are in the process of lexically addressing the program? No!

Before lexical-address-helper returns the new lambda expression, it will first lexically address the lambda's body. And that is the only place where that lambda expression's parameters are meaningful! The region of those declarations is the body of the lambda and nowhere else.

Does it work? Let's see:

> (lexical-address '(lambda (x)
                      (lambda (y)
                        ((lambda (x)
                           (x y))
                         x))))
(lambda 1
  (lambda 1
    ((lambda 1
       ((: 0 0) (: 1 0)))
     (: 1 0))))

> (lexical-address '(lambda (x y)
                      ((lambda (a)
                         (x (a y)))
                       x)))
(lambda 2
  ((lambda 1
     ((: 1 0) ((: 0 0) (: 1 1))))
   (: 0 0)))

> (lexical-address '(lambda (f)
                      ((lambda (h)
                         (lambda (n)
                           ((f (h h)) n)))
                       (lambda (h)
                         (lambda (n)
                           ((f (h h)) n))))))
(lambda 1
  ((lambda 1 (lambda 1 (((: 2 0) ((: 1 0) (: 1 0))) (: 0 0))))
   (lambda 1 (lambda 1 (((: 2 0) ((: 1 0) (: 1 0))) (: 0 0))))))

> (lexical-address '(lambda (a b c)
                      (if (eq? b c)
                          ((lambda (c) (cons a c)) a)
                          b)) )
(lambda 3
  (if ((: 1 0) (: 0 1) (: 0 2))
      ((lambda 1 ((: 2 1) (: 1 0) (: 0 0))) (: 0 0))
      (: 0 1)))

Looks good!

Lexical Address Redux

The exercise of writing lexical-address serves us in two ways. One, it lets us exercise our recursive programming skills on a topic in programming languages. Practice is good, and practice that begins to expand our range of skills is even better.

Two, it helps us to understand that programming languages topic — variable references and scope — in a way that a definition alone usually cannot. To write the program, we have to understand how a new block is created and how each variable reference relates back to the blocks in which it resides. We also have to understand what scope means for each kind of expression, and how the kinds of expression relate to one another.

Watching a small program compute lexical addresses and remove variable declarations without loss of information brings home the point that motivated this idea in the first place: Variable names are syntactic sugar! It also reinforces the idea of static analysis. A program really can do this.

As we discussed in last time, lexical addressing is quite similar to what a compiler must do when translating a source program into assembly or machine language. Compiled code is much faster because it can directly access the value of a referenced variable. (Imagine if a compiled program had to look up the value for every reference a lá lexical-address-for!) Lexical addresses really are addresses that allow the compiler to compute and hardcode the address of a variable in a piece of code.

Interpreters sometimes do something like this, too. In any programming language, but especially functional languages, a large percentage of total execution time of a program is spent looking up values for variables in some data structure (usually called the environment). For every variable reference and assignment, the interpreter has to compute the location of the value in memory. Anything an interpreter can do to spend up this process will have a huge effect on run time.

Next Time Coming Soon

When we needed to curry lexical-address-helper so that we could map it over an application's arguments, we had to write:

(map (lambda (exp)
       (lexical-address-helper exp list-of-decls))
     exps)

This is a place where Racket's verbosity really gets in the way of reading our code. Wouldn't be nice if we could use a shorthand notation for writing anonymous functions? Perhaps comething like this would do:

(map (lexical-address-helper _ list-of-decls)
     exps)

Some languages, including Racket, enable us to create our own syntactic abstractions and add them to the language. In an upcoming session, we'll learn how to do this in Racket.

Wrap Up