Session 19:
Computing Lexical Addresses
Introduction
Last time, we learned about the idea of a lexical address, which identifies the variable declaration to which a variable reference refers. A lexical address gives the depth of the reference from the block in which the variable was declared and the position of the variable's declaration in that block.
For example, consider this exercise from last session:
(lambda (x) ;; block 1 (lambda (y) ;; block 2 ((lambda (x) ;; block 3 (x y)) ;; line 4 x))) ;; line 5
There are three blocks in this code and three variable references.
All of the blocks declare one variable each, so the positions of
all the lexical addresses will be 0. The variable references
x
and y
on Line 4 are in Block 3. The
x
refers to the parameter in the same block, so its
address is (x : 0 0)
. The y
on the same
line refers to the y
declared in Block 2. Its depth
is 1, so the address is (y : 1 0)
. Finally, the
x
on Line 5 is in Block 2, outside of Block 3. So
it refers to the parameter on Block 1, and its address is
(x : 1 0)
.
Even though we often label blocks from the outside in, when we compute lexical addresses, we look at the blocks from the inside out, from the perspective of the variable reference itself.
A Warm-Up Exercise
(lambda (f) ((lambda (h) (lambda (n) ((f (h h)) n))) (lambda (h) (lambda (n) ((f (h h)) n)))))
The Solution — and More
This expression has five lambda
expressions, so it has
five blocks. The outermost block contains two blocks, each of
which contains another. The variable references n
,
h
, and f
are to variables declared in
three different blocks, so their depths will be 0, 1, and 2,
respectively. Each reference is to the first and only variable
declared in that block, so its position is 0. The two
(lambda (h) ...)
expressions are identical, which
means that the lexical addresses of the two sets of variables are
identical. So:
(lambda (f) ((lambda (h) (lambda (n) (((f : 2 0) ((h : 1 0) (h : 1 0))) (n : 0 0)))) (lambda (h) (lambda (n) (((f : 2 0) ((h : 1 0) (h : 1 0))) (n : 0 0))))))
What happens if we delete the f
's, h
's,
and the n
's from the lexical addresses? What happens
if we replace each of the parameter lists with the number 1?
As we saw at the end of Session 18, once we have computed lexical addresses for all variable references, we no longer really need the variable references themselves. What's more, we don't even really need the variable declarations, either! They are sugar.
To remind yourselves of that, try Exercise 3 from last session's closing exercise.
(lambda 3 ;; Problem 3 ((: 0 1) ((lambda 2 ((: 0 0) (: 1 0) (: 0 1))) (: 0 2))))
Can we reuse any of the variable names from the outer block when we declare the inner block? Which ones?
Not every expression that looks like a lexical address expression can be "reverse engineered" in this way. Consider Exercise 2 from last session's close:
(lambda (x) ;; Problem 2 (lambda (x) (: 1 0)))
Why isn't this legal?
Fortunately, any time we start with a legal expression and compute lexical addresses for its variables, the result is a legal lexical address expression that can be reversed — losing only the actual names used.
You may have noticed that the expression in our opening exercise is a combinator: a function definition with no free variables. Indeed, it is the Y combinator, famous in programming language theory for its role in the proof that it is possible to create recursive local functions that do not have a name. But how can a function call itself if it doesn't have a name? If you would like to find out, check out this very optional reading — LINK COMING SOON — and the associated code. Warning: this one may make your head hurt for a few minutes. But it's pretty cool.
Now, onward.
Where Are We?
For the last couple of weeks, we have been devoting our attention to the idea of a syntactic abstraction, a language feature that is not strictly necessary because we could conceivably do without it. We can do without such "syntactic sugar" because we have an equivalent way to express the same idea using other language features. For the same reason, the interpreter doesn't really need to understand the feature in order to determine the meaning of our program.
At this point, we have seen that a number of common language features are in fact syntactic abstractions:
- procedures that take more than one argument
- local variables
- local functions, even recursive ones
- logical connectives
- conditional expressions
- variable references
- parameter names (sort of)
This list is longer than it was the last time I gave it. In
a reading assignment,
you examined the idea of local recursive functions. The syntax
is identical to a let
expression, but its semantics
are bit different. Now we can make sense of
the superfast factorial
function
I showed you back in
Session 2!
Like other local variables, local recursive functions are a
syntactic abstraction. Unlike other local variables, they
require something more complex than a simple rewrite to an
application of a lambda
expression.
In last session, we saw that variable names are not strictly necessary: they are really syntactic sugar. I supported this claim by showing how a piece of code without explicit variable references can convey the same information as one that uses variable names.
Today, we will take a deeper journey into the idea of lexical addressing, which we first explored last time, by writing a program that does lexical addressing. The program will make the idea clearer by putting it into a concrete program that you can run and modify. Writing the program will give you another opportunity to create a processor for a little language using Structural Recursion. Using Structural Recursion, this problem is tricky but manageable; without it, this problem might seem impossible.
An Exercise in Lexical Addressing
Here is the BNF description of a new version of our little language, the small Racket-like language we've been using:
<exp> ::= <varref> | (lambda (<var>*) <exp>) ; 0 or more parameters | (<exp> <exp>*) ; 0 or more arguments | (if <exp> <exp> <exp>)
Notice that this version of the language allows functions with zero or more parameters, and so applications with zero or more arguments. It also has a standard if-then-else expression.
... demo language extensions and new syntax procs
(lexical-address exp)
,
where exp
is any expression in our language.
lexical-address
returns an equivalent expression with
every variable reference v replaced by its lexical
address, in the form of a list (v : d p), as described
last session.
> (lexical-address '(lambda (f) ((lambda (h) (lambda (n) ((f (h h)) n))) (lambda (h) (lambda (n) ((f (h h)) n)))))) (lambda (f) ((lambda (h) (lambda (n) (((f : 2 0) ((h : 1 0) (h : 1 0))) (n : 0 0)))) (lambda (h) (lambda (n) (((f : 2 0) ((h : 1 0) (h : 1 0))) (n : 0 0))))))
Let's allow our expressions to contain free variables. In order
to do that, we have to make an assumption similar to the one we
make when we use a Racket interpreter: the free variables are
bound at the "top level". We can imagine that the expression to
process is contained within a lambda
expression that
binds references to any variables that occur free in the
expression. This will account for system primitives such as
eq?
and cons
.
For example:
> (lexical-address 'a) (a : 0 0) > (lexical-address '(if a b c)) (if (a : 0 0) (b : 0 1) (c : 0 2)) > (lexical-address '(lambda (a b c) (if (eq? b c) ((lambda (c) (cons a c)) a) b)) ) (lambda (a b c) (if ((eq? : 1 0) (b : 0 1) (c : 0 2)) ((lambda (c) ((cons : 2 1) (a : 1 0) (c : 0 0))) (a : 0 0)) (b : 0 1)))
Code to Use in Your Code
Here is some code to use in building your solution.
-
These
syntax procedures
allow us to work in terms of the language's abstract syntax.
We first implemented a set of such procedures in
Session 13.
I have extended them to deal with
if
expressions and with the generalizedlambda
and app expressions. -
At some point, you will want to use the function
list-index
, which we used as an example in Session 18. -
At some point, you will also want to use the function
free-vars
. This function, a version of which you wrote for Homework 7, computes the set of free variable references that appear in an expression.
Today's
.zip file
contains all three of these items, along with two helper files
used by syntax-procs
(utility functions)
and free-vars
(a set ADT).
You don't need these files to write lexical-address
.
Feel free to review their interfaces now and their implementations
later.
Helpful Ideas on the Way to a Solution
... time passes as students work and think and work. And then we reach the end of our time.
Now that you've worked on this for a while, you have begun to discover some of the secrets to building a solution, among them several things you already know:
- Hint 1: how to design the function's structure
- Hint 2: how to assemble the answer returned by the function
- Hint 3: how to break the problem down into manageable pieces
- Hint 4: how to wait to compute lexical addresses
Soon, if not yet, you may hit on these ideas:
- Hint 5: how to give your function the information it needs to create addresses
- Hint 6: how to keep track of variable declarations
- Hint 7: how to update the list of variable declarations
And eventually you will come to two final issues:
- Hint 8: where the original list of variable declarations comes from
- Hint 9:
Why did he give us
free-vars
again?
With these ideas in hand, you can build this function! Give it a serious try before our next session.
Wrap Up
-
Reading
Looking ahead to Quiz 3, this might be a good time to catch up on your reading since Quiz 2. The readings assignments in that time have included:
-
a short section about
let
expressions as a syntactic abstraction - a short section about local functions as a syntactic abstraction
- a short section about boolean operators and conditional expressions as syntactic abstractions
- a short section about local recursive functions as a syntactic abstraction
- two optional readings, 2.4 Variables and Let Expressions and 2.5 Lambda Expressions, in Dybvig's The Scheme Programming Language
- the mini-lecture on syntax procedures might be worth reviewing, as you are now writing your own syntax procedures
-
a short section about
-
Homework
- Homework 8 is available and be due on Monday.
-
Quiz
- Quiz 3 is one week from today, on Thursday, April 4.