Session 17

Recursive Local Procedures

CS 3540
Programming Languages and Paradigms

In Our Previous Episode...

... we dug deeper into the idea that a let expression is a syntactic abstraction of a lambda application and wrote code to translate the former into the latter. In the simplest form, it this is a matter of simply deconstructing the let exp and creating an app expression:

    (define let->app
      (lambda (let-exp)
        (let ((var  (let->var  let-exp))
              (val  (let->val  let-exp))
              (body (let->body let-exp)))
          (make-app (make-lambda var body)   ; using the constructors
                    val))))                  ; from Homework 7

Notice: we translate one expression in the language into another expression in the language. The new expression has exactly the same behavior as the original but uses only core features of the language.

To process expressions from the entire language, though, we need to write a preprocessor that can (1) translate sugar into its underlying form and (2) keep all core expressions in their original form. Most expressions in the language contain other expressions, so the preprocess function is recursive.

Preprocessing on Homework 7

Problem 4 on Homework 7 asks you to extend the preprocessor for our little language to allow and and or expressions as syntactic sugar. You saw translations for the boolean connectives in the reading assignment from last session.

Your job here is the same as above: translate one expression in the language into another expression in the language that behaves the same as the original but uses only core features of the language. That means using constructors to produce language expressions, not Racket code that looks like it. Racket code will be evaluated as Racket; the lists you produce will be returned as values.

This doesn't mean that you can't use Racket to help you do your job. Consider the above example: I use a Racket let expression to create local variables that I use to assemble the expression that I return. That allows me to create a clean, simple app expression in the language using a clean, simple make-lambda. You may do the same. Just be sure that the value returned is an expression created by a constructor.

Issue #1: Boolean Syntax.   What are boolean values in the little language? This is a matter of what expressions look like. We care about this because the translation for and expressions includes a literal false:

    (if test_1
        (and test_2 ... test_n)             ;; looking for a reason
        false)                              ;; to return false...

As the little language is currently defined, we can't use #t and #f because aren't legal expressions. We have at least two options: define specific forms for true and false literals, or designate the symbols true and false as keywords. Can you think of others?

For simplicity's sake, let's do that. But now we need to change our varref? type predicate, because true and false can't be used as variables anymore!

Issue #2 Boolean Semantics.   What count as true and false values in the language? This is a matter of what expressions mean.

In Java, only the boolean values true and false can be used for truth and falsehood. In Python, though, some non-Boolean values count as true and false. We say that Python is "truthy" and "falsy". In Racket, "any value other than #f counts as true." It is a truthy language.

What counts as true and false in our little language matters because of or. In truthy languages, like Racket, an or often returns the first non-false value. In order to avoid recomputing the value (which could be dangerous in a language with side effects!), we use a local variable to hold the value.

    (let ((*value* test_1))
       (if *value*                        ;; looking for a reason
           *value*                        ;; to return true...
           (or test_2 ... test_n)))

This creates two more issues for us:

gensym creates yet another problem, though this one is smaller: How can I test my preprocessor code when I don't which symbol it will generate for the local variable when translating amd or expression? All of my tests fail:

    name:       check-equal?
    location:   homework07-tests.rkt:6:0
    actual:     '((lambda (g3378) (if g3378 g3378 b)) a)
    expected:   '((lambda (value) (if value value b)) a)
... but they really are correct. For now, I will eyeball them. This reminds us of one of functional programming's advantages: a pure function is easier to test, because we always know what its value should be!

Quick Exercise

What is the value of this dandy let expression?

    (let ((x 3) (y 4))
      (+ (let ((z 5)
               (x (* 3 y)))
           (+ (* x y) z))
         (let ((x 6)
               (w (- x y)))
           (+ (* x w) y))))

The challenge may seem familiar and new all at once. In the opening exercise last time, we saw a case where a variable did not mean what we thought it might because the region of a local variable is the body of the let. That does not include the variables' values. This time, though, that problem is turned inside out. We have a variable in an expression that cannot see the outer declaration, because another declaration hides it!

Keep this in mind as we tackle a seemingly unrelated problem...

Local Functions

At the end of last session, we saw that we can use a let expression to create a local function in Racket, because function names are like any other variable bindings:

    (define invert
      (lambda (list-of-2-lists)
        (let ((swap (lambda (lst)
                      (list (second lst) (first lst)) )))
          (map swap list-of-2-lists)) ))

Hurray! Racket gives us something unexpected for free, which follows from the design decision to make naming functions work like naming any other value. We have already taken advantage of this many times, when we pass a function name or value as an argument to another function.

As we've learned to write recursive programs, we have often found ourselves creating helper functions as a part of our programs. From now on, we will be able to create simple local functions whenever they help us.

Consider the case of (list-index target los), a function I sometimes have students write as a homework problem. It computes the 0-based first occurrence of target in los:

    > (list-index 'd '(a b c d e f d))
    > (list-index 'z '(a b c d e f d))

QUICK EXERCISE: write the function now.

To solve this problem, we need a third argument: the position of the current symbol in the list. So we make list-index an interface procedure and make our structurally-recursive function a helper.

We end up with functions that look like something this:

    (define list-index
      (lambda (target los)
        (list-index-with-count target los 0)))

    (define list-index-with-count
      (lambda (target los count)
        (if (null? los)
            (if (eq? target (first los))
                (list-index-with-count target
                                       (rest los)
                                       (+ count 1))))))

list-index-with-count exists only to serve list-index. No other function or programmer is likely ever to need it. That sounds like a perfect time to use a local function! So...

A Wrinkle in Code

A Wrinkle in Time
From Wikipedia

... we define list-index-with-count as a local variable:

    (define list-index
      (lambda (target los)
        (let ((list-index-with-count
                (lambda (target los base)
                  (if (null? los)
                      (if (eq? target (first los))
                                target (rest los) (+ base 1)) )) )))
          (list-index-with-count target los 0)) ))

Trouble ensues... Dr. Racket doesn't even give us a chance to execute the function. Its type checker displays an error even before we can load the file successfully:

    list-index-with-count: unbound identifier in module in: list-index-with-count

What went wrong?

When I ask you a question, you usually know enough to figure out the answer. This time, the answer is the same one we found in our opening exercise last time, when y and z could not be initialized as we expected:

The region of the name list-index-with-count is the body of the let expression.

The first call to list-index-with-count, within the body of the let expression, is fine. But list-index-with-count is recursive and calls itself from within its own body. The body of the let does not include the definition of list-index-with-count itself! The recursive call within its body uses an undefined name.

This is perfectly clear if we translate the let expression into a semantically-equivalent lambda app:

    (define list-index
      (lambda (target los)
        ( (lambda (list-index-counted)
            (list-index-counted target los 0))
          (lambda (target los count)
            (cond ((null? los) -1)
                  ((eq? target (first los)) count)
                  (else (list-index-counted target (rest los)
                                                   (+ count 1))))) )))

This code declares list-index-with-count as formal parameter on a function. We pass the body of the function as an argument using a nameless lambda. But this function calls what used to be named list-index-with-count! In we call the nameless function that is created, list-index-with-count is a free variable.

Can we use a nested let expression to solve this problem? Something like:

    (let ((list-index-with-count ...))
      (let ((list-index-with-count ...))

After today's quick exercise, you know that this won't help us. The new local variable shadows the outer one.

We are stymied. let can't support the idea of a recursive function. Why? Because it is merely a syntactic abstraction of a lambda application. The arguments passed to the lambda are evaluated before they are passed to the function and only then bound to their names.

To iron out this wrinkle, we need something more powerful than let.

Ironing Out the Wrinkle

In another style of programming, this might not be a big deal. We could try to work around the limitation. But in functional programming, we create functions all the time. We also recurse over tree and list structures. We want to be able to so so as flexibly as possibly. For this reason, Racket provides another special form, named letrec, that supports local recursive definitions.

The syntax of letrec is identical to that of let:

    <letrec-expression> ::= (letrec <binding-list> <body>)

         <binding-list> ::= ()
                          | ( <binding> . <binding-list> )

              <binding> ::= (<var> <exp>)

The semantics of the letrec expression are different from the semantics of the let expression in an important way. In a letrec, the region associated with each <var> is the remainder of the letrec expression, including the binding expressions that follow.

With letrec, we can define list-index using a local procedure:

    (define list-index
      (lambda (target los)
        (letrec ((list-index-with-count
                   (lambda (target los base)
                     (cond ((null? los) -1)
                           ((eq? target (first los)) count)
                           (else (list-index-counted target
                                                     (rest los)
                                                     (+ count 1)) )))))
          (list-index-with-count target los 0)) ))

... and satisfaction fills the room:

    > (list-index 'd '(a b c d e f d))

Actually, we can now simplify list-index-with-count a bit. Because the value of target remains the same throughout the body of list-index and the body of list-index-with-count, we don't really need to pass target to list-index-with-count:

    (define list-index
      (lambda (target los)              ;; once target is passed in ...
        (letrec ((list-index-counted
                  (lambda (los base)
                    (cond ((null? los) -1)                 ;; ... its value
                          ((eq? target (first los)) base)  ;; never changes
                          (else (list-index-counted (rest los)
                                                    (+ base 1)) )))))
          (list-index-counted los 0))))

This makes for a more cohesive procedure, at the small expense of tracking target up to the declaration in list-index.

Exercise for home: What happens if we remove los from list-index-helper's parameter list, the way we did target?

Local function definitions are quite useful in cases that require an interface procedure. The helper function is the real function, while the interface procedure exists only to send the initial value for some argument.

Local Recursive Functions

As you are learning Racket, this type of construction may be hard to read for a while. Simple local variables tend be defined with simple values, but local recursive procedures tend to be a bit longer and more complex. On the other hand, they are a clean, compact way to implement many solutions that require interface procedures to kick off a computation.

Perhaps you can help yourself understand the ideas in Racket by referring to another language you know:

Local function bindings offer some significant advantages over helpers declared at the top level:

letrec as Syntactic Abstraction

It is not often the case that Racket provides a new keyword or a new piece of syntax to solve a problem. letrec is an exception, but a well-motivated one. Recursive programming is a fundamental technique in functional programming, so it is important that Racket make writing recursive procedures as easy and straightforward as possible.

But we don't need a new piece of syntax. Like the let expression, letrec is a syntactic abstraction. We can implement the equivalent of a letrec expression in "vanilla" Racket, using a feature of the language we have not studied yet and a bit of a hack. Can you imagine how?

The Context

We are discussing the idea of syntactic abstractions, those features of a language that are convenient to have but that are not essential to the language. In the last few weeks, we have learned that a number of standard language features are really syntactic abstractions of more primitive features, including:

Now we return to the idea of a variable's scope and see how scope works in a block-structured language. This discussion is prelude to something more radical: the idea that variable names themselves are not necessary. They are syntactic abstractions!

Scope and Lexical Analysis -- READING

As we saw in Session 14, the region of a variable declaration is the part of a program where that variable declaration is seen. The term "region" is often used synonymously with "scope". If the scope of a variable can be determined at compile time (that is, statically), then we say that the language is statically or lexically scoped.

Is there a difference between the region of a variable and the scope of a variable? Yes. We will study the distinction in some detail later. For now, we can consider the region of our Racket identifiers as identical to their scope. These identifiers are, of course, parameter names and local variable declarations.

If regions can be nested inside of each other, then we say that a language is block structured. These regions are called blocks. If a variable in one block is not visible because it has been re-declared within a nested block, we say that the "inner" variable creates a hole in the scope of the "outer" variable. We also say that the inner variable declaration shadows the outer one.

We saw an example of this in today's opening exercise. The nested let expressions created holes in the region of the x declared by the main expression. Each inner x shadows the outer one.

Consider the following Java-like snippet:

     {                    /* Block 1 */
        int x = 4,
            y = 0;
        {                 /* Block 2 */
           int x = 3,
               z = x + 1;
           System.out.println( x + " " + z );
        y = x + 1;
        System.out.println( x + " " + y );

What is the value of x printed in Block 1? In Block 2? What is the value of y printed in Block 1? Of z in Block 2? These results happen because the declaration of x in Block 2 shadows the declaration of x in Block 1.

Sidebar.   This is not legal Java, because Java does not allow a variable in a nested block to shadow a variable in the outer scope! We can simulate the same idea with a method-local variable that shadows an instance or class variable. See this simple test class. (In some contexts, we can declare a static block that lives within another block.) With different output statements, this code snippet would be a legal C program.

Can we write a Racket program with a similar structure? Certainly. We can use let to create blocks, though we will need let* to initialize our inner block properly:

     (let ((x 4)            ;; Block 1
           (y 0))
       (let* ((x 3)         ;; Block 2
              (z (+ x 1)))

Quick Question: Could we use a letrec instead to solve the problem of nesting lets for variable bindings?

In both the Java example and the Racket example, we can begin to see how the variable names themselves are unnecessary. Can we change the names? If so, what remains constant in how the variable bindings are determined? Each variable reference can be determined by what block the variable is declared in, along with the position of the declaration within the block.

This is the idea of lexical addressing. We will pick up our exploration here next time.

Wrap Up

Eugene Wallingford ..... ..... March 6, 2018