Session 15

Syntactic Abstraction: Local Variables

CS 3540
Programming Languages and Paradigms

For Want of a Local Variable...

Last time, I used a new Racket function, assoc, in my code. It returns the first pair in a list that starts with a given symbol:

    > (assoc 'c '((a . 11) (b . 24) (c . 3)))
    '(c . 3)

If it doesn't find such a pair, it returns false:

    > (assoc 'd '((a . 11) (b . 24) (c . 3)))

This double feature makes assoc handy for implementing the lookup function we needed in our cipher language evaluator:

    (define lookup
      (lambda (var env)
        (if (assoc var env)
            (cdr (assoc var env))
            (error 'lookup "invalid variable reference ~a" var))))

This code is slick, but we have to call assoc twice whenever we find the var we are looking for. Can't we do better? Sure: we can use a local variable.

... but we have not used local variables in Racket.

This has caused some of you extra friction as you've learned to write Racket functions. That was intentional. We were learning a new style of programming, and local variables could lure us off the path of learning. See, one of the main reasons we use local variables in other languages is to sequence a computation: first do this, then do this, ..., with local variables holding partial results along the way. But that's not the primary way we think in functional programming, which uses functions to do as much work as possible.

So we ignored local variables for a while so that you would have a chance to practice the new style without so much temptation to fall back into procedural thinking. (You could be tempted, but we didn't have the tool...) I was careful to select as many problems as possible whose solutions did not need local variable, so that we could practice with as few distractions as possible.

Writing lookup reminds us that we use local variables for other reasons, too. We use them not only to sequence a computation but also to give a name to a value for readability. In lookup, I might want to call assoc once and give its value name.

I want a local variable:

    match = (assoc var env)
so that I can say
    (if match
        (cdr match)
        (error 'lookup "invalid variable reference ~a" var))

The functional programmer in me know that I can already do this, using a function:

    (define get-value
      (lambda (match var)    ; have to pass var for error case
        (if match
            (cdr match)
            (error 'lookup "invalid variable reference ~a" var))))
That creates a name and uses the named object twice. If I call it:
    (get-value (assoc var env) var)
I assign a value to match and get a result. Voilá!

I can even do without creating a stand-alone function, if my lambda-fu is strong:

    ((lambda (match)      ; var is available here!
       (if match
           (cdr match)
           (error 'lookup "invalid variable reference ~a" var)))
     (assoc var env))

That's great and all, but why do I have to think about lambda here? Wouldn't it be nice if I could simply make a local variable?

It would be, and the designers of Racket give us the ability to do so. But we just saw something very important: Racket doesn't need much new machinery under the hood to make this happen. The language already supports everything we need to get the job done.

Syntactic Abstraction

sugar cubes
Gimme me
some sugar

We are all familiar with programming language features that are not strictly necessary to make the language complete. A good example is the for statement in Java. for is not strictly necessary, because you can always replace a for loop with a while loop that does the same thing. for is nice, though, because it brings all of the control elements of the loop together into one place.

Or consider the simple assignment statement:

    x = y + z
Programmers often find themselves using this statement to update the value of a variable:
    x = x + 5
... which gives rise to the convenient shorthand we see in Python:
    x += 5
In languages like C++ and Java, programmers write lots of for-loops and find themselves incrementing lots of counters:
    x = x + 1
... which gives rise to the even shorter shorthand:
We don't need these extra constructs, but they are handy.

The formal name for such language features is syntactic abstraction, though many people call them syntactic sugar. They make programming easier by abstracting away the details of a common construction into a simpler or more direct statement. They are convenient but not necessary. They make the language sweeter for humans.

As programmers, we often feel as if syntactic abstractions are essential to our task of writing code easily. Indeed, much of my research in artificial intelligence and object-oriented programming was built on the foundation of creating and using very high-level programming languages for developing intelligent systems. These languages aren't necessary. People could always have written their programs in Python, Ada, Java, Racket, Ruby, Smalltalk, Lisp, or C++. But the new languages made it possible to write programs in terms of domain knowledge and problem-solving strategies, rather programs in terms of for statements or car and cdr expressions.

But from the programming language perspective, syntactic sugar is not essential and can complicate the process of interpreting a language unnecessarily. Part of our study of the design of programming languages is to identify which features are essential so that we can understand how interpreters work, and how an interpreter can pre-process sugar.

We have already learned that one feature we probably thought was essential -- functions that take more than one argument -- is really syntactic sugar. The idea underlying this abstraction was called "currying". You have also read about and used Racket's multi-way conditional expression, cond, and learned that it is a syntactic abstraction of the more basic if primitive. (Or vice versa!)

Over the next few sessions, we will consider a number of other common programming language features and investigate whether they are essential or are "just sugar".

Local Bindings and lambda

Up to this point in the course, we have used only three kinds of identifiers in our programs: the names of primitive functions, the names of functions and other data values that we defined at the top level, and the names of formal parameters on functions. Of these, only the formal parameters behave like the "local variables" that we are accustomed to using in other programming languages.

These names are local to the function in which they are declared. They have not yet been variable, because we have not had a way to assign new values to a name yet. But we haven't needed to. Any time we have needed to "change the value" of an variable, we have passed the new value as an argument in a recursive call.

This idea also explains how we have managed to write programs for six weeks without creating any local variables. Any time we needed a local variable, like position in the function positions-of, we created a new function with a new parameter:

    (define positions-of-helper
      (lambda (s los position)
and passed the value of the variable when we called the function:
    (define positions-of
      (lambda (s los)
        (positions-of-helper s los 0)))
A function satisfied our needs.

The lambda special form provides a binding mechanism by which names are created and values are associated with names. Last week, we considered how we can determine statically whether a variable reference is bound to the value of a formal parameter in a program. Let's now move on to consider identifiers in more detail and how they get their values.

Calling functions isn't the only way that people think of names and values. There are times when writing a mathematical expression out in full detail is ungainly. The detail obscures the nature of the relationship being expressed. In mathematics, we have a common solution. Instead of writing:

f(x,y) = x3y2 + 2x2y + x + xy - xy2 - y2 + 1

... we often write a partially factored expression:

f(x,y) = x(1+xy)2 + y(1-y)+(1+xy)(1-y)

The repeated expressions themselves may represent important abstractions in the problem. From this formula, we might go one step farther and write:

f(x,y) = xa2 + yb + ab,
where a = 1 + xy and b = 1 - y.

The formulas for a and b are examples of an important abstraction mechanism called local variables.

The variables a and b behave in the same way we expect local variables to behave in a computer program. Would it be proper to substitute one value for y in the definitions of a and b, and then another value for y in the definition of f? Clearly not; such a substitution would violate all of the notational conventions with which we are familiar. All readers understand that the variables defined in the second part of the statement and the variable references in the first part refer to the same values.

So, even in mathematics, where functions are a fundamental tool, we find that local variables are a useful tool. As programmers, this doesn't surprise us.

Local Bindings and let

Unsurprisingly, Racket has a way of creating expressions that use local variables: the special form let. Here is a let expression that creates a local variable named x, assigns it the value 3, and uses it to compute another value:

    (let ((x 3)) 
      (+ x (* x 10)))

Using let, we can write the function f from above as:

    (define f
      (lambda (x y)
        (let ((a (+ 1 (* x y)))
              (b (- 1 y)))
          (+ (* x (square a)) (* y b) (* a b)))))

You will notice that this resembles what we do in math: "Let a be ... and b be ... in ...."

More importantly, we now have the tool we need to implement lookup function< in the way we'd like:

    (define lookup
      (lambda (var env)
        (let ((match (assoc var env)))
          (if match
              (cdr match)
              (error 'lookup "invalid variable reference ~a" var)))))

Of course, let uses prefix notation, like the rest of Racket, and the placement of the parentheses is -- as always! -- important. So let's have a closer look.

The general form of a Racket let expression is:

    <let-expression> ::= (let <binding-list> <body>)

      <binding-list> ::= ()
                       | ( <binding> . <binding-list> )

           <binding> ::= (<var> <exp>)

              <body> ::= <exp>

The let special form takes two arguments. The first is a list of variable bindings. Though Racket permits an empty list, in practice we almost never use one. (In all my years of programming in Lisp, Scheme, and Racket, I have never written a no-variable let expression.) The second is an expression that uses these bindings.

Quick Exercise: Why does let have to be a special form?

The region of the variables -- that is, the code where these variables have meaning -- is the body of the let expression. This is important. We cannot use one of the variables declared in a let expression within the value of another variable.

As with nearly all Racket expressions, a let expression can be used anywhere an expression is expected, including outside the context of a function definition. For example:

    (+ (let ((x 3))                   ; what is the value? 
         (+ x (* x 10)))

The following expression tells us a little more about how let works:

    (define x 5)

    (+ (let ((x 3))                   ; what is the value now? 
          (+ x (* x 10)))

When this expression is evaluated, it returns the value 38 because:

Do you notice any similarity between this example and our recent discussion of free and bound variables? That's not accidental...

We can use let expressions to create local variables for all the same reasons we use local variables in other languages. Consider this helper function, which you wrote for positions-of on Homework 4:

    (define positions-of-helper
      (lambda (s los position)
        (if (null? los)
            (if (eq? s (first los))
                (cons position
                      (positions-of-helper s (rest los) (add1 position)))
                (positions-of-helper s (rest los) (add1 position))))))

In the case of a pair, we always need to know the positions of s in the rest of the list. This code repeats the call, resulting in code that is harder to read. We could use a local variable to hold the result for the rest of the list:

    (define positions-of-helper
      (lambda (s los position)
        (if (null? los)
            (let ((positions-for-rest
                    (positions-of-helper s (rest los) (add1 position))))
              (if (eq? s (first los))
                  (cons position positions-for-rest)

In some situations, we might prefer this solution.

Translational Semantics

Semantics refers to what a programming language construct means. Consider the let special form we have just discussed. How are we to interpret a let expression? This is important for human readers as well as language processors such as our Racket interpreter and our C++ compiler.

There are a number of ways to describe the semantics of a programming language feature. For instance, we could write a definition in English or some other natural language. But such definitions tend to be imprecise or ambiguous, even for human readers.

One of the more natural ways for a computer scientist to describe the semantics of a language feature is to write a program. We can translate expressions that use one feature into expressions that use another feature, perhaps one that we already understand well. This is called a translational semantics. Let's take a look at a translational semantics for the let expression.

A Translational Semantics for let

The primary purpose of a let expression is to bind variables to values. We know, too, that the application of a lambda expression binds variables to values, for use in evaluating an expression that contains those variables.

Recall that a let expression has the following form:

    (let ((<var_1> <exp_1>)
          (<var_2> <exp_2>)
          (<var_n> <exp_n>))

The semantics of this expression bind the value of <exp_i> to <var_i> in <body>. As noted above, the variables are bound to their values only in the body of the let expression. This is called the region of the variables. lambda expressions work the same way: formal parameters are bound to their values only in the body of the lambda expression.

So, we can express the meaning of a let expression using the following lambda expression:

    ((lambda (<var_1> <var_2>...<var_n>)
     <exp_1> <exp_2>... <exp_n>)

In fact, many Scheme interpreters automatically translate a let expression into an equivalent lambda application whenever they see one!

We programmers can do the same thing. Just as we could use a while loop to write Java code without for loops, we can write code using lambda application to write Racket code without let expressions. In both cases, though, the code we produced would probably not be as nice to read, and it would take more effort to write.

The idea of a treating a local binding as a syntactic abstraction is not unique to Racket. We can apply the same semantics to local variables in languages like Python or Java.

Consider the following snippet of Python:

    x = 2
    # ... do some stuff
    return x * something

This code creates a local variable, x, assigns it the value 2, and uses the variable to compute a result. We can write code that has exactly the same meaning and no local variable by creating and calling a new helper function:

    return helper(2)

    def helper(x):
        # ... do some stuff
        return x * something

This code creates a formal parameter, x, assigns it a value when we call the function, and uses the variable to compute a result -- with no no local variable.

Of course, this sort of construction is not as natural in Python as it is in Racket, because it has harder to create and use Python functions "on the fly". But we programmers do this sort of thing all the time, whenever we recognize a need to factor out functionality for reuse in a function.

Note: Study this Python example. It is often a stumbling block for students.

The syntax of a programming language generally caters to us programmers. At the implementation level, though, language interpreters are less concerned with ease of programming than they are with efficiency, completeness, and correctness of translation. And that is the way we programmers like it!

A language interpreter can accommodate both sides of this equation. Racket allows us to program with the syntactic sugar of let but pre-processes it away before evaluating our program. A Racket compiler can translate any let expression into an equivalent lambda application before evaluating it.

Don't make the mistake of thinking that the idea of pre-processing syntactic abstractions away is unique to Racket or to odd functional programming languages. C++ was designed as a language made up almost entirely of syntactic sugar! Its abstractions (classes and members) can be -- and originally were -- pre-processed into C code with structs that is suitable for a vanilla C compiler.

The key point to note here is this:

Local variables are not essential to a programming language!

Quick Exercises: For some practice, try converting these let expressions into equivalent lambda expressions:

    (let ((x 5)        ;; Exercise 1
          (y 6)
          (z 7))
       (- (+ x z) y))

    (let ((x 13)       ;; Exercise 2
          (y (+ y x))
          (z x))
       (- (+ x z) y))

Wrap Up

Eugene Wallingford ..... ..... March 7, 2023