CS 3540 Reading: Adding State

Adding State to a Language Interpreter

Context

You implemented an interpreter for the Huey color language for Homework 9 and then extended the language and interpreter to support local variables for Homework 10. We reviewed a sample implementation of Homework 9 for Session 26, during which we discussed some of the design choices we have to make when writing an interpreter.

Previously, you read a short discussion of the difference between denoted and expressed values, which introduced the distinction between thinking of a variable as referring to a value and thinking of it as referring to a location that holds a value. This distinction is essential if we want our programs to be able to record state that changes over time, a topic that we covered in Session 24 and Session 25.

In order to support mutable state, we must first change the environment of the language to map variables onto cells that hold values. We must then have expressions that enable programmers to modify the state of existing variables, either through an interactive interpreter or through sequences of statements that change the valuesof existing variables.

In this reading assignment, let's walk through the process of adding a sequence of assignment statements to a language and programs that process programs written in the language, including the preprocessor and the evaluator.

Adding a Sequence of Assignment Statements

As we created Huey and added new features to it over two homeworks, we went through the same set of steps each time. After I choose a concrete syntax for the feature, we extended our code in several steps:

extend the BNF representation of the language grammar,
define syntax procedures that embody the abstract syntax of the feature,
add a new case to the pre-processor that recognizes the feature and eliminates any syntactic abstractions, and
add a new case to the evaluator that describes the behavior of the new feature.

In order to extend the evaluator, we must also define the semantics of the new feature. What does the new expression mean when used in a program? For many features, the semantics were relatively straightforward, though there were occasional wrinkles.

For this assignment, we add a sequence of assignment statements as a new kind of compound color expression. A do expression consists of a sequence of assignment statements followed by a single expression that provides the value of the entire do expression. Let's keep things simple to start, so that we can learn as much as we can in a short period of time.

Concrete Syntax

When we create our own languages, we can use whatever concrete syntax we like. Designing syntax that is easy to learn, use, and grow with is an art. For Huey, we should use a concrete syntax for a sequence that is consistent with the rest of the language. The syntax defined in the updated language spec looks like this:

(do (rgb 255 0 0))

(color c = (rgb 0 255 0) in
  (do (c <= (c mix (rgb 0 0 255)))
      (c <= (invert blue-green))
      (c shift 5)))

do is the keyword that signals a sequence. It is followed by zero or more assignment expressions, which are parenthesized expressions using the symbol <= to indicate assignment. After the assignment expressions comes a single color expression that represents the value of the sequence.

The BNF representation of such assignments is:

<color> ::= ( do <assignment>* <color> )

where an <assignment> is defined as:

<assignment> ::= ( <varref> <= <color> )

Abstract Syntax

When we extend the concrete syntax, we also extend our syntax procedures to handle sequences. This will include a type predicate, a constructor, and corresponding access procedures. We also need to update the general type predicate color? to recognize the new types of expression, too.

Remember: the abstract syntax consists of the "variable" parts of the concrete syntax: the parts of a function definition that change from one definition to the next. There are two meaningful parts of the expression that defines a sequence: the list of assignments and the value expression at the end. The keyword do is a "constant".

Racket has a primitive function named last that returns the last item in a list. You will need to write code to pull out the list of assignment statements.

Suggestion: Write a utility function named all-but-last that takes a list as an argument and returns a list of all its items except for the last. Use it to pull out the list of assignment statements from a do expression.

Your preprocessor and evaluator will have to process each assignment statement in a sequence, so it will be necessary to also define syntax procedures to handle assignment statements. As always, this includes a type predicate, a constructor, and corresponding access procedures. The abstract syntax of an assignment includes the variable and the value being assigned to it.

The Preprocessor

Next we add an arm to the pre-processor. The sequence is a core part of the language, so the pre-processor rebuilds a new sequence expression with its sub-expressions pre-processed. It must preprocess each assignment statement and the final value. In assignment statements, the variables are just symbols, not expressions, so we don't have to pre-process them. The new values are color expressions, so they need to be pre-processed.

Toward Behavior: Adding Cells to the Interpreter

In order to implement the desired behavior, we must modify our environments to map variables onto cells that hold values. That way, when a variable is assigned a new value, the evaluator can update the variable's value in the environment — not extend the environment with a new value.

We will do this by representing a cell as a mutable object whose value can be changed with set!. This will be the second data structure we implement for our interpreter, after the environment ADT we implemented to support local variables.

Before we modify our evaluator to handle sequences, we need to modify the environments that it creates and extends to map variables to cells. This requires three changes:

change the initial environment to bind white and black to cells that hold their values,
change the part of the evaluator that extends an environment to bind the new variable to a cell that holds its value, and
change the part of the evaluator that looks up the value of a variable in the environment to ask the cell it finds for the actual value.

After this step, our evaluator should work exactly as it did after Homework 11. But now we will be able to implement the behavior of sequence expressions.

The Evaluator

Now, to evaluate a sequence definition, we add a new arm to the helper of eval-exp that evaluates preprocessed expressions. This case evaluates a do expression by first evaluating its assignment statements in order and then evaluating the final color expression in the updated environment. The value of that final expression is the value of the sequence.

Remember: Any part of an the expression that is a color must be evaluated with a recursive call, using the existing environment.

Suggestion: Create a helper function to evaluate the list of assignment statements. It is a list of the sort you have processed many times. Use structural recursion! An assignment statement does not have to return a value, because it is being evaluated for its side effect (a change to the value of an existing variable).

Conclusion

That's it! Our interpreter should now be able to handle sequences as color expressions.