TITLE: Learn Exceptions Later AUTHOR: Eugene Wallingford DATE: September 26, 2017 3:58 PM DESC: ----- BODY: Yesterday, I mentioned rewriting the rules for computing FIRST and FOLLOW sets using only "plain English". As I was refactoring my descriptions, I realized that one of the reasons students have difficulty with many textbook treatments of the algorithms is that the books give complete and correct definitions of the sets upfront. The presence of X := ε rules complicates the construction of both sets, but they are unnecessary to understanding the commonsense ideas that motivate the sets. Trying to deal with ε too soon can interfere with the students learning what they need to learn in order to eventually understand ε! When I left the ε rules out of my descriptions, I ended up with what I thought were an approachable set of rules:

The FIRST set of a terminal contains only the terminal itself.

To compute FIRST for a non-terminal X, find all of the grammar rules that have X on the lefthand side. Add to FIRST(X) all of the items in the FIRST set of the first symbol of each righthand side.

The FOLLOW set of the start symbol contains the end-of-stream marker.

To compute FOLLOW for a non-terminal X, find all of the grammar rules that have X on the righthand side. If X is followed by a symbol in the rule, add to FOLLOW(X) all of the items in the FIRST set of that symbol. If X is the last symbol in the rule, add to FOLLOW(X) all of the items in the FOLLOW set of the symbol on the rule's lefthand side.

These rules are incomplete, but they have offsetting benefits. Each of these cases is easy to grok with a simple example or two. They also account for a big chunk of the work students need to do in constructing the sets for a typical grammar. As a result, they can get some practice building sets before diving into the gnarlier details ε, which affects both of the main rules above in a couple of ways. These seems like a two-fold application of the Concrete, Then Abstract pattern. The first is the standard form: we get to see and work with accessible concrete examples before formalizing the rules in mathematical notation. The second involves the nature of the problem itself. The rules above are the concrete manifestation of FIRST and FOLLOW sets; students can master them before considering the more abstract ε cases. The abstract cases are the ones that benefit most from using formal notation. I think this is an example of another pattern that works well when teaching. We might call it "Learn Exceptions Later", "Handle Exceptions Later", "Save Exceptions For Later", or even "Treat Exceptions as Exceptions". (Naming things is hard.) It is often possible to learn a substantial portion of an idea without considering exceptions at all, and doing so prepares students for learning the exceptions anyway. I guess I now have at least one idea for my next PLoP paper. Ironically, writing this post brings to mind a programming pattern that puts exceptions up top, which I learned during the summer Smalltalk taught me OOP. Instead of writing code like this:

    if normal_case(x) then
       // a bunch
       // of lines
       // of code
       // processing x
    else
       throw_an_error

you can write:

    if abnormal_case(x) then
       throw_an_error

    // a bunch
    // of lines
    // of code
    // processing x

This idiom brings the exceptional case to the top of the function and dispatches with it immediately. On the other hand, it also makes the normal case the main focus of the function, unindented and clear to the eye. It may look like this idiom violates the "Save Exceptions For Later" pattern, but code of this sort can be a natural outgrowth of following the pattern. First, we implement the function to do its normal business and makes sure that it handles all of the usual cases. Only then do we concern ourselves with the exceptional case, and we build it into the function with minimal disruption to the code. This pattern has served me well over the years, far beyond Smalltalk. -----