Recursion doesn't need to be scary. Sometimes, it's all about the data.
Fill in the blanks: 1, 2, 4, _, _.
Recall that a data type consists of two things:
As programmers, we are also concerned with other pragmatic issues, such as how values are represented as literals in programs and how values are represented for reading and printing. But these are interface issues that are essentially external to the data type itself.
In order to define operations on the values of a data type, we need a way to specify the set of values in the type. That way, we can be sure to handle all possible values of an object. More generally, we want to be able to recognize whether or not a value is in the set in order for our language processors to do type-checking.
The most concise way to define many data types is inductively.
Inductive definitions are a powerful form of shorthand. Though many of you cringe at the idea of an inductive proof, most of you think about certain problems in this way with ease. (Writing a proof requires a few proof-making skills, along with more precision -- and thus more discipline.)
Suppose that we want to specify the set of natural numbers. We might write:
We all understand this definition because we intuitively recognize the pattern. This specification relies on the reader to determine the pattern, add one to the previous value, and to continue it. But what if the "reader" is a computer program? Programs aren't nearly as strong at recognizing conceptual patterns in infinite lists ("--yet", says the old AI guy).
This type of specification can even cause problems to a human reader. Does "and so on" used above mean what we think it means?
Most of us probably feel a strong compulsion to list 8 as the next item in the sequence. It turns out that our minds have a built-in bias for simplicity, and double the previous value is one of the simplest rules to use, based on our past experience.
But, as any of you who have studied the mathematics of functions knows, there are an infinite number of functions that fit any finite set of points. For example, the next value in my sequence is actually 7, followed by 12... because I used the rule add one to the sum of the previous two values. Or perhaps we are listing the positive integers and omitting prime odd numbers. Then the next value is 6.
So we can only use such specifications safely with human readers if we have a shared understanding of how to identify the pattern. When multiple reasonable patterns exist, we must have a common sense of simplicity. (This is one of the things that always bothered me about analogy problems on exams such as the ACT and SAT.)
We can be more concise and more explicit about the pattern by writing:
Now we know that 8 is the next member in the set and that, using this definition, we can generate all of the powers of two greater than zero. (That is, as long as we agree on what 2n means. As we've seen, this is not as simple as we might have thought a few weeks ago!)
The second part of our definition is the inductive part. It defines one member of the set in terms of another member of the set.
Quick Exercise: Using the inductive definition above, how might we convince someone that 8 is in the set?
We could reason forward:
1 is in the set. (Rule 1)
Therefore, 2 is in the set. (Rule 2)
Therefore, 4 is in the set. (Rule 2)
Therefore, 8 is in the set. (Rule 2)
Or we could reason backward:
8 would be in the set if 4 were. (Rule 2)
4 would be in the set if 2 were. (Rule 2)
2 would be in the set if 1 were. (Rule 2)
1 is in the set. (Rule 1)
Therefore, 8 is in the set.
Reasoning inductively is something humans do automatically, and it turns out to be a useful technique to embody definitions in computer programs, too. The two arguments above correspond roughly to the forward chaining and backward chaining techniques we learn in a course on artificial intelligence.
Side note: It turns out that the effort needed to convert our two demonstrations into honest-to-goodness proofs is not that large! The first derivation leads naturally to a proof by construction. The second is the basis of a proof by contradiction that begins, "Assume that 3 is not in the set." Proof by contradiction generalizes to much larger numbers, too, because we can use Rule 2 n times to reach 0 from any n > 0. Proof by contradiction can generate an efficient proof, because it is directed toward a goal, and not left with a bunch of decisions about how to derive the statement to be proved. This is useful when there are many possible rules to use reasoning forward.
Inductive definitions are not limited to numbers. Suppose that we have a set of adjectives, A = {loud, hot, intense}. Then we can define a certain set of noun phrases in this way:
Our set includes "loud arena", "intense arena", and "loud hot intense arena". Is "loud loud loud arena" in the set?
We can even use an inductive definition to specify what counts as an arithmetic expression:
This can be handy for type checking computer programs... and much more. We return to this idea soon!
Inductive specifications are so plentiful and so useful that computer scientists have developed a specialized notation -- a language -- for writing them.
In case you haven't noticed, most of computer science is about notation -- and thus about language -- in one form or another. |
The notation we use to write inductive definitions is called Backus-Naur Form, or BNF. You will see BNF used in a number of places, including most programming language references, and computer scientists will expect that you know what BNF is.
BNF has three major components:
You will sometimes see terminals ignored altogether, because they are obvious to the reader. Suppose that I wished to define a "list of numbers". I may assume that my readers will know what I mean by a "number" and so not define it with a rule. If that turns out to be a problem (for example, do negative numbers count?), I can always add a rule or two to define numbers.
Here is the BNF definition for a list of numbers in Racket:
<list-of-numbers> ::= () <list-of-numbers> ::= (<number> . <list-of-numbers>)
This definition is inductive, because the second part defines one list of numbers in terms of another (shorter!) list of numbers. It matches the catchy definition I gave for a Racket list back in Session 4: a list is a pair whose cdr (second item) is a list.
There are a few things to notice about the BNF definition:
What about <number>? Is it a terminal or a non-terminal? It is written as a non-terminal. We might assume that it stands in place of the numeric literals 0, 1, 2, and so on, and thus requires no formal definition. But we really should define <number> formally, too:
<number> ::= 0 <number> ::= <number>+1
As you can see from these definitions, there may be more than one rule for a non-terminal. This happens so often that there is a special notation for it:
<list-of-numbers> ::= () | (<number> . <list-of-numbers>)
We read the vertical bar, |, as "... or ...".
As a shorthand, we will sometimes use another piece of notation, called the Kleene star, denoted by {...}*. This This notation says that whatever is in the curly brackets is repeated zero or more times. Using the Kleene star, we can define a list of numbers simply as:
<list-of-numbers> ::= ( {<number>}* )
Another variation is the Kleene plus, denoted by {...}^{+}. This notation differs from the Kleene star in that the structure within the brackets must appear one or more times.
In this course, we usually write our definitions of data types using the more verbose notation given in the first definition above. The more verbose notation will help us think more directly about the code that processes our definitions.
We can use our inductive definitions, whether in BNF or not, to demonstrate that a given element is or is not a member of a particular data type. This is just what we did above when I asked "Convince me that...".
For example, consider the list (4 3). We can use the technique of syntactic derivation to show that it is a member of the data type <list-of-numbers>:
<list-of-numbers> (<number> . <list-of-numbers>) (4 . <list-of-numbers>) (4 . (<number> . <list-of-numbers>)) (4 . (3 . <list-of-numbers>)) (4 . (3 . ()))
Recall that(4 3) and (4 . (3 . ())) are two ways of writing the same Racket value!
You shouldn't be surprised that syntactic derivation resembles Racket's substitution model. They are closely related. And, as in the substitution model, the order in which we derive the sub-expressions is not important.
We have just seen how we can specify a list of numbers using BNF. We can use BNF to specify other data types as well. For example, we can specify a binary tree that has numbers as leaf nodes and symbols as internal nodes as:
<tree> ::= <number> | (<symbol> <tree> <tree>)
This BNF specification describes trees that look like the following:
1 (foo 1 2) (bar (foo 1 2) (baz 3 4)) (bif 1 (foo (baz 3 4) 5))
We will see many more examples of BNF throughout the course. It goes hand in hand with inductive definitions. Earlier today, we defined a simple set of arithmetic expressions inductively. (Can you give a BNF definition for that set?) That definition specifies a set of numbers, but it also specifies the sort of expressions we write in computer programs. Is that resemblance accidental? No, it reveals something deeper.
An important use of BNF is in specifying the syntax of programming languages. It may seem odd that a method we just described as being useful for specifying data types would be useful for specifying syntax, but remember: a compiler is just a program that treats other programs as data. In other words, syntax is just another data type!
This is a powerful idea. Let it sink in.
The notion that syntax is just a data type actually takes us a bit farther along our path of studying programming languages. How might we understand a language? With some help from Shriram Krishnamurthi's book, Programming Languages: Application and Interpretation I can think of at least four ways:
All four of these are important to programmers, especially those who want to use the language to build big systems. In a course on programming languages, our focus is on syntax and semantics, with an emphasis on the latter: How can a program implement the behaviors we desire? How can a programming language describe those behaviors? How can an interpreter or compiler process a program?
Though it might seem like it to yet, because you are still early in your study of computer science, but in many ways, syntax is the least important of these four features. Familiar or not, terse or not, it just doesn't matter. It just doesn't matter (frivolous video). You'll see.
BNF gives us a tool for describing syntax. Programs that process BNF descriptions recursively give us a tool for understanding behavior.
We first encountered John Backus back in Session 1, as the leader of the first compiler project in history. For this and his many other contributions to computing, Backus received the Turing Award in 1977. The Turing Award is computing's equivalent of the Nobel Prize. The other half of Backus-Naur is Peter Naur, who received the Turing Award in 2005. Naur's award also recognizes fundamental contributions to programming languages: For fundamental contributions to programming language design and the definition of Algol 60, to compiler design, and to the art and practice of computer programming.
The first Turing Award was given in 1966 to Alan Perlis -- another name we've seen this semester, in this quote on languages changing how we think. By now, I hope that you are not surprised that so many of the giants of computing made their biggest contributions to the discipline in the area of programming languages. As mentioned above, most of computer science is about notation -- and thus about language -- in one form or another.