TITLE: Basic Concepts: The Unity of Data and Program AUTHOR: Eugene Wallingford DATE: February 06, 2007 10:31 PM DESC: ----- BODY: I remember vividly a particular moment of understanding that I experienced in graduate school. As I mentioned last time, I was studying knowledge-based systems, and one of the classic papers we read was William Clancey's Heuristic Classification. This paper described an abstract decomposition for classification programs, the basis of diagnostic systems, that was what we today would call a pattern. It gave us the prototype against which we could pattern our own analysis of problem-solving types. In this paper, Clancey discussed how configuration and planning are two points of view on the same problem, design. A configuration system produces as output an artifact capable of producing state changes in some context; a planning system takes such an artifact as input. A configuration takes as input a sequence of desired state changes, to be produced by the configured system; a planning system produces a sequence of operations that produces desired state changes in the given artifact. Thus, the same kind of system could produce a thing, an artifact, or a process that creates an artifact. In a certain sense, things and processes were the same kind of entity. Wow. Design and planning systems could be characterized by similar software patterns. I felt like I'd been shown a new world. Later I learned that this interplay between thing and process ran much deeper. Consider this Lisp (or Scheme) "thing", a data value known as a list:
(a 1 2)If I replace the symbol "a" with the symbol "+", I also have a Lisp list of size 3:
(+ 1 2)But this Lisp list is also the Lisp program for computing the sum of 1 and 2! If I give this program to a Lisp interpreter, I will see the result:
> (+ 1 2)In Lisp, there is no distinction between data and program. Indeed, this is true for C, Java, or any other programming language. But the syntax of Lisp (and especially Scheme) is so simple and uniform that the unity of data and program stands out starkly. It also makes Scheme a natural language to use in a course on the principles of programming languages. The syntax and semantics of Lisp programs are so uniform that one can write a Lisp interpreter in about a page of Lisp code. (If you'd like, take a look at my implementation of John McCarthy's Lisp-in-Lisp, in Scheme, based on Paul Graham's essay The Roots of Lisp. If you haven't read that paper, please do soon.) There is no distinction between data and program. This is one of the truly beautiful ideas in computer science. It runs through everything that we do, from von Neumann's stored program computer, itself to the implementation of a virtual machine for Java to run inside a web browser. A related idea is the notion that programs can exist at arbitrary levels of abstraction. For each level at which a program is data to another program, there is yet another program whose behavior is to produce that data. An assembler produces machine language from assembly language.