TITLE: Guy Steele's Own Strange Loop AUTHOR: Eugene Wallingford DATE: October 15, 2010 12:34 PM DESC: ----- BODY: The last talk of the afternoon of Day 1 was a keynote by Guy Steele. My notes for his talk are not all that long, or at least weren't when I started writing. However, as I expected, Steele presented a powerful talk, and I want to be able to link directly to it later. Steele opened with a story about a program he wrote forty years ago, which he called the ugliest program he ever wrote. It fit on a single punch card. To help us understand this program, he described in some detail the IBM hardware on which it ran. One problem he faced as a programmer is that the dumps were undifferentiated streams of bytes. Steele wanted line breaks, so he wrote an assembly language program to do that -- his ugliest program. Forty years later, all he has is the punch card -- no source. Steele's story then turned into CSI: Mainframe. He painstakingly reverse-engineered his code from punches on the card. We learned about instruction format, data words, register codes... everything we needed to know how this program managed to dump memory with newlines and fit on a single card. The number of hacks he used, playing on puns between op codes and data and addresses, was stunning. That he could resurrect these memories forty years later was just as impressive. I am just old enough to have programmed assembly for a couple of terms on punch cards. This talk brought back memories, even how you can recognize data tables on a card by the unused rows where there are no op codes. What a wonderful forensics story. The young guys in the room liked the story, but I think some were ready for the meet of the talk. But Steele told another, about a program for computing sin 3x on a PDP-11. To write this program, Steele took advantage of changes in the assembly languages between the IBM mainframe and the PDP-11 to create more readable code. Still, he had to use several idioms to make it run quickly in the space available. These stories are all about automating resource management, from octal code to assemblers on up to virtual memory and garbage collection. These techniques let the programmer release concerns about managing memory resources to tools. Steele's two stories demonstrate the kind of thinking that programmers had to do back in the days when managing memory was the programmer's job. It turns out that the best way to think about memory management is not to think about it at all. At this point, Steele closed his own strange loop back to the title of his talk. His thesis is this: the best way to think about parallel programming is not to have to. If we program using a new set of idioms, then parallelism can be automated in our tools. The idioms aren't about parallelism; they are more like functional programming patterns that commit the program less to underlying implementation. There are several implications of Steele's thesis. Here are two: Steele illustrated both of these implications by solving an example problem that would fit nicely in a CS1 course: finding all the words in a string. With such a simple problem, everyone in the room has an immediate intuition about how to solve it. And nearly everyone's intuition produces a program using accumulators that violates several important algebraic properties that our code might have. One thing I love about Steele's talks: he grounds ideas in real code. He developed a complete solution to the problem in Fortress, the language Steele and his team have been creating at Sun/Oracle for the last few years. I won't try to reproduce the program or the process. I will say this much. One, the process demonstrated a wonderful interplay between functions and objects. Two, in the end, I felt like we had just used a process very similar to the one I use when teaching students to create this functional merge sort function:
    (define mergesort
       (lambda (lst)
          (merge-all (map list lst))))
Steele closed his talk with the big ideas that his programs and stories embody. Among the important algebraic properties that programs should have whenever possible are ones we all learned in grade school, explicitly or implicitly. Though they may still sound scary, they all have intuitive common meanings: Steele said that "wiggle room" was the key buzzword to take away from his talk. Preserving invariants of these algebraic properties give the compiler wiggle room to choose among alternative ways to implement the solution. In particulars, associativity and commutativity give the compiler wiggle room to parallelize the implementation. (Note that the merge-all operation in my mergesort program satisfies all five properties.) One way to convert an imperative loop to a parallel solution is to think in terms of grouping and functions:
  1. Bunch mutable state together as a state "value".
  2. Look at the loop as an application of one or more state transformation functions.
  3. Look for an efficient way to compose these transformation functions into a single function.
The first two steps are relatively straightforward. The third step is the part that requires ingenuity! In this style of programming, associative combining operators are a big deal. Creating new, more diverse associative combining operators is the future of programming. Creating new idioms -- the patterns of programs written in this style -- is one of our challenges. Good programming languages of the future will provide, encourage, and enforce invariants that give compilers wiggle room. In closing, Steele summarized our task as this: We need to do for processor allocation what garbage collection did for memory allocation. This is essential in a world in which we have parallel computers of wildly different sizes. (Multicore processors, anyone?) I told some of the guys at the conference that I go to hear Guy Steele irrespective of his topic. I've been fortunate enough to be in a small OOPSLA workshop on creativity with Steele, gang-writing poetry and Sudoku generators, and I have seen him speak a few times over the years. Like his past talks, this talk makes me think differently about programs. It also crystallizes several existing ideas in a way that clarifies important questions. -----