TITLE: OOPSLA Day 2: Guy Steele on Fortress AUTHOR: Eugene Wallingford DATE: October 26, 2006 3:46 PM DESC: ----- BODY: The first two sentences of Guy Steele's OOPSLA 2006 keynote this morning were eerily reminiscent of his describes the application of the principles from his justly famous OOPSLA 1998 invited talk Growing a Language. I'm pretty sure that, for that few seconds, he used only one-syllable words! Guy Steele This talk, A Growable Language, was related to that talk, but in a more practical sense. It applied the spirit and ideas expressed in that talk to a new language. His team at Sun is designing Fortress, a "growable language" with the motto "To Do for Fortran What Java Did for C". The aim of Fortress is to support high-performance scientific and engineering programming without carrying forward the historical accidents of Fortran. Among the additions to Fortran will be extensive libraries, including for networking, a security model, type safety, dynamic compilation (to enable the optimization of running program), multithreading, and platform independence. The project is being funded by DARPA with the goal of improving programmer productivity for writing scientific and engineering applications -- to reduce the time between when a programmer receives a problem and when the programmer delivers the answer, rather than focus solely on the speed of the compiler or executable. Those are important, too, but we are shortsighted in thinking that they are the only formsof speed that matter. (Other DARPA projects in this vein are the Extend language from IBM[?] and the Chapel language from Cray.) Given that even desktop computers are moving toward multicore chips, this sort of project offers potential value beyond the scientific programming community. The key ideas behind the Fortress project are threefold: Steele reminded his audience of the motivation for growing a language from his first talk: If you plan for a language, and then design it, then build it, you will probably miss the optimal window of opportunity for language. One of the guiding questions of the Fortress project is, Will designing for the growth of a language and its user community change the technical decisions the team makes or, more importantly, the way it makes them? One of the first technical questions the team faced was what set of primitive data types to build into the language. Integer and float -- but what sizes? Data aggregates? Quaternions, octonions? Physical units such as meter and kilogram?? He "might say 'yes' to all of them, but he must say no to some of them." Which ones -- and why? The strategy of the team is to, wherever possible, add a desired feature via a library -- and to give library designers substantial control over both the semantics and the syntax of the library. The result is a two-level language design: a set of features to support library designers and a set of features to support application programmers. The former have turned out to be quite obbject-oriented, while the latter is not obbject-oriented at all -- something of a surprise to the team. At this time, the language defines some very cool types in libraries: lists, vectors, sets, maps (with better, more math-like notion), matrices and multidimensional vectors, and units of measurement. The language also offers as a feature mathematical typography, using a wiki-style mark-up to denote Unicode characters beyond what's available on the ASCII keyboard. In the old model for designing a language, the designers In the new model, though, designers At a strategic level, the Fortress team wants to avoid creating a monolithic "standard library", even when taking into account the user-defined libraries created by a single team or by many. Their idea is instead to treat libraries as replaceable components, perhaps with different versions. Steele says that Fortress effectively has make and svn built into its toolset! I can just hear some of my old-school colleagues decrying this "obvious bloat", which must surely degrade the language's performance. Steele and his colleagues have worked hard to make abstraction efficient in a way that surpasses many of today's languages, via aggressive static and dynamic optimization. We OO programmers have come to accept that our environments can offer decent efficiency while still having features that make us more productive. The challenge facing Fortress is to sell this mindset to C and Fortran programmers with habits of 10, 20, even 40 years thinking that you have to avoid procedure calls and abstract data types in order to ensure optimal performance. The first category of features in Fortress intend to support library developers. The list is an impressive integration of ideas from many corners of the programming language community, including first-class types, traits and trait descriptors (where, comprise, exclude), multiple inheritance of code but not fields, and type contracts. In this part of the language definition, knowledge that used to be buried in the compiler is brought out explicitly into code where it can be examined by programmers, reasoned over by the type inference system, and used by library designers. But these features are not intended for use by application programmers. In order to support application developers, Steele and his team watched (scientific) programmers scribble on their white boards and then tried to convert as much of what they say as possible into their language. For example, Fortress takes advantage of subtle whitespace cues, as in phrases such as
{ |x| | x ← S, 3 | x }
Note the four different uses of the vertical bar, disambiguated in part by the whitespace in the expression. The wired-in syntax of Fortress consists of some standard notation from programming and math: Any other operator can be defined as infix, prefix, or postfix. For example, ! is defined as a postfix for factorial. Similarly, juxtaposition is a binary operator, one which can be defined by the library designer for her own types. Even nicer for the cientific programmer, the compiler knows that the juxtaposition of functions is itself a function (composition!). The syntax of Fortress is rich and consistent with how scientific programmers think. But they don't think much about "data types", and Fortress supports that, too. The goal is for library designers to think about types a lot, but application programmers should be able to do their thing with type inference filling in most of the type information. Finally, scientists, engineers, and mathematicians use particular textual conventions -- fonts, characters, layout -- to communicate. Fortress allows programmers to post-process their code into a beautiful mathematical presentation. Of course, this idea and even its implementation are not new, but the question for the Fortress team was what it would be like if a language were designed with this downstream presentation as the primary mode o presentation? The last section of Steele's talk looked a bit more like a POPL or ICFP paper, as he explained the theoretical foundations underlying Fortress's biggest challenge: mediating the language's abstractions down to efficient executable code for parallel scientific computation. Steele asserted that parallel programming is not a feature but rather a pragmatic compromise. Programmers do not think naturally in parallel and sop need language support. Fortress is an experiment in making parallelism the default mode of computation. Steele's example focused on the loop, which in most languages conflates two ideas: "do this statement (or block) multiple times" and "do things in this order". In Fortress, the loop itself means only the former; by default its iterations can be parallelized. In order to force sequencing, the programmer modifies the interval of the loop (the range of values that "counts" the loop) with the seqoperator. So, rather than annotate code to get parallel compilation, we must annotate to get sequential compilation. Fortress uses the idea of generators and reducers -- functions that produce and manipulate, respectively, data structures like sequences and trees -- as the basis of the program transformations from Fortress source code down to executablecode. There are many implementations for these generators and reducers, some that are sequential and some that are not. From here Steele made a "deep dive" into how generators and reducers are used to implement parallelism efficiently. That discussion is way behind what I can write here. Besides, I will have to study the transformations more closely before I can explain them well. As Steele wrapped up, he reiterated The Big Idea that guides the Fortress team: to expose algorithm and design decisions in libraries rather than bury them in the compiler -- but to bury them in the libraries rather than expose them in application code. It's an experiment that many of us are eager to see run. One question from the crowd zeroed in on the danger of dialect. When library designers are able to create such powerful extensions, with different syntactic notations, isn't there a danger that different libraries will implement similar ideas (or different ideas) incompatibly? Yes, Steele acknolwedge, that is a real danger. He hopes that the Fortress community will grow with library designers thinking of themselves as language designers and so exercise restraint in the the extensions they make, and work together to create community standards. I also learned about a new idea that I need to read about... the BOOM hierarchy. My memory is vagues, but the discussion involved considering whether a particular operation -- which, I can't remember -- is associative, commutative, and idempotent. There are, of course, eight possible combinations of these features, four of which are meaningful (tree, list, bag/multiset, and set). One corresponds to an idea that Steele termed a "mobile", and the rest are, in his terms, "weird". I gotta read more! -----