TITLE: Patterns as Compression Technology AUTHOR: Eugene Wallingford DATE: July 13, 2009 3:23 PM DESC: ----- BODY: In trying to understand the role patterns and pattern languages play both in developing software and in learning to develop software, I often look for different angles from which to look at patterns. I've written the idea of patterns as descriptive grammar and the idea of patterns as a source of freedom in design. Both still seem useful to me as perspectives on patterns, and the latter is among the most-read articles on my blog. The notion of patterns-as-grammar also relates closely to one of the most commonly-cited roles that patterns play for the developer or learner, that of vocabulary for describing the meaningful components of a program. This weekend, I read Brian Hayes's instructive article on compressive sensing, The Best Bits. Hayes talks about how it is becoming possible to imagine that digital cameras and audio recorders could record compressed streams -- say, a 10-megapixel camera storing a 3MB photo directly rather than recording 30MB and then compressing it after the fact. The technique he calls compressive sensing is a beautiful application of some straightforward mathematics and a touch of algorithmic thinking. I highly recommend it. While reading this article, though, the idea of patterns as vocabulary came to mind in a new way, triggered initially by this passage:
... every kind of signal that people find meaningful has a sparse representation in some domain. This is really just another way of saying that a meaningful signal must have some structure or regularity; it's not a mere jumble of random bits.
an optical illusion -- can you see it? Programs are meaningful signals and have structure and regularity beyond the jumble of seemingly random characters at the level of the programming level. The chasm between random language stuff and high-level structure is most obvious when working with beginners. They have to learn that structure can exist and that there are tools for creating it. But I think developers face this chasm all the time, too, whenever they dive into a new language, a new library, or a new framework. Where is the structure? Knowing it is there and seeing it are too different matters. The idea of a sparse representation is fundamental to compression. We have to find the domain in which a signal, whether image or sound, can be represented in as few bits as possible while losing little or even none of the signal's information. A pattern language of programs does the same thing for a family of programs. It operates at a level (in Hayes' terms, in a domain) at which the signal of the program can be represented sparsely. By describing Java's I/O stream library as a set of decorators on a set of concrete streams, we convey a huge amount of information in very few words. That's compression. If we say nothing else, we have a lossy compression, in that we won't be able to reconstruct the library accurately from the sparse representation. But if we use more patterns to describe the library (such as Abstract Class and "Throw, Don't Catch"), we get a representation that pretty accurately captures the structure of the library, if not the bit-by-bit code that implements it. This struck me as a useful way to think about what patterns do for us. If you've seen other descriptions of patterns as a means for compression, I'd love to hear from you. -----