StrangeLoop 2015 is long in the books for most people, but I occasionally still think about some of the things I learned there. Chris Ford's recent blog post reminded me that I had a draft about his talk waiting to be completed and posted. Had I published this post closer to the conference, I would have called it Kolmogorov Music: Compression and Understanding.
(This is the second follow-up post about a StrangeLoop 2015 talk that is still on my mind. The previous follow-up was about Peter Alvaro's talk on languages and distributed systems.)
Chris Ford stepped the podium in front of an emacs buffer. "Imagine a string of g's," he said, "infinitely long, going in both directions." This is an infinite string, he pointed out, with an 11-word description. That's the basic idea of Kolmogorov complexity, and the starting point for his talk.
I first read about Kolmogorov complexity in a couple of papers by Gregory Chaitin that I found on the early web back in the 1990s. It fascinated me then, and I went into "Kolmogorov Music", Ford's talk, with high hopes. It more than delivered. The talk was informative, technically clear, and entertaining.
Ford uses Clojure for this work in order to write macros. They allow him to talk about code at two levels: source and expansion. The source macro is his description of some piece of music, and the expansion is the music itself, "executable" by an interpreter.
He opened by demo'ing some cool music, including a couple of his own creations. Then he began his discussion of how complex a piece of music is. His measure of complexity is the ratio of the length of the evaluated data (the music) to the length of the macro (the program that generates it). This means that complexity is relative, in part, to the language of expression. If we used a language other than Clojure, the ratios would be different.
Once we settle on a programming language, we can compare the relative complexity of two pieces of music. This also gives rise to cool ideas such as conditional complexity, based on the distance between the programs that encode two pieces of music.
Compression algorithms do something quite similar: exploit our understanding of data to express it in fewer bytes. Ford said that he based his exploration on the paper Analysis by Compression by David Meredith, a "musicologist with a computational bent". Meredith thinks of listening to music as a model-building process that can be described using algorithms.
Programs have more expressive power than traditional music notation. Ford gave as an example clapping music that falls farther and farther behind itself as accompaniment continues. It's much easier to write this pattern using a programming language with repetition and offsets than using musical notation.
Everything has been cool so far. Ford pushed on to more coolness.
A minimalist idea can be described briefly. As Borges reminds us in The Library of Babel, a simple thing can contain things that are more complex than itself. Ford applied this idea to music. He recalled Carl Sagan's novel Contact, in which the constant pi was found to contain a hidden message. Inspired by Sagan, Ford looked to the Champernowne constant, a number created by concatenating all integers in succession -- 0.12345678910111213141516..., and turned it into music. Then he searched it for patterns.
Ford found something that sounded an awful lot like "Blurred Lines", a pop hit by Robin Thicke in 2013, and played it for us. He cheekily noted that his Champernowne song infringes the copyright on Thicke's song, which is quite humorous given the controversial resemblance of Thicke's song to "Got to Give It Up", a Marvin Gaye tune from 1977. Of course, Ford's song is infinitely long, so it likely infringes the copyright of every song ever written! The good news for him is that it also subsumes every song to be written in the future, offering him the prospect of a steady income as an IP troll.
Even more than usual, my summary of Ford's talk cannot possibly do it justice, because he shows code and plays music! Let me echo what was a common refrain on Twitter immediately after his talk at StrangeLoop: Go watch this video. Seriously. You'll get to see him give a talk using only emacs and a pair of speakers, and hear all of the music, too. Then check out Ford's raw material. All of his references, music, and code are available on his Github site.
After that, check out his latest blog entry. More coolness.
But we see the core message as something else altogether: In pretty much any area of human endeavor, people have a tremendous capacity to improve their performance, as long as they train in the right way. If you practice something for a few hundred hours, you will almost certainly see great improvement ... but you have only scratched the surface. You can keep going and going and going, getting better and better and better. How much you improve is up to you.
... courtesy of Anders Ericsson himself, in a Salon piece adapted from his new book, Peak, (written with Robert Pool). Ericsson himself, author of the oft-cited paper at the source of the rule, which was made famous by Malcolm Gladwell.
I've seen this dynamic play out over the years for many students. They claimed not to be able any good at math or programming, but then they decided to do the work. And they kept getting better. Some ended up with graduate degrees, and most ended up with surprising success in industry.
Looked at from one perspective, the so-called 10,000 Hour Rule is daunting. "Look how far I am from being really good at this..." Many people shrink in the face of this mountain, willing to settle for limits placed on them by their supposed talents. But, as my friend Richard Gabriel often says, talent doesn't determine good you get, only how fast you get good. As I quoted from Art and Fear long ago, "talent is rarely distinguishable, over the long run, from perseverance and lots of hard work".
That's the most important lesson behind Ericsson's research. If we practice right, we will get better and, with even more of the right kind of training, we will keep getting better. Our limits usually lie much farther away than we think.
When I read this interview with Louis C.K. last week, the following response spoke to me as a college professor, not as a stand-up comedian:
Can you explain the difference, practically, between the stand-up you were doing at your peak and what you're doing now?
I think I'm a better comedian overall than I was back then, but back then I was better at performing. When you're that greased up onstage, you just have a higher comedy IQ. It's the ability to go on any stage in the country and be perfectly present and able to maneuver the set and have great timing. Some of it is being in physical shape. When you're under pressure or strain, you get dumb, you know? It's why I started working out in boxing gyms, because you watch a guy who's fighting, he's in a terribly arduous moment and he's making intelligent choices. So to me that's when you're 55 minutes deep into your sixth show of the week, in your fifth city of the week. You have to be able to be great right in that moment. You have to be, "You're not going to believe what I'm going to do next." The audience is tired, and you have to have more energy than anyone in the room. You have to be able to control the pace. At my show last night, I was talking to myself a little bit while my mouth was moving delivering material. I was thinking, You're going too fast. Cool it. You have plenty of time and loads ... to say.
It's funny how so many of us, doing so many different things, experience so many of the same ups and downs in our professions. With a few changes to the surface of this story, it sounds like something a college instructor might say ten years on. I've even reached a point where I can talk to myself in an analytical way during a class session or a presentation. I was never as good in 2004 as Louis was, but I feel the same evolution in how I feel about my work in the classroom.
One thing I haven't tried is boxing. (Perhaps that is one of my more intelligent choices.) I have had to make some tough decisions under grueling conditions while running marathons, but those tend to unfold at a slower pace than in the ring. "Everybody has a plan until they get punched in the face."
Like Louis, I'm always trying to get better at teaching in first gear. He is probably more natural than I am in an amped-up state, too. Both are a challenge for me.
In this Dr. Dobbs interview, Alan Kay reveals how he became interested in education as a life's work: as a teenager, he read the issue of Life magazine that introduced Margaret Bourke-White's photos from Buchenwald. Kay says:
That probably was the turning point that changed my entire attitude toward life. It was responsible for getting me interested in education. My interest in education is unglamorous. I don't have an enormous desire to help children, but I have an enormous desire to create better adults.
This desire has caused Kay to explore how children think and learn more deeply than most people do. Our greatest desires sometimes lead us down paths we would not otherwise go.
For some reason, Kay's comments on his enduring involvement in education made me think of this passage from a profile of Ludwig Wittgenstein in the Paris Review:
We all struggle to form a self. Great teaching, Wittgenstein reminds us, involves taking this struggle and engaging in it with others; his whole life was one great such struggle. In working with poor children, he wanted to transform himself, and them.
Wittgenstein wanted to create a better adult of himself and so engaged for six years in "the struggle to form a self" with elementary school students. Let's hope that the students in his charge grew into better adults as well. As Kay says later in the same interview, "Education is a double-edged sword. You have to start where people are, but if you stay there, you're not educating."
In Prime After Prime (great title!), Brian Hayes boils down into two sentences the fundamental challenge that faces people doing research:
What I find most surprising about the discovery is that no one noticed these patterns long ago. They are certainly conspicuous enough once you know how to look for them.
It would be so much easier to form hypotheses and run tests if interesting hypotheses were easier to find.
Once found, though, we can all see patterns. When they can be computed, we can all write programs to generate them! After reading a paper about the strong correlations among pairs of consecutive prime numbers, Hayes wrote a bunch of programs to visualize the patterns and to see what other patterns he might find. A lot of mathematicians did the same.
Evidently that was a common reaction. Evelyn Lamb, writing in Nature, quotes Soundararajan: "Every single person we've told this ends up writing their own computer program to check it for themselves."
Being able to program means being able to experiment with all kinds of phenomena, even those that seemingly took genius to discover in the first place.
Actually, though, Hayes's article gives a tour of the kind of thinking we all can do that can yield new insights. Once he had implemented some basic ideas from the research paper, he let his imagination roam. He tried different moduli. He visualized the data using heat maps. When he noticed some symmetries in his tables, he applied a cyclic shift to the data (which he termed a "twist") to see if some patterns were easier to identify in the new form.
Being curious and asking questions like these are one of the ways that researchers manage to stumble upon new patterns that no one has noticed before. Genius may be one way to make great discoveries, but it's not a reliable one for those of us who aren't geniuses. Exploring variations on a theme is a tactic we mortals can use.
Some of the heat maps that Hayes generates are quite beautiful. The image above is a heat map of the normalized counts of consecutive eight-digit primes, taken modulo 31. He has more fun making images of his twists and with other kinds of primes. I recommend reading the entire article, for its math, for its art, and as an implicit narration of how a computational scientist approaches a cool result.
Yesterday, William Stein's talk about the origins of SageMath spread rapidly through certain neighborhoods of Twitter. It is a thorough and somewhat depressing discussion of how hard it is to develop open source software within an academic setting. Writing code is not part of the tenure reward system or the system for awarding grants. Stein has tenure at the University of Washington but has decided that he has to start a company, SageMath, work for it full-time in order to create a viable open source alternative to the "four 'Ma's": Mathematica, Matlab, Maple, and Magma.
Stein's talk reminded me of something I read earlier this year, from a talk by Matthew Butterick:
"Information wants to be expensive, because it's so valuable ... On the other hand, information wants to be free, because the cost of getting it out is getting lower ... So you have these two fighting against each other."
This was said by a guy named Stewart Brand, way back in 1984.
So what's the message here? Information wants to be free? No, that's not the message. The message is that there are two forces in tension. And the challenge is how to balance the forces.
Proponents of open source software -- and I count myself one -- are often so glib with the mantra "information wants to be free" that we forget about the opposing force. Wolfram et al. have capitalized quite effectively on information's desire to be expensive. This force has an economic power that can overwhelm purely communitarian efforts in many contexts, to the detriment of open work. The challenge is figuring out how to balance the forces.
In my mind, Mozilla stands out as the modern paradigm of seeking a way to balance the forces between free and expensive, creating a non-profit shell on top of a commercial foundation. It also seeks ways to involve academics in process. It will be interesting to see whether this model is sustainable.
Oh, and Stewart Brand. He pointed out this tension thirty years ago. I recently recommended How Buildings Learn to my wife and thought I should look back at the copious notes I took when I read it twenty years ago. But I should read the book again myself; I hope I've changed enough since then that reading it anew brings new ideas to mind.
A few years ago, I was giving a short presentation about one of our new programs to people from the community and campus. One of the prompts for the talk was how this program would contribute to teaching "critical thinking skills" across the curriculum. I made a mild joke about the idea, wondering aloud which programs on campus taught uncritical thinking skills. Only later did I learn that our new provost, who was in the audience, had just announced a major focus on critical thinking. Fortunately, our new provost had a sense of humor.
I don't believe that we can teach critical thinking in any useful way outside the context of a particular discipline. I do believe, though, that we can teach it as a part of any discipline -- not just in the humanities or liberal arts, which in too many people's minds don't include the sciences. Studies show that these skills don't tend to transfer when we move to a different discipline, but I am convinced that people who learn to think deeply in one discipline are better prepared to learn another discipline than someone who is learning deeply for the first time.
In a recent essay for Inside Higher Ed, John Schlueter offers a new analogy for thinking about critical thinking:
When it comes to thinking skills, it would be much more productive if we stop thinking "transfer" and start thinking "overlap". That is, once thinking skills become more explicitly taught, especially in general education classes, both professors and students will notice how thinking in the context of one domain (say, economics) overlaps with the kind of thinking processes at work in another (biology).
The idea of overlap fits nicely with how I think about these skills. Making thinking skills more explicit in our instruction might enable students to notice intersections and differences across the disciplines they study. That awareness may help them to internalize general strategies that are useful across disciplines, for times when they are in unknown waters, and be aware of possible points of failure in their own thinking.
I'm not sure if this analogy is any easier to operationalize or test than the notion of transfer, but it does give me a different way to think about thinking.
In an interesting article about words and concepts, Elisa Gabbert repeats a familiar sentiment about teaching:
... the physicist Richard Feynman reportedly said, after being asked to prepare a freshman lecture on why spin-1/2 particles obey Fermi-Dirac statistics, "I couldn't reduce it to the freshman level. That means we really don't understand it."
When I read this, my inner Sheldon Cooper thought, "With the data at hand, you really can't draw that conclusion. All you can say with absolute certainty is that you don't understand it."
Actually, I empathize deeply Feynman's sentiment, which has been attributed to many famous people and stated in one form or other by many people who have tried to teach a challenging topic to others. Most teachers have had the experience of trying to explain an idea they think they know cold, only to find themselves stumbling over concepts or relationships that seem so obvious in their expert mind. I experience this feeling almost every semester. When I was a young teacher, such experiences disconcerted me. I soon learned that they were opportunities to understand the world better.
But I think that, at a logical level, people sometimes draw an invalid conclusion from statements of the sort Feynman reportedly made. It's certainly true that if we don't really understand a complex subject, then we probably won't be able to reduce it to a level appropriate for first-year students. But even if you do understand it really well, you still may have difficulty explaining it to beginners.
Teaching involves two parties: the teacher and the learner. Effective teaching requires being able to communicate new ideas in a way that connect with what the learner knows and can do.
To be effective teachers, we need two kinds of knowledge:
This latter understanding comes at two levels. First, we might know a specific individual well and be able to connect to his or her own personal experiences and knowledge. Second, we might understand a group, such as freshman CS students, based on some common background and maturity level.
Teaching individuals one-on-one can be most effective, but it takes a lot of time and doesn't scale well. As a result, we often find ourselves teaching a group of people all at once or writing for a mass audience. Teaching a class means being able to communicate new ideas to a group of students in a way that prepares most or all of them to learn on their own after they leave the classroom and begin to do their individual work.
Most people who try to teach find out that this is a lot harder than it looks. Over time, we begin to learn what a generic freshman CS student knows and is like. We build up a cache of stories for reaching them in different ways. We encounter pedagogical patterns of effective learning and learn ways to implement them in our teaching. We also begin to learn techniques for working with students individually so that, in our office after class, we can drop down from generic teaching to more intimate, one-on-one instruction.
If you want to find out simultaneously how well your students are understanding what you are teaching and how well you understand what you are teaching, let them ask questions. I am often amazed at the questions students ask, and equally amazed at how hard they can be to answer well. On truly glorious days, I surprise myself (and them!) with an answer or story or example that meets their needs perfectly.
However well I understand a topic, it always takes me time to figure out how to communicate effectively with a new audience. Once I understood that this was natural, it allowed me to take some of the pressure to be perfect off myself and get down to the business of learning how to teach and, often, learning computer science at a deeper level.
So, if we can't reduce some topic to the freshman level, it may well mean that we don't really understand it. But it may also mean that you don't yet understand your audience well enough. Figuring out which is true in a given case is yet another challenge that every teacher faces.
Yesterday, I wrote me some Java. It was fun.
A few days ago, I started wondering if there was something unique I could send my younger daughter for her birthday today. My daughters and I were all born in presidential election years, which is neat little coincidence. This year's election is special for the birthday girl: it is her first opportunity to vote for the president. She has participated in the process throughout, which has seen both America's most vibrant campaign for progressive candidate in at least forty years and the first nomination of a woman by a major party. Both of these are important to her.
In the spirit of programming and presidential politics, I decided to write a computer program to convert images into the style of Shepard Fairey's iconic Obama "Hope" poster and then use it to create a few images for her.
I dusted off Dr. Java and fired up some code I wrote when I taught media computation in our intro course many years ago. It had been a long time since I had written any Java at all, but it came back just like riding a bike. More than decade of writing code in a language burns some pretty deep grooves in the mind.
I found RGB values to simulate the four colors in Fairey's poster in an old message to the mediacomp mailing list:
Color darkBlue = new Color(0, 51, 76); Color lightBlue = new Color(112, 150, 158); Color red = new Color(217, 26, 33); Color yellow = new Color(252, 227, 166);
Then came some experimentation...
I liked the outputs of this third effort quite a bit, at least for the photos I gave it as input. Two of them worked out especially well. With a little doctoring in Photoshop, they would have an even more coherent feel to them, like an artist might produce with a keener eye. Pretty good results for a few fun minutes of programming.
Now, let's hope my daughter likes them. I don't think she's ever received a computer-generated present before, at least not generated by a program her dad wrote!
The images I created were gifts to her, so I'll not share them here. But if you've read this far, you deserve a little something, so I give you these:
Now that is change we can all believe in.
Functions of Symbolic Expressions
and Their Computation by Machine,
courtesy of Classic Programmer Paintings
Earlier this week, Alan Kay was answering questions on Hacker News and mentioned Lisp 1.5:
This got deeper if one was aware of how Lisp 1.5 had been implemented with the possibility of late bound parameter evaluation ...
Kay mentions Lisp, and especially Lisp 1.5, often whenever he is talking about the great ideas of computing. He sometimes likens McCarthy's universal Lisp interpreter to Maxwell's equations in physics -- a small, simple set of equations that capture a huge amount of understanding and enable a new way of thinking. Late-bound evaluation of parameters is one of the neat ideas you can find embedded in that code.
The idea of a universal Lisp interpreter is pretty simple: McCarthy defined the features of Lisp in terms of the language features themselves. The interpreter consists of two main procedures:
These procedures recurse mutually to evaluate a program.
This is one of the most beautiful ideas in computing, one that we take for granted today.
The syntax and semantics of Lisp programs are so sparse and so uniform that the McCarthy's universal Lisp interpreter consisted of about one page of Lisp code. Here that page is: Page 13 of the Lisp 1.5 Programmer's Manual, published in 1962.
You may see this image passed around the Twitter and the web these whenever Lisp 1.5 is mentioned. But the universal Lisp interpreter is a program. Why settle for a JPG image?
While preparing for the final week of my programming languages course this spring, I sat down and implemented the Lisp interpreter on Page 13 of the Lisp 1.5 manual in universal-lisp-interpreter.rkt, using Racket.
I tried to reproduce the main procedures from the manual as faithfully as I could. You see the main two functions underlying McCarthy's idea: "evaluate an expression" and "apply a function to its arguments". The program assumes the existence of only a few primitive forms from Racket:
't means true, and nil means both false and the empty list. My Racket implementation uses #t and #f internally, but they do not appear in the code for the interpreter.
Notice that this interpreter implements all of the language features that it uses: the same five primitive functions, the same two special forms, and lambda. It also defines label, a way to create recursive functions. (label offers a nice contrast to the ways we talk about implementing recursive functions in my course.)
The interpreter uses a few helper functions, which I also define as in the manual. evcon evaluates a cond expression, and evlis evaluates a list of arguments. assoc looks up the value for a key in an "association list", and pairlis extends an existing association list with new key/value pairs. (In my course, assoc and pairlis correspond to basic operations on a finite function, which we use to implement environments.)
I enjoyed walking through this code briefly with my students. After reading this code, I think they appreciated anew the importance of meaningful identifiers...
The code works. Open it up in Racket and play with a Lisp from the dawn of time!
It really is remarkable how much can be built out of so little. I sometimes think of the components of this program as the basic particles out of which all computation is built, akin to an atomic theory of matter. Out of these few primitives, all programs can be built.
In a Startup School interview earlier this year, Paul Graham reminds software developers of an uncomfortable truth:
Still, to this day, one of the big things programmers do not get is how traumatized users have been by bad and hard-to-use software. The default assumption of users is, "This is going to be really painful, and in the end, it's not going to work."
I have encountered this trauma even more since beginning to work with administrators on campus a decade ago. "Campus solutions" track everything from enrollments to space usage. So-called "business-to-business software" integrates purchasing with bookkeeping. Every now and then the university buys and deploys a new system, to manage hiring, say, or faculty travel. In almost every case, interacting with the software is painful for the user, and around the edges it never quite seems to fit what most users really want.
When administrators or faculty relate their latest software-driven pain, I try to empathize while also bring a little perspective to their concerns. These systems address large issues, and trying to integrate them into a coherent whole is a very real challenge, especially for an understaffed group of programmers. Sometimes, the systems are working exactly as they should to implement an inconvenient policy. Unfortunately, users don't see the policy on a daily basis; they see the inconvenient and occasionally incomplete software that implements it.
Yet there are days when even I have to complain out loud. Using software can be painful.
Today, though, I offer a story of nascent redemption.
After reviewing some enrollment data earlier this spring, my dean apologized in advance for any errors he had made in the reports he sent to the department heads. Before he can analyze the data, he or one of the assistant deans has to spend many minutes scavenging through spreadsheets to eliminate rows that are not part of the review. They do this several times a semester, which adds up to hours of wasted time in the dean's office. The process is, of course, tedious and error-prone.
I'm a programmer. My first thought was, "A program can do this filtering almost instantaneously and never make an error."
In fact, a few years ago, I wrote a simple Ruby program to do just this sort of filtering for me, for a different purpose. I told the dean that I would be happy to adapt it for use in his office to process data for all the departments in the college. My primary goal was to help the dean; my ulterior motive was self-improvement. On top of that, this was a chance to put my money where my mouth is. I keep telling people that a few simple programs can make our lives better, and now I could provide a concrete example.
Last week, I whipped up a new Python script. This week, I demoed it to the dean and an assistant dean. The dean's first response was, "Wow, this will help us a lot." The rest of the conversation focused on ways that the program could help them even more. Like all users, once they saw what was possible, they knew even better what they really wanted.
I'll make a few changes and deliver a more complete program soon. I'll also help the users as they put it to work and run into any bugs that remain. It's been fun. I hope that this humble script is an antidote, however small, to the common pain of software that is hard to use and not as helpful as it should be. Many simple problems can be solved by simple programs.