May 19, 2021 3:07 PM

Today's Thinking Prompts, in Tweets

On teaching, via Robert Talbert:

Look at the course you teach most often. If you had the power to remove one significant topic from that course, what would it be, and why?

I have a high degree of autonomy in most of the courses I teach, so power isn't the limiting factor for me. Time is a challenge to making big changes, of course. Gumption is probably what I need most right now. Summer is a great time for me to think about this, both for my compiler course this fall and programming languages next spring.

On research, via Kris Micinski:

i remember back to Dana Scott's lecture on the history of the lambda calculus where he says, "If Turing were alive today, I don't know what he'd be doing, but it wouldn't be recursive function theory." I think about that a lot.

Now I am, too. Seriously. I'm no Turing, but I have a few years left and some energy to put into something that matters. Doing so will require some gumption to make other changes in my work life first. I am reaching a tipping point.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

April 30, 2021 1:56 PM

Good News at the End of a Long Year, v2.0

A couple of weeks ago, a former student emailed me after many years. Felix immigrated to the US from the Sudan back in the 1990s and wound up at my university, where he studied computer science. While in our program, he took a course or two with me, and I supervised his undergrad research project. He graduated and got busy with life, and we lost touch.

He emailed to let me know that he was about to defend his Ph.D. dissertation, titled "Efficient Reconstruction and Proofreading of Neural Circuits", at Harvard. After graduating from UNI, he programmed at DreamWorks Interactive and EA Sports, before going to grad school and working to "unpack neuroscience datasets that are almost too massive to wrap one's mind around". He defended his dissertation successfully this week.

Congratulations, Dr. Gonda!

Felix wrote initially to ask permission to acknowledge me in his dissertation and defense. As I told him, it is an honor to be remembered so fondly after so many years. People often talk about how teachers affect their students' futures in ways that are often hard to see. This is one of those moments for me. Arriving at the end of what has been a challenging semester in the classroom for me, Felix's note boosted my spirit and energizes me a bit going into the summer.

If you'd like to learn more about Felix and his research, here is his personal webpage The Harvard School of Engineering also has a neat profile of Felix that shows you what a neat person he is.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

February 27, 2021 11:12 AM

All The Words

In a Paris Review interview, Fran Lebowitz joked about the challenge of writing:

Every time I sit at my desk, I look at my dictionary, a Webster's Second Unabridged with nine million words in it and think, All the words I need are in there; they're just in the wrong order.

Unfortunately, thinks this computer scientist, writing is a computationally more intense task than simply putting the words in the right order. We have to sample with replacement.

Computational complexity is the reason we can't have nice things.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 08, 2021 2:12 PM

My Experience Storing an Entire Course Directory in Git

Last summer, I tried something new: I stored the entire directory of materials for my database course in Git. This included all of my code, the course website, and everything else. It worked well.

The idea came from a post or tweet many years ago by Martin Fowler who, if I recall correctly, had put his entire home directory under version control. It sounded like the potential advantages might be worth the cost, so I made a note to try it myself sometime. I wasn't quite ready last summer to go all the way, so I took a baby step by creating my new course directory as a git repo and growing it file by file.

My context is pretty simple. I do almost all of my work on a personal MacBook Pro or a university iMac in my office. My main challenge is to keep my files in sync. When I make changes to a small number of files, or when the stakes of a missing file are low, copying files by hand works fine, with low overhead and no tooling necessary.

When I make a lot of changes in a short period of time, however, as I sometimes do when writing code or building my website, doing things by hand becomes more work. And the stakes of losing code or web pages are a lot higher than losing track of some planning notes or code I've been noodling with. To solve this problem, for many years I have been using rsync and a couple of simple shell scripts to manage code directories and my course web sites.

So, the primary goal for using Git in a new workflow was to replace rsync. Not being a Git guru, as many of you are, I figured that this would also force me to live with git more often and perhaps expand my tool set of handy commands.

My workflow for the semester was quite simple. When I worked in the office, there were four steps:

  1. git merge laptop
  2. [ do some work ]
  3. git commit
  4. git push

On my laptop, the opening and closing git commands changed:

  1. git pull origin main
  2. [ do some work ]
  3. git commit
  4. git push origin laptop

My work on a course is usually pretty straightforward. The most common task is to create files and record information with commit. Every once in a while, I had to back up a step with checkout.

You may say, "But you are not using git for version control!" You would be correct. The few times I checked out an older version of a file, it was usually to eliminate a spurious conflict, say, a .DS_Store file that was out of sync. Locally, I don't need a lot of version control, but using Git this way was a form of distributed version control, making sure that, wherever I was working, I had the latest version of every file.

I think this is a perfectly valid way to use Git. In some ways, Git is the new Unix. It provided me with a distributed filesystem and a file backup system all in one. The git commands ran effectively as fast as their Unix counterparts. My repo was not very much bigger than the directory would have been on its own, and I always had a personal copy of the entire repo with me wherever I went, even if I had to use another computer.

Before I started, several people reminded me that Git doesn't always work well with large images and binaries. That didn't turn out to be much of a problem for me. I had a couple of each in the repo, but they were not large and never changed. I never noticed a performance hit.

The most annoying hiccup all semester was working with OS X's .DS_Store files, which record screen layout information for OS X. I like to keep my windows looking neat and occasionally reorganize a directory layout to reflect what I'm doing. Unfortunately, OS X seems to update these files at odd times, after I've closed a window and pushed changes. Suddenly the two repos would be out of sync only because one or more .DS_Store files had changed after the fact. The momentary obstacle was quickly eliminated with a checkout or two before merging. Perhaps I should have left the .DS_Stores untracked...

All in all, I was pretty happy with the experience. I used more git, more often, than ever before and thus am now a bit more fluent than I was. (I still avoid the hairier corners of the tool, as all right-thinking people do whenever possible.) Even more, the repository contains a complete record of my work for the semester, false starts included, with occasional ruminations about troubles with code or lecture notes in my commit messages. I had a little fun after the semester ended looking back over some of those messages and making note of particular pain points.

The experiment went well enough that I plan to track my spring course in Git, too. This will be a bigger test. I've been teaching programming languages for many years and have a large directory of files, both current and archival. Not only are there more files, there are several binaries and a few larger images. I'm trying decide if I should put the entire folder into git all at once upfront or start with an empty folder a lá last semester and add files as I want or need them. The latter would be more work at early stages of development but might be a good way to clear out the clutter that has built up over twenty years.

If you have any advice on that choice, or any other, please let me know by email or on Twitter. You all have taught me a lot over the years. I appreciate it.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 30, 2020 3:38 PM

That's a Shame

In the middle of an old post about computing an "impossible" integral, John Cook says:

In the artificial world of the calculus classroom, everything works out nicely. And this is a shame.

When I was a student, I probably took comfort in the fact that everything was supposed to work out nicely on the homework we did. There *was* a solution; I just had to find the pattern, or the key that turned the lock. I suspect that I was a good student largely because I was good at finding the patterns, the keys.

It wasn't until I got to grad school that things really changed, and even then course work was typically organized pretty neatly. Research in the lab was very different, of course, and that's where my old skills no longer served me so well.

In university programs in computer science, where many people first learn how to develop software, things tend to work out nicely. That is a shame, too. But it's a tough problem to solve.

In most courses, in particular introductory courses, we create assignments with that have "closed form" solutions, because we want students to practice a specific skill or learn a particular concept. Having a fixed target can be useful in achieving the desired outcomes, especially if we want to help students build confidence in their abilities.

It's important, though, that we eventually take off the training wheels and expose students to messier problems. That's where they have an opportunity to build other important skills they need for solving problems outside the classroom, which aren't designed by a benevolent instructor to have follow a pattern. As Cook says, neat problems can create a false impression that every problem has a simple solution.

Students who go on to use calculus for anything more than artificial homework problems may incorrectly assume they've done something wrong when they encounter an integral in the wild.

CS students need experience writing programs that solve messy problems. In more advanced courses, my colleagues and I all try to extend students' ability to solve less neatly-designed problems, with mixed results.

It's possible to design a coherent curriculum that exposes students to an increasingly messy set of problems, but I don't think many universities do this. One big problem is that doing so requires coordination across many courses, each of which has its own specific content outcomes. There's never enough time, it seems, to teach everything about, say, AI or databases, in the fifteen weeks available. It's easier to be sure that we cover another concept than it is to be sure students take a reliable step along the path from being able to solve elementary problems to being able to solve to the problems they'll find in the wild.

I face this set of competing forces every semester and do my best to strike a balance. It's never easy.

Courses that involve large systems projects are one place where students in my program have a chance to work on a real problem: writing a compiler, an embedded real-time system, or an AI-based system. These courses have closed form solutions of sorts, but the scale and complexity of the problems require students to do more than just apply formulas or find simple patterns.

Many students thrive in these settings. "Finally," they say, "this is a problem worth working on." These students will be fine when they graduate. Other students struggle when they have to do battle for the first time with an unruly language grammar or a set of fussy physical sensors. One of my challenges in my project course is to help this group of students move further along the path from "student doing homework" to "professional solving problems".

That would be a lot easier to do if we more reliably helped students take small steps along that path in their preceding courses. But that, as I've said, is difficult.

This post describes a problem in curriculum design without offering any solutions. I will think more about how I try to balance the forces between neat and messy in my courses, and then share some concrete ideas. If you have any approaches that have worked for you, or suggestions based on your experiences as a student, please email me or send me a message on Twitter. I'd love to learn how to do this better.

I've written a number of posts over the years that circle around this problem in curriculum and instruction. Here are three:

I'm re-reading these to see if past me has any ideas for present-day me. Perhaps you will find them interesting, too.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

November 28, 2020 11:04 AM

How Might A Program Help Me Solve This Problem?

Note: The following comes from the bottom of my previous post. It gets buried there beneath a lot of code and thinking out loud, but it's a message that stands on its own.


I demo'ed a variation of my database-briven passphrase generator to my students as we closed the course last week. It let me wrap up my time with them by reminding them that they are developing skills that can change how they see every problem they encounter in the future.

Knowing how to write programs gives you a new power. Whenever you encounter a problem, you can ask yourself, "How might a program help me solve this?"

The same is true for many more specialized CS skills. People who know how to create a language and implement an interpreter can ask themselves, "How might a language help me solve this problem?" That's one of the outcomes, I hope, of our course in programming languages.

The same is true for databases, too. Whenever you encounter a problem, you can ask yourself, "Can a database help me solve this?"

Computer science students can use the tools they learn each semester to represent and interpret information. That's a power they can use to solve many problems. It's easy to lose sight of that fact during a busy semester and worth reflecting on in calmer moments.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 25, 2020 12:15 PM

Three Bears-ing a Password Generator in SQLite

A long semester -- shorter in time than usual by more than a week, but longer psychologically than any in a long time -- is coming to an end. Teaching databases for the first time was a lot of fun, though the daily grind of preparing so much new material in real time wore me down. Fortunately, I like to program enough that there were moments of fun scattered throughout the semester as I played with SQLite for the first time.

In the spirit of the Three Bears pattern, I looked for opportunities all semester to use SQLite to solve a problem. When I read about the Diceware technique for generating passphrases, I found one.

Diceware is a technique for generating passphrases using dice to select words from the "Diceware Word List", in which each word is paired with a five digit number. All of the digits are between one and six, so five dice rolls are all you need to select a word from the list. Choose a number of words for the passphrase, roll your dice, and select your words.

The Diceware Word List is a tab-separated file of dice rolls and words. SQLite can import TSV data directly into a table, so I almost have a database. I had to preprocess the file twice to make it importable. First, the file is wrapped as a PGP signed message, so I stripped the header and footer by hand, to create diceware-wordlist.txt.

Second, some of the words in this list contain quote characters. Like many applications, SQLite struggles with CSV and TSV files that contain embedded quote characters. There may be some way to configure it to handle these files gracefully, but I didn't bother looking for one. I just replaced the ' and " characters with _ and __, respectively:

    cat diceware-wordlist.txt \
      | sed "s/\'/_/g"        \
      | sed 's/\"/__/g'       \
      > wordlist.txt

Now the file is ready to import:

    sqlite> CREATE TABLE WordList(
       ...>    diceroll CHAR(5),
       ...>    word VARCHAR(30),
       ...>    PRIMARY KEY(diceroll)
       ...>    );

sqlite> .mode tabs sqlite> .import 'wordlist.txt' WordList

sqlite> SELECT * FROM WordList ...> WHERE diceroll = '11113'; 11113 a_s

That's one of the words that used to contain an apostrophe.

So, I have a dice roll/word table keyed on the dice roll. Now I want to choose words at random from the table. To do that, I needed a couple of SQL features we had not used in class: random numbers and string concatenation. The random() function returns a big integer. A quick web search showed me this code to generate a random base-10 digit:

    SELECT abs(random())%10
    FROM (SELECT 1);
which is easy to turn into a random die roll:
    SELECT 1+abs(random())%6
    FROM (SELECT 1);

I need to evaluate this query repeatedly, so I created a view that wraps the code in what acts, effectively, a function:

    sqlite> CREATE VIEW RandomDie AS
       ...>   SELECT 1+abs(random())%6 AS n
       ...>   FROM (SELECT 1);

Aliasing the random value as 'n' is important because I need to string together a five-roll sequence. SQL's concatenation operator helps there:

    SELECT 'Eugene' || ' ' || 'Wallingford';

I can use the operator to generate a five-character dice roll by selecting from the view five times...

    sqlite> SELECT n||n||n||n||n FROM RandomDie;
    sqlite> SELECT n||n||n||n||n FROM RandomDie;

... and then use that phrase to select random words from the list:

    sqlite> SELECT word FROM WordList
       ...> WHERE diceroll =
       ...>         (SELECT n||n||n||n||n FROM RandomDie);

Hurray! Diceware defaults to three-word passphrases, so I need to do this three times and concatenate.

This won't work...

    sqlite> SELECT word, word, word FROM WordList
       ...> WHERE diceroll =
       ...>         (SELECT n||n||n||n||n FROM RandomDie);
... because the dice roll is computed only once. A view can help us here, too:
    sqlite> CREATE VIEW OneRoll AS
       ...>   SELECT word FROM WordList
       ...>   WHERE diceroll =
       ...>           (SELECT n||n||n||n||n FROM RandomDie);

OneRoll acts like a table that returns a random word:

    sqlite> SELECT word FROM OneRoll;
    sqlite> SELECT word FROM OneRoll;
    sqlite> SELECT word FROM OneRoll;

Almost there. Now, this query generates three-word passphrases:

    sqlite> SELECT Word1.word || ' ' || Word2.word || ' ' || Word3.word FROM
       ...>   (SELECT * FROM OneRoll) AS Word1,
       ...>   (SELECT * FROM OneRoll) AS Word2,
       ...>   (SELECT * FROM OneRoll) AS Word3;
    eagle crab pinch

Yea! I saved this query in gen-password.sql and saved the SQLite database containing the table WordList and the views RandomDie and OneRoll as diceware.db. This lets me generate passphrases from the command line:

    $ sqlite3 diceware.db < gen-password.sql
    ywca maine over
Finally, I saved that command in a shell script named gen-password, and I now have passphrase generator ready to use with a few keystrokes. Success.

Yes, this is a lot of work to get a simple job done. Maybe I could do better with Python and a CSV reader package, or some other tools. But that wasn't the point. I was revisiting SQL and learning SQLite with my students. By overusing the tools, I learned them both a little better and helped refine my sense of when they will and won't be helpful to me in the future. So, success.


I demo'ed a variation of this to my students on the last day of class. It let me wrap up my time with them by pointing out that they are developing skills which can change how they see every problem they encounter in the future.

Knowing how to write programs gives you a new power. Whenever you encounter a problem, you can ask yourself, "How might a program help me solve this?" I do this daily, both as faculty member and department head.

The same is true for many more specialized CS skills. People who know how to create a language and implement an interpreter can ask themselves, "How might a language help me solve this problem?" That's one of the outcomes, I hope, of our course in programming languages.

The same is true for databases. When I came across a technique for generating passphrases, I could ask myself, "How might a database help me build a passphrase generator?"

Computer science students can use the tools they learn each semester to represent and interpret information. That's a power they can use to solve many problems. It's easy to lose sight of this incredible power during a hectic semester, and worth reflecting on in calmer moments after the semester ends.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 19, 2020 2:27 PM

An "Achievement Gap" Is Usually A Participation Gap

People often look at the difference between the highest-rated male chess player in a group and the highest-rated female chess player in the same group and conclude that there is a difference between the abilities of men and women to play chess, despite the fact that there are usually many, many more men in the group than women. But that's not even good evidence that there is an achievement gap. From What Gender Gap in Chess?:

It's really quite simple. Let's say I have two groups, A and B. Group A has 10 people, group B has 2. Each of the 12 people gets randomly assigned a number between 1 and 100 (with replacement). Then I use the highest number in Group A as the score for Group A and the highest number in Group B as the score for Group B. On average, Group A will score 91.4 and Group B 67.2. The only difference between Groups A and B is the number of people. The larger group has more shots at a high score, so will on average get a higher score. The fair way to compare these unequally sized groups is by comparing their means (averages), not their top values. Of course, in this example, that would be 50 for both groups -- no difference!

I love this paragraph. It's succinct and uses only the simplest ideas from probability and statistics. It's the sort of statistics that I would hope our university students learn in their general education stats course. While learning a little math, students can also learn about an application that helps us understand something important in the world.

The experiment described is also simple enough for beginning programmers to code up. Over the years, I've used problems like this with intro programming students in Pascal, Java, and Python, and with students learning Scheme or Racket who need some problems to practice on. I don't know whether learning science supports my goal, but I hope that this sort of problem (with suitable discussion) can do double duty for learners: learn a little programming, and learn something important about the world.

With educational opportunities like this available to us, we really should be able to turn graduates who have a decent understanding of why so many of our naive conclusions about the world are wrong. Are we putting these opportunities to good use?

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

July 16, 2020 10:47 AM

Dreaming in Git

I recently read a Five Books interview about the best books on philosophical wonder. One of the books recommended by philosopher Eric Schwitzgebel was Diaspora, a science fiction novel by Greg Egan I've never read. The story unfolds in a world where people are able to destroy their physical bodies to upload themselves into computers. Unsurprisingly, this leads to some fascinating philosophical possibilities:

Well, for one thing you could duplicate yourself. You could back yourself up. Multiple times.

And then have divergent lives, as it were, in parallel but diverging.

Yes, and then there'd be the question, "do you want to merge back together with the person you diverged from?"

Egan wrote Diaspora before the heyday of distributed version control, before darcs and mercurial and git. With distributed VCS, a person could checkout a new personality, or change branches and be a different person every day. We could run diffs to figure out what makes one version of a self so different from another. If things start going too wrong, we could always revert to an earlier version of ourselves and try again. And all of this could happen with copies of the software -- ourselves -- running in parallel somewhere in the world.

And then there's Git. Imagine writing such a story now, with Git's complex model of versioning and prodigious set of commands and flags. Not only could people branch and merge, checkout and diff... A person could try something new without ever committing changes to the repository. We'd have to figure out what it means to push origin or reset --hard HEAD. We'd be able to rewrite history by rebasing, amending, and squashing. A Git guru can surely explain why we'd need to --force-with-lease or --unset-upstream, but even I can imagine the delightful possibilities of git stash in my personal improvement plan.

Perhaps the final complication in our novel would involve a merge so complex that we need a third-party diff tool to help us put our desired self back together. Alas, a Python library or Ruby gem required by the tool has gone stale and breaks an upgrade. Our hero must find a solution somewhere in her tree of blobs, or be doomed to live a forever splintered life.

If you ever see a book named Dreaming in Git or Bug Report on an airport bookstore's shelves, take a look. Perhaps I will have written the first of my Git fantasies.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

June 17, 2020 3:53 PM

Doing Concatenative Programming in a Spreadsheet

a humble spreadsheet cell

It's been a long time since I was excited by a new piece of software the way I was excited by Loglo, Avi Bryant's new creation. Loglo is "LOGO for the Glowforge", an experimental programming environment for creating SVG images. That's not a problem I need to solve, but the way Loglo works drew me in immediately. It consists of a stack programming language and a set of primitives for describing vector graphics, integrated into a spreadsheet interface. It's the use of a stack language to program a spreadsheet that excites me so much.

Actually, it's the reverse relationship that really excites me: using a spreadsheet to build and visualize a stack-based program. Long-time readers know that I am interested in this style of programming (see Summer of Joy for a post from last year) and sometimes introduce it in my programming languages course. Students understand small examples easily enough, but they usually find it hard to grok larger programs and to fully appreciate how typing in such a language can work. How might Loglo help?

In Loglo, a cell can refer to the values produced by other cells in the familiar spreadsheet way, with an absolute address such as "a1" or "f2". But Loglo cells have two other ways to refer to other cell's values. First, any cell can access the value produced by the cell to its left implicitly, because Loglo leaves the result of a cell's computation sitting on top of the stack. Second, a cell can access the value produced by the cell above it by using the special variable "^". These last two features strike me as a useful way for programmers to see their computations grow over time, which can be an even more powerful mode of interaction for beginners who are learning this programming style.

Stack-oriented programming of this sort is concatenative: programs are created by juxtaposing other programs, with a stack of values implicitly available to every operator. Loglo uses the stack as leverage to enable programmers to build images incrementally, cell by cell and row by row, referring to values on the stack as well as to predecessor cells. The programmer can see in a cell the value produced by a cumulative bit of code that includes new code in the cell itself. Reading Bryant's description of programming in Loglo, it's easy to see how this can be helpful when building images. I think my students might find it helpful when learning how to write concatenative programs or learning how types and programs work in a concatenative language.

For example, here is a concatenative program that works in Loglo as well as other stack-based languages such as Forth and Joy:

 2 3 + 5 * 2 + 6 / 3 /

Loglo tells us that it computes the value 1.5:

a stack program in Loglo

This program consists of eleven tokens, each of which is a program in its own right. More interestingly, we can partition this program into smaller units by taking any subsequences of the program:

 2 3 + 5 *   2 + 6 /   3 /
 ---------   -------   ---
These are the programs in cells A1, B1, and C1 of our spreadsheet. The first computes 25, the second uses that value to compute 4.5, and the third uses the 4.5 to compute 1.5. Notice that the programs in cells B1 and C1 require an extra value to do their jobs. They are like functions of one argument. Rather than pass an argument to the function, Loglo allows it to read a value from the stack, produced by the cell to its left.

a partial function in Loglo

By making the intermediate results visible to the programmer, this interface might help programmers better see how pieces of a concatenative program work and learn what the type of a program fragment such as 2 + 6 / (in cell B1 above) or 3 / is. Allowing locally-relative references on a new row will, as Avi points out, enable an incremental programming style in which the programmer uses a transformation computed in one cell as the source for a parameterized version of the transformation in the cell below. This can give the novice concatenative programmer an interactive experience more supportive than the usual REPL. And Loglo is a spreadsheet, so changes in one cell percolate throughout the sheet on each update!

Am I the only one who thinks this could be a really cool environment for programmers to learn and practice this style of programming?

Teaching concatenative programming isn't a primary task in my courses, so I've never taken the time to focus on a pedagogical environment for the style. I'm grateful to Avi for demonstrating a spreadsheet model for stack programs and stimulating me to think more about it.

For now, I'll play with Loglo as much as time permits and think more about its use, or use of a tool like it, in my courses. There are couple of features I'll have to get used to. For one, it seems that a cell can access only one item left on the stack by its left neighbor, which limits the kind of partial functions we can write into cells. Another is that named functions such as rotate push themselves onto the stack by default and thus require a ! to apply them, whereas operators such as + evaluate by default and thus require quotation in a {} block to defer execution. (I have an academic's fondness for overarching simplicity.) Fortunately, these are the sorts of features one gets used to whenever learning a new language. They are part of the fun.

Thinking beyond Loglo, I can imagine implementing an IDE like this for my students that provides features that Loglo's use cases don't require. For example, it would be cool to enable the programmer to ctrl-click on a cell to see the type of the program it contains, as well as an option to see the cumulative type along the row or built on a cell referenced from above. There is much fun to be had here.

To me, one sign of a really interesting project is how many tangential ideas flow out of it. For me, Loglo is teeming with ideas, and I'm not even in its target demographic. So, kudos to Avi!

Now, back to administrivia and that database course...

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

June 11, 2020 1:02 PM

Persistence Wins, Even For Someone Like You

There's value to going into a field that you find difficult to grasp, as long as you're willing to be persistent. Even better, others can benefit from your persistence, too.

In an old essay, James Propp notes that working in a field where you lack intuition can "impart a useful freedom from prejudice". Even better...

... there's value in going into a field that you find difficult to grasp, as long as you're willing to be really persistent, because if you find a different way to think about things, something that works even for someone like you, chances are that other people will find it useful too.

This reminded me of a passage in Bob Nystroms's post about his new book, Crafting Interpreters. Nystrom took a long time to finish the book in large part because he wanted the interpreter at the end of each chapter to compile and run, while at the same time growing into the interpreter discussed in the next chapter. But that wasn't the only reason:

I made this problem harder for myself because of the meta-goal I had. One reason I didn't get into languages until later in my career was because I was intimidated by the reputation compilers have as being only for hardcore computer science wizard types. I'm a college dropout, so I felt I wasn't smart enough, or at least wasn't educated enough to hack it. Eventually I discovered that those barriers existed only in my mind and that anyone can learn this.

Some students avoid my compilers course because they assume it must be difficult, or because friends said they found it difficult. Even though they are CS majors, they think of themselves as average programmers, not "hardcore computer science wizard types". But regardless of the caliber of the student at the time they start the course, the best predictor of success in writing a working compiler is persistence. The students who plug away, working regularly throughout the two-week stages and across the entire project, are usually the ones who finish successfully.

One of my great pleasures as a prof is seeing the pride in the faces of students who demo a working compiler at the end of the semester, especially in the faces of the students who began the course concerned that they couldn't hack it.

As Propp points out in his essay, this sort of persistence can pay off for others, too. When you have to work hard to grasp an idea or to make something, you sometimes find a different way to think about things, and this can help others who are struggling. One of my jobs as a teacher is to help students understand new ideas and use new techniques. That job is usually made easier when I've had to work persistently to understand the idea myself, or to find a better way to help the students who teach me the ways in which they struggle.

In Nystrom's case, his hard work to master a field he didn't grasp immediately pays of for his readers. I've been following the growth of Crafting Interpreters over time, reading chapters in depth whenever I was able. Those chapters were uniformly easy to read, easy to follow, and entertaining. They have me thinking about ways to teach my own course differently, which is probably the highest praise I can give as a teacher. Now I need to go back and read the entire book and learn some more.

Teaching well enough that students grasp what they thought was not graspable and do what they thought was not doable is a constant goal, rarely achieved. It's always a work in progress. I have to keep plugging away.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

May 14, 2020 2:35 PM

Teaching a New Course in the Fall

I blogged weekly about the sudden switch to remote instruction starting in late March, but only for three weeks. I stopped mostly because my sense of disorientation had disappeared. Teaching class over Zoom started to feel more normal, and my students and I got back into the usual rhythm. A few struggled in ways that affected their learning and performance, and a smaller few thrived. My experience was mostly okay: some parts of my work suffered as I learned how to use tools effectively, but not having as many external restrictions on my schedule offset the negatives. Grades are in, summer break begins to begin, and at least some things are right with the world.

Fall offers something new for me to learn. My fall compilers course had a lower enrollment than usual and, given the university's current financial situation, I had to cancel it. This worked out fine for the department, though, as one of our adjunct instructors asked to take next year off in order to deal with changes in his professional and personal lives. So there was a professor in need of a course, and a course in need of a professor: Database Systems.

Databases is one of the few non-systems CS courses that I have never taught as a prof or as a grad student. It's an interesting course, mixing theory and design with a lot of practical skills that students and employers prize. In this regard, it's a lot of like our OO design and programming course in Java, only with a bit more visible theory. I'm psyched to give it a go. At the very least, I should be able to practice some of those marketable skills and learn some of the newer tools involved.

As with all new preps, this course has me looking for ideas. I'm aware of a few of the standard texts, though I am hoping to find a good open-source text online, or a set of online materials out of which to assemble the readings my students will need for the semester. I'm going to be looking scouting for all the other materials I need to teach the course as well, including examples, homework assignments, and projects. I tend to write a lot of my own stuff, but I also like to learn from good courses and good examples already out there. Not being a database specialist, I am keen to see what specialists think is important, beyond what we find in traditional textbooks.

Then there is the design of the course itself. Teaching a course I've never taught before means not having an old course design to fall back on. This means more work, of course, but is a big win for curious mind. Sometimes, it's fun to start from scratch. I have always found instructional design fascinating, much like any kind of design, and building a new course leaves open a lot of doors for me to learn and to practice some new skills.

COVID-19 is a big part of why I am teaching this course, but it is not done with us. We still do not know what fall semester will look like, other than to assume that it won't look like a normal semester. Will be on campus all semester, online all semester, or a mix of both? If we do hold instruction on campus, as most universities are hoping to do, social distancing requirements will require us to do some things differently, such as meeting students in shifts every other day. This uncertainty suggests that I should design a course that depends less on synchronous, twice-weekly, face-to-face direct instruction and more on ... what?

I have a lot to learn about teaching this way. My university is expanding its professional development offerings this summer and, in addition to diving deep into databases and SQL, I'll be learning some new ways to design a course. It's exciting but also means a bit more teaching prep than usual for my summer.

This is the first entirely new prep I've taught in a while. I think the most recent was the fall of 2009, when I taught Software Engineering for the first and only time. Looking back at the course website reminds me that I created this delightful logo for the course:

course logo for Software Engineering, created using YUML

So, off to work I go. I could sure use your help. Do you know of model database courses that I should know about? What database concepts and skills should CS graduates in 2021 know? What tools should they be able to use? What has changed in the world since I last took database courses that must be reflected in today's database course? Do you know of a good online textbook for the course, or a print book that my students would find useful and be willing to pay for?

If you have any ideas to share, feel free to email me or contact me on Twitter. If not for me, do it for my students!

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 10, 2020 2:16 PM

Software Can Make You Feel Alive, or It Can Make You Feel Dead

This week I read one of Craig Mod's old essays and found a great line, one that everyone who writes programs for other people should keep front of mind:

When it comes to software that people live in all day long, a 3% increase in fun should not be dismissed.

Working hard to squeeze a bit more speed out of a program, or to create even a marginally better interaction experience, can make a huge difference to someone who uses that program everyday. Some people spend most of their professional days inside one or two pieces of software, which accentuates further the human value of Mod's three percent. With shelter-in-place and work-from-home the norm for so many people these days, we face a secondary crisis of software that is no fun.

I was probably more sensitive than usual to Mod's sentiment when I read it... This week I used Blackboard for the first time, at least my first extended usage. The problem is not Blackboard, of course; I imagine that most commercial learning management systems are little fun to use. (What a depressing phrase "commercial learning management system" is.) And it's not just LMSes. We use various PeopleSoft "campus solutions" to run the academic, administrative, and financial operations on our campus. I always feel a little of my life drain away whenever I spend an hour or three clicking around and waiting inside this large and mostly functional labyrinth.

It says a lot that my first thought after downloading my final exams on Friday morning was, "I don't have to login to Blackboard again for a good long while. At least I have that going for me."

I had never used our LMS until this week, and then only to create a final exam that I could reliably time after being forced into remote teaching with little warning. If we are in this situation again in the fall, I plan to have an alternative solution in place. The programmer in me always feels an urge to roll my own when I encounter substandard software. Writing an entire LMS is not one of my life goals, so I'll just write the piece I need. That's more my style anyway.

Later the same morning, I saw this spirit of writing a better program in a context that made me even happier. The Friday of finals week is my department's biennial undergrad research day, when students present the results of their semester- or year-long projects. Rather than give up the tradition because we couldn't gather together in the usual way, we used Zoom. One student talked about alternative techniques for doing parallel programming in Python, and another presented empirical analysis of using IR beacons for localization of FIRST Lego League robots. Fun stuff.

The third presentation of the morning was by a CS major with a history minor, who had observed how history profs' lectures are limited by the tools they had available. The solution? Write better presentation software!

As I watched this talk, I was proud of the student, whom I'd had in class and gotten to know a bit. But I was also proud of whatever influence our program had on his design skills, programming skills, and thinking. This project, I thought, is a demonstration of one thing every CS student should learn: We can make the tools we want to use.

This talk also taught me something non-technical: Every CS research talk should include maps of Italy from the 1300s. Don't dismiss 3% increases in fun wherever they can be made.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

April 04, 2020 11:37 AM

Is the Magic Gone?

This passage from Remembering the LAN recalls an earlier time that feels familiar:

My father, a general practitioner, used this infrastructure of cheap 286s, 386s, and 486s (with three expensive laser printers) to write the medical record software for the business. It was used by a dozen doctors, a nurse, and receptionist. ...
The business story is even more astonishing. Here is a non-programming professional, who was able to build the software to run their small business in between shifts at their day job using skills learned from a book.

I wonder how many hobbyist programmers and side-hustle programmers of this sort there are today. Does programming attract people the way it did in the '70s or '80s? Life is so much easier than typing programs out of Byte or designing your own BASIC interpreter from scratch. So many great projects out on Github and the rest of the web to clone, mimic, adapt. I occasionally hear a student talking about their own projects in this way, but it's rare.

As Crawshaw points out toward the end of his post, the world in which we program now is much more complex. It takes a lot more gumption to get started with projects that feel modern:

So much of programming today is busywork, or playing defense against a raging internet. You can do so much more, but the activation energy required to start writing fun collaborative software is so much higher you end up using some half-baked SaaS instead.

I am not a great example of this phenomenon -- Crawshaw and his dad did much more -- but even today I like to roll my own, just for me. I use a simple accounting system I've been slowly evolving for a decade, and I've cobbled together bits and pieces of my own tax software, not an integrated system, just what I need to scratch an itch each year. Then there are all the short programs and scripts I write for work to make Spreadsheet City more habitable. But I have multiple CS degrees and a lot of years of experience. I'm not a doctor who decides to implement what his or her office needs.

I suspect there are more people today like Crawshaw's father than I hear about. I wish it were more of a culture that we cultivated for everyone. Not everyone wants to bake their own bread, but people who get the itch ought to feel like the world is theirs to explore.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development

March 15, 2020 9:35 AM

Things I've Been Reading

This was a weird week. It started with preparations for spring break and an eye on the news. It turned almost immediately into preparations for at least two weeks of online courses and a campus on partial hiatus. Of course, we don't know how the COVID-19 outbreak will develop over the next three weeks, so we may be facing the remaining seven weeks of spring semester online, with students at a distance.

Here are three pieces that helped me get through the week.

Even If You Believe

From When Bloom Filters Don't Bloom:

Advanced data structures are very interesting, but beware. Modern computers require cache-optimized algorithms. When working with large datasets that do not fit in L3, prefer optimizing for a reduced number of loads over optimizing the amount of memory used.

I've always liked the Bloom filter. It seems such an elegant idea. But then I've never used one in a setting where performance mattered. It still surprises me how well current architectures and compilers optimize performance for us in ways that our own efforts can only frustrate. The article is also worth reading for its link to a nice visualization of the interplay among the parameters of a Bloom Filter. That will make a good project in a future class.

Even If You Don't Believe

From one of Tyler Cowen's long interviews:

Niels Bohr had a horseshoe at his country house across the entrance door, a superstitious item, and a friend asked him, "Why do you have it there? Aren't you a scientist? Do you believe in it?" You know what was Bohr's answer? "Of course I don't believe in it, but I have it there because I was told that it works, even if you don't believe in it."

You don't have to believe in good luck to have good luck.

You Gotta Believe

From Larry Tesler's annotated manual for the PUB document compiler:

In 1970, I became disillusioned with the slow pace of artificial intelligence research.

The commentary on the manual is like a mini-memoir. Tesler writes that he went back to the Stanford AI lab in the spring of 1971. John McCarthy sent him to work with Les Earnest, the lab's chief administrator, who had an idea for a "document compiler", a lá RUNOFF, for technical manuals. Tesler had bigger ideas, but he implemented PUB as a learning exercise. Soon PUB had users, who identified shortcomings that were in sync with Tesler's own ideas.

The solution I favored was what we would now call a WYSIWYG interactive text editing and page layout system. I felt that, if the effect of any change was immediately apparent, users would feel more in control. I soon left Stanford to pursue my dream at Xerox PARC (1973-80) and Apple Computer (1980-1997).

Thus began the shift to desktop publishing. And here I sit, in 2020, editing this post using emacs.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

February 18, 2020 4:00 PM

Programming Feels Like Home

I saw Robin Sloan's An App Can Be a Home-Cooked Meal floating around Twitter a few days back. It really is quite good; give it a read if you haven't already. This passage captures a lot of the essay's spirit in only a few words:

The exhortation "learn to code!" has its foundations in market value. "Learn to code" is suggested as a way up, a way out. "Learn to code" offers economic leverage, a squirt of power. "Learn to code" goes on your resume.
But let's substitute a different phrase: "learn to cook." People don't only learn to cook so they can become chefs. Some do! But far more people learn to cook so they can eat better, or more affordably, or in a specific way. Or because they want to carry on a tradition. Sometimes they learn just because they're bored! Or even because -- get this -- they love spending time with the person who's teaching them.

Sloan expresses better than I ever have an idea that I blog about every so often. Why should people learn to program? Certainly it offers a path to economic gain, and that's why a lot of students study computer science in college, whether as a major, a minor, or a high-leverage class or two. There is nothing wrong with that. It is for many a way up, a way out.

But for some of us, there is more than money in programming. It gives you a certain power over the data and tools you use. I write here occasionally about how a small script or a relatively small program makes my life so much easier, and I feel bad for colleagues who are stuck doing drudge work that I jump past. Occasionally I'll try to share my code, to lighten someone else's burden, but most of the time there is such a mismatch between the worlds we live in that they are happier to keep plugging along. I can't say that I blame them. Still, if only they could program and used tools that enabled them to improve their work environments...

But... There is more still. From the early days of this blog, I've been open with you all:

Here's the thing. I like to write code.

One of the things that students like about my classes is that I love what I do, and they are welcome to join me on the journey. Just today a student in my Programming Languages drifted back to my office with me after class , where we ended up talking for half an hour and sketching code on a whiteboard as we deconstructed a vocabulary choice he made on our latest homework assignment. I could sense this student's own love of programming, and it raised my spirits. It makes me more excited for the rest of the semester.

I've had people come up to me at conferences to say that the reason they read my blog is because they like to see someone enjoying programming as much as they do. many of them share links with their students as if to say, "See, we are not alone." I look forward to days when I will be able to write in this vein more often.

Sloan reminds us that programming can be -- is -- more than a line on a resume. It is something that everyone can do, and want to do, for a lot of different reasons. It would be great if programming "were marbled deeply into domesticity and comfort, nerdiness and curiosity, health and love" in the way that cooking is. That is what makes Computing for All really worth doing.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development, Teaching and Learning

February 10, 2020 2:37 PM

Some Things I Read Recently

Campaign Security is a Wood Chipper for Your Hopes and Dreams

Practical campaign security is a wood chipper for your hopes and dreams. It sits at the intersection of 19 kinds of status quo, each more odious than the last. You have to accept the fact that computers are broken, software is terrible, campaign finance is evil, the political parties are inept, the DCCC exists, politics is full of parasites, tech companies are run by arrogant man-children, and so on.

This piece from last year has some good advice, plenty of sarcastic humor from Maciej, and one remark that was especially timely for the past week:

You will fare especially badly if you have written an app to fix politics. Put the app away and never speak of it again.

Know the Difference Between Neurosis and Process

In a conversation between Tom Waits and Elvis Costello from the late 1980s, Waits talks about tinkering too long with a song:

TOM: "You have to know the difference between neurosis and actual process, 'cause if you're left with it in your hands for too long, you may unravel everything. You may end up with absolutely nothing."

In software, when we keep code in our hands for too long, we usually end up with an over-engineered, over-abstracted boat anchor. Let the tests tell you when you are done, then stop.

Sometimes, Work is Work

People say, "if you love what you do you'll never work a day in your life." I think good work can be painful--I think sometimes it feels exactly like work.

Some weeks more than others. Trust me. That's okay. You can still love what you do.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

January 22, 2020 3:54 PM

The Roots of TDD -- from 1957

In 1957, Dan McCracken published Digital Computer Programming, perhaps the first book on the new art of programming. His book shows that the roots of extreme programming run deep. In this passage, McCracken encourages both the writing of tests before the writing of code and the involvement of the customer in the software development process:

The first attack on the checkout problem may be made before coding is begun. In order to fully ascertain the accuracy of the answers, it is necessary to have a hand-calculated check case with which to compare the answers which will later be calculated by the machine. This means that stored program machines are never used for a true one-shot problem. There must always be an element of iteration to make it pay. The hand calculations can be done at any point during programming. Frequently, however, computers are operated by computing experts to prepare the problems as a service for engineers or scientists. In these cases it is highly desirable that the "customer" prepare the check case, largely because logical errors and misunderstandings between the programmer and customer may be pointed out by such procedure. If the customer is to prepare the test solution is best for him to start well in advance of actual checkout, since for any sizable problem it will take several days or weeks to calculate the test.

I don't have a copy of this book, but I've read a couple of other early books by McCracken, including one of his Fortran books for engineers and scientists. He was a good writer and teacher.

I had the great fortune to meet Dan at an NSF workshop in Clemson, South Carolina, back in the mid-1990s. We spent many hours in the evening talking shop and watching basketball on TV. (Dan was cheering his New York Knicks on in the NBA finals, and he was happy to learn that I had been a Knicks and Walt Frazier fan in the 1970s.) He was a pioneer of programming and programming education who was willing to share his experience with a young CS prof who was trying to figure out how to teach. We kept in touch by email thereafter. It was honor to call him a friend.

You can find the above quotation in A History of Test-Driven Development (TDD), as Told in Quotes, by Rob Myers. That post includes several good quotes that Myers had to cut from his upcoming book on TDD. "Of course. How else could you program?"

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 26, 2019 1:55 PM

An Update on the First Use of the Term "Programming Language"

This tweet and this blog entry on the first use of the term "programming language" evoked responses from readers with some new history and some prior occurrences.

Doug Moen pointed me to the 1956 Fortran manual from IBM, Chapter 2 of which opens with:

Any programming language must provide for expressing numerical constants and variable quantities.

I was aware of the Fortran manual, which I link to in the notes for my compiler course, and its use of the term. But I had been linking to a document dated October 1957, and the file at is dated October 15, 1956. That beats the January 1957 Newell and Shaw paper by a few months.

As Moen said in his email, "there must be earlier references, but it's hard to find original documents that are early enough."

The oldest candidate I have seen comes from @matt_dz. His tweet links to this 1976 Stanford tech report, "The Early Development of Programming Languages", co-authored by Knuth. On Page 26, it refers to work done by Arthur W. Burks in 1950:

In 1950, Burks sketched a so-called "Intermediate Programming Language" which was to be the step one notch above the Internal Program Language.

Unfortunately, though, this report's only passage from Burke refers to the new language as "Intermediate PL", which obscures the meaning of the 'P'. Furthermore, the title of Burke's paper uses "program" in the language's name:

Arthur W. Burks, "An intermediate program language as an aid in program synthesis", Engineering Research Institute, Report for Burroughs Adding Machine Company (Ann Arbor, Mich.: Univ. of Michigan, 1951), ii+15 pp.

The use of "program language" in this title is consistent with the terminology in Burks's previous work on an "Internal Program Language", to which Knuth et al. also refer.

Following up on the Stanford tech report, Douglas Moen found the book Mathematical Methods in Program Development, edited by Manfred Broy and Birgit Schieder. It includes a paper that attempts "to identify the first 'programming language', and the first use of that term". Here's a passage from Page 219, via Google Books:

There is not yet an indication of the term 'programming languages'. But around 1950 the term 'program' comes into wide use: 'The logic of programming electronic digital computers' (A. W. Burks 1950), 'Programme organization and initial orders for the EDSAC' (D. J. Wheeler 1950), 'A program composition technique' (H. B. Curry 1950), 'La conception du programme' (Corrado Bohm 1952), and finally 'Logical or non-mathematical programmes' (C. S. Strachey 1952).

And then, on Page 224, it comments specifically on Burks's work:

A. W. Burks ('An intermediate program language as an aid in program synthesis', 1951) was among the first to use the term program(ming) language.

The parenthetical in that phrase -- "the first to use the term program(ming) language" -- leads me to wonder if Burks may use "program language" rather than "programming language" in his 1951 paper.

Is it possible that Knuth et al. retrofitted the use of "programming language" onto Burks's language? Their report documents the early development of PL ideas, not the history of the term itself. The authors may have used a term that was in common parlance by 1976 even if Burks had not. I'd really like to find an original copy of Burks's 1951 ERI report to see if he ever uses "programming language" when talking about his Intermediate PL. Maybe after the holiday break...

In any case, the use of program language by Burks and others circa 1950 seems to be the bridge between use of the terms "program" and "language" independently and the use of "programming language" that soon became standard. If Burke and his group never used the new term for its intermediate PL, it's likely that someone else did between 1951 and release of the 1956 Fortran manual.

There is so much to learn. I'm glad that Crista Lopes tweeted her question on Sunday and that so many others have contributed to the answer!

Posted by Eugene Wallingford | Permalink | Categories: Computing

December 23, 2019 10:23 AM

The First Use of the Term "Programming Language"?

Yesterday, Crista Lopes asked a history question on Twitter:

Hey, CS History Twitter: I just read Iverson's preface of his 1962 book carefully, and suddenly this occurred to me: did he coin the term "programming language"? Was that the first time a programming language was called "programming language"?

In a follow-up, she noted that McCarthy's CACM paper on LISP from roughly the same time called Lisp a 'programming system'", not a programming language.

I had a vague recollection from my grad school days that Newell and Simon might have used the term. I looked up IPL, the Information Processing Language they created in the mid-1950s with Shaw. IPL pioneered the notion of list processing, though at the level of assembly language. I first learned of it while devouring Newell and Simon's early work on AI and reading every thing I could find about programs such as the General Problem Solver and Logic Theorist.

That wikipedia page has a link to this unexpected cache of documents on IPL from Newell, Simon, and Shaw's days at Rand. The oldest of these is a January 1957 paper, Programming the Logic Theory Machine, by Newell and Shaw that was presented at the Western Joint Computer Conference (WJCC) the next month. It details their efforts to build computer systems to perform symbolic reasoning, as well as the language they used to code their programs.

There it is on Page 5: a section titled "Requirements for the Programming Language". They even define what they mean by programming language:

We can transform these statements about the general nature of the program of LT into a set of requirements for a programming language. By a programming language we mean a set of symbols and conventions that allows a programmer to specify to the computer what processes he wants carried out.

Other than the gendered language, that definition works pretty well even today.

The fact that Newell and Shaw defined "programming language" in this paper indicates that the term probably was not in widespread use at the time. The WJCC was a major computing conference of the day. The researchers and engineers who attended it would likely be familiar with common jargon of the industry.

Reading papers about IPL is an education across a range of ideas in computing. Researchers at the dawn of computing had to contend with -- and invent -- concepts at multiple levels of abstraction and figure out how to implement the on machines with limited size and speed. What a treat these papers are.

I love to read original papers from the beginning of our discipline, and I love to learn about the history of words. A few of my students do, too. One student stopped in after the last day of my compilers course this semester to thank me for telling stories about the history of compilers occasionally. Next semester, I teach our Programming Languages and Paradigms course again, and this little story might add a touch of color to our first days together.

All this said, I am neither a historian of computer science nor a lexicographer. If you know of an earlier occurrence of the term "programming language" than Newell and Shaw's from January 1957, I would love to hear from you by email or on Twitter.

Posted by Eugene Wallingford | Permalink | Categories: Computing

December 20, 2019 1:45 PM

More Adventures in Constrained Programming: Elo Predictions

I like tennis. The Tennis Abstract blog helps me keep up with the game and indulge my love of sports stats at the same time. An entry earlier this month gave a gentle introduction to Elo ratings as they are used for professional tennis:

One of the main purposes of any rating system is to predict the outcome of matches--something that Elo does better than most others, including the ATP and WTA rankings. The only input necessary to make a prediction is the difference between two players' ratings, which you can then plug into the following formula:

1 - (1 / (1 + (10 ^ (difference / 400))))

This formula always makes me smile. The first computer program I ever wrote because I really wanted to was a program to compute Elo ratings for my high school chess club. Over the years I've come back to Elo ratings occasionally whenever I had an itch to dabble in a new language or even an old favorite. It's like a personal kata of variable scope.

I read the Tennis Abstract piece this week as my students were finishing up their compilers for the semester and as I was beginning to think of break. Playful me wondered how I might implement the prediction formula in my students' source language. It is a simple functional language with only two data types, integers and booleans; it has no loops, no local variables, no assignments statements, and no sequences. In another old post, I referred to this sort of language as akin to an integer assembly language. And, heaven help me, I love to program in integer assembly language.

To compute even this simple formula in Klein, I need to think in terms of fractions. The only division operator performs integer division, so 1/x for any x gives 0. I also need to think carefully about how to implement the exponentiation 10 ^ (difference / 400). The difference between two players' ratings is usually less than 400 and, in any case, almost never divisible by 400. So My program will have to take an arbitrary root of 10.

Which root? Well, I can use our gcd() function (implemented using Euclid's algorithm, of course) to reduce diff/400 to its lowest terms, n/d, and then compute the dth root of 10^n. Now, how to take the dth root of an integer for an arbitrary integer d?

Fortunately, my students and I have written code like this in various integer assembly languages over the years. For instance, we have a SQRT function that uses binary search to hone in on the integer closest to the square of a given integer. Even better, one semester a student implemented a square root program that uses Newton's method:

   xn+1 = xn - f(xn)/f'(xn)

That's just what I need! I can create a more general version of the function that uses Newton's method to compute an arbitrary root of an arbitrary base. Rather than work with floating-point numbers, I will implement the function to take its guess as a fraction, represented as two integers: a numerator and a denominator.

This may seem like a lot of work, but that's what working in such a simple programming language is like. If I want my students' compilers to produce assembly language that predicts the result of a professional tennis match, I have to do the work.

This morning, I read a review of Francis Su's new popular math book, Mathematics for Human Flourishing. It reminds us that math isn't about rules and formulas:

Real math is a quest driven by curiosity and wonder. It requires creativity, aesthetic sensibilities, a penchant for mystery, and courage in the face of the unknown.

Writing my Elo rating program in Klein doesn't involve much mystery, and it requires no courage at all. It does, however, require some creativity to program under the severe constraints of a very simple language. And it's very much true that my little programming diversion is driven by curiosity and wonder. It's fun to explore ideas in a small space uses limited tools. What will I find along the way? I'll surely make design choices that reflect my personal aesthetic sensibilities as well as the pragmatic sensibilities of a virtual machine that know only integers and booleans.

As I've said before, I love to program.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 06, 2019 2:42 PM

OOP As If You Meant It

This morning I read an old blog post by Michael Feathers, The Flawed Theory Behind Unit Testing. It discusses what makes TDD and Clean Room software development so effective for writing code with fewer defects: they define practices that encourage developers to work in a continuous state of reflection about their code. The post holds up well ten years on.

The line that lit my mind up, though, was this one:

John Nolan gave his developers a challenge: write OO code with no getters.

Twenty-plus years after the movement of object-oriented programming into the mainstream, this still looks like a radical challenge to many people. "Whenever possible, tell another object to do something rather than ask for its data." This sort of behavioral abstraction is the heart of OOP and the source of its design power. Yet it is rare to find big Java or C++ systems where most classes don't provide public accessors. When you open that door, client code will walk in -- even if you are the person writing the client code.

Whenever I look at a textbook intended for teaching undergraduates OOP, I look to see how it introduces encapsulation and the use of "getters" and "setters". I'm usually disappointed. Most CS faculty think doing otherwise would be too extreme for relative beginners. Once we open the door , though, it's a short step to using (gulp) instanceof to switch on kinds of objects. No wonder that some students are unimpressed and that too many students don't see any much value in OO programming, which as they learn it doesn't feel much different from what they've done before but which puts new limitations on them.

To be honest, though, it is hard to go Full Metal OOP. Nolan was working with professional programmers, not second-year students, and even so programming without getters was considered a challenge for them. There are certainly circumstances in which the forces at play may drive us toward cracking the door a bit and letting an instance variable sneak out. Experienced programmers know how to decide when the trade-off is worth it. But our understanding the downside of the trade-off is improved after we know how to design independent objects that collaborate to solve problems without knowing anything about the other objects beyond the services they provide.

Maybe we need to borrow an idea from the TDD As If You Meant It crowd and create workshops and books that teach and practice OOP as if we really meant it. Nolan's challenge above would be one of the central tenets of this approach, along with the Polymorphism Challenge and other practices that look odd to many programmers but which are, in the end, the heart of OOP and the source of its design power.


If you like this post, you might enjoy The Summer Smalltalk Taught Me OOP. It's isn't about OOP itself so much as about me throwing away systems until I got it right. But the reason I was throwing systems away was that I was still figuring out how to build an object-oriented system after years programming procedurally, and the reason I was learning so much was that I was learning OOP by building inside of Smalltalk and reading its standard code base. I'm guessing that code base still has a lot to teach many of us.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 10, 2019 11:06 AM

Three of the Hundred Falsehoods CS Students Believe

Jan Schauma recently posted a list of one hundred Falsehoods CS Students (Still) Believe Upon Graduating. There is much good fun here, especially for a prof who tries to help CS students get ready for the world, and a fair amount of truth, too. I will limit my brief comments to three items that have been on my mind recently even before reading this list.

18. 'Email' and 'Gmail' are synonymous.

CS grads are users, too, and their use of Gmail, and systems modeled after it, contributes to the truths of modern email: top posting all the time, with never a thought of trimming anything. Two-line messages sitting atop icebergs of text which will never be read again, only stored in the seemingly infinite space given us for free.

Of course, some of our grads end up in corporate and IT, managing email as merely one tool in a suite of lowest-common-denominator tools for corporate communication. The idea of email as a stream of text that can, for the most part, be read as such, is gone -- let alone the idea that a mail stream can be processed by programs such as procmail to great benefit.

I realize that most users don't ask for anything more than a simple Gmail filter to manage their mail experience, but I really wish it were easier for more users with programming skills to put those skills to good use. Alas, that does not fit into the corporate IT model, and not even the CS grads running many of these IT operations realize or care what is possible.

38. Employers care about which courses they took.

It's the time of year when students register for spring semester courses, so I've been meeting with a lot of students. (Twice as many as usual, covering for a colleague on sabbatical.) It's interesting to encounter students on both ends of the continuum between not caring at all what courses they take and caring a bit too much. The former are so incurious I wonder how they fell into the major at all. The latter are often more curious but sometimes are captive to the idea that they must, must, must take a specific course, even if it meets at a time they can't attend or is full by the time they register.

I do my best to help them get into these courses, either this spring or in a late semester, but I also try to do a little teaching along the way. Students will learn useful and important things in just about every course they take, if they want to, and taking any particular course does not have to be either the beginning or the end of their learning of that topic. And if the reason they think they must take a particular course is because future employers will care, they are going to be surprised. Most of the employers who interview our students are looking for well-rounded CS grads who have a solid foundation in the discipline and who can learn new things as needed.

90. Two people with a CS degree will have a very similar background and shared experience/knowledge.

This falsehood operates in a similar space to #38, but at the global level I reached at the end of my previous paragraph. Even students who take most of the same courses together will usually end their four years in the program with very different knowledge and experiences. Students connect with different things in each course, and these idiosyncratic memories build on one another in subsequent courses. They participate in different extracurricular activities and work different part-time jobs, both of shape and augment what they learn in class.

In the course of advising students over two, three, or four years, I try to help them see that their studies and other experiences are helping them to become interesting people who know more than they realize and who are individuals, different in some respects from all their classmates. They will be able to present themselves to future employers in ways that distinguish them from everyone else. That's often the key to getting the job they desire now, or perhaps one they didn't even realize they were preparing for while exploring new ideas and building their skillsets.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 13, 2019 3:12 PM

How a government boondoggle paved the way for the expansion of computing

In an old interview at Alphachatterbox, economist Brad DeLong adds another programming tale to the annals of unintended consequences:

So the Sage Air Defense system, which never produced a single usable line of software running on any piece of hardware -- we spent more on the Sage Air Defense System than we did on the entire Manhattan Project. And it was in one sense the ultimate government Defense Department boondoggle. But on the other hand it trained a whole generation of computer programmers at a time when very little else was useful that computer programmers could exercise their skills on.
And by the time the 1960s rolled around we not only ... the fact that Sage had almost worked provided say American Airlines with the idea that maybe they should do a computer-driven reservations system for their air travel, which I think was the next big Manhattan Project-scale computer programming project.
And as that moved on the computer programmers began finding more and more things to do, especially after IBM developed its System 360.
And we were off and running.

As DeLong says earlier in the conversation, this development upended IBM president Thomas Watson's alleged claim that there was "a use for maybe five computers in the world". This famous quote is almost certainly an urban legend, but Watson would not have been as off-base as people claim even if he had said it. In the 1950s, there was not yet a widespread need for what computers did, precisely because most people did not yet understand how computing could change the landscape of every activity. Training a slew of programmers for a project that ultimately failed had the unexpected consequence of creating the intellectual and creative capital necessary to begin exploring the ubiquitous applications of computing. Money unexpectedly well spent.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 12, 2019 3:57 PM

Pain and Shame

Today's lecture notes for my course include a link to @KentBeck's article on Prune, which I still enjoy.

The line that merits its link in today's session is:

We wrote an ugly, fragile state machine for our typeahead, which quickly became a source of pain and shame.

My students will soon likely experience those emotions about the state machines; they are building for lexers for their semester-long compiler project. I reassure them: These emotions are normal for programmers.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

August 30, 2019 4:26 PM

Unknown Knowns and Explanation-Based Learning

Like me, you probably see references to this classic quote from Donald Rumsfeld all the time:

There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say, we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know.

I recently ran across it again in an old Epsilon Theory post that uses it to frame the difference between decision making under risk (the known unknowns) and decision-making under uncertainty (the unknown unknowns). It's a good read.

Seeing the passage again for the umpteenth time, it occurred to me that no one ever seems to talk about the fourth quadrant in that grid: the unknown knowns. A quick web search turns up a few articles such as this one, which consider unknown knowns from the perspective of others in a community: maybe there are other people who know something that you do not. But my curiosity was focused on the first-person perspective that Rumsfeld was implying. As a knower, what does it mean for something to be an unknown known?

My first thought was that this combination might not be all that useful in the real world, such as the investing context that Ben Hunt writes about in Epsilon Theory. Perhaps it doesn't make any sense to think about things you don't know that you know.

As a student of AI, though, I suddenly made an odd connection ... to explanation-based learning. As I described in a blog post twelve years ago:

Back when I taught Artificial Intelligence every year, I used to relate a story from Russell and Norvig when talking about the role knowledge plays in how an agent can learn. Here is the quote that was my inspiration, from Pages 687-688 of their 2nd edition:

Sometimes one leaps to general conclusions after only one observation. Gary Larson once drew a cartoon in which a bespectacled caveman, Zog, is roasting his lizard on the end of a pointed stick. He is watched by an amazed crowd of his less intellectual contemporaries, who have been using their bare hands to hold their victuals over the fire. This enlightening experience is enough to convince the watchers of a general principle of painless cooking.

I continued to use this story long after I had moved on from this textbook, because it is a wonderful example of explanation-based learning.

In a mathematical sense, explanation-based learning isn't learning at all. The new fact that the program learns follows directly from other facts and inference rules already in its database. In EBL, the program constructs a proof of a new fact and adds the fact to its database, so that it is ready-at-hand the next time it needs it. The program has compiled a new fact, but in principle it doesn't know anything more than it did before, because it could always have deduced that fact from things it already knows.

As I read the Epsilon Theory article, it struck me that EBL helps a learner to surface unknown knowns by using specific experiences as triggers to combine knowledge it already into a piece of knowledge that is usable immediately without having to repeat the (perhaps costly) chain of inference ever again. Deducing deep truths every time you need them can indeed be quite costly, as anyone who has ever looked at the complexity of search in logical inference systems can tell you.

When I begin to think about unknown knowns in this way, perhaps it does make sense in some real-world scenarios to think about things you don't know you know. If I can figure it all out, maybe I can finally make my fortune in the stock market.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

August 08, 2019 2:42 PM

Encountering an Old Idea Three Times in Ten Days

I hope to eventually write up a reflection on my first Dagstuhl seminar, but for now I have a short story about how I encountered a new idea three times in ten days, purely by coincidence. Actually, the idea is over one hundred fifty years old but, as my brother often says, "Hey, it's new to me."

On the second day of Dagstuhl, Mark Guzdial presented a poster showing several inspirations for his current thinking about task-specific programming languages. In addition to displaying screenshots of two cool software tools, the poster included a picture of an old mechanical device that looked both familiar and strange. Telegraphy had been invented in the early 1840s, and telegraph operators needed some way to type messages. But how? The QWERTY keyboard was not created for the typewriter until the early 1870s, and no other such devices were in common use yet. To meet the need, Royal Earl House adapted a portion of a piano keyboard to create the input device for the "printing telegraph", or teleprinter. The photo on Mark's poster looked similar to the one on Wikipedia page for the teleprinter.

There was a need for a keyboard thirty years before anyone designed a standard typing interface, so telegraphers adapted an existing tool to fit their needs. What if we are in that same thirty-year gap in the design of programming languages? This has been one of Mark's inspirations as he works with non-computer scientists on task-specific programming languages. I had never seen an 1870s teleprinter before and thought its keyboard to be a rather ingenious way to solve a very specific problem with a tool borrowed from another domain.

When Dagstuhl ended, my wife and I spent another ten days in Europe on a much-needed vacation. Our first stop was Paris, and on our first full day there we visited the museum of the Conservatoire National des Arts et Métiers. As we moved into the more recent exhibits of the museum, what should I see but...

a Hughes teleprinter with piano-style keyboard, circa 1975, in the CNAM museum, Paris

... a Hughes teleprinter with piano-style keyboard, circa 1975. Déjà vu! I snapped a photo, even though the device was behind glass, and planned to share it with Mark when I got home.

We concluded our vacation with a few days in Martinici, Montenegro, the hometown of a department colleague and his wife. They still have a lot of family in the old country and spend their summers there working and relaxing. On our last day in this beautiful country, we visited its national historical museum, which is part of the National Museum of Montenegro in the royal capital of Cetinje. One of the country's most influential princes was a collector of modern technology, and many of his artifacts are in the museum -- including:

a teleprinter with piano-style keyboard in the Historical Museum of Montenegro, Cetinje

This full-desk teleprinter was close enough to touch and examine up close. (I didn't touch!) The piano keyboard on the device shows the wear of heavy use, which brings to mind each of my laptops' keyboards after a couple of years. Again, I snapped a photo, this time in fading light, and made a note to pass it on.

In ten days, I went from never having heard much about a "printing telegraph" to seeing a photo of one, hearing how it is an inspiration for research in programming language design, and then seeing two such devices that had been used in the 19th-century heyday of telegraphy. It was an unexpected intersection of my professional and personal lives. I must say, though, that having heard Mark's story made the museum pieces leap into my attention in a way that they might not have otherwise. The coincidence added a spark to each encounter.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

August 02, 2019 2:48 PM

Programming is an Infinite Construction Kit

As so often, Marvin Minsky loved to tell us about the beauty of programming. Kids love to play with construction sets like Legos, TinkerToys, and Erector sets. Programming provides an infinite construction kit: you never run out of parts!

In the linked essay, which was published as a preface to a 1986 book about Logo, Minsky tells several stories. One of the stories relates that once, as a small child, he built a large tower out of TinkerToys. The grownups who saw it were "terribly impressed". He inferred from their reaction that:

some adults just can't understand how you can build whatever you want, so long as you don't run out of sticks and spools.

Kids get it, though. Why do so many of us grow out of this simple understanding as we get older? Whatever its cause, this gap between children's imaginations and the imaginations of adults around them creates a new sort of problem when we give the children a programming language such as Logo or Scratch. Many kids take to these languages just as they do to Legos and TinkerToys: they're off to the races making things, limited only by their expansive imaginations. The memory on today's computers is so large that children never run out of raw material for writing programs. But adults often don't possess the vocabulary for talking with the children about their creations!

... many adults just don't have words to talk about such things -- and maybe, no procedures in their heads to help them think of them. They just do not know what to think when little kids converse about "representations" and "simulations" and "recursive procedures". Be tolerant. Adults have enough problems of their own.

Minsky thinks there are a few key ideas that everyone should know about computation. He highlights two:

Computer programs are societies. Making a big computer program is putting together little programs.

Any computer can be programmed to do anything that any other computer can do--or that any other kind of "society of processes" can do.

He explains the second using ideas pioneered by Alan Turing and long championed in the popular sphere by Douglas Hofstadter. Check out this blog post, which reflects on a talk Hofstadter gave at my university celebrating the Turing centennial.

The inability of even educated adults to appreciate computing is a symptom of a more general problem. As Minsky says toward the end of his essay, People who don't appreciate how simple things can grow into entire worlds are missing something important. If you don't understand how simple things can grow into complex systems, it's hard to understand much at all about modern science, including how quantum mechanics accounts for what we see in the world and even how evolution works.

You can usually do well by reading Minsky; this essay is a fine example of that. It comes linked to an afterword written by Alan Kay, another computer scientist with a lot to say about both the beauty of computing and its essential role in a modern understanding of the world. Check both out.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

July 05, 2019 12:40 PM

A Very Good Reason to Leave Your Home and Move to a New Country

He applied to switch his major from mathematics to computer science, but the authorities forbade it. "That is what tipped me to accept the idea that perhaps Russia is not the best place for me," he says. "When they wouldn't allow me to study computer science."

-- Sergey Aleynikov, as told to Michael Lewis and reported in Chapter 5 of Flash Boys.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

June 21, 2019 2:35 PM

Computing Everywhere, Sea Mammal Edition

In The Narluga Is a Strange Beluga-Narwhal Hybrid, Ed Yong tells the story of a narluga, the offspring of a beluga father and a narwhal mother:

Most of its DNA was a half-and-half mix between the two species, but its mitochondrial DNA -- a secondary set that animals inherit only from their mothers -- was entirely narwhal.

This strange hybrid had a mouth and teeth unlike either of its parents, the product of an unexpected DNA computation:

It's as if someone took the program for creating a narwhal tusk and ran it in a beluga's mouth.

The analogy to software doesn't end there, though...

There's something faintly magical about that. This fluky merger between two species ended up with a mouth that doesn't normally exist in nature but still found a way of using it. It lived neither like a beluga nor a narwhal, but it lived nonetheless.

Fluky and abnormal; a one-off, yet it adapts and survives. That sounds like a lot of the software I've used over the years and, if I'm honest, like some of the software I've written, too.

That said, nature is amazing.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 20, 2019 3:51 PM

Implementing a "Read Lines" Operator in Joy

I wasn't getting any work done today on my to-do list, so I decided to write some code.

One of my learning exercises to open the Summer of Joy is to solve the term frequency problem from Crista Lopes's Exercises in Programming Style. Joy is a little like Scheme: it has a lot of cool operations, especially higher-order operators, but it doesn't have much in the way of practical level tools for basic tasks like I/O. To compute term frequencies on an arbitrary file, I need to read the file onto Joy's stack.

I played around with Joy's low-level I/O operators for a while and built a new operator called readfile, which expects the pathname for an input file on top of the stack:

    DEFINE readfile ==
        (* 1 *)  [] swap "r" fopen
        (* 2 *)  [feof not] [fgets swap swonsd] while
        (* 3 *)  fclose.

The first line leaves an empty list and an input stream object on the stack. Line 2 reads lines from the file and conses them onto the list until it reaches EOF, leaving a list of lines under the input stream object on the stack. The last line closes the stream and pops it from the stack.

This may not seem like a big deal, but I was beaming when I got it working. First of all, this is my first while in Joy, which requires two quoted programs. Second, and more meaningful to me, the loop body not only works in terms of the dip idiom I mentioned in my previous post, it even uses the higher-order swonsd operator to implement the idiom. This must be how I felt the first time I mapped an anonymous lambda over a list in Scheme.

readfile leaves a list of lines on the stack. Unfortunately, the list is in reverse order: the last line of the file is the front of the list. Besides, given that Joy is a stack-based language, I think I'd like to have the lines on the stack itself. So I noodled around some more and implemented the operator pushlist:

    DEFINE pushlist ==
        (* 1 *)  [ null not ] [ uncons ] while
        (* 2 *)  pop.

Look at me... I get one loop working, so I write another. The loop on Line 1 iterates over a list, repeatedly taking (head . tail) and pushing head and tail onto the stack in that order. Line 2 pops the empty list after the loop terminates. The result is a stack with the lines from the file in order, first line on top:

    line-n ... line-3 line-2 line-1

Put readfile and pushlist together:

    DEFINE fileToStack == readfile pushlist.
and you get fileToStack, something like Python's readlines() function, but in the spirit of Joy: the file's lines are on the stack ready to be processed.

I'll admit that I'm pleased with myself, but I suspect that this code can be improved. Joy has a lot of dandy higher-order operators. There is probably a better way to implement pushlist and maybe even readfile. I won't be surprised if there is a more idiomatic way to implement the two that makes the two operations plug together with less rework. And I may find that I don't want to leave bare lines of text on the stack after all and would prefer having a list of lines. Learning whether I can improve the code, and how, are tasks for another day.

My next job for solving the term frequency problem is to split the lines into individual words, canonicalize them, and filter out stop words. Right now, all I know is that I have two more functions in my toolbox, I learned a little Joy, and writing some code made my day better.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

June 11, 2019 3:04 PM

Summer of Joy

"Elementary" ideas are really hard & need to be revisited
& explored & re-revisited at all levels of mathematical
sophistication. Doing so actually moves math forward.

-- James Tanton

Three summers ago, I spent a couple of weeks re-familiarizing myself with the concatenative programming language Joy and trying to go a little deeper with the style. I even wrote a few blog entries, including a few quick lessons I learned in my first week with the language. Several of those lessons hold up, but please don't look at the code linked there; it is the raw code of a beginner who doesn't yet get the idioms of the style or the language. Then other duties at work and home pulled me away, and I never made the time to get back to my studies.

my Summer of Joy folder

I have dubbed this the Summer of Joy. I can't devote the entire summer to concatenative programming, but I'm making a conscious effort to spend a couple of days each week in real study and practice. After only one week, I have created enough forward momentum that I think about problems and solutions at random times of the day, such as while walking home or making dinner. I think that's a good sign.

An even better sign is that I'm starting to grok some of the idioms of the style. Joy is different from other concatenative languages like Forth and Factor, but it shares the mindset of using stack operators effectively to shape the data a program uses. I'm finally starting to think in terms of dip, an operator that enables a program to manipulate data just below the top of the stack. As a result, a lot of my code is getting smaller and beginning to look like idiomatic Joy. When I really master dip and begin to think in terms of other "dipping" operators, I'll know I'm really on my way.

One of my goals for the summer is to write a Joy compiler from scratch that I can use as a demonstration in my fall compiler course. Right now, though, I'm still in Joy user mode and am getting the itch for a different sort of language tool... As my Joy skills get better, I find myself refactoring short programs I've written in the past. How can I be sure that I'm not breaking the code? I need unit tests!

So my first bit of tool building is to implement a simple JoyUnit. As a tentative step in this direction, I created the simplest version of RackUnit's check-equal? function possible:

    DEFINE check-equal == [i] dip i =.
This operator takes two quoted programs (a test expression and an expected result), executes them, and compares the results. For example, this test exercises a square function:
    [ 2 square ] [ 4 ] check-equal.

This is, of course, only the beginning. Next I'll add a message to display when a test fails, so that I can tell at a glance which tests have failed. Eventually I'll want my JoyUnit to support tests as objects that can be organized into suites, so that their results can be tallied, inspected, and reported on. But for now, YAGNI. With even a few simple functions like this one, I am able to run tests and keep my code clean. That's a good feeling.

To top it all off, implementing JoyUnit will force me to practice writing Joy and push me to extend my understanding while growing the set of programming tools I have at my disposal. That's another good feeling, and one that might help me keep my momentum as a busy summer moves on.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

May 07, 2019 11:15 AM

A PL Design Challenge from Alan Kay

In an answer on Quora from earlier this year:

There are several modern APL-like languages today -- such as J and K -- but I would criticize them as being too much like the classic APL. It is possible to extract what is really great from APL and use it in new language designs without being so tied to the past. This would be a great project for some grad students of today: what does the APL perspective mean today, and what kind of great programming language could be inspired by it?

The APL perspective was more radical even twenty years ago, before MapReduce became a thing and before functional programming ascended. When I was an undergrad, though, it seemed otherworldly: setting up a structure, passing it through a sequence of operators that changed its shape, and then passing it through a sequence of operators that folded up a result. We knew we weren't programming in Fortran anymore.

I'm still fascinated by APL, but I haven't done a lot with it in the intervening years. These days I'm still thinking about concatenative programming in languages like Forth, Factor, and Joy, a project I reinitiated (and last blogged about) three summers ago. Most concatenative languages work with an implicit stack, which gives it a very different feel from APL's dataflow style. I can imagine, though, that working in the concision and abstraction of concatenative languages for a while will spark my interest in diving back into APL-style programming some day.

Kay's full answer is worth a read if only for the story in which he connects Iverson's APL notation, and its effect on how we understand computer systems, to the evolution of Maxwell's equations. Over the years, I've heard Kay talk about McCarthy's Lisp interpreter as akin to Maxwell's equations, too. In some ways, the analogy works even better with APL, though it seems that the lessons of Lisp have had a larger historical effect to date.

Perhaps that will change? Alas, as Kay says in the paragraph that precedes his challenge:

As always, time has moved on. Programming language ideas move much slower, and programmers move almost not at all.

Kay often comes off as pessimistic, but after all the computing history he has lived through (and created!), he has earned whatever pessimism he feels. As usual, reading one of his essays makes me want to buckle down and do something that would make him proud.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 29, 2019 2:42 PM

The Path to Nothing

Dick Gabriel writes, in Lessons From The Science of Nothing At All:

Nevertheless, the spreadsheet was something never seen before. A chart indicating the 64 greatest events in accounting and business history contains VisiCalc.

This reminds me of a line from The Tao of Pooh:

Take the path to Nothing, and go Nowhere until you reach it.

A lot of research is like this, but even more so in computer science, where the things we produce are generally made out of nothing. Often, like VisiCalc, they aren't really like anything we've ever seen or used before either.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 16, 2019 3:40 PM

The Importance of Giving Credit in Context

From James Propp's Prof. Engel's Marvelously Improbable Machines:

Chip-firing has been rediscovered independently in three different academic communities: mathematics, physics, and computer science. However, its original discovery by Engel is in the field of math education, and I strongly feel that Engel deserves credit for having been the first to slide chips around following these sorts of rules. This isn't just for Engel's sake as an individual; it's also for the sake of the kind of work that Engel did, blending his expertise in mathematics with his experience in the classroom. We often think of mathematical sophistication as something that leads practitioners to create concepts that can only be understood by experts, but at the highest reaches of mathematical research, there's a love of clarity that sees the pinnacle of sophistication as being the achievement of hard-won simplicity in settings where before there was only complexity.

First of all, Petri nets! I encountered Petri nets for the first time in a computer architecture course, probably as a master's student, and it immediately became my favorite thing about the course. I was never much into hardware and architecture, but Petri nets showed me a connection back to graph theory, which I loved. Later, I studied how to apply temporal logic to modeling hardware and found another way to appreciate my architecture courses.

But I really love the point that Propp makes in this paragraph and the section it opens. Most people think of research and teaching as being different sort of activities. But the kind of thinking one does in one often crosses over into the other. The sophistication that researchers have and use help us make sense of complex ideas and, at their best, help us communicate that understanding to a wide audience, not just to researchers at the same level of sophistication. The focus that teachers put on communicating challenging ideas to relative novices can encourage us to seek new formulations for a complex idea and ways to construct more complex ideas out of the new formulations. Sometimes, that can lead to an insight we can use in research.

In recent years, my research has benefited a couple times from trying to explain and demonstrate concatenative programming, as in Forth and Joy, to my undergraduate students. These haven't been breakthroughs of the sort that Engel made with his probability machines, but they've certainly help me grasp in new ways ideas I'd been struggling with.

Propp argues convincingly that it's important that we tell stories like Engel's and recognize that his breakthrough came as a result of his work in the classroom. This might encourage more researchers to engage as deeply with their teaching as with their research. Everyone will benefit.

Do you know any examples similar to the one Propp relates, but in the field of computer science? If so, I would love to hear about them. Drop me a line via email or Twitter.

Oh, and if you like Petri nets, probability, or fun stories about teaching, do read Propp's entire piece. It's good fun and quite informative.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 10, 2019 10:53 AM

Weekend Shorts

Andy Ko, in SIGCSE 2019 report:

I always have to warn my students before they attend SIGCSE that it's not a place for deep and nuanced discussions about learning, nor is it a place to get critical feedback about their ideas.
It is, however, a wonderful place to be immersed in the concerns of CS teachers and their perceptions of evidence.

I'm not sure I agree that one can't have deep, nuanced discussions about learning at SIGCSE, but it certainly is not a research conference. It is a great place to talk to and learn from people in the trenches teaching CS courses, with a strong emphasis on the early courses. I have picked up a lot of effective, creative, and inspiring ideas at SIGCSE over the years. Putting them onto sure scientific footing is part of my job when I get back.


Stephen Kell, in Some Were Meant for C (PDF), an Onward! 2017 essay:

Unless we can understand the real reasons why programmers continue to use C, we risk researchers continuing to solve a set of problems that is incomplete and/or irrelevant, while practitioners continue to use flawed tools.

For example,

... "faster safe languages" is seen as the Important Research Problem, not better integration.

... whereas Kell believes that C's superiority in the realm of integration is one of the main reasons that C remains a dominant, essential systems language.

Even with the freedom granted by tenure, academic culture tends to restrict what research gets done. One cause is a desire to publish in the best venues, which encourages work that is valued by certain communities. Another reason is that academic research tends to attract people who are interested in a certain kind of clean problem. CS isn't exactly "round, spherical chickens in a vacuum" territory, but... Language support for system integration, interop, and migration can seem like a grungier sort of work than most researchers envisioned when they went to grad school.

"Some Were Meant for C" is an elegant paper, just the sort of work, I imagine, that Richard Gabriel had when envisioned the essays track at Onward. Well worth a read.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 28, 2019 4:29 PM

Ubiquitous Distraction

This morning, while riding the exercise bike, I read two items within twenty minutes or so that formed a nice juxtaposition for our age. First came The Cost of Distraction, an old blog post by L.M. Sacasas that reconsiders Kurt Vonnegut's classic story, "Harrison Bergeron" (*). In the story, it is 2081, and the Handicapper General of the United States ensures equality across the land by offsetting any advantages any individual has over the rest of the citizenry. In particular, those of above-average intelligence are required to wear little earpieces that periodically emit high-pitched sounds to obliterate any thoughts in progress. The mentally- and physically-gifted Harrison rebels, to an ugly end.

Soon after came Ian Bogost's Apple's AirPods Are an Omen, an article from last year that explores the cultural changes that are likely to ensue as more and more people wear AirPods and their ilk. ("Apple's most successful products have always done far more than just make money, even if they've raked in a lot of it....") AirPods free the wearer in so many ways, but they also bind us to ubiquitous distraction. Will we ever have a free moment to think deeply when our phones and laptops now reside in our heads?

As Sacasas says near the end of his post,

In the world of 2081 imagined by Vonnegut, the distracting technology is ruthlessly imposed by a government agency. We, however, have more or less happily assimilated ourselves to a way of life that provides us with regular and constant distraction. We have done so because we tend to see our tools as enhancements.

Who needs a Handicapper General when we all walk down to the nearest Apple Store or Best Buy and pop distraction devices into our own ears?

Don't get me wrong. I'm a computer scientist, and I love to program. I also love the productivity my digital tools provide me, as well as the pleasure and comfort they afford. I'm not opposed to AirBuds, and I may be tempted to get a pair someday. But there's a reason I don't carry a smart phone and that the only iPod I've ever owned is 1GB first-gen Shuffle. Downtime is valuable, too.

(*) By now, even occasional readers know that I'm a big Vonnegut fan who wrote a short eulogy on the occasion of his death, nearly named this blog after one of his short stories, and returns to him frequently.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

December 29, 2018 4:41 PM

No Big Deal

I love this line from Organizational Debt:

So my proposal for Rust 2019 is not that big of a deal, I guess: we just need to redesign our decision making process, reorganize our governance structures, establish new norms of communication, and find a way to redirect a significant amount of capital toward Rust contributors.

A solid understatement usually makes me smile. Decision-making processes, governance structure, norms of communication, and compensation for open-source developers... no big deal, indeed. We all await the results. If the results come with advice that generalizes beyond a single project, especially the open-source compensation thing, all the better.

Communication is a big part of the recommendation for 2019. Changing how communication works is tough in any organization, let alone an organization with distributed membership and leadership. In every growing organization there eventually comes the time for intentional systems of communication:

But we've long since reached the point where coordinating our design vision by osmosis is not working well. We need an active and intentional circulatory system for information, patterns, and frameworks of decision making related to design.

I'm not a member of the Rust community, only an observer. But I know that the language inspires some programmers, and I learned a bit about its tool chain and community support a couple of years ago when an ambitious student used it successfully to implement his compiler in my course. It's the sort of language we need, being created in what looks to be an admirable way. I wish the Rust team well as they tackle their organizational debt and tackle their growing pains.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 24, 2018 2:55 PM

Using a Text Auto-Formatter to Enhance Human Communication

More consonance with Paul Romer, via his conversation with Tyler Cowen: They were discussing how hard it is to learn read English than other languages, due to its confusing orthography and in particular the mismatch between sounds and their spellings. We could adopt a more rational way to spell words, but it's hard to change the orthography of large language spoken by a large, scattered population. Romer offered a computational solution:

It would be a trivial translation problem to let some people write in one spelling form, others in the other because it would be word-for-word translation. I could write you an email in rationalized spelling, and I could put it through the plug-in so you get it in traditional spelling. This idea that it's impossible to change spelling I think is wrong. It's just, it's hard, and we should -- if we want to consider this -- we should think carefully about the mechanisms.

This sounds similar to a common problem and solution in the software development world. Programmers working in teams often disagree about the orthography of code, not the spelling so much as its layout, the use of whitespace, and the placement of punctuation. Being programmers, we often address this problem computationally. Team members can stylize their code anyway they see fit but, when they check it into the common repository, they run it through a language formatter. Often, these formatters are built into our IDEs. Nowadays, some languages even come with a built-in formatting tool, such as Go and gofmt.

Romer's email plug-in would play a similar role in human-to-human communication, enabling writers to use different spelling systems concurrently. This would make it possible to introduce a more rational way to spell words without having to migrate everyone to the new system all at once. There are still challenges to making such a big change, but they could be handled in an evolutionary way.

Maybe Romer's study of Python is turning him into a computationalist! Certainly, being a programmer can help a person recognize the possibility of a computational solution.

Add this idea to his recent discovery of C.S. Peirce, and I am feeling some intellectual kinship to Romer, at least as much as an ordinary CS prof can feel kinship to a Nobel Prize-winning economist. Then, to top it all off, he lists Slaughterhouse-Five as one of his two favorite novels. Long-time readers know I'm a big Vonnegut fan and nearly named this blog for one of his short stories. Between Peirce and Vonnegut, I can at least say that Romer and I share some of the same reading interests. I like his tastes.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

December 23, 2018 10:45 AM

The Joy of Scholarship

This morning I read Tyler Cowen's conversation with Paul Romer. At one point, Romer talks about being introduced to C.S. Peirce, who had deep insights into "abstraction and how we use abstraction to communicate" (a topic Romer and Cowen discuss earlier in the interview). Romer is clearly enamored with Peirce's work, but he's also fascinated by the fact that, after a long career thinking about a set of topics, he could stumble upon a trove of ideas that he didn't even know existed:

... one of the joys of reading -- that's not a novel -- but one of the joys of reading, and to me slightly frightening thing, is that there's so much out there, and that a hundred years later, you can discover somebody who has so many things to say that can be helpful for somebody like me trying to understand, how do we use abstraction? How do we communicate clearly?

But the joy of scholarship -- I think it's a joy of maybe any life in the modern world -- that through reading, we can get access to the thoughts of another person, and then you can sample from the thoughts that are most relevant to you or that are the most powerful in some sense.

This process, he says, is the foundation for how we transmit knowledge within a culture and across time. It's how we grow and share our understanding of the world. This is a source of great joy for scholars and, really, for anyone who can read. It's why so many people love books.

Romer's interest in Peirce calls to mind my own fascination with his work. As Romer notes, Peirce had a "much more sophisticated sense about how science proceeds than the positivist sort of machine that people describe". I discovered Peirce through an epistemology course in graduate school. His pragmatic view of knowledge, along with William James's views, greatly influenced how I thought about knowledge. That, in turn, redefined the trajectory by which I approached my research in knowledge-based systems and AI. Peirce and James helped me make sense of how people use knowledge, and how computer programs might.

So I feel a great kinship with Romer in his discovery of Peirce, and the joy he finds in scholarship.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

October 02, 2018 4:04 PM

Strange Loop 5: Day Two

the video screen announcing Philip Wadler's talk

Friday was a long day, but a good one. The talks I saw were a bit more diverse than on Day One: a couple on language design (though even one of those covered a lot more ground than that), one on AI, one on organizations and work-life, and one on theory:

• "All the Languages Together", by Amal Ahmed, discussed a problem that occurs in multi-language systems: when code written in one language invalidates the guarantees made by code written in the other. Most languages are not designed with this sort of interoperability baked in, and their FFI escape hatches make anything possible within foreign code. As a potential solution, Ahmed offered principled escape hatches designed with specific language features in mind. The proposed technique seems like it could be a lot of work, but the research is in its early stages, so we will learn more as she and her students implement the idea.

This talk is yet another example of how so many of our challenges in software engineering are a result of programming language design. It's good to see more language designers taking issues like these seriously, but we have a long way to go.

• I really liked Ashley Williams's talk on on the evolution of async in Javascript and Rust. This kind of talk is right up my alley... Williams invoked philosophy, morality, and cognitive science as she reviewed how two different language communities incorporated asynchronous primitives into their languages. Programming languages are designed, to be sure, but they are also the result of "contingent turns of history" (a lá Foucault). Even though this turned out to be more of a talk about the Rust community than I had expected, I enjoyed every minute. Besides, how can you not like a speaker who says, "Yes, sometimes I'll dress up as a crab to teach."?

(My students should not expect a change in my wardrobe any time soon...)

• I also enjoyed "For AI, by AI", by Connor Walsh. The talk's subtitle, "Freedom & Evolution of the Algopoetic Avant-Garde", was a bit disorienting, as was its cold open, but the off-kilter structure of the talk was easy enough to discern once Walsh got going: first, a historical review of humans making computers write poetry, followed by a look at something I didn't know existed... a community of algorithmic poets — programs — that write, review, and curate poetry without human intervention. It's a new thing, of Walsh's creation, that looks pretty cool to someone who became drunk on the promise of AI many years ago.

I saw two other talks the second day:

  • the after-lunch address by Philip Wadler, "Categories for the Working Hacker", which I wrote about separately
  • Rachel Krol's Some Things May Never Get Fixed, about how organizations work and how developers can thrive despite how they work

I wish I had more to say about the last talk but, with commitments at home, the long drive beckoned. So, I departed early, sadly, hopped in my car, headed west, and joined the mass exodus that is St. Louis traffic on a Friday afternoon. After getting past the main crush, I was able to relax a bit with the rest of Zen and the Art of Motorcycle Maintenance.

Even a short day at Strange Loop is a big win. This was the tenth Strange Loop, and I think I've been to five, or at least that's what my blog seems to tell me. It is awesome to have a conference like this in Middle America. We who live here benefit from the opportunities it affords us, and maybe folks in the rest of the world get a chance to see that not all great computing ideas and technology happen on the coasts of the US.

When is Strange Loop 2019?

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

October 01, 2018 7:12 PM

Strange Loop 4: The Quotable Wadler

Philip Wadler is a rockstar to the Strange Loop crowd. His 2015 talk on propositions as types introduced not a few developers to one of computer science's great unities. This year, he returned to add a third idea to what is really a triumvirate: categories. With a little help from his audience, he showed that category theory has elements which correspond directly to ...

  • logical 'and', which models the record (or tuple, or pair) data type
  • logical 'or', which models the union (or variant record) data type
  • a map, which models the function data type
What's more, the product/sum dual models De Morgan's laws, but with more structure, which enables it to model sets beyond the booleans!

Wadler is an entertaining teacher; I recommend the video of his talk! But he is also as quotable as any CS prof I've encountered in a long while. Here is a smattering of his great lines from "Categories for the Working Hacker":

If you can use math to do something, do it. It will make your life better.

That's the great thing about math. It lets you see something obvious after only thirty or forty years.

Pick your favorite algebra. If you don't have one, get one.

Let's do that in Java. That's what you should always do when you learn a new idea: do it in Java.

That's what category theory is really about: avoiding traffic jams.

Sums are the secret origin of folds.

If you don't understand this, I don't mind, because it's Java.

While watching the presentation, I created a one-liner of my own: Surprise! If you do something that matches exactly what Haskell does, Haskell code will be much shorter than Java code.

This was a very good talk; I enjoyed it quite a bit. However, I also left the room with a couple of nagging questions. The talk was titled "Categories for the Working Hacker", and it did a nice job of presenting some basic ideas from category theory in a way that most any developer could understand, even one without much background in math. But... How does this knowledge make one a better hacker? Armed with this new, entertaining knowledge, what are software developers able to do that they couldn't do before?

I have my own ideas for answers to these questions, but I would love to hear Wadler's take.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 30, 2018 6:40 PM

Strange Loop 3: David Schmüdde and the Art of Misuse

the splash screen to open David Schmudde's talk, 'Misuse'

This talk, the first of the afternoon on Day 1, opened with a familiar image: René Magritte's "this is not a pipe" painting, next to a picture of an actual pipe from some e-commerce site. Throughout the talk, speaker David Schmüdde returned to the distinction between thing and referent as he looked at the phenomenon of software users who used -- misused -- software to do something other than intended by the designer. The things they did were, or became, art.

First, a disclaimer: David is a former student of mine, now a friend, and one of my favorite people in the world. I still have in my music carousel a CD labeled "Schmudde Music!!" that he made for me just before he graduated and headed off to a master's program in music at Northwestern.

I often say in my conference reports that I can't do a talk justice in a blog entry, but it's even more true of a talk such as this one. Schmüdde demonstrated multiple works of art, both static and dynamic, which created a vibe that loses most of its zing when linearized in text. So I'll limit myself here to a few stray observations and impressions from the talk, hoping that you'll be intrigued enough to watch the video when it's posted.

Art is a technological endeavor. Rembrandt and hip hop don't exist without advances in art-making technology.

Misuse can be a form of creative experimentation. Check out Jodi, a website created in 1995 and still available. In the browser, it seems to be a work of ASCII art, but show the page source... (That's a lot harder these days than it was in 1995.) Now that is ASCII art.

Schmüdde talked about another work of from the same era, entitled Rain. It used slowness -- of the network, of the browser -- as a feature. Old HTML (or was it a bug in an old version of Netscape Navigator?) allowed one HEAD tag in a file with multiple BODY tags. The artist created such a document that, when loaded in sequence, gave the appearance of rain falling in the browser. Misusing the tools under the conditions of the day enabled the artist to create an animation before animated GIFs, Flash, and other forms of animation existed.

The talk followed with examples and demos of other forms of software misuse, which could:

  • find bugs in a system
  • lead to new system features
  • illuminate a system in ways not anticipated by the software's creator
Schmüdde wondered, when we fix bugs in this way, do we make the resulting system, or the resulting interaction, less human?

Accidental misuse is life. We expect it. Intentional misuse is, or can be, art. It can surprise us.

What does art preservation look like for these works? The original hardware and software systems often are obsolete or, more likely, gone. To me, this is one of the great things about computers: we can simulate just about anything. Digital art preservation becomes a matter of simulating the systems or interactions that existed at the time the art was created. We are back to Magritte's pipe... This is not a work of art; it is a pointer to a work of art.

It is, of course, harder to recreate the experience of the art from the time it was created, but isn't this true of all art? Each of us experiences a work of art anew each time we encounter it. Our experience is never the same as the experience of those who were present when the work was first unveiled. It's often not even the same experience we ourselves had yesterday.

Schmüdde closed with a gentle plea to the technologists in the room to allow more art into their process. This is a new talk, and he was a little concerned about his ending. He may find a less abrupt way to end in the future, but to be honest, I though what he did this time worked well enough for the day.

Even taking my friendship with the speaker into account, this was the talk of the conference for me. It blended software, users, technology, ideas, programming, art, the making of things, and exploring software at its margins. These ideas may appear at the margin, but they often lie at the core of the work. And even when they don't, they surprise us or delight us or make us think.

This talk was a solid example of what makes Strange Loop a great conference every year. There were a couple of other talks this year that gave me a similar vibe, for example, Hannah Davis's "Generating Music..." talk on Day 1 and Ashley Williams's "A Tale of Two asyncs" talk on Day 2. The conference delivers top-notch technical content but also invites speakers who use technology, and explore its development, in ways that go beyond what you find in most CS classrooms.

For me, Day One of the conference ended better than most: over a beer with David at Flannery's with good conversation, both about ideas from his talk and about old times, families, and the future. A good day.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Personal

September 30, 2018 10:31 AM

Strange Loop 2: Simon Peyton Jones on Teaching CS in the Schools

Simon Peyton discusses one of the myths of getting CS into the primary and secondary classroom: it's all about the curriculum

The opening keynote this year was by Simon Peyton Jones of Microsoft Research, well known in the programming languages for Haskell and many other things. But his talk was about something considerably less academic: "Shaping Our Children's Education in Computing", a ten-year project to reform the teaching of computing in the UK primary and secondary schools. It was a wonderful talk, full of history, practical advice, lessons learned, and philosophy of computing. Rather than try to summarize everything Peyton Jones said, I will let you watch the video when it is posted (which will be as early as next week, I think).

I would, though, like to highlight one particular part of the talk, the way he describes computer science to a non-CS audience. This is an essential skill for anyone who wants to introduce CS to folks in education, government, and the wider community who often see CS as either hopelessly arcane or as nothing more than a technology or a set of tools.

Peyton Jones characterized computing as being about information, computation, and communication. For each, he shared one or two ways to discuss the idea with an educated but non-technical audience. For example:

  • Information.   Show two images, say the Mona Lisa and a line drawing of a five-pointed star. Ask which contains more information. How can we tell? How can we compare the amounts? How might we write that information down?

  • Computation.   Use a problem that everyone can relate to, such as planning a trip to visit all the US state capitals in the fewest miles or sorting a set of numbers. For the latter, he used one of the activities from CS Unplugged on sorting networks as an example.

  • Communication.   Here, Peyton Jones used the elegant and simple idea underlying the Diffie Hellman algorithm for sharing secret as his primary example. It is simple and elegant, yet it's not at all obvious to most people who don't already know it that the problem can be solved at all!

In all three cases, it helps greatly to use examples from many disciplines and to ask questions that encourage the audience to ask their own questions, form their own hypotheses, and create their own experiments. The best examples and questions actually enable people to engage with computing through their own curiosity and inquisitiveness. We are fascinated by computing; other people can be, too.

There is a huge push in the US these days for everyone to learn how to program. This creates a tension among many of us computer scientists, who know that programming isn't everything that we do and that its details can obscure CS as much as they illuminate it. I thought that Peyton Jones used a very nice analogy to express the relationship between programming and CS more broadly: Programming is to computer science as lab work is to physics. Yes, you could probably take lab work out of physics and still have physics, but doing so would eviscerate the discipline. It would also take away a lot of what draws people to the discipline. So it is with programming and computer science. But we have to walk a thin line, because programming is seductive and can ultimately distract us from the ideas that make programming so valuable in the first place.

Finally, I liked Peyton Jones's simple summary of the reasons that everyone should learn a little computer science:

  • Everyone should be able to create digital media, not just consume it.
  • Everyone should be able to understand their tools, not just use them.
  • People should know that technology is not magic.
That last item grows increasingly important in a world where the seeming magic of computers redefines every sector of our lives.

Oh, and yes, a few people will get jobs that use programming skills and computing knowledge. People in government and business love to hear that part.

Regular readers of this blog know that I am a sucker for aphorisms. Peyton Jones dropped a few on us, most earnestly when encouraging his audience to participate in the arduous task of introducing and reforming the teaching CS in the schools:

  • "If you wait for policy to change, you'll just grow old. Get on with it."
  • "There is no 'them'. There is only us."
(The second of these already had a home in my brain. My wife has surely tired of hearing me say something like it over the years.)

It's easy to admire great researchers who have invested so much time and energy into solving real-world problems, especially in our schools. As long as this post is, it covers only a few minutes from the middle of the talk. My selection and bare-bones outline don't do justice to Peyton Jones's presentation or his message. Go watch the talk when the video goes up. It was a great way to start Strange Loop.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 29, 2018 6:19 PM

Strange Loop 1: Day One

the Strange Loop splash screen from the main hall

Last Wednesday morning, I hopped in my car and headed south to Strange Loop 2018. It had been a few years since I'd listened to Zen and the Art of Motorcycle Maintenance on a conference drive, so I popped it into the tapedeck (!) once I got out of town and fell into the story. My top-level goal while listening to Zen was similar to my top-level goal for attending Strange Loop this year: to experience it at a high level; not to get bogged down in so many details that I lost sight of the bigger messages. Even so, though, a few quotes stuck in my mind from the drive down. The first is an old friend, one of my favorite lines from all of literature:

Assembly of Japanese bicycle require great peace of mind.

The other was the intellectual breakthrough that unified Phaedrus's philosophy:

Quality is not an object; it is an event.
This idea has been on my mind in recent months. It seemed a fitting theme, too, for Strange Loop.

On the first day of the conference, I saw mostly a mixture of compiler talks and art talks, including:

@mraleph's "Six Years of Dart", in which he reminisced on the evolution of the language, its ecosystem, and its JIT. I took at least one cool idea from this talk. When he compared the performance of two JITs, he gave a histogram comparing their relative performances, rather than an average improvement. A new system often does better on some programs and worse on others. An average not only loses information; it may mislead.

• Jason Dagit's "Your Secrets are Safe with Julia", about a system that explores the use of homomorphic encryption to to compile secure programs. In this context, the key element of security is privacy. As Dagit pointed out, "trust is not transitive", which is especially important when it comes to sharing a person's health data.

• I just loved Hannah Davis's talk on "Generating Music From Emotion". She taught me about data sonification and its various forms. She also demonstrated some of her attempts to tease multiple dimensions of human emotion out of large datasets and to use these dimensions to generate music that reflects the data's meaning. Very cool stuff. She also showed the short video Dragon Baby, which made me laugh out loud.

• I also really enjoyed "Hackett: A Metaprogrammable Haskell", by Alexis King. I've read about this project on the Racket mailing list for a few years and have long admired King's ability in posts there to present complex ideas clearly and logically. This talk did a great job of explaining that Haskell deserves a powerful macro system like Racket's, that Racket's macro system deserves a powerful type system like Haskell's, and that integrating the two is more challenging than simply adding a stage to the compiler pipeline.

I saw two other talks the first day:

  • the opening keynote address by Simon Peyton Jones, "Shaping Our Children's Education in Computing" [ link ]
  • David Schmüdde, "Misuser" [ link ]
My thoughts on these talks are more extensive and warrant short entries of their own, to follow.

I had almost forgotten how many different kinds of cool ideas I can encounter in a single day at Strange Loop. Thursday was a perfect reminder.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

September 20, 2018 4:44 PM

Special Numbers in a Simple Language

This fall I am again teaching our course in compiler development. Working in teams of two or three, students will implement from scratch a complete compiler for a simple functional language that consists of little more than integers, booleans, an if statement, and recursive functions. Such a language isn't suitable for much, but it works great for writing programs that do simple arithmetic and number theory. In the past, I likened it to an integer assembly language. This semester, my students are compiling a Pascal-like language of this sort that call Flair.

If you've read my blog much in the falls over the last decade or so, you may recall that I love to write code in the languages for which my students write their compilers. It makes the language seem more real to them and to me, gives us all more opportunities to master the language, and gives us interesting test cases for their scanners, parsers, type checkers, and code generators. In recent years I've blogged about some of my explorations in these languages, including programs to compute Farey numbers and excellent numbers, as well as trying to solve one of my daughter's AP calculus problems.

When I run into a problem, I usually get an itch to write a program, and in the fall I want to write it in my students' language.

Yesterday, I began writing my first new Flair program of the semester. I ran across this tweet from James Tanton, which starts:

N is "special" if, in binary, N has a 1s and b 0s and a & b are each factors of N (so non-zero).

So, 10 is special because:

  • In binary, 10 is 1010.
  • 1010 contains two 1s and two 0s.
  • Two is a factor of 10.

9 is not special because its binary rep also contains two 1s and two 0s, but two is not a factor of 9. 3 is not special because its binary rep has no 0s at all.

My first thought upon seeing this tweet was, "I can write a Flair program to determine if a number is special." And that is what I started to do.

Flair doesn't have loops, so I usually start every new program by mapping out the functions I will need simply to implement the definition. This makes sure that I don't spend much time implementing loops that I don't need. I ended up writing headers and default bodies for three utility functions:

  • convert a decimal number to binary
  • count the number of times a particular digits occurs in a number
  • determine if a number x divides evenly into a number n

With these helpers, I was ready to apply the definition of specialness:

    return divides(count(1, to_binary(n)), n)
       and divides(count(0, to_binary(n)), n)

Calling to_binary on the same argument is wasteful, but Flair doesn't have local variables, either. So I added one more helper to implement the design pattern "Function Call as Variable Assignment", apply_definition:

    function apply_definition(binary_n : integer, n : integer) : boolean
and called it from the program's main:
    return apply_definition(to_binary(n), n)

This is only the beginning. I still have a lot of work to do to implement to_binary, count and divides, using recursive function calls to simulate loops. This is another essential design pattern in Flair-like languages.

As I prepared to discuss my new program in class today, I found bug: My divides test was checking for factors of binary_n, not the decimal n. I also renamed a function and one of its parameters. Explaining my programs to students, a generalization of rubber duck debugging, often helps me see ways to make a program better. That's one of the reasons I like to teach.

Today I asked my students to please write me a Flair compiler so that I can run my program. The course is officially underway.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

September 05, 2018 3:58 PM

Learning by Copying the Textbook

Or: How to Learn Physics, Professional Golfer Edition

Bryson DeChambeau is a professional golfer, in the news recently for consecutive wins in the FedExCup playoff series. But he can also claim an unusual distinction as a student of physics:

In high school, he rewrote his physics textbook.

DeChambeau borrowed the textbook from the library and wrote down everything from the 180-page book into a three-ring binder. He explains: "My parents could have bought one for me, but they had done so much for me in golf that I didn't want to bother them in asking for a $200 book. ... By writing it down myself I was able to understand things on a whole comprehensive level.

I imagine that copying texts word-for-word was a more common learning strategy back when books were harder to come by, and perhaps it will become more common again as textbook prices rise and rise. There is certainly something to be said for it. Writing by hand takes time, and all the while our brains can absorb terms, make connections among concepts, and process the material into long-term memory. Zed Shaw argues for this as a great way to learn computer programming, implementing it as a pedagogical strategy in his "Learn <x> the Hard Way" series of books. (See Learn Python the Hard Way as an example.)

I don't think I've ever copied a textbook word-for-word, and I never copied computer programs from "Byte" magazine, but I do have similar experiences in note taking. I took elaborate notes all through high school, college, and grad school. In grad school, I usually rewrote all of my class notes -- by hand; no home PC -- as I reviewed them in the day or two after class. My clean, rewritten notes had other benefits, too. In a graduate graph algorithms course, they drew the attention of a classmate who became one of my best friends and were part of what attracted the attention of the course's professor, who asked me to consider joining his research group. (I was tempted... Graph algorithms was one of my favorite courses and research areas!)

I'm not sure many students these days benefit from this low-tech strategy. Most students who take detailed notes in my course seem to type rather than write which, if what I've read is correct, has fewer cognitive advantages. But at least those students are engaging with the material consciously. So few students seem to take detailed notes at all these days, and that's a shame. Without notes, it is harder to review ideas, to remember what they found challenging or puzzling in the moment, and to rehearse what they encounter in class into their long-term memories. Then again, maybe I'm just having a "kids these days" moment.

Anyway, I applaud DeChambeau for saving his parents a few dollars and for the achievement of copying an entire physics text. He even realized, perhaps after the fact, that it was an excellent learning strategy.

(The above passage is from The 11 Most Unusual Things About Bryson DeChambeau. He sounds like an interesting guy.)

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 31, 2018 3:06 PM

Reflection on a Friday

If you don't sit facing the window, you could be in any town.

I read that line this morning in Maybe the Cumberland Gap just swallows you whole, where it is a bittersweet observation of the similarities among so many dying towns across Appalachia. It's a really good read, mostly sad but a little hopeful, that applies beyond one region or even one country.

My mind is self-centered, though, and immediately reframed the sentence in a way that cast light on my good fortune.

I just downloaded a couple of papers on return-oriented programming so that I can begin working with an undergraduate on an ambitious research project. I have a homework assignment to grade sitting in my class folder, the first of the semester. This weekend, I'll begin to revise a couple of lectures for my compiler course, on NFAs and DFAs and scanning text. As always, there is a pile of department work to do on my desk and in my mind.

I live in Cedar Falls, Iowa, but if I don't sit facing the window, I could be in Ames or Iowa City, East Lansing or Durham, Boston or Berkeley. And I like the view out of my office window very much, thank you, so I don't even want to trade.

Heading into a three-day weekend, I realize again how fortunate I am. Do I put my good fortune to good enough use?

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

August 17, 2018 2:19 PM

LangSec and My Courses for the Year

As I a way to get into the right frame of mind for the new semester and the next iteration of my compiler course, I read Michael Hicks's Software Security is a Programming Languages Issue this morning. Hicks incorporates software security into his courses on the principles of programming languages, with two lectures on security before having students study and use Rust. The article has links to lecture slides and supporting material, which makes it a post worth bookmarking.

I started thinking about adding LangSec to my course late in the spring semester, as I brainstormed topics that might spice the rest of the course up for both me and my students. However, time was short, so I stuck with a couple of standalone sessions on topics outside the main outline: optimization and concatenative languages. They worked fine but left me with an itch for something new.

I think I'll use the course Hicks and his colleagues teach as a starting point for figuring out how I might add to next spring's course. Students are interested in security, it's undoubtedly an essential issue for today's grads, and it is a great way to demonstrate how the design of programming languages is more than just the syntax of a loop or the lambda calculus.

Hicks's discussion of Rust also connects with my fall course. Two years ago, an advanced undergrad used Rust as the implementation language for his compiler. He didn't know the language but wanted to pair it with Haskell in his toolbox. The first few weeks of the project were a struggle as he wrestled with mastering ownership and figuring out some new programming patterns. Eventually he hit a nice groove and produced a working compiler with only a couple of small holes.

I was surprised how easy it was for me install the tools I needed to compile, test, and explore his code. That experience increased my interest in learning the language, too. Adding it to my spring course would give me the last big push I need to buckle down.

This summer has been a blur of administrative stuff, expected and unexpected. The fall semester brings the respite of work I really enjoy: teaching compilers and writing some code. Hurray!

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 05, 2018 10:21 AM

Three Uses of the Knife

I just finished David Mamet's Three Uses of the Knife, a wide-ranging short book with the subtitle: "on the nature and purpose of drama". It is an extended essay on how we create and experience drama -- and how these are, in the case of great drama, the same journey.

Even though the book is only eighty or so pages, Mamet characterizes drama in so many ways that you'll have to either assemble a definition yourself or accept the ambiguity. Among them, he says that the job of drama and art is to "delight" us and that "the cleansing lesson of the drama is, at its highest, the worthlessness of reason."

Mamet clearly believes that drama is central to other parts of life. Here's a cynical example, about politics:

The vote is our ticket to the drama, and the politician's quest to eradicate "fill in the blank", is no different from the promise of the superstar of the summer movie to subdue the villain -- both promise us diversion for the price of a ticket and a suspension of disbelief.

As reader, I found myself using the book's points to ruminate about other parts of life, too. Consider the first line of the second essay:

The problems of the second half are not the problems of the first half.

Mamet uses this to launch into a consideration of the second act of a drama, which he holds equally to be a consideration of writing the second act of a drama. But with fall semester almost upon us, my thoughts jumped immediately to teaching a class. The problems of teaching the second half of a class are quite different from the problems of teaching the first half. The start of a course requires the instructor to lay the foundation of a topic while often convincing students that they are capable of learning it. By midterm, the problems include maintaining the students' interest as their energy flags and the work of the semester begins to overwhelm them. The instructor's energy -- my energy -- begins to flag, too, which echoes Mamet's claim that the journey of the creator and the audience are often substantially the same.

A theme throughout the book is how people immerse themselves in story, suspending their disbelief, even creating story when they need it to soothe their unease. Late in the book, he connects this theme to religious experience as well. Here's one example:

In suspending their disbelief -- in suspending their reason, if you will -- for a moment, the viewers [of a magic show] were rewarded. They committed an act of faith, or of submission. And like those who rise refreshed from prayers, their prayers were answered. For the purpose of the prayer was not, finally, to bring about intercession in the material world, but to lay down, for the time of the prayer, one's confusion and rage and sorrow at one's own powerlessness.

This all makes the book sound pretty serious. It's a quick read, though, and Mamet writes with humor, too. It feels light even as it seems to be a philosophical work.

The following paragraph wasn't intended as humorous but made me, a computer scientist, chuckle:

The human mind cannot create a progression of random numbers. Years ago computer programs were created to do so; recently it has been discovered that they were flawed -- the numbers were not truly random. Our intelligence was incapable of creating a random progression and therefore of programming a computer to do so.

This reminded me of a comment that my cognitive psychology prof left on the back of an essay I wrote in class. He wrote something to the effect, "This paper gets several of the particulars incorrect, but then that wasn't the point. It tells the right story well." That's how I felt about this paragraph: it is wrong on a couple of important facts, but it advances the important story Mamet is telling ... about the human propensity to tell stories, and especially to create order out of our experiences.

Oh, and thanks to Anna Gát for bringing the book to my attention, in a tweet to Michael Nielsen. Gát has been one of my favorite new follows on Twitter in the last few months. She seems to read a variety of cool stuff and tweet about it. I like that.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

July 28, 2018 11:37 AM

Three Things I Read This Morning

Why I Don't Love Gödel, Escher, Bach

I saw a lot of favorable links to this post a while back and finally got around to it. Meh. I generally agree with the author: GEB is verbose in places, there's a lot of unnecessary name checking, and the dialogues that lead off each chapter are often tedious. I even trust the author's assertion that Hofstadter's forays beyond math, logic, and computers are shallow.

So what? Things don't have to be perfect for me to like them, or for them to make me think. GEB was a swirl of ideas that caused me to think and helped me make a few connections. I'm sure if I read the book now that I would feel differently about it, but reading it when I did, as an undergrad CS major thinking about AI and the future, it energized me.

I do thank the author for his pointer (in a footnote) to Vi Hart's wonderful Twelve Tones. You should watch it. Zombie Schonberg!

The Web Aesthetic

This post wasn't quite what I expected, but even a few years old it has something to say to web designers today.

Everything on the web ultimately needs to degrade down to plain text (images require alt text; videos require transcripts), so the text editor might just become the most powerful app in the designer's toolbox.

XP Challenge: Compilers

People outside the XP community often don't realize how seriously the popularizers of XP explored the limitations of their own ideas. This page documents one of several challenges that push XP values and practices to the limits: When do they break down? Can they be adapted successfully to the task? What are the consequences of applying them in such circumstances?

Re-reading this old wiki page was worth it if only for this great line from Ron Jeffries:

The point of XP is to win, not die bravely.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

July 08, 2018 10:47 AM

Computing Everywhere: In the Dugout and On the Diamond

How's this for a job description: "The successful candidate will be able to hit a fungo, throw batting practice, and program in SQL."

We decided that in the minor leagues, we would hire an extra coach at each level. The requirements for that coach were that he had to be able to hit a fungo, throw batting practice, and program in SQL. It's a hard universe to find where those intersect, but we were able to find enough of them--players who had played in college that maybe played one year in the minors who had a technical background and could understand analytics.

The technical skills are not enough by themselves, though. In order to turn a baseball franchise into a data-informed enterprise, you have to change the culture of the team in the trenches, working with the people who have to change their own behavior. Management must take the time necessary to guide the organization's evolution.

The above passage is from How the Houston Astros are winning through advanced analytics. I picked it up expecting a baseball article, or perhaps a data analytics article, but it reads like a typical McKinsey Report piece. It was an interesting read, but for different reasons than I had imagined.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 29, 2018 11:46 AM

Computer Science to the Second Degree

Some thoughts on studying computer science from Gian-Carlo Rota:

A large fraction of MIT undergraduates major in computer science or at least acquire extensive computer skills that are applicable in other fields. In their second year, they catch on to the fact that their required courses in computer science do not provide the whole story. Not because of deficiencies in the syllabus; quite the opposite. The undergraduate curriculum in computer science at MIT is probably the most progressive and advanced such curriculum anywhere. Rather, the students learn that side by side with required courses there is another, hidden curriculum consisting of new ideas just coming into use, new techniques and that spread like wildfire, opening up unsuspected applications that will eventually be adopted into the official curriculum.

Keeping up with this hidden curriculum is what will enable a computer scientist to stay ahead in the field. Those who do not become computer scientists to the second degree risk turning into programmers who will only implement the ideas of others.

MIT is, of course, an exceptional school, but I think Rota's comments apply to computer science at most schools. So much learning of CS happens in the spaces between courses: in the lab, in the student lounge, at meetings of student clubs, at part-time jobs, .... That can sometimes be a challenge for students who don't have much curiosity, or develop one as they are exposed to new topics.

As profs, we encourage students to be aware of all that is going on in computer science beyond the classroom and to take part in the ambient curriculum to the extent they are able. Students who become computer scientists only to the first degree can certainly find good jobs and professional success, but there are more opportunities open at the second degree. CS can also be a lot more fun there.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 17, 2018 9:52 AM

Sometimes, Evolution Does What No Pivot Can

From an old New Yorker article about Spotify:

YouTube, which is by far the largest streaming-music site in the world (it wasn't designed that way--that's just what it became)...

Companies starting in one line of business and evolving into something else is nothing new. I mean, The Connecticut Leather Company became Coleco and made video game consoles. But there's something about software that make this sort of evolution seem so normal. We build a computer system to solve one problem and find that our users -- who have needs and desires that neither we nor they fully comprehend -- use it to solve a different problem. Interesting times. Don't hem yourself in, and don't hem your software in, or the people who use it.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 16, 2018 3:58 PM

Computing Everywhere: The Traveling Salesman Problem and Paris Fashion Week

I just read Pops, Michael Chabon's recent book of essays on fatherhood. The first essay, which originally appeared as an article in GQ, includes this parenthetical about his tour of Paris fashion week with his son:

-- a special mapping algorithm seemed to have been employed to ensure that every show was held as far as possible from its predecessor and its successor on the schedule --

My first thought was to approach this problem greedily: Start with the first show, then select a second show that is as far away as possible, then select a third show that is as far away as possible from that one, and so on, until all of the shows had been scheduled. But then I figured that a schedule so generated might seem laborious to travel at first, when there are plenty of faraway shows to choose from, but it might eventually start to seem pretty reasonable as the only shows left to schedule are relatively close.

We can generate a more wholly sort of unsatisfactory schedule by maximizing the total travel time of the circuit. That's the Traveling Salesman Problem, inverted. Taking this approach, our algorithm is quite simple. We start with the usual nxn matrix d, where d[i,j] equals the distance between shows i and j. Then:

  1. Replace the distance between every two show locations d[i,j], with -(d[i,j]), its additive inverse.
  2. Call your favorite TSP solver with the new graph.

Easy! I leave implementation of the individual steps as an exercise for the reader.

(By the way, Chabon's article is a sweet story about an already appreciative dad coming to appreciate his son even better. If you like that sort of thing, give it a read.)

Posted by Eugene Wallingford | Permalink | Categories: Computing

May 29, 2018 3:41 PM

Software as Adaptation-Executer, Not Fitness-Maximizer

In Adaptation-Executers, not Fitness-Maximizers, Eliezer Yudkowsky talks about how evolution has led to human traits that may no longer be ideal in the our current environment. He also talks about tools, though, and this summary sentence made me think of programs:

So the screwdriver's cause, and its shape, and its consequence, and its various meanings, are all different things; and only one of these things is found within the screwdriver itself.

I often fall victim to thinking that the meaning of software is at least somewhat inherent in its code, but that really is what the designer intended as its use -- a mix of what Yudkowsky calls its cause and its consequence. These are things that exist only in the mind of the designer and the user, not in the computational constructs that constitute the code.

When approaching new software, especially a complex piece of code with many parts, it's helpful to remember that it doesn't really have objective meaning or consequences, only those intended by its designers and those exercised by its users. Over time, the users' conception tends to drive the designers' conception as they put the software to particular uses and even modify it to better suit these new uses.

Perhaps software is best thought of as an adaptation-executer, too, and not as a fitness-maximizer.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 27, 2018 10:20 AM

AI's Biggest Challenges Are Still To Come

Semantic Information Processing on my bookshelf

A lot of people I know have been discussing last week's NY Times op-ed about recent advances in neural networks and what they mean for AI. The article even sparked conversation among colleagues from my grad school research lab and among my PhD advisor's colleagues from when he was in grad school. It seems that many of us are frequently asked by non-CS folks what we think about recent advances in AI, from AlphaGo to voice recognition to self-driving cars. My answers sound similar to what some of my old friends say. Are we now afraid of AI being able to take over the world? Um, no. Do you think that the goals of AI are finally within reach? No. Much remains to be done.

I rate my personal interest in recent deep learning advances as meh. I'm not as down on the current work as the authors of the Times piece seem to be; I'm just not all that interested. It's wonderful as an exercise in engineering: building focused systems that solve a single problem. But, as the article points out, that's the key. These systems work in limited domains, to solve limited problems. When I want one of these problems to be solved, I am thankful that people have figured out how to solve and make it commercially available for us to use. Self-driving cars, for instance, have the potential to make the world safer and to improve the quality of my own life.

My interest in AI, though, has always been at a higher level: understanding how intelligence works. Are there general principles that govern intelligent behavior, independent of hardware or implementation? One of the first things to attract me to AI was the idea of writing a program that could play chess. That's an engineering problem in a very narrow domain. But I soon found myself drawn to cognitive issues: problem-solving strategies, reflection, explanation, conversation, symbolic reasoning. Cognitive psychology was one of my favorite courses in grad school in large part because it tried to connect low-level behaviors in the human brain connected to the symbolic level. AlphaGo is exceedingly cool as a game player, but it can't talk to me about Go, and for me that's a lot of the fun of playing.

In an email message earlier this week, my quick take on all this work was: We've forgotten the knowledge level. And for me, the knowledge level is what's most interesting about AI.

That one-liner oversimplifies things, as most one-liners do. The AI world hasn't forgotten the knowledge level so much as moved away from it for a while in order to capitalize on advances in math and processing power. The results have been some impressive computer systems. I do hope that the pendulum swings back soon as AI researchers step back from these achievements and builds some theories at the knowledge level. I understand that this may not be possible, but I'm not ready to give up on the dream yet.

Posted by Eugene Wallingford | Permalink | Categories: Computing

April 06, 2018 3:19 PM

Maps and Abstractions

I've been reading my way through Frank Chimero's talks online and ran across a great bit on maps and interaction design in What Screens Want. One of the paragraphs made me think about the abstractions that show up in CS courses:

When I realized that, a little light went off in my head: a map's biases do service to one need, but distort everything else. Meaning, they misinform and confuse those with different needs.

CS courses are full of abstractions and models of complex systems. We use examples, often simplified, to expose or emphasize a single facet a system, as a way to help students cut through the complexity. For example, compilers and full-strength interpreters are complicated programs, so we start with simple interpreters operating over simple languages. Students get their feet wet without drowning in detail.

In the service of trying not to overwhelm students, though, we run the risk of distorting how they think about the parts we left out. Worse, we sometimes distort even their thinking about the part we're focusing on, because they don't see its connections to the more complete picture. There is an art to identifying abstractions, creating examples, and sequencing instruction. Done well, we can minimize the distortions and help students come to understand the whole with small steps and incremental increases in size and complexity.

At least that's what I think on my good days. There are days and even entire semesters when things don't seem to progress as smoothly as I hope or as smoothly as past experience has led me to expect. Those days, I feel like I'm doing violence to an idea when I create an abstraction or adopt a simplifying assumption. Students don't seem to be grokking the terrain, so change the map. We try different problems or work through more examples. It's hard to find the balance sometimes between adding enough to help and not adding so much as to overwhelm.

The best teachers I've encountered know how to approach this challenge. More importantly, they seem to enjoy the challenge. I'm guessing that teachers who don't enjoy it must be frustrated a lot. I enjoy it, and even so there are times when this challenge frustrates me.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 29, 2018 3:05 PM

Heresy in the Battle Between OOP and FP

For years now, I've been listening to many people -- smart, accomplished people -- feverishly proclaim that functional programming is here to right the wrongs of object-oriented programming. For many years before that, I heard many people -- smart, accomplished people -- feverishly proclaim that object-oriented programming was superior to functional programming, an academic toy, for building real software.

Alas, I don't have a home in the battle between OOP and FP. I like and program in both styles. So it's nice whenever I come across something like Alan Kay's recent post on Quora, in response to the question, "Why is functional programming seen as the opposite of OOP rather than an addition to it?" He closes with a paragraph I could take on as my credo:

So: both OOP and functional computation can be completely compatible (and should be!). There is no reason to munge state in objects, and there is no reason to invent "monads" in FP. We just have to realize that "computers are simulators" and figure out what to simulate.

As in many things, Kay encourages to go beyond today's pop culture of programming to create a computational medium that incorporates big ideas from the beginning of our discipline. While we work on those ideas, I'll continue to write programs in both styles, and to enjoy them both. With any luck, I'll bounce between mindsets long enough that I eventually attain enlightenment, like the venerable master Qc Na. (See the koan at the bottom of that link.)

Oh: Kay really closes his post with

I will be giving a talk on these ideas in July in Amsterdam (at the "CurryOn" conference).

If that's not a reason to go to Amsterdam for a few days, I don't know what is. Some of the other speakers looks pretty good, too.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 12, 2018 3:43 PM

Technology is a Place Where We Live

Yesterday morning I read The Good Room, a talk Frank Chimero gave last month. Early on in the talk, Chimero says:

Let me start by stating something obvious: in the last decade, technology has transformed from a tool that we use to a place where we live.

This sentence jumped off the page both for the content of the assertion and for the decade time frame with which he bounds it. In the fall of 2003, I taught a capstone course for non-majors that is part of my university's liberal arts core. The course, titled "Environment, Technology, and Society", brings students from all majors on campus together in a course near the end of their studies, to apply their general education and various disciplinary expertises to problems of some currency in the world. As you might guess from the title, the course focuses on problems at the intersection of the natural environment, technology, and people.

My offering of the course put on a twist on the usual course content. We focused on the man-made environment we all live in, which even by 2003 had begun to include spaces carved out on the internet and web. The only textbook for the course was Donald Norman's The Design of Everyday Things, which I think every university graduate should have read. The topics for the course, though, had a decided IT flavor: the effect of the Internet on everyday life, e-commerce, spam, intellectual property, software warranties, sociable robots, AI in law and medicine, privacy, and free software. We closed with a discussion of what an educated citizen of the 21st century ought to know about the online world in which they would live in order to prosper as individuals and as a society.

The change in topic didn't excite everyone. A few came to the course looking forward to a comfortable "save the environment" vibe and were resistant to considering technology they didn't understand. But most were taking the course with no intellectual investment at all, as a required general education course they didn't care about and just needed to check off the list. In a strange way, their resignation enabled them to engage with the new ideas and actually ask some interesting questions about their future.

Looking back now after fifteen years , the course design looks pretty good. I should probably offer to teach it again, updated appropriately, of course, and see where young people of 2018 see themselves in the technological world. As Chimero argues in his talk, we need to do a better job building the places we want to live in -- and that we want our children to live in. Privacy, online peer pressure, and bullying all turned out differently than I expected in 2003. Our young people are worse off for those differences, though I think most have learned ways to live online in spite of the bad neighborhoods. Maybe they can help us build better places to live.

Chimero's talk is educational, entertaining, and quotable throughout. I tweeted one quote: "How does a city wish to be? Look to the library. A library is the gift a city gives to itself." There were many other lines I marked for myself, including:

  • Penn Station "resembles what Kafka would write about if he had the chance to see a derelict shopping mall." (I'm a big Kafka fan.)
  • "The wrong roads are being paved in an increasingly automated culture that values ease."
Check the talk out for yourself.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

March 06, 2018 4:11 PM

A Good Course in Epistemology

Theoretical physicist Marcelo Gleiser, in The More We Know, the More Mystery There Is:

But even if we did [bring the four fundamental forces together in a common framework], and it's a big if right now, this "unified theory" would be limited. For how could we be certain that a more powerful accelerator or dark matter detector wouldn't find evidence of new forces and particles that are not part of the current unification? We can't. So, dreamers of a final theory need to recalibrate their expectations and, perhaps, learn a bit of epistemology. To understand how we know is essential to understand how much we can know.

People are often surprised to hear that, in all my years of school, my favorite course was probably PHL 440 Epistemology, which I took in grad school as a cognate to my CS courses. I certainly enjoyed the CS courses I took as a grad student, and as an undergrad, too, and but my study of AI was enhanced significantly by courses in epistemology and cognitive psychology. The prof for PHL 440, Dr. Rich Hall, became a close advisor to my graduate work and a member of my dissertation committee. Dr. Hall introduced me to the work of Stephen Toulmin, whose model of argument influenced my work immensely.

I still have the primary volume of readings that Dr. Hall assigned in the course. Looking back now, I'd forgotten how many of W.V.O. Quine's papers we'd read... but I enjoyed them all. The course challenged most of my assumptions about what it means "to know". As I came to appreciate different views of what knowledge might be and how we come by it, my expectations of human behavior -- and my expectations for what AI could be -- changed. As Gleiser suggests, to understand how we know is essential to understanding what we can know, and how much.

Gleiser's epistemology meshes pretty well with my pragmatic view of science: it is descriptive, within a particular framework and necessarily limited by experience. This view may be why I gravitated to the pragmatists in my epistemology course (Peirce, James, Rorty), or perhaps the pragmatists persuaded me better than the others.

In any case, the Gleiser interview is a delightful and interesting read throughout. His humble of science may get you thinking about epistemology, too.

... and, yes, that's the person for whom a quine in programming is named. Thanks to Douglas Hofstadter for coining the term and for giving us programming nuts a puzzle to solve in every new language we learn.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns, Personal

February 26, 2018 3:55 PM

Racket Love

Racket -- "A Programmable Programming Language" -- is the cover story for next month's Communications of the ACM. The new issue is already featured on the magazine's home page, including a short video in which Matthias Felleisen explains the idea of code as more than a machine artifact.

My love of Racket is no surprise to readers of this blog. Still one of my favorite old posts here is The Racket Way, a write-up of my notes from Matthew Flatt's talk of the same name at StrangeLoop 2012. As I said in that post, this was a deceptively impressive talk. I think that's especially fitting, because Racket is a deceptively impressive language.

One last little bit of love from a recent message to the Racket users mailing list... Stewart Mackenzie describes his feelings about the seamless interweaving of Racket and Typed Racket via a #lang directive:

So far my dive into Racket has positive. It's magical how I can switch from untyped Racket to typed Racket simply by changing #lang. Banging out my thoughts in a beautiful lisp 1, wave a finger, then finger crack to type check. Just sublime.

That's what you get when your programming language is as programmable as your application.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

February 21, 2018 3:38 PM

Computer Programs Aren't Pure Abstractions. They Live in the World.

Guile Scheme guru Andy Wingo recently wrote a post about langsec, the idea that we can bake system security into our programs by using languages that support proof of correctness. Compilers can then be tools for enforcing security. Wingo is a big fan of the langsec approach but, in light of the Spectre and Meltdown vulnerabilities, is pessimistic that it really matter anymore. If bad actors can exploit the hardware that executes our programs, then proving that the code is secure doesn't do much good.

I've read a few blog posts and tweets that say Wingo is too pessimistic, that efforts to make our languages produce more secure code will still pay off. I think my favorite such remark, though, is a comment on Wingo's post itself, by Thomas Dullien:

I think this is too dark a post, but it shows a useful shock: Computer Science likes to live in proximity to pure mathematics, but it lives between EE and mathematics. And neglecting the EE side is dangerous - which not only Spectre showed, but which should have been obvious at the latest when Rowhammer hit.

There's actual physics happening, and we need to be aware of it.

It's easy for academics, and even programmers who work atop an endless stack of frameworks, to start thinking of programs as pure abstractions. But computer programs, unlike mathematical proofs, come into contact with real, live hardware. It's good to be reminded sometimes that computer science isn't math; it lives somewhere between math and engineering. That is good in so many ways, but it also has its downsides. We should keep that in mind.

Posted by Eugene Wallingford | Permalink | Categories: Computing

February 16, 2018 2:54 PM

Old Ideas and New Words

In this Los Angeles Review of Books interview, novelist Jenny Offill says:

I was reading a poet from the Tang dynasty... One of his lines from, I don't know, page 812, was "No new feelings". When I read that I laughed out loud. People have been writing about the same things since the invention of the written word. The only originality comes from the language itself.

After a week revising lecture notes and rewriting a recruiting talk intended for high school students and their parents, I know just what Offill and that Tang poet mean. I sometimes feel the same way about code.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

January 14, 2018 9:24 AM


This was posted on the Racket mailing list recently:

"The Little Schemer" starts slow for people who have programmed before, but seeing that I am only half-way through and already gained some interesting knowledge from it, one should not underestimate the acceleration in this book.

The Little Schemer is the only textbook I assign in my Programming Languages course. These students usually have only a little experience: often three semesters, two in Python and one in Java; sometimes just the two in Python. A few of the students who work in the way the authors intend have an A-ha! experience while reading it. Or maybe they are just lucky... Other students have only a WTF? experience.

Still, I assign the book, with hope. It's relatively inexpensive and so worth a chance that a few students can use it to grok recursion, along with a way of thinking about writing functions that they haven't seen in courses or textbooks before. The book accelerates from the most basic ideas of programming to "interesting" knowledge in a relatively short number of pages. Students who buy in to the premise, hang on for the ride, and practice the ideas in their own code soon find that they, too, have accelerated as programmers.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 28, 2017 8:46 AM

You Have to Learn That It's All Beautiful

In this interview with Adam Grant, Walter Jacobson talks about some of the things he learned while writing biographies of Benjamin Franklin, Albert Einstein, Steve Jobs, and Leonardo da Vinci. A common theme is that all four were curious and interested in a wide range of topics. Toward the end of the interview, Jacobson says:

We of humanities backgrounds are always doing the lecture, like, "We need to put the 'A' in 'STEM', and you've got to learn the arts and the humanities." And you get big applause when you talk about the importance of that.

But we also have to meet halfway and learn the beauty of math. Because people tell me, "I can't believe somebody doesn't know the difference between Mozart and Haydn, or the difference between Lear and Macbeth." And I say, "Yeah, but do you know the difference between a resistor and a transistor? Do you know the difference between an integral and a differential equation?" They go, "Oh no, I don't do math, I don't do science." I say, "Yeah, but you know what, an integral equation is just as beautiful as a brush stroke on the Mona Lisa." You've got to learn that they're all beautiful.

Appreciating that beauty made Leonardo a better artist and Jobs a better technologist. I would like for the students who graduate from our CS program to know some literature, history, and art and appreciate their beauty. I'd also like for the students who graduate from our university with degrees in literature, history, art, and especially education to have some knowledge of calculus, the Turing machine, and recombinant DNA, and appreciate their beauty.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

December 21, 2017 2:42 PM

A Writer with a Fondness for Tech

I've not read either of Helen DeWitt's novels, but this interview from 2011 makes her sound like a technophile. When struggling to write, she finds inspiration in her tools:

What is to be done?

Well, there are all sorts of technical problems to address. So I go into Illustrator and spend hours grappling with the pen tool. Or I open up the statistical graphics package R and start setting up plots. Or (purists will be appalled) I start playing around with charts in Excel.

... suddenly I discover a brilliant graphic solution to a problem I've been grappling with for years! How to display poker hands graphically in a way that sets a series of strong hands next to the slightly better hands that win.

Other times she feels the need for a prop, a lá Olivier:

I may have a vague idea about a character -- he is learning Japanese at an early age, say. But I don't know how to make this work formally, I don't know what to do with the narrative. I then buy some software that lets me input Japanese within my word-processing program. I start playing around, I come up with bits of Japanese. And suddenly I see that I can make visible the development of the character just by using a succession of kanji! I don't cut out text -- I have eliminated the need for 20 pages of text just by using this software.

Then she drops a hint about a work in progress, along with a familiar name:

Stolen Luck is a book about poker using Tuftean information design to give readers a feel for both the game and the mathematics.

Dewitt sounds like my kind of person. I wonder if I would like her novels. Maybe I'll try Lightning Rods first; it sounds like an easier read than The Last Samurai.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 24, 2017 12:30 PM

Thousand-Year Software

I recently read an old conversation between Neil Gaiman and Kazuo Ishiguro that started out as a discussion of genre but covered a lot of ground, including how stories mutate over time, and that the time scale of stories is so much larger than that of human lives. Here are a few of the passages about stories and time:

NG   Stories are long-lived organisms. They're bigger and older than we are.

NG   You sit there reading Pepys, and just for a minute, you kind of get to be 350, 400 years older than you are.

KI   There's an interesting emotional tension that comes because of the mismatch of lifespans in your work, because an event that might be tragic for one of us may not be so for the long-lived being.

KI   I'm often asked what my attitude is to film, theatrical, radio adaptations of my novels. It's very nice to have my story go out there, and if it's in a different form, I want the thing to mutate slightly. I don't want it to be an exact translation of my novel. I want it to be slightly different, because in a very vain kind of way, as a storyteller, I want my story to become like public property, so that it gains the status where people feel they can actually change it around and use it to express different things.

This last comment by Ishiguro made me think of open-source software. It can be adapted by anyone for almost any use. When we fork a repo and adapt it, how often does it grow into something new and considerably different? I often tell my compiler students about the long, mutated life of P-code, which was related by Chris Clark in a 1999 SIGPLAN Notices article:

P-code is an example [compiler intermediate representation] that took on a life of its own. It was invented by Nicklaus Wirth as the IL for the ETH Pascal compiler. Many variants of that compiler arose [Ne179], including the USCD Pascal compiler that was used at Stanford to define an optimizer [Cho83]. Chow's compiler evolved into the MIPS compiler suite, which was the basis for one of the DEC C compilers -- acc. That compiler did not parse the same language nor use any code from the ETH compiler, but the IL survived.

That's not software really, but a language processed by several generations of software. What are other great examples of software and languages that mutated and evolved?

We have no history with 100-year-old software yet, of course, let alone 300- or 1000-year-old software. Will we ever? Software is connected to the technology of a given time in ways that stories are not. Maybe, though, an idea that is embodied in a piece of software today could mutate and live on in new software or new technology many decades from now? The internet is a system of hardware and software that is already evolving into new forms. Will the world wide web continue to have life in a mutated form many years hence?

The Gaiman/Ishiguro conversation turned out to be more than I expected when I first found it. Good stuff. Oh, and as I wrap up this post, this passage resonates with me:

NG   I know that when I create a story, I never know what's going to work. Sometimes I will do something that I think was just a bit of fun, and people will love it and it catches fire, and sometimes I will work very hard on something that I think people will love, and it just fades: it never quite finds its people.

Been there, done that, my friend. This pretty well describes my experience blogging and tweeting all these years, and even writing for my students. I am a less reliable predictor of what will connect with readers than my big ego would ever have guessed.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 15, 2017 4:03 PM

A Programming Digression: Kaprekar Numbers

Earlier this week I learned about Kaprekar numbers when someone re-tweeted this my way:

Kaprekar numbers are numbers whose square in that base can be split into 2 parts that add up to the original number

So, 9 is a Kaprekar number, because 9 squared is 81 and 8+1 equals 9. 7777 is, too, because 7777 squared is 60481729 and 6048 + 1729 equals 7777.

This is the sort of numerical problem that is well-suited for the language my students are writing a compiler for this semester. I'm always looking out for fun little problems that I can to test their creations. In previous semesters, I've blogged about computing Farey sequences and excellent numbers for just this purpose.

Who am I kidding. I just like to program, even in small language that feels like integer assembly language, and these problems are fun!

So I sat down and wrote Klein functions to determine if a given number is a Kaprekar number and to generate all of the Kaprekar numbers less than a given number. I made one small change to the definition, though: I consider only numbers whose squares consist of an even-number of digits and thus can be split in half, a lá excellent numbers.

Until we have a complete compiler for our class language, I always like to write a reference program in a language such as Python so that I can validate my logic. I had a couple of meetings this morning, which gave just the time I needed to port my solution to a Klein-like subset of Python.

When I finished my program, I still had a few meeting minutes available, so I started generating longer and longer Kaprekar numbers. I noticed that there are a bunch more 6-digit Kaprekar numbers than at any previous length:

 1: 1
 2: 3
 3: 2
 4: 5
 5: 4
 6: 24

Homer Simpson says, 'D'oh!'

I started wondering why that might be... and then realized that there are a lot more 6-digit numbers overall than 5-digit -- ten times as many, of course. (D'oh!) My embarrassing moment of innumeracy didn't kill my curiosity, though. How does that 24 compare to the trend line of Kaprekar numbers by length?

 1: 1    of        9  0.11111111
 2: 3    of       90  0.03333333
 3: 2    of      900  0.00222222
 4: 5    of     9000  0.00055555
 5: 4    of    90000  0.00004444
 6: 24   of   900000  0.00002666

There is a recognizable drop-off at each length up to six, where the percentage is an order of magnitude different than expected. Are 6-digit numbers a blip or a sign of a change in the curve? I ran another round. This took much longer, because my Klein-like Python program has to compute operations like length recursively and has no data structures for caching results. Eventually, I had a count:

 7: 6    of  9000000  0.00000066

A big drop, back in line with the earlier trend. One more round, even slower.

 8: 21   of 90000000  0.00000023

Another blip in the rate of decline. This calls for some more experimentation... There is a bit more fun to have the next time I have a couple of meetings to fill.

Image: courtesy of the Simpsons wiki.

Posted by Eugene Wallingford | Permalink | Categories: Computing

October 04, 2017 3:56 PM

A Stroll Through the Gates CS Building

sneaking up on the Gates-Hillman Complex at Carnegie Mellon from Forbes St.
sneaking up on the Gates-Hillman Complex
from Forbes St., Pittsburgh, PA

I had a couple of hours yesterday between the end of the CS education summit and my shuttle to the airport. Rather than sit in front of a computer for two more hours, I decided to take advantage of my location, wander over to the Carnegie Mellon campus, and take a leisurely walk through the Gates Center for Computer Science. I'm glad I did.

At the beginning of my tour, I was literally walking in circles, from the ground-level entrance shown in its Wikipedia photo up to where the CS offices seem to begin, up on the fourth floor. This is one of those buildings that looks odd from the outside and is quite confusing on the inside, at least to the uninitiated. But everyone inside seemed to feel at home, so maybe it works.

It didn't take long before my mind was flooded by memories of my time as a doctoral student. Michigan State's CS program isn't as big as CMU's, but everywhere I looked I saw familiar images: Students sitting in their labs or in their offices, one or two or six at a time, hacking code on big monitors, talking shop, or relaxing. The modern world was on display, too, with students lounging comfy chairs or sitting in a little coffee shop, laptops open and earbuds in place. That was never my experience as a student, but I know it now as a faculty member.

I love to wander academic halls, in any department, really, and read what is posted on office doors and hallway walls. At CMU, I encountered the names of several people whose work I know and admire. They came from many generations... David Touretzky, whose Lisp textbook taught me a few things about programming. Jean Yang, whose work on programming languages I find cool. (I wish I were going to SPLASH later this month...) Finally, I stumbled across the office of Manuel Blum, the 1995 Turing Award winner. There were a couple of posters outside his door showing the work of his students on problems of cryptography and privacy, and on the door itself were several comic strips. The punchline of one read, "I'll retire when it stops being fun." On this, even I am in sync with a Turing Award winner.

Everywhere I turned, something caught my eye. A pointer to the Newell/Simon bridge... Newell-and-Simon, the team, were the like the Pied Piper to me when I began my study of AI. A 40- or 50-page printout showing two old researchers (Newell and Simon?) playing chess. Plaques in recognition of big donations that had paid for classrooms, labs, and auditoria, made by Famous People who were either students or faculty in the school.

CMU is quite different from my school, of course, but there are many other schools that give off a similar vibe. I can see why people want to be at an R-1, even if they aspire to be teachers more than research faculty. There is so much going on. People, labs, sub-disciplines, and interdisciplinary projects. Informal talks, department seminars, and outside speakers. Always something going on. Ideas. Energy.

On the ride to the airport later in the day, I sat in some slow, heavy traffic going one direction and saw slower, heavier traffic going in the other. As much as I enjoyed the visit, I was glad to be heading home.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

October 03, 2017 12:23 PM

Why Do CS Enrollments Surges End?

The opening talk of the CS Education Summit this week considered the challenges facing CS education in a time of surging enrollments and continued concerns about the diversity of the CS student population. In the session that followed, Eric Roberts and Jodi Tims presented data that puts the current enrollment surge into perspective, in advance of a report from the National Academy of Science.

In terms of immediate takeaway, Eric Roberts's comments were gold. Eric opened with Stein's Law: If something is unsustainable, it will stop. Stein was an economist whose eponymous law expresses one of those obvious truths we all seem to forget about in periods of rapid change: If something cannot go on forever, it won't. You don't have to create a program to make it stop. A natural corollary is: If it can't go on for long, you don't need a program to deal with it. It will pass soon.

Why is that relevant to the summit? Even without continued growth, current enrollments in CS majors is unsustainable for many schools. If the past is any guide, we know that many schools will deal with unsustainable growth by limiting the number of students who start or remain in their major.

Roberts has studied the history of CS boom-and-bust cycles over the last thirty years, and he's identified a few common patterns:

  • Limiting enrollments is how departments respond to enrollment growth. They must: the big schools can't hire faculty fast enough, and most small schools can't hire new faculty at all.

  • The number of students graduating with CS degrees drops because we limit enrollments. Students do not stop enrolling because the number of job opportunities goes down or any other cause.

    After the dot-com bust, there was a lot of talk about offshoring and automation, but the effects of that were short-term and rather small. Roberts's data shows that enrollment crashes do not follow crashes in job openings; they follow enrollment caps. Enrollments remain strong wherever they are not strictly limited.

  • When we limit enrollments, the effect is bigger on women and members of underserved communities. These students are more likely to suffer from impostor syndrome, stereotype bias, and other fears, and the increased competitiveness among students for fewer openings combines with discourages them from continuing.

So the challenge of booming enrollments exacerbates the challenge to increase diversity. The boom might decrease diversity, but when it ends -- and it will, if we limit enrollments -- our diversity rarely recovers. That's the story of the last three booms.

In order to grow capacity, the most immediate solution is to hire more professors. I hope to write more about that soon, but for now I'll mention only that the problem of hiring enough faculty to teach all of our students has at east two facets. The first is that many schools simply don't have the money to hire more faculty right now. The second is that there aren't enough CS PhDs to go around. Roberts reported that, of last year's PhD grads, 83% took positions at R1 schools. That leaves 17% for the rest of us. "Non-R1 schools can expect to hire a CS PhD every 27 years." Everyone laughed, but I could see anxiety on more than a few faces.

The value of knowing this history is that, when we go to our deans and provosts, we can do more than argue for more resources. We can show the effect of not providing the resources needed to teach all the students coming our way. We won't just be putting the brakes on local growth; we may be helping to create the next enrollment crash. At a school like mine, if we teach the people of our state that we can't handle their CS students, then the people of our state will send their students elsewhere.

The problem for any one university, of course, is that it can act only based on its own resources and under local constraints. My dean and provost might care a lot about the global issues of demand for CS grads and need for greater diversity among CS students. But their job is to address local issues with their own (small) pool of money.

I'll have to re-read the papers Roberts has written about this topic. His remarks certainly gave us plenty to think about, and he was as engaging as ever.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 02, 2017 12:16 PM

The Challenge Facing CS Education

Today and tomorrow, I am at a CS Education Summit in Pittsburgh. I've only been to Pittsburgh once before, for ICFP 2002 (the International Conference on Functional Programming) and am glad to be back. It's a neat city.

The welcome address for the summit was given by Dr. Farnam Jahanian, the interim president at Carnegie Mellon University. Jahanian is a computer scientist, with a background in distributed computing and network security. His resume includes a stint as chair of the CS department at the University of Michigan and a stint at the NSF.

Welcome addresses for conferences and workshops vary in quality. Jahanian gave quite a good talk, putting the work of the summit into historical and cultural context. The current boom in CS enrollments is happening at a time when computing, broadly defined, is having an effect in seemingly all disciplines and all sectors of the economy. What does that mean for how we respond to the growth? Will we see that the current boom presages a change to the historical cycle of enrollments in coming years?

Jahanian made three statements in particular that for me capture the challenge facing CS departments everywhere and serve as a backdrop for the summit:

  • "We have to figure out how to teach all of these students."

    Unlike many past enrollment booms, "all of these students" this time comprises two very different subsets: CS majors and non-majors. We have plenty of experience teaching CS majors, but how do you structure your curriculum and classes when you have three times as many majors? When numbers go up far enough fast enough, many schools have a qualitatively different problem.

    Most departments have far less experience teaching computer science (not "literacy") to non-majors. How do you teach all of these students, with different backgrounds and expectations and needs? What do you teach them?

  • "This is an enormous responsibility."

    Today's graduates will have careers for 45 years or more. That's a long time, especially in a world that is changing ever more rapidly, in large part due to our own discipline. How different are the long-term needs of CS majors and non-majors? Both groups will be working and living for a long time after they graduate. If computing remains a central feature of the world in the future, how we respond to enrollment growth now will have an outsized effect on every graduate. An enormous responsibility, indeed.

  • "We in CS have to think about impending cultural changes..."

    ... which means that we computer science folks will need to have education, knowledge, and interests much broader than just CS. People talk all the time about the value of the humanities in undergraduate education. This is a great example of why. One bit of good news: as near as I can tell, most of the CS faculty in this room, at this summit, do have interests and education bigger than just computer science (*). But we have to find ways to work these issues into our classrooms, with both majors and non-majors.

Thus the idea of a CS education summit. I'm glad to be here.

(*) In my experience, it is much more likely to find a person with a CS or math PhD and significant educational background in the humanities than to find a person with a humanities PhD and significant educational background in CS or math (or any other science, for that matter). One of my hopes for the current trend of increasing interest in CS among non-CS majors is that we an close this gap. All of the departments on our campuses, and thus all of our university graduates, will be better for it.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 26, 2017 3:58 PM

Learn Exceptions Later

Yesterday, I mentioned rewriting the rules for computing FIRST and FOLLOW sets using only "plain English". As I was refactoring my descriptions, I realized that one of the reasons students have difficulty with many textbook treatments of the algorithms is that the books give complete and correct definitions of the sets upfront. The presence of X := ε rules complicates the construction of both sets, but they are unnecessary to understanding the commonsense ideas that motivate the sets. Trying to deal with ε too soon can interfere with the students learning what they need to learn in order to eventually understand ε!

When I left the ε rules out of my descriptions, I ended up with what I thought were an approachable set of rules:

  • The FIRST set of a terminal contains only the terminal itself.

  • To compute FIRST for a non-terminal X, find all of the grammar rules that have X on the lefthand side. Add to FIRST(X) all of the items in the FIRST set of the first symbol of each righthand side.

  • The FOLLOW set of the start symbol contains the end-of-stream marker.

  • To compute FOLLOW for a non-terminal X, find all of the grammar rules that have X on the righthand side. If X is followed by a symbol in the rule, add to FOLLOW(X) all of the items in the FIRST set of that symbol. If X is the last symbol in the rule, add to FOLLOW(X) all of the items in the FOLLOW set of the symbol on the rule's lefthand side.

These rules are incomplete, but they have offsetting benefits. Each of these cases is easy to grok with a simple example or two. They also account for a big chunk of the work students need to do in constructing the sets for a typical grammar. As a result, they can get some practice building sets before diving into the gnarlier details ε, which affects both of the main rules above in a couple of ways.

These seems like a two-fold application of the Concrete, Then Abstract pattern. The first is the standard form: we get to see and work with accessible concrete examples before formalizing the rules in mathematical notation. The second involves the nature of the problem itself. The rules above are the concrete manifestation of FIRST and FOLLOW sets; students can master them before considering the more abstract ε cases. The abstract cases are the ones that benefit most from using formal notation.

I think this is an example of another pattern that works well when teaching. We might call it "Learn Exceptions Later", "Handle Exceptions Later", "Save Exceptions For Later", or even "Treat Exceptions as Exceptions". (Naming things is hard.) It is often possible to learn a substantial portion of an idea without considering exceptions at all, and doing so prepares students for learning the exceptions anyway.

I guess I now have at least one idea for my next PLoP paper.

Ironically, writing this post brings to mind a programming pattern that puts exceptions up top, which I learned during the summer Smalltalk taught me OOP. Instead of writing code like this:

    if normal_case(x) then
       // a bunch
       // of lines
       // of code
       // processing x
you can write:
    if abnormal_case(x) then

// a bunch // of lines // of code // processing x

This idiom brings the exceptional case to the top of the function and dispatches with it immediately. On the other hand, it also makes the normal case the main focus of the function, unindented and clear to the eye. It may look like this idiom violates the "Save Exceptions For Later" pattern, but code of this sort can be a natural outgrowth of following the pattern. First, we implement the function to do its normal business and makes sure that it handles all of the usual cases. Only then do we concern ourselves with the exceptional case, and we build it into the function with minimal disruption to the code.

This pattern has served me well over the years, far beyond Smalltalk.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

September 25, 2017 3:01 PM

A Few Thoughts on My Compilers Course

I've been meaning to blog about my compilers course for more than a month, but life -- including my compilers course -- have kept me busy. Here are three quick notes to prime the pump.

  • I recently came across Lindsey Kuper's My First Fifteen Compilers and thought again about this unusual approach to a compiler course: one compiler a week, growing last week's compiler with a new feature or capability, until you have a complete system. Long, long-time readers of this blog may remember me writing about this idea once over a decade ago.

    The approach still intrigues me. Kuper says that it was "hugely motivating" to have a working compiler at the end of each week. In the end I always shy away from the approach because (1) I'm not yet willing to adopt for my course the Scheme-enabled micro-transformation model for building a compiler and (2) I haven't figured out how to make it work for a more traditional compiler.

    I'm sure I'll remain intrigued and consider it again in the future. Your suggestions are welcome!

  • Last week, I mentioned on Twitter that I was trying to explain how to compute FIRST and FOLLOW sets using only "plain English". It was hard. Writing a textual description of the process made me appreciate the value of using and understanding mathematical notation. It is so expressive and so concise. The problem for students is that it is also quite imposing until they get it. Before then, the notation can be a roadblock on the way to understanding something at an intuitive level.

    My usual approach in class to FIRST and FOLLOW sets, as for most topics, is to start with an example, reason about it in commonsense terms, and only then to formalize. The commonsense reasoning often helps students understand the formal expression, thus removing some of its bite. It's a variant of the "Concrete, Then Abstract" pattern.

    Mathematical definitions such as these can motivate some students to develop their formal reasoning skills. Many people prefer to let students develop their "mathematical maturity" in math courses, but this is really just an avoidance mechanism. "Let the Math department fail them" may solve a practical problem, sometimes we CS profs have to bite the bullet and help our students get better when they need it.

  • I have been trying to write more code for the course this semester, both for my enjoyment (and sanity) and for use in class. Earlier, I wrote a couple of toy programs such as a Fizzbuzz compiler. This weekend I took a deeper dive and began to implement my students' compiler project in full detail. It was a lot of fun to be deep in the mire of a real program again. I have already learned and re-learned a few things about Python, git, and bash, and I'm only a quarter of the way in! Now I just have to make time to do the rest as the semester moves forward.

In her post, Kuper said that her first compiler course was "a lot of hard work" but "the most fun I'd ever had writing code". I always tell my students that this course will be just like that for them. They are more likely to believe the first claim than the second. Diving in, I'm remembering those feelings firsthand. I think my students will be glad that I dove in. I'm reliving some of the challenges of doing everything that I ask them to do. This is already generating a new source of empathy for my students, which will probably be good for them come grading time.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

August 14, 2017 1:42 PM

Papert 1: Mathophobia, Affect, Physicality, and Love

I have finally started reading Mindstorms. I hope to write short reflections as I complete every few pages or anytime I come across something I feel compelled to write about in the moment. This is the first entry in the imagined series.

In the introduction, Papert says:

We shall see again and again that the consequences of mathophobia go far beyond obstructing the learning of mathematics and science. The interact with other endemic "cultural toxins", for example, with popular theories of aptitudes, to contaminate peoples' images of themselves as learners. Difficulty with school math is often the first step of an invasive intellectual process that leads us all to define ourselves as bundles of aptitudes and ineptitudes, as being "mathematical" or "not mathematical", "artistic" or "not artistic", "musical" or "not musical", "profound" or "superficial", "intelligent" or "dumb". Thus deficiency becomes identity, and learning is transformed from the early child's free exploration of the world to a chore beset by insecurities and self-imposed restrictions.

This invasive intellectual process has often deeply affected potential computer science students long before they reach the university. I would love to see Papert's dream made real early enough that young people can imagine being a computer scientist earlier. It's hard to throw of the shackles after they take hold.


The thing that sticks out as I read the first few pages of Mindstorms is its focus on the power of affect in learning. I don't recall conscious attention to my affect having much of a role in my education; it seems I was in a continual state of "cool, I get to learn something". I didn't realize at the time just what good fortune it was to have that as a default orientation.

I'm also struck by Papert's focus on the role of physicality in learning, how we often learn best when the knowledge has a concrete manifestation in our world. I'll have to think about this more... Looking back now, abstraction always seemed natural to me.

Papert's talk of love -- falling in love with the thing we learn about, but also with the thing we use to learn it -- doesn't surprise me. I know these feelings well, even from the earliest experiences I had in kindergarten.

An outside connection that I will revisit: Frank Oppenheimer's exploratorium, an aspiration I learned about from Alan Kay. What would a computational exploratorium look like?

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 28, 2017 2:02 PM

The Need for Apprenticeship in Software Engineering Education

In his conversation with Tyler Cowen, Ben Sasse talks a bit about how students learn in our schools of public policy, business, and law:

We haven't figured out in most professional schools how to create apprenticeship models where you cycle through different aspects of what doing this kind of work will actually look like. There are ways that there are tighter feedback loops at a med school than there are going to be at a policy school. There are things that I don't think we've thought nearly enough about ways that professional school models should diverge from traditional, theoretical, academic disciplines or humanities, for example.

We see a similar continuum in what works best, and what is needed, for learning computer science and learning software engineering. Computer science education can benefit from the tighter feedback loops and such that apprenticeship provides, but it also has a substantial theoretical component that is suitable for classroom instruction. Learning to be a software engineer requires a shift to the other end of the continuum: we can learn important things, in the classroom, but much of the important the learning happens in the trenches, making things and getting feedback.

A few universities have made big moves in how they structure software engineering instruction, but most have taken only halting steps. They are often held back by a institutional loyalty to the traditional academic model, or out of sheer curricular habit.

The one place you see apprenticeship models in CS is, of course, graduate school. Students who enter research work in the lab under the mentorship of faculty advisors and more senior grad students. It took me a year or so in graduate school to figure out that I needed to begin to place more focus on my research ideas than on my classes. (I hadn't committed to a lab or an advisor yet.)

In lieu of a changed academic model, internships of the sort I mentioned recently can be really helpful for undergrad CS students looking to go into software development. Internships create a weird tension for faculty... Most students come back from the workplace with a new appreciation for the academic knowledge they learn in the classroom, which is good, but they also back to wonder why more of their schoolwork can't have the character of learning in the trenches. They know to want more!

Project-based courses are a way for us to bring the value of apprenticeship to the undergraduate classroom. I am looking forward to building compilers with ten hardy students this fall.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 11, 2017 3:17 PM

Blogging as "Loud Thinking"

This morning, I tweeted a quote from Sherry Turkle's Remembering Seymour Papert that struck a chord with a few people: "Seymour Papert saw that the computer would make it easier for thinking itself to become an object of thought." Here is another passage that struck a chord with me:

At the time of the juggling lesson, Seymour was deep in his experiments into what he called 'loud thinking'. It was what he was asking my grandfather to do. What are you trying? What are you feeling? What does it remind you of? If you want to think about thinking and the real process of learning, try to catch yourself in the act of learning. Say what comes to mind. And don't censor yourself. If this sounds like free association in psychoanalysis, it is. (When I met Seymour, he was in analysis with Greta Bibring.) And if it sounds like it could you get you into personal, uncharted, maybe scary terrain, it could. But anxiety and ambivalence are part of learning as well. If not voiced, they block learning.

It occurred to me that I blog as a form of "loud thinking". I don't write many formal essays or finished pieces for my blog these days. Mostly I share thoughts as they happen and think out loud about them in writing. Usually, it's just me trying to make sense of ideas that cross my path and see where they fit in with the other things I'm learning. I find that helpful, and readers sometimes help me by sharing their own thoughts and ideas.

When I first read the phrase "loud thinking", it felt awkward, but it's already growing on me. Maybe I'll try to get my compiler students to do some loud thinking this fall.

By the way, Turkle's entire piece is touching and insightful. I really liked the way she evoked Papert's belief that we "love the objects we think with" and "think with the objects we love". (And not just because I'm an old Smalltalk programmer!) I'll let you read the rest of the piece yourself to appreciate both the notion and Turkle's storytelling.

Now, for a closing confession: I have never read Mindstorms. I've read so much about Papert and his ideas over the years, but the book has never made it to the top of my stack. I pledge to correct this egregious personal shortcoming and read it as soon as I finish the novel on my nightstand. Maybe I'll think out loud about it here soon.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 30, 2017 1:35 PM

Be Honest About Programming; The Stakes Are High

It's become commonplace of late to promote programming as fun! and something everyone will want to learn, if only they had a chance. Now, I love to program, but I've also been teaching long enough to know that not everyone takes naturally to programming. Sometimes, they warm up to it later in their careers, and sometimes, they never do.

This Quartz article takes the conventional wisdom to task as misleading:

Insisting on the glamour and fun of coding is the wrong way to acquaint kids with computer science. It insults their intelligence and plants the pernicious notion in their heads that you don't need discipline in order to progress. As anyone with even minimal exposure to making software knows, behind a minute of typing lies an hour of study.

But the author does think that people should understand code and what it means to program, but not because they necessarily will program very much themselves:

In just a few years, understanding programming will be an indispensable part of active citizenship.

This is why it's important for people to learn about programming, and why it's so important not to sell it in a way that ambushes students when they encounter it for the first time. Software development is both technically and ethically challenging. All citizens will be better equipped to participate in the world if they understand these challenges at some level. Selling the challenges short makes it harder to attract people who might be interested in the ethical challenges and harder to retain people turned off by technical challenges they weren't expecting.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 25, 2017 10:05 AM

Work on Cool Hard Problems; It Pays Off Eventually

In The Secret Origin Story of the iPhone, we hear about the day Steve Jobs told Bas Ording, one of Apple's UI "wizards", ...

... to make a demo of scrolling through a virtual address book with multitouch. "I was super-excited," Ording says. "I thought, Yeah, it seems kind of impossible, but it would be fun to just try it." He sat down, "moused off" a phone-size section of his Mac's screen, and used it to model the iPhone surface. He and a scant few other designers had spent years experimenting with touch-based user interfaces -- and those years in the touchscreen wilderness were paying off.

I'm guessing that a lot of programmers understand what Ording felt in that moment: "It seems kind of impossible, but it would be fun to just try it..." But he and his team were ready. Sometimes, you work in the wilderness a while, and suddenly it all pays off.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 10, 2017 10:28 AM

98% of the Web in One Sentence

Via Pinboard's creator, the always entertaining Maciej Cegłowski:

Pinboard is not much more than a thin wrapper around some carefully tuned database queries.

You are ready to make your millions. Now all you need is an idea.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

June 08, 2017 12:10 PM

We Need a Course on Mundane Data Types

Earlier this month, James Iry tweeted:

Every CS degree covers fancy data structures. But what trips up more programmers? Times. Dates. Floats. Non-English text. Currencies.

I would like to add names to Iry's list. As a person who goes by his middle name and whose full legal name includes a suffix, I've seen my name mangled over the years in ways you might not imagine -- even by a couple of computing-related organizations that shall remain nameless. (Ha!) And my name presents only a scant few of the challenges available when we consider all the different naming conventions around the world.

This topic would make a great course for undergrads. We could call it "Humble Data Types" or "Mundane Data Types". My friends who program for a living know that these are practical data types, the ones that show up in almost all software and which consume an inordinate amount of time. That's why we see pages on the web about "falsehoods programmers believe" about time, names, and addresses -- another one for our list!

It might be hard to sell this course to faculty. They are notoriously reluctant to add new courses to the curriculum. (What would it displace?) Such conservatism is well-founded in a discipline that moves quickly through ideas, but this is a topic that has been vexing programmers for decades.

It would also be hard to sell the course to students, because it looks a little, well, mundane. I do recall a May term class a few years ago in which a couple of programmers spent days fighting with dates and times in Ruby while building a small accounting system. That certainly created an itch, but I'm not sure most students have enough experience with such practical problems before they graduate.

Maybe we could offer the course as continuing education for programmers out in the field. They are the ones who would appreciate it the most.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 07, 2017 1:43 PM

Write a Program, Not a Slide Deck

From Compress to Impress, on Jeff Bezos's knack for encoding important strategies in concise, memorable form:

As a hyper intelligent person, Jeff didn't want lossy compression or lazy thinking, he wanted the raw feed in a structured form, and so we all shifted to writing our arguments out as essays that he'd read silently in meetings. Written language is a lossy format, too, but it has the advantage of being less forgiving of broken logic flows than slide decks.

Ask any intro CS student: Even less forgiving of broken logic than prose is the computer program.

Programs are not usually the most succinct way to express an idea, but I'm often surprised by how little work it takes to express an idea about a process in code. When a program is a viable medium for communicating an idea, it provides value in many dimensions. You can run a program, which makes the code's meaning observable in its behavior. A program lays bare logic and assumptions, making them observable, too. You can tinker with a program, looking at variations and exploring their effects.

The next time you have an idea about a process, try to express it in code. A short bit prose may help, too.

None of this is intended to diminish the power of using rhetorical strategies to communicate at scale and across time, as described in the linked post. It's well worth a read. From the outside looking in, Bezos seems to be a remarkable leader.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 06, 2017 2:39 PM

Using Programs and Data Analysis to Improve Writing, World Bank Edition

Last week I read a tweet that linked to an article by Paul Romer. He is an economist currently working at the World Bank, on leave from his chair at NYU. Romer writes well, so I found myself digging deeper and reading a couple of his blog articles. One of them, Writing, struck a chord with me both as a writer and as a computer scientist.


The quality of written prose should be higher in documents that will have many readers.

This is true of code, too. If a piece of code will be read many times, whether by one person or several, then each minute spent making it shorter and clearer improves reading comprehension every single time. That's even more important in code than in text, because so often we read code in order to change it. We need to understand it at even deeper level to ensure that our changes have the intended effect. Time spent making code better repays itself many times over.

Romer caused a bit of a ruckus when he arrived at the World Bank by insisting, to some of his colleagues' displeasure, that everyone in his division writer clearer, more concise reports. His goal was admirable: He wanted more people to be able to read and understand these reports, because they deal with policies that matter to the public.

He also wanted people to trust what the World Bank was saying by being able more readily to see that a claim was true or false. His article looks at two different examples that make a claim about the relationship between education spending and GDP per capita. He concludes his analysis of the examples with:

In short, no one can say that the author of the second claim wrote something that is false because no one knows what the second claim means.

In science, writing clearly builds trust. This trust is essential for communicating results to the public, of course, because members of the public do not generally possess the scientific knowledge they need to assess the truth of claim directly. But it is also essential for communicating results to other scientists, who must understand the claims at a deeper level in order to support, falsify, and extend them.

In the second half of the article, Romer links to a study of the language used in World Bank's yearly reports. It looks at patterns such as the frequency of the word "and" in the reports and the ratio of nouns to verbs. (See this Financial Times article for a fun little counterargument on the use of "and".)

Romer wants this sort of analysis to be easier to do, so that it can be used more easily to check and improve the World Bank's reports. After looking at some other patterns of possible interest, he closes with this:

To experiment with something like this, researchers in the Bank should be able to spin up a server in the cloud, download some open-source software and start experimenting, all within minutes.

Wonderful: a call for storing data in easy-to-access forms and a call for using (and writing) programs to analyze text, all in the name not of advancing economics technically but of improving its ability to communicate its results. Computing becomes a tool integrated into the process of the World Bank doing its designated job. We need more leaders in more disciplines thinking this way. Fortunately, we hear reports of such folks more often these days.

Alas, data and programs were not used in this way when Romer arrived at the World Bank:

When I arrived, this was not possible because people in ITS did not trust people from DEC and, reading between the lines, were tired of dismissive arrogance that people from DEC displayed.

One way to create more trust is to communicate better. Not being dismissively arrogant is, too, though calling that sort of behavior out may be what got Romer in so much hot water with the administrators and economists at the World Bank in the first place.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

May 21, 2017 10:07 AM

Computer Programs Have Much to Learn, and Much to Teach Us

In his recent interview with Tyler Cowen, Garry Kasparov talks about AI, chess, politics, and the future of creativity. In one of the more intriguing passages, he explains that building databases for chess endgames has demonstrated how little we understand about the game and offers insight into how we know that chess-playing computer programs -- now so far beyond humans that even the world champion can only score occasionally against commodity programs -- still have a long way to improve.

He gives as an example a particular position with a king, two rooks, a knight on one side versus a king and two rooks on the other. Through the retrograde analysis used to construct endgame databases, we know that, with ideal play by both sides, the stronger side can force checkmate in 490 moves. Yes, 490. Kasparov says:

Now, I can tell you that -- even being a very decent player -- for the first 400 moves, I could hardly understand why these pieces moved around like a dance. It's endless dance around the board. You don't see any pattern, trust me. No pattern, because they move from one side to another.

At certain points I saw, "Oh, but white's position has deteriorated. It was better 50 moves before." The question is -- and this is a big question -- if there are certain positions in these endgames, like seven-piece endgames, that take, by the best play of both sides, 500 moves to win the game, what does it tell us about the quality of the game that we play, which is an average 50 moves? [...]

Maybe with machines, we can actually move our knowledge much further, and we can understand how to play decent games at much greater lengths.

But there's more. Do chess-playing computer programs, so much superior to even the best human players, understand these endgames either? I don't mean "understand" in the human sense, but only in the sense of being able to play games of that quality. Kasparov moves on to his analysis of games between the best programs:

I think you can confirm my observations that there's something strange in these games. First of all, they are longer, of course. They are much longer because machines don't make the same mistakes [we do] so they could play 70, 80 moves, 100 moves. [That is] way, way below what we expect from perfect chess.

That tells us that [the] machines are not perfect. Most of those games are decided by one of the machines suddenly. Can I call it losing patience? Because you're in a position that is roughly even. [...] The pieces are all over, and then suddenly one machine makes a, you may call, human mistake. Suddenly it loses patience, and it tries to break up without a good reason behind it.

That also tells us [...] that machines also have, you may call it, psychology, the pattern and the decision-making. If you understand this pattern, we can make certain predictions.

Kasparov is heartened by this, and it's part of the reason that he is not as pessimistic about the near-term prospects of AI as some well-known scientists and engineers are. Even with so-called deep learning, our programs are only beginning to scratch the surface of complexity in the universe. There is no particular reason to think that the opaque systems evolved to drive our cars and fly our drones will be any more perfect in their domains than our game-playing programs, and we have strong evidence from the domain of games that programs are still far from perfect.

On a more optimistic note, advances in AI give us an opportunity to use programs to help us understand the world better and to improve our own judgment. Kasparov sees this in chess, in the big gaps between the best human play, the best computer play, and perfect play in even relatively simple positions; I wrote wistfully about this last year, prompted by AlphaGo's breakthrough. But the opportunity is much more valuable when we move beyond playing games, as Cowen alluded in an aside during Kasparov's explanation: Imagine how bad our politics will look in comparison to computer programs that do it well! We have much to learn.

As always, this episode of Conversations with Tyler was interesting and evocative throughout. If you are a chess player, there is an special bonus. The transcript includes a pointer to Kasparov's Immortal Game against Veselin Topalov at Wijk aan Zee in 1999, along with a discussion of some of Kasparov's thoughts on the game beginning with the pivotal move 24. Rxd4. This game, an object of uncommon beauty, will stand as an eternal reminder why, even in the face of advancing AI, it will always matter that people play and compete and create.


If you enjoyed this entry, you might also like Old Dreams Live On. It looks more foresightful now that AlphaGo has arrived.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 28, 2017 11:27 AM

Data Compression and the Complexity of Consciousness

Okay, so this is cool:

Neuroscientists stimulate the brain with brief pulses of energy and then record the echoes that bounce back. Dreamless sleep and general anaesthesia return simple echoes; brains in conscious states produce more complex patterns. Then comes a little inspiration from data compression:

Excitingly, we can now quantify the complexity of these echoes by working out how compressible they are, similar to how simple algorithms compress digital photos into JPEG files. The ability to do this represents a first step towards a "consciousness-meter" that is both practically useful and theoretically motivated.

This made me think of Chris Ford's StrangeLoop 2015 talk about using compression to understand music. Using compressibility as a proxy for complexity gives us a first opportunity to understand all sorts of phenomena about which we are collecting data. Kolmogorov complexity is a fun tool for programmers to wield.

The passage above is from an Aeon article on the study of consciousness. I found it an interesting read from beginning to end.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

March 29, 2017 4:16 PM

Working Through A Problem Manually

This week, I have been enjoying Eli Bendersky's two-article series "Adventures in JIT Compilation":

Next I'll follow his suggestion and read the shorter How to JIT - An Introduction.

Bendersky is a good teacher, at least in the written form, and I am picking up a lot of ideas for my courses in programming languages and compilers. I recommend his articles and his code highly.

In Part 2, Bendersky says something that made me think of my students:

One of my guiding principles through the field of programming is that before diving into the possible solutions for a problem (for example, some library for doing X) it's worth working through the problem manually first (doing X by hand, without libraries). Grinding your teeth over issues for a while is the best way to appreciate what the shrinkwrapped solution/library does for you.

The presence or absence of this attitude is one of the crucial separators among CS students. Some students come into the program with this mindset already in place, and they are often the ones who advance most quickly in the early courses. Other students don't have this mindset, either by interest or by temperament. They prefer to solve problems quickly using canned libraries and simple patterns. These students are often quite productive, but they sometimes soon hit a wall in their learning. When a student rides along the surface of what they are told in class, never digging deeper, they tend to have a shallow knowledge of how things work in their own programs. Again, this can lead to a high level of productivity, but it also produces brittle knowledge. When something changes, or the material gets more difficult, they begin to struggle. A few of the students eventually develop new habits and move nicely into the group of students who likes to grind. The ones who don't make the transition continue to struggle and begin to enjoy their courses less.

There is a rather wide variation among undergrad CS students, both in their goals and in their preferred styles or working and learning. This variation is one of the challenges facing profs who hope to reaching the full spectrum of students in their classes. And helping students to develop new attitudes toward learning and doing is always a challenge.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 22, 2017 4:50 PM

Part of the Fun of Programming

As I got ready for class yesterday morning, I decided to refactor a piece of code. No big deal, right? It turned out to be a bigger deal than I expected. That's part of the fun of programming.

The function in question is a lexical addresser for a little language we use as a specimen in my Programming Languages course. My students had been working on a design, and it was time for us to build a solution as a group. Looking at my code from the previous semester, I thought that changing the order of two cases would make for a better story in class. The cases are independent, so I swapped them and ran my tests.

The change broke my code. It turns out that the old "else" clause had been serving as a convenient catch-all and was only working correctly due to an error in another function. Swapping the cases exposed the error.

Ordinarily, this wouldn't be a big deal, either. I would simply fix the code and give my students a correct solution. Unfortunately, I had less than an hour before class, so I now found myself in a scramble to find the bug, fix it, and make the changes to my lecture notes that had motivated the refactor in the first place. Making changes like this under time pressure is rarely a good idea... I was tempted to revert to the previous version, teach class, and make the change after class. But I am a programmer, dogged and often foolhardy, so I pressed on. With a few minutes to spare, I closed the editor on my lecture notes and synced the files to my teaching machine. I was tired and still had a little nervous energy coursing through me, but I felt great. That's part of the fun of programming.

I will say this: Boy, was I glad to have my test suite! It was incomplete, of course, because I found an error in my program. But the tests I did have helped me to know that my bug fix had not broken something else unexpectedly. The error I found led to several new tests that make the test suite stronger.

This experience was fresh in my mind this morning when I read "Physics Was Paradise", an interview with Melissa Franklin, a distinguished experimental particle physicist at Harvard. At one point, Franklin mentioned taking her first physics course in high school. The interviewer asked if physics immediately stood out as something she would dedicate her life to. Franklin responded:

Physics is interesting, but it didn't all of a sudden grab me because introductory physics doesn't automatically grab people. At that time, I was still interested in being a writer or a philosopher.

I took my first programming class in high school and, while I liked it very much, it did not cause me to change my longstanding intention to major in architecture. After starting in the architecture program, I began to sense that, while I liked architecture and had much to learn from it, computer science was where my future lay. Maybe somewhere deep in my mind was memory of an experience like the one I had yesterday, battling a piece of code and coming out with a sense of accomplishment and a desire to do battle again. I didn't feel the same way when working on problems in my architecture courses.

Intro CS, like intro physics, doesn't always snatch people away from their goals and dreams. But if you enjoy the fun of programming, eventually it sneaks up on you.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development

March 18, 2017 11:42 AM

Hidden Figures, Douglas Engelbart Edition

At one point in this SOHP interview, Douglas Engelbart describes his work at the Ames Research Center after graduating from college. He was an electrical engineer, building and maintaining wind tunnels, paging systems, and other special electronics. Looking to make a connection between this job and his future work, the interviewer asked, "Did they have big computers running the various operations?" Engelbart said:

I'll tell you what a computer was in those days. It was an underpaid woman sitting there with a hand calculator, and they'd have rooms full of them, that's how they got their computing done. So you'd say, "What's your job?" "I'm a computer."

Later in the interview, Engelbart talks about how his experience working with radar in the Navy contributed to his idea for a symbol-manipulating system that could help people deal with complexity and urgency. He viewed the numeric calculations done by the human computers at Ames as being something different. Still, I wonder how much this model of parallel computing contributed to his ideas, if only implcitly.

Posted by Eugene Wallingford | Permalink | Categories: Computing

March 17, 2017 9:27 AM

What It Must Feel Like to be Ivan Sutherland

In The Victorian Internet, Tom Standage, says this about the unexpected challenge facing William Cooke and Samuel Morse, the inventors of the telegraph:

[They] had done the impossible and constructed working telegraphs. Surely the world would fall at their feet. Building the prototypes, however, turned out to be the easy part. Convincing people of their significance was far more of a challenge.

That must be what it feels like to be Ivan Sutherland. Or Alan Kay, for that matter.

Posted by Eugene Wallingford | Permalink | Categories: Computing

March 16, 2017 8:50 AM

Studying Code Is More Like Natural Science Than Reading

A key passage from Peter Seibel's 2014 essay, Code Is Not Literature:

But then it hit me. Code is not literature and we are not readers. Rather, interesting pieces of code are specimens and we are naturalists. So instead of trying to pick out a piece of code and reading it and then discussing it like a bunch of Comp Lit. grad students, I think a better model is for one of us to play the role of a 19th century naturalist returning from a trip to some exotic island to present to the local scientific society a discussion of the crazy beetles they found: "Look at the antenna on this monster! They look incredibly ungainly but the male of the species can use these to kill small frogs in whose carcass the females lay their eggs."

The point of such a presentation is to take a piece of code that the presenter has understood deeply and for them to help the audience understand the core ideas by pointing them out amidst the layers of evolutionary detritus (a.k.a. kludges) that are also part of almost all code. One reasonable approach might be to show the real code and then to show a stripped down reimplementation of just the key bits, kind of like a biologist staining a specimen to make various features easier to discern.

My scientist friends often like to joke that CS isn't science, even as they admire the work that computer scientists and programmers do. I think Seibel's essay expresses nicely one way in which studying software really is like what natural scientists do. True, programs are created by people; they don't exist in the world as we find it. (At least programs in the sense of code written by humans to run on a computer.) But they are created under conditions that look a lot more like biological evolution than, say, civil engineering.

As Hal Abelson says in the essay, most real programs end up containing a lot of stuff just to make it work in the complex environments in which they operate. The extraneous stuff enables the program to connect to external APIs and plug into existing frameworks and function properly in various settings. But the extraneous stuff is not the core idea of the program.

When we study code, we have to slash our way through the brush to find this core. When dealing with complex programs, this is not easy. The evidence of adaptation and accretion obscures everything we see. Many people do what Seibel does when they approach a new, hairy piece of code: they refactor it, decoding the meaning of the program and re-coding it in a structure that communicates their understanding in terms that express how they understand it. Who knows; the original program may well have looked like this simple core once, before it evolved strange appendages in order to adapt to the condition in which it needed to live.

The folks who helped to build the software patterns community recognized this. They accepted that every big program "in the wild" is complex and full of cruft. But they also asserted that we can study such programs and identify the recurring structures that enable complex software both to function as intended and to be open to change and growth at the hands of programmers.

One of the holy grails of software engineering is to find a way to express the core of a system in a clear way, segregating the extraneous stuff into modules that capture the key roles that each piece of cruft plays. Alas, our programs usually end up more like the human body: a mass of kludges that intertwine to do some very cool stuff just well enough to succeed in a harsh world.

And so: when we read code, we really do need to bring the mindset and skills of a scientist to our task. It's not like reading People magazine.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

February 19, 2017 10:42 AM

Bret Victor's Advice for Reading Alan Kay

In Links 2013, Bret Victor offers these bits of advice for how to read Alan Kay's writings and listen to his talks:

As you read and watch Alan Kay, try not to think about computational technology, but about a society that is fluent in thinking and debating in the dimensions opened up by the computational medium.

Don't think about "coding" (that's ink and metal type, already obsolete), and don't think about "software developers" (medieval scribes only make sense in an illiterate society).

I have always been inspired and challenged by Kay's work. One of second-order challenges I face is to remember that his vision is not ultimately about people like me writing programs. It's about a culture in which every person can use computational media the way we all use backs of envelopes, sketch books, newspapers, and books today. Computation can change the way we think and exchange ideas.

Then again, it's hard enough to teach CS students to program. That is a sign that we still have work to do in understanding programming better, and also in thinking about the kind of tools we build and use. In terms of Douglas Engelbart's work, also prominently featured among Victor's 2013 influences -- we need to build tools to improve how we program before we can build tools to "improve the improving".

Links 2013 could be the reading list for an amazing seminar. There are no softballs there.

Posted by Eugene Wallingford | Permalink | Categories: Computing

February 09, 2017 4:25 PM

Knowing, Doing, and Ubiquitous Information

I was recently reading an old bit-player entry on computing number factoids when I ran across a paragraph that expresses an all-too-common phenomenon of the modern world:

If I had persisted in my wrestling match, would I have ultimately prevailed? I'll never know, because in this era of Google and MathOverflow and StackExchange, a spoiler lurks around every cybercorner. Before I could make any further progress, I stumbled upon pointers to the work of Ira Gessel of Brandeis, who neatly settled the matter ... more than 40 years ago, when he was an undergraduate at Harvard.

The matter in this case was recognizing whether an arbitrary n is a Fibonacci number or not, but it could be have been just about anything. If you need an answer to almost any question these days, it's already out there, right a your fingertips.

Google and StackExchange and MathOverflow are a boon for knowing, but not so much for doing. Unfortunately, doing often leads to a better kind of knowing. Jumping directly to the solution can rob us of some important learning. As Hayes reminds us in his articles, it also can also deprive us of a lot of fun.

You can still learn by doing and have a lot of fun doing it today -- if you can resist the temptation to search. After you struggle for a while and need some help, then having answers at our fingertips becomes a truly magnificent resource and can help us get over humps we could never have gotten over so quickly in even the not-the-so-distant past.

The new world puts a premium on curiosity, the desire to find answers for ourselves. It also values self-denial, the ability to delay gratification while working hard to find answer that we might be able to look up. I fear that this creates a new gap for us to worry about in our education systems. Students who are curious and capable of self-denial are a new kind of "haves". They have always had a leg up in schools, but ubiquitous information magnifies the gap.

Being curious, asking questions, and wanting to create (not just look up) answers have never been more important to learning.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 11, 2017 2:22 PM

An Undergraduate Research Project for Spring

Coming into the semester, I didn't have any students doing their undergraduate research under my supervision. That frees up some time each week, which is nice, but leaves my semester a bit poorer. Working with students one-on-one is one of the best parts of this job, even more so in relief against administrative duties. Working on these projects makes my weeks better, even when I don't have as much time to devote to them as I'd like.

Yesterday, a student walked in with a project that makes my semester a little busier -- and much more interesting. Last summer, he implemented some ideas on extensible effects in Haskell and has some ideas for ways to make the system more efficient.

This student knows a lot more about extensible effects and Haskell than I do, so I have some work to do just to get ready to help. I'll start with Extensible Effects: An Alternative to Monad Transformers, the paper by Oleg Kiselyov and his colleagues that introduced the idea to the wider computing community. This paper builds on work by Cartwright and Felleisen, published over twenty years ago, which I'll probably look at, too. The student has a couple of other things for me to read, which will appear in his more formal proposal this week. I expect that these papers will make my brain hurt, in the good way, and am looking forward to diving in.

In the big picture, most undergrad projects in my department are pretty minor as research goes. They are typically more D than R, with students developing something that goes beyond what they learn in any course and doing a little empirical analysis. The extensible effects project is much more ambitious. It builds on serious academic research. It works on a significant problem and proposes something new. That makes the project much more exciting for me as the supervisor.

I hope to report more later, as the semester goes on.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 10, 2017 4:07 PM

Garbage Collection -- and the Tempting Illusion of Big Breakthroughs

Good advice in this paragraph, paraphrased lightly from Modern Garbage Collection:

Garbage collection is a hard problem, really hard, one that has been studied by an army of computer scientists for decades. Be very suspicious of supposed breakthroughs that everyone else missed. They are more likely to just be strange or unusual tradeoffs in disguise, avoided by others for reasons that may only become apparent later.

It's wise always to be on the lookout for "strange or unusual tradeoffs in disguise".

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

January 08, 2017 9:48 AM

Finding a Balance Between Teaching and Doing

In the Paris Review's The Art of Fiction No. 183, the interviewer asks Tobias Wolff how he balances writing with university teaching. Wolff figures that teaching is a pretty good deal:

When I think about the kinds of jobs I've had and the ways I've lived, and still managed to get work done--my God, teaching in a university looks like easy street. I like talking about books, and I like encountering other smart, passionate readers, and feeling the friction of their thoughts against mine. Teaching forces me to articulate what otherwise would remain inchoate in my thoughts about what I read. I find that valuable, to bring things to a boil.

That reflects how I feel, too, as someone who loves to do computer science and write programs. As a teacher, I get to talk about cool ideas every day with my students, to share what I learn as I write software, and to learn from them as they ask the questions I've stopped asking myself. And they pay me. It's a great deal, perhaps the optimal point in the sort of balance that Derek Sivers recommends.

Wolff immediately followed those sentences with a caution that also strikes close to home:

But if I teach too much it begins to weigh on me--I lose my work. I can't afford to do that anymore, so I keep a fairly light teaching schedule.

One has to balance creative work with the other parts of life that feed the work. Professors at research universities, such as Wolff at Stanford, have different points of equilibrium available to them than profs at teaching universities, where course loads are heavier and usually harder to reduce.

I only teach one course a semester, which really does help me to focus creative energies around a smaller set of ideas than a heavier load does. Of course, I also have the administrative duties of a department head. They suffocate time and energy in a much less productive way than teaching does. (That's the subject of another post.)

Why can't Wolff afford to teach too many courses anymore? I suspect the answer is time. When you reach a certain age, you realize that time is no longer an ally. There are only so many years left, and Wolff probably feels the need to write more urgently. This sensation has been seeping into my mind lately, too, though I fear perhaps a bit too slowly.


(I previously quoted Wolff from the same interview in a recent entry about writers who give advice that reminds us that there is no right way to write all programs. A lot of readers seemed to like that one.)

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

December 26, 2016 8:38 AM

Learn By Programming

The latest edition of my compiler course has wrapped, with grades submitted and now a few days distance between us and the work. The course was successful in many ways, even though not all of the teams were all able to implement the entire compiler. That mutes the students' sense of accomplishment sometimes, but it's not unusual for at least some of the teams to have trouble implementing a complete code generator. A compiler is a big project. Fifteen weeks is not a lot of time. In that time, students learn a lot about compilers, and also about how to work as a team to build a big program using some of the tools of modern software development. In general, I was quite proud of the students' efforts and progress. I hope they were proud of themselves.

One of the meta-lessons students tend to learn in this course is one of the big lessons of any project-centered course:

... making something is a different learning experience from remembering something.

I think that a course like this one also helps most of them learn something else even more personal:

... the discipline in art-making is exercised from within rather than without. You quickly realize that it's your own laziness, ignorance, and sloppiness, not somebody else's bad advice, that are getting in your way. No one can write your [program] for you. You have to figure out a way to write it yourself. You have to make a something where there was a nothing.

"Laziness", "ignorance", and "sloppiness" seem like harsh words, but really they aren't. They are simply labels for weaknesses that almost all of us face when we first learn to create things on our own. Anyone who has written a big program has probably encountered them in some form.

I learned these lessons as a senior, too, in my university's two-term project course. It's never fun to come up short of our hopes or expectations. But most of us do it occasionally, and never more reliably than we are first learning how to make something significant. It is good for us to realize early on our own responsibility for how we work and what we make. It empowers us to take charge of our behavior.

Black Mountain College's Lake Eden campus

The quoted passages are, with the exception of the word "program", taken from Learn by Painting, a New Yorker article about "Leap Before You Look: Black Mountain College, 1933-1957", an exhibit at the Institute of Contemporary Art in Boston. Black Mountain was a liberal arts college with a curriculum built on top of an unusual foundation: making art. Though the college lasted less than a quarter century, its effects were felt across most of art disciplines in the twentieth century. But its mission was bigger: to educate citizens, not artists, through the making art. Making something is a different learning experience from remembering something, and BMC wanted all of its graduates to have this experience.

The article was a good read throughout. It closes with a comment on Black Mountain's vision that touches on computer science and reflects my own thinking about programming. This final paragraph begins with a slight indignity to us in CS but turns quickly into an admiration:

People who teach in the traditional liberal-arts fields today are sometimes aghast at the avidity with which undergraduates flock to courses in tech fields, like computer science. Maybe those students see dollar signs in coding. Why shouldn't they? Right now, tech is where value is being created, as they say. But maybe students are also excited to take courses in which knowing and making are part of the same learning process. Those tech courses are hands-on, collaborative, materials-based (well, virtual materials), and experimental -- a digital Black Mountain curriculum.

When I meet with prospective students and their parents, I stress that, while computer science is technical, it is not vocational. It's more. Many high school students sense this already. What attracts them to the major is a desire to make things: games and apps and websites and .... Earning potential appeals to some of them, of course, but students and parents alike seem more interested in something else that CS offers them: the ability to make things that matter in the modern world. They want to create.

The good news suggested in "Learn by Painting", drawing on the Black Mountain College experiment, is that learning by making things is more than just that. It is a different and, in most ways, more meaningful way to learn about the world. It also teaches you a lot about yourself.

I hope that at least a few of my students got that out of their project course with me, in addition to whatever they learned about compilers.


IMAGE. The main building of the former Black Mountain College, on the grounds of Camp Rockmont, a summer camp for boys. Courtesy of Wikipedia. Public domain.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 21, 2016 2:31 PM

Retaining a Sense of Wonder

A friend of mine recently shared a link to Radio Garden on a mailing list (remember those?), and in the ensuing conversation, another friend wrote:

I remember when I was a kid playing with my Dad's shortwave radio and just being flabbergasted when late one night I tuned in a station from Peru. Today you can get on your computer and communicate instantly with any spot on the globe, and that engenders no sense of wonder at all.

Such is the nature of advancing technology. Everyone becomes acclimated to amazing new things, and pretty soon they aren't even things any more.

Teachers face a particularly troublesome version of this phenomenon. Teach a subject for a few years, and pretty soon it loses its magic for you. It's all new to your students, though, and if you can let them help you see it through their eyes, you can stay fresh. The danger, though, is that it starts to look pretty ordinary to you, even boring, and you have a hard time helping them feel the magic.

If you read this blog much, you know that I'm pretty easy to amuse and pretty easy to make happy. Even so, I have to guard against taking life and computer science for granted.

Earlier this week, I was reading one reading one of the newer tutorials in Matthew Butterick's Beautiful Racket, Imagine a language: wires. In it, he builds a DSL to solve one of the problems in the 2015 edition of Advent of Code, Some Assembly Required. The problem is fun, specifying a circuit in terms of a small set of operations for wires and gates. Butterick's approach to solving it is fun, too: creating a DSL that treats the specification of a circuit as a program to interpret.

This is no big deal to a jaded old computer scientist, but remember -- or imagine -- what this solution must seem like to a non-computer scientist or to a CS student encountering the study of programming languages for the first time. With a suitable interpreter, every dataset is a program. If that isn't amazing enough, some wires datasets introduce sequencing problems, because the inputs to a gate are defined in the program after the gate. Butterick uses a simple little trick: define wires and gates as functions, not data. This simple little trick is really a big idea in disguise: Functions defer computation. Now circuit programs can be written in any order and executed on demand.

Even after all these years, computing's most mundane ideas can still astonish me sometimes. I am trying to keep my sense of wonder high and to renew it whenever it starts to flag. This is good for me, and good for my students.


P.S. As always, I recommend Beautiful Racket, and Matthew Butterick's work more generally, quite highly. He has a nice way of teaching useful ideas in a way that appreciates their beauty.

P.P.S. The working title of this entry was "Paging Louis C.K., Paging Louis C.K." That reference may be a bit dated by now, but still it made me smile.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 16, 2016 2:14 PM

Language and Thinking

Earlier this week, Rands tweeted:

Tinkering is a deceptively high value activity.

... to which I followed up:

Which is why a language that enables tinkering is a deceptively high value tool.

I thought about these ideas a couple of days later when I read The Running Conversation in Your Head and came across this paragraph:

The idea is not that you need language for thinking but that when language comes along, it sure is useful. It changes the way you think, it allows you to operate in different ways because you can use the words as tools.

This is how I think about programming in general and about new, and better, programming languages in particular. A programmer can think quite well in just about any language. Many of us cut our teeth in BASIC, and simply learning how to think computationally allowed us to think differently than we did before. But then we learn a radically different or more powerful language, and suddenly we are able to think new thoughts, thoughts we didn't even conceive of in quite the same way before.

It's not that we need the new language in order to think, but when it comes along, it allows us to operate in different ways. New concepts become new tools.

I am looking forward to introducing Racket and functional programming to a new group of students this spring semester. First-class functions and higher-order functions can change how students think about the most basic computations such as loops and about higher-level techniques such as OOP. I hope to do a better job this time around helping them see the ways in which it really is different.

To echo the Running Conversation article again, when we learn a new programming style or language, "Something really special is created. And the thing that is created might well be unique in the universe."

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 12, 2016 3:15 PM

Computer Science Is Not That Special

I'm reminded of a student I met with once who told me that he planned to go to law school, and then a few minutes later, when going over a draft of a lab report, said "Yeah... Grammar isn't really my thing." Explaining why I busted up laughing took a while.

When I ask prospective students why they decided not to pursue a CS degree, they often say things to the effect of "Computer science seemed cool, but I heard getting a degree in CS was a lot of work." or "A buddy of mine told me that programming is tedious." Sometimes, I meet these students as they return to the university to get a second degree -- in computer science. Their reasons for returning vary from the economic (a desire for better career opportunities) to personal (a desire to do something that they have always wanted to do, or to pursue a newfound creative interest).

After you've been in the working world a while, a little hard work and some occasional tedium don't seem like deal breakers any more.

Such conversations were on my mind as I read physicist Chad Orzel's recent Science Is Not THAT Special. In this article, Orzel responds to the conventional wisdom that becoming a scientist and doing science involve a lot of hard work that is unlike the exciting stuff that draws kids to science in the first place. Then, when kids encounter the drudgery and hard work, they turn away from science as a potential career.

Orzel's takedown of this idea is spot on. (The quoted passage above is one of the article's lighter moments in confronting the stereotype.) Sure, doing science involves a lot of tedium, but this problem is not unique to science. Getting good at anything requires a lot of hard work and tedious attention to detail. Every job, every area of expertise, has its moments of drudgery. Even the rare few who become professional athletes and artists, with careers generally thought of as dreams that enable people to earn a living doing the thing they love, spend endless hours engaged in the drudgery of practicing technique and automatizing physical actions that become their professional vocabulary.

Why do we act as if science is any different, or should be?

Computer science gets this rap, too. What could be worse than fighting with a compiler to accept a program while you are learning to code? Or plowing threw reams of poorly documented API descriptions to plug your code into someone's e-commerce system?

Personally, I can think of lots of things that are worse. I am under no illusion, however, that other professionals are somehow shielded from such negative experiences. I just prefer my pains to theirs.

Maybe some people don't like certain kinds of drudgery. That's fair. Sometimes we gravitate toward the things whose drudgery we don't mind, and sometimes we come to accept the drudgery of the things we love to do. I'm not sure which explains my fascination with programming. I certainly enjoy the drudgery of computer science more than that of most other activities -- or at least I suffer it more gladly.

I'm with Orzel. Let's be honest with ourselves and our students that getting good at anything takes a lot of hard work and, once you master something, you'll occasionally face some tedium in the trenches. Science, and computer science in particular, are not that much different from anything else.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 09, 2016 1:54 PM

Two Quick Hits with a Mathematical Flavor

I've been wanting to write a blog entry or two lately about my compiler course and about papers I've read recently, but I've not managed to free up much time as semester winds down. That's one of the problems with having Big Ideas to write about: they seem daunting and, at the very least, take time to work through.

So instead here are two brief notes about articles that crossed my newsfeed recently and planted themselves in my mind. Perhaps you will enjoy them even without much commentary from me.

A Student's Unusual Proof Might Be A Better Proof

I asked a student to show that between any two rationals is a rational.

She did the following: if x < y are rational then take δ << y-x and rational and use x+δ.

I love the student's two proofs in article! Student programmers are similarly creative. Their unusual solutions often expose biases in my thinking and give me way to think about a problem. If nothing else, they help to understand better how students think about ideas that I take for granted.

Numberless Word Problems

Some girls entered a school art competition. Fewer boys than girls entered the competition.

She projected her screen and asked, "What math do you see in this problem?"

Pregnant pause.

"There isn't any math. There aren't any numbers."

I am fascinated by the possibility of adapting this idea to teaching students to think like a programmer. In an intro course, for example, students struggle with computational ideas such as loops and functions even though they have a lot of experience with these ideas embodied in their daily lives. Perhaps the language we use gets in the way of them developing their own algorithmic skills. Maybe I could use computationless word problems to get them started?

I'm giving serious thought to ways I might use this approach to help students learn functional programming in my Programming Languages course this spring. The authors describes how to write numberless word problems, and I'm wondering how I might bring the philosophy to computer science. If you have any ideas, please share!

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

November 03, 2016 3:53 PM

TIL Today: The Power of Limiting Random Choices

This morning I read this blog post by Dan Luu on the use of randomized algorithms in cache eviction. He mentions a paper by Mitzenmacher, Richa, and Sitaraman called The Power of Two Random Choices that explains a cool effect we see when we limit our options when choosing among multiple options. Luu summarizes the result:

The mathematical intuition is that if we (randomly) throw n balls into n bins, the maximum number of balls in any bin is O(log n / log log n) with high probability, which is pretty much just O(log n). But if (instead of choosing randomly) we choose the least loaded of k random bins, the maximum is O(log log n / log k) with high probability, i.e., even with two random choices, it's basically O(log log n) and each additional choice only reduces the load by a constant factor.

Luu used this result to create a cache eviction policy that outperforms random eviction across the board and competes closely with the traditional LRU policy. It chooses two pages at random and evicts the least-recently used of the two. This so-called 2-random algorithm slightly outperforms LRU in larger caches and slightly underperforms LRU in smaller caches. This trade-off may be worth making because, unlike LRU, random eviction policies degrade gracefully as loops get large.

The power of two random choices has potential application in any context that fits the balls-and-bins model, including load balancing. Luu mentions less obvious application areas, too, such as circuit routing.

Very cool indeed. I'll have to look at Mitzenmacher et al. to see how the math works, but first I may try the idea out in some programs. For me, the programming is even more fun...

Posted by Eugene Wallingford | Permalink | Categories: Computing

November 01, 2016 4:04 PM

An Adventure with C++ Compilers

I am a regular reader of John Regehr's blog, which provides a steady diet of cool compiler conversation. One of Regehr's frequent topics is undefined behavior in programming languages, and what that means for implementing and testing compilers. A lot of those blog entries involve C and C++, which I don't use all that often any more, so reading them is more spectator sport than contact sport.

This week, I got see how capricious C++ compilers can feel up close.

My students are implementing a compiler for a simple subset of a Pascal-like language. We call the simplest program in this language print-one:

    $ cat print-one.flr
    program main();
        return 1

One of the teams is writing their compiler in C++. The team completed its most recent version, a parser that validates its input or reports an error that renders its input invalid. They were excited that it finally worked:

    $ g++ -std=c++14 compiler.cpp -o compiler
    $ compiler print-one.flr 
    Valid flair program

They had tested their compiler on two platforms:

  • a laptop running OS X v10.11.5 and gcc v4.9.3
  • a desktop running Ubuntu v14.04 and gcc v4.8.4

I sat down at my desktop computer to exercise their compiler.

    $ g++ compiler.cpp -o compiler
    In file included from compiler.cpp:7:
    In file included from ./parser.cpp:3:
    In file included from ./ast-utilities.cpp:4:
    ./ast-utilities.hpp:7:22: warning: in-class initialization of non-static data
          member is a C++11 extension [-Wc++11-extensions]
        std::string name = "Node";
    24 warnings generated.

Oops, I forgot the -std=c++14 flag. Still, it compiled, and all of the warnings come from a part of the code has no effect on program validation. So I tried the executable:

    $ compiler print-one.flr 
    ERROR at line #3 -- unexpected <invalid>  1
    Invalid flair program

Hmm. The warnings are unrelated to part of the executable that I am testing, but maybe they are creating a problem. So I recompile with the flag:

    $ g++ -std=c++14 compiler.cpp -o compiler
    error: invalid value 'c++14' in '-std=c++14'

What? I check my OS and compiler specs:

    $ sw_vers -productVersion
    $ g++ --version
    Configured with: --prefix=/Applications/ --with-gxx-include-dir=/usr/include/c++/4.2.1
    Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)

Oh, right, Apple doesn't ship gcc any more; it ships clang and link gcc to the clang exe. I know my OS is a bit old, but it still seems odd that the -std=c++14 flag isn't supported. I google for an answer (thanks, StackOverflow!) and find that that I need to use -std=c++1y. Okay:

    $ g++ -std=c++1y compiler.cpp -o compiler
    $ compiler print-one.flr 
    ERROR at line #3 -- unexpected <invalid>  1
    Invalid flair program

Now the student compiler compiles but gives incorrect, or at least unintended, behavior. I'm surprised that both my clang and the students' gcc compile their compiler yet produce executables that give different answers. I know that gcc and clang aren't 100% compatible, but my students are using a relatively small part of C++. How can this be?

Maybe it has something to do with how clang processes the c++1y standard flag. So I backed up to the previous standard:

    $ g++ -std=c++0x compiler.cpp -o compiler
    $ compiler print-one.flr 
    ERROR at line #3 -- unexpected <invalid>  1
    Invalid flair program

Yes, that's c++0x, not c++0y. The student compiler still compiles and still gives incorrect or unintended behavior. Maybe it is a clang problem? I upload their code to our student server, which runs Linux and gcc:

    $ cat /etc/debian_version 
    $ g++ --version
    gcc version 4.7.2 (Debian 4.7.2-5)

This version of gcc doesn't support either c++14 or c++1?, so I fell back to c++0x:

    $ g++ -std=c++0x compiler.cpp -o compiler
    $ compiler print-one.flr 
    Valid flair program

Hurray! I can test their code.

I'm curious. I have a Macbook Pro running a newer version of OS X. Maybe...

    $ sw_vers -productVersion
    ProductName:Mac OS X
    $ g++ --version
    Configured with: --prefix=/Applications/ --with-gxx-include-dir=/Applications/
    Apple LLVM version 7.0.2 (clang-700.1.81)

$ g++ -std=c++14 compiler.cpp -o compiler $ compiler print-one.flr Valid flair program

Now, the c++14 flag works, and it produces a compiler that produces the correct behavior -- or at least the intended behavior.

I am curious about this anomaly, but not curious enough to research the differences between clang and gcc, the differences between the different versions of clang, or what Apple or Debian are doing. I'm also not curious enough to figure out which nook of C++ my students have stumbled into that could expose a rift in the behavior of these various C++ compilers, all of which are standard tools and pretty good.

At least now I remember what it's like to program in a language with undefined behavior and can read Regehr's blog posts with a knowing nod of the head.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 22, 2016 2:00 PM

Competence and Creating Conditions that Minimize Mistakes

I enjoyed this interview with Atul Gawande by Ezra Klein. When talking about making mistakes, Gawande notes that humans have enough knowledge to cut way down on errors in many disciplines, but we do not always use that knowledge effectively. Mistakes come naturally from the environments in which we work:

We're all set up for failure under the conditions of complexity.

Mistakes are often more a matter of discipline and attention to detail than a matter of knowledge or understanding. Klein captures the essence of Gawande's lesson in one of his questions:

We have this idea that competence is not making mistakes and getting everything right. [But really...] Competence is knowing you will make mistakes and setting up a context that will help reduce the possibility of error but also help deal with the aftermath of error.

In my experience, this is a hard lesson for computer science students to grok. It's okay to make mistakes, but create conditions where you make as few as possible and in which you can recognize and deal with the mistakes as quickly as possible. High-discipline practices such as test-first and pair programming, version control, and automated builds make a lot more sense when you see them from this perspective.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development, Teaching and Learning

October 17, 2016 4:10 PM

Using Programming to Learn Math, Even at the University

There is an open thread on the SIGCSE mailing list called "Forget the language wars, try the math wars". Faculty are discussing how to justify math requirements on a CS major, especially for students who "just want to be programmers". Some people argue that math requirements are a barrier to recruiting students who can succeed in computer science, in particular calculus.

Somewhere along the line, Cay Horstmann wrote a couple of things I agree with. First, he said that he didn't want to defend the calculus requirement because most calculus courses do not teach students how to think "but how to follow recipes". I have long had this complaint about calculus, especially as it's taught in most US high schools and universities. Then he wrote something more positive:

What I would like to see is teaching math alongside with programming. None of my students were able to tell me what sine and cosine were good for, but when they had to program a dial [in a user interface], they said "oh".

Couldn't that "oh" have come earlier in their lives? Why don't students do programming in middle school math? I am not talking large programs--just a few lines, so that they can build models and intuition.

I agree wholeheartedly. And even if students do not have such experiences in their K-12 math classes, the least we could do help them have that "oh" experience earlier in their university studies.

My colleagues and I have been discussing our Discrete Structures course now for a few weeks, including expected outcomes, its role as a prerequisite to other courses, and how we teach it. I have suggested that one of the best ways to learn discrete math is to connect it with programs. At our university, students have taken at least one semester of programming (currently, in Python) before they take Discrete. We should use that to our advantage!

A program can help make an abstract idea concrete. When learning about set operations, why do only paper-and-pencil exercises when you can use simple Python expressions in the REPL? Yes, adding programming to the mix creates new issues to deal with, but if designed well, such instruction could both improve students' understanding of discrete structures -- as Horstmann says, helping them build models and intuition -- and give students more practice writing simple programs. An ancillary benefit might be to help students see that computer scientists can use computation to help them learn new things, thus preparing for habits that can extend to wider settings.

Unfortunately, the most popular Discrete Structures textbooks don't help much. They do try to use CS-centric examples, but they don't seem otherwise to use the fact that students are CS majors. I don't really blame them. They are writing for a market in which students study many different languages in CS 1, so they can't (and shouldn't) assume any particular programming language background. Even worse, the Discrete Structures course appears at different places throughout the CS curriculum, which means that textbooks can't assume even any particular non-language CS experience.

Returning to Horstmann's suggestion to augment math instruction with programming in K-12, there is, of course, a strong movement nationally to teach computer science in high school. My state has been disappointingly slow to get on board, but we are finally seeing action. But most of the focus in this nationwide movement is on teaching CS qua CS, with less interest in emphasis on integrating CS into math and other core courses.

For this reason, let us again take a moment to thank the people behind the Bootstrap project for leading the charge in this regard, helping teachers use programming in Racket to teach algebra and other core topics. They are even evaluating the efficacy of the work and showing that the curriculum works. This may not surprise us in CS, but empirical evidence of success is essential if we hope to get teacher prep programs and state boards of education to take the idea seriously.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 06, 2016 2:46 PM

Computers Shouldn't Need a Restart Button (Memories of Minix)

An oldie but goodie from Andrew Tanenbaum:

Actually, MINIX 3 and my research generally is **NOT** about microkernels. It is about building highly reliable, self-healing, operating systems. I will consider the job finished when no manufacturer anywhere makes a PC with a reset button. TVs don't have reset buttons. Stereos don't have reset buttons. Cars don't have reset buttons. They are full of software but don't need them. Computers need reset buttons because their software crashes a lot. I know that computer software is different from car software, but users just want them both to work and don't want lectures why they should expect cars to work and computers not to work. I want to build an operating system whose mean time to failure is much longer than the lifetime of the computer so the average user never experiences a crash.

I remember loving MINIX 1 (it was just called Minix then, of course) when I first learned it in grad school. I did not have any Unix experience coming out of my undergrad and had only begun to feel comfortable with BSD Unix in my first few graduate courses. Then I was assigned to teach the Operating Systems course, working with one of the CS faculty. He taught me a lot, but so did Tanenbaum -- through Minix. That is one of the first times I came to really understand that the systems we use (the OS, the compiler, the DBMS) were just programs that I could tinker with, modify, and even write.

Operating systems is not my area, and I have no expertise for evaluating the whole microkernel versus monolith debate. But I applaud researchers like Tanenbaum who are trying to create general computer systems that don't need to be rebooted. I'm a user, too.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

September 28, 2016 3:16 PM

Language, and What It's Like To Be A Bat

My recent post about the two languages resonated in my mind with an article I finished reading the day I wrote the post: Two Heads, about the philosophers Paul and Pat Churchland. The Churchlands have been on a forty-year quest to change the language we use to describe our minds, from popular terms based in intuition and introspection to terms based in the language of neuroscience. Changing language is hard under any circumstances, and it is made harder when the science they need is still in its infancy. Besides, maybe more traditional philosophers are right and we need our traditional vocabulary to make sense of what it feels like to be human?

The New Yorker article closes with these paragraphs, which sounds as if they are part of a proposal for a science fiction novel:

Sometimes Paul likes to imagine a world in which language has disappeared altogether. We know that the two hemispheres of the brain can function separately but communicate silently through the corpus callosum, he reasons. Presumably, it will be possible, someday, for two separate brains to be linked artificially in a similar way and to exchange thoughts infinitely faster and more clearly than they can now through the muddled, custom-clotted, serially processed medium of speech. He already talks about himself and Pat as two hemispheres of the same brain. Who knows, he thinks, maybe in his children's lifetime this sort of talk will not be just a metaphor.

If, someday, two brains could be joined, what would be the result? A two-selved mutant like Joe-Jim, really just a drastic version of Siamese twins, or something subtler, like one brain only more so, the pathways from one set of neurons to another fusing over time into complex and unprecedented arrangements? Would it work only with similar brains, already sympathetic, or, at least, both human? Or might a human someday be joined to an animal, blending together two forms of thinking as well as two heads? If so, a philosopher might after all come to know what it is like to be a bat, although, since bats can't speak, perhaps he would be able only to sense its batness without being able to describe it.

(Joe-Jim is a character from a science fiction novel, Robert Heinlein's Orphans of the Sky.)

What a fascinating bit of speculation! Can anyone really wonder why kids are drawn to science fiction?

Let me add my own speculation to the mix: If we do ever find a way to figure out what it's like to be a bat, people will find a way to idescribe what it's like to be a bat. They will create the language they need. Making language is what we do.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns

September 06, 2016 2:44 PM

"Inception" and the Simulation Argument

If Carroll's deconstruction of the simulation argument is right, then the more trouble we have explaining consciousness, the more that should push us to believe we're in a ground-level simulation. There's probably a higher-level version of physics in which consciousness makes sense. Our own consciousness is probably being run in a world that operates on that higher-level law. And we're stuck in a low-resolution world whose physics doesn't allow consciousness -- because if we weren't, we'd just keep recursing further until we were.

-- Scott Alexander, The View From Ground Level

two characters from the film 'Inception' walking in a dream world where space folds back on itself

In the latest installment of "You Haven't Seen That Yet?", I watched the film Inception yesterday. There was only one person watching, but still the film gets two hearty thumbs-up. All those Ellen Pages waking up, one after the other...

Over the last few years, I've heard many references to the idea from physics that we are living in a simulation, that our universe is a simulation created by beings in another universe. It seems that some physicists think and talk about this a lot, which seems odd to me. Empiricism can't help us much to unravel the problem; arguments pro and con come down to the sort of logical arguments favored by mathematicians and philosophers, abstracted away from observation of the physical world. It's a fun little puzzle, though. The computer science questions are pretty interesting, too.

Ideas like this are old hat to those of us who read a lot of science fiction growing up, in particular Philip K. Dick. Dick's stories were often predicated on suspending some fundamental attribute of reality, or our perception of it, and seeing what happened to our lives and experiences. Now that I have seen Memento (a long-time favorite of mine) and Inception, I'm pretty happy. What Philip K. Dick was with the written word to kids of my generation, Christopher Nolan is on film to a younger generation. I'm glad I've been able to experience both.


The photo above comes from Matt Goldberg's review of Inception. It shows Arthur, the character played by Joseph Gordon-Levitt, battling with a "projection" in three-dimensional space that folds back on itself. Such folding is possible in dream worlds and is an important element in designing dreams that enable all the cool mind games that are central to the film.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

August 22, 2016 4:18 PM

A New Way to Debug Our Election Systems

In The Security of Our Election Systems, Bruce Schneier says that we no longer have time to sound alarm about security flaws in our election systems and hope that government and manufacturers will take action. Instead...

We must ignore the machine manufacturers' spurious claims of security, create tiger teams to test the machines' and systems' resistance to attack, drastically increase their cyber-defenses and take them offline if we can't guarantee their security online.

How about this:

The students in my department love to compete in cyberdefense competitions (CDCs), in which they are charged with setting up various systems and then defending them against attack from experts for some period, say, twenty-four hours. Such competitions are growing in popularity across the country.

Maybe we should run a CDC with the tables turned. Manufacturers are required to set up their systems and to run the full set of services they promise when they sell the systems to government agencies. Students across the US would then be given a window of twenty-fours or more to try to crack the systems, with the manufacturers or even our election agencies trying to keep their systems up and running securely. Any vulnerabilities that the students find would be made public, enabling the manufacturers to fix them and the state agencies to create and set up new controls.

Great idea or crazy fantasy?

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

August 07, 2016 10:36 AM

Some Advice for How To Think, and Some Personal Memories

I've been reading a bunch of the essays on David Chapman's Meaningness website lately, after seeing a link to one on Twitter. (Thanks, @kaledic.) This morning I read How To Think Real Good, about one of Chapman's abandoned projects: a book of advice for how to think and solve problems. He may never write this book as he once imagined it, but I'm glad he wrote this essay about the idea.

First of all, it was a fun read, at least for me. Chapman is a former AI researcher, and some of the stories he tells remind me of things I experienced when I was in AI. We were even in school at about the same time, though in different parts of the country and different kinds of program. His work was much more important than mine, but I think at some fundamental level most people in AI share common dreams and goals. It was fun to listen as Chapman reminisced about knowledge and AI.

He also introduced me to the dandy portmanteau anvilicious. I keep learning new words! There are so many good ones, and people make up the ones that don't exist already.

My enjoyment was heightened by the fact that the essay stimulated the parts of my brain that like to think about thinking. Chapman includes a few of the heuristics that he intended to include in his book, along with anecdotes that illustrate or motivate them. Here are three:

All problem formulations are "false", because they abstract away details of reality.

Solve a simplified version of the problem first. If you can't do even that, you're in trouble.

Probability theory is sometimes an excellent way of dealing with uncertainty, but it's not the only way, and sometimes it's a terrible way.

He elaborates on the last of these, pointing out that probability theory tends to collapse many different kinds of uncertainty into a single value. This does not work all that well in practice, because different kinds of uncertainty often need to be handles in very different ways.

Chapman has a lot to say about probability. This essay was prompted by what he sees as an over-reliance of the rationalist community on a pop version of Bayesianism as its foundation for reasoning. But as an old AI researcher, he knows that an idea can sound good and fail in practice for all sorts of reasons. He has also seen how a computer program can make clear exactly what does and doesn't work.

Artificial intelligence has always played a useful role as a reality check on ideas about mind, knowledge, reasoning, and thought. More generally, anyone who writes computer programs knows this, too. You can make ambiguous claims with English sentences, but to write a program you really have to have a precise idea. When you don't have a precise idea, your program itself is a precise formulation of something. Figuring out what that is can be a way of figuring out what you were really thing about in the first place.

This is one of the most important lessons college students learn from their intro CS courses. It's an experience that can benefit all students, not just CS majors.

Chapman also includes a few heuristics for approaching the problem of thinking, basically ways to put yourself in a position to become a better thinker. Two of my favorites are:

Try to figure out how people smarter than you think.

Find a teacher who is willing to go meta and explain how a field works, instead of lecturing you on its subject matter.

This really is good advice. Subject matter is much easier to come by than deep understanding of how the discipline work, especially in these days of the web.

The word meta appears frequently throughout this essay. (I love that the essay is posted on the metablog/ portion of his site!) Chapman's project is thinking about thinking, a step up the ladder of abstraction from "simply" thinking. An AI program must reason; an AI researcher must reason about how to reason.

This is the great siren of artificial intelligence, the source of its power and also its weaknesses: Anything you can do, I can do meta.

I think this gets at why I enjoyed this essay so much. AI is ultimately the discipline of applied epistemology, and most of us who are lured into AI's arms share an interest in what it means to speak of knowledge. If we really understand knowledge, then we ought to be able to write a computer program that implements that understanding. And if we do, how can we say that our computer program isn't doing essentially the same thing that makes us humans intelligent?

As much as I love computer science and programming, my favorite course in graduate school was an epistemology course I took with Prof. Rich Hall. It drove straight to the core curiosity that impelled me to study AI in the first place. In the first week of the course, Prof. Hall laid out the notion of justified true belief, and from there I was hooked.

A lot of AI starts with a naive feeling of this sort, whether explicitly stated or not. Doing AI research brings that feeling into contact with reality. Then things gets serious. It's all enormously stimulating.

Ultimately Chapman left the field, disillusioned by what he saw as a fundamental limitation that AI's bag of tricks could never resolve. Even so, the questions that led him to AI still motivate him and his current work, which is good for all of us, I think.

This essay brought back a lot of pleasant memories for me. Even though I, too, am no longer in AI, the questions that led me to the field still motivate me and my interests in program design, programming languages, software development, and CS education. It is hard to escape the questions of what it means to think and how we can do it better. These remain central problems of what it means to be human.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

August 03, 2016 1:56 PM

Programming: Don't Knock It Till You Try It

We have a fair number of students on campus outside of CS who want to become web designers, but few of them think they should learn to program. Some give it a try when one of our communications profs tells them how exciting and liberating it can be. In general, though, it's a hard sell. Programming sounds boring to them, full of low-level details better left to techies over in computer science.

This issue pervades the web design community. In The Bomb in the Garden, Matthew Butterick does a great job of explaining why the web as a design medium is worth saving, and pointing to ways in which programming can release the creativity we need to keep it alive.

Which brings me to my next topic--what should designers know about programming?

And I know that some of you will think this is beating a dead horse. But when we talk about restoring creativity to the web, and expanding possibilities, we can't avoid the fact that just like the web is a typographic medium, it's also a programmable medium.

And I'm a designer who actually does a lot of programming in my work. So I read the other 322,000 comments about this on the web. I still think there's a simple and non-dogmatic answer, which is this:

You don't have to learn programming, but don't knock it till you try it.

It's fun for me when one of the web design students majoring in another department takes his or her first programming class and is sparked by the possibilities that writing a program opens up. And we in CS are happy to help them go deeper into the magic.

Butterick speaks truth when he says he's a designer who does a lot of programming in his work. Check out Pollen, the publishing system he created to write web-based books. Pollen's documentation says that it "helps authors make functional and beautiful digital books". That's true. It's a very nice package.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 28, 2016 2:37 PM

Functional Programming, Inlined Code, and a Programming Challenge

an example of the cover art for the Commander Keen series of games

I recently came across an old entry on Jonathan Blow's blog called John Carmack on Inlined Code. The bulk of the entry consists an even older email message that Carmack, lead programmer on video games such as Doom and Quake, sent to a mailing list, encouraging developers to consider inlining function calls as a matter of style. This email message is the earliest explanation I've seen of Carmack's drift toward functional programming, seeking to as many of its benefits as possible even in the harshly real-time environment of game programming.

The article is a great read, with much advice borne in the trenches of writing and testing large programs whose run-time performance is key to their success. Some of the ideas involve programming language:

It would be kind of nice if C had a "functional" keyword to enforce no global references.

... while others are more about design style:

The function that is least likely to cause a problem is one that doesn't exist, which is the benefit of inlining it.

... and still others remind us to rely on good tools to help avoid inevitable human error:

I now strongly encourage explicit loops for everything, and hope the compiler unrolls it properly.

(This one may come in handy as I prepare to teach my compiler course again this fall.)

This message-within-a-blog-entry itself quotes another email message, by Henry Spencer, which contains the seeds of a programming challenge. Spencer described a piece of flight software written in a particularly limiting style:

It disallowed both subroutine calls and backward branches, except for the one at the bottom of the main loop. Control flow went forward only. Sometimes one piece of code had to leave a note for a later piece telling it what to do, but this worked out well for testing: all data was allocated statically, and monitoring those variables gave a clear picture of most everything the software was doing.

Wow: one big loop, within which all control flows forward. To me, this sounds like a devilish challenge to take on when writing even a moderately complex program like a scanner or parser, which generally contain many loops within loops. In this regard, it reminds me of the Polymorphism Challenge's prohibition of if-statements and other forms of selection in code. The goal of that challenge was to help programmers really grok how the use of substitutable objects can lead to an entirely different style of program than we tend to create with traditional procedural programming.

Even though Carmack knew that "a great deal of stuff that goes on in the aerospace industry should not be emulated by anyone, and is often self destructive", he thought that this idea might have practical value, so he tried it out. The experience helped him evolve his programming style in a promising direction. This is a great example of the power of the pedagogical pattern known as Three Bears: take an idea to its extreme in order to learn the boundaries of its utility. Sometimes, you will find that those boundaries lie beyond what you originally thought.

Carmack's whole article is worth a read. Thanks to Jonathan Blow for preserving it for us.


The image above is an example of the cover art for the "Commander Keen" series of video games, courtesy of Wikipedia. John Carmack was also the lead programmer for this series. What a remarkable oeuvre he has produced.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

July 20, 2016 9:16 AM

A Few Quick Lessons from Five Small Joy Programs

I recently wrote about some experiences programming in Joy, in which I use Joy to solve five problems that make up a typical homework assignment early in my Programming Languages course. These problems introduce my students to writing simple functions in a functional style, using Racket. Here is my code, if you care to check it out. I'm just getting back to stack programming, so this code can surely be improved. Feel free to email me suggestions or tweet me at @wallingf!

What did these problems teach me about Joy?

  • The language's documentation is sparse. Like my students, I had to find out which Joy primitives were available to me. It has a lot of the basic arithmetic operators you'd expect, but finding them meant searching through a plain-text file. I should write Racket-caliber documentation for the language to support my own work.

  • The number and order of the arguments to a function matters a lot. A function that takes several arguments can become more complicated than the corresponding Racket function, especially if you need to use them multiple times. I encountered this on my first day back to the language. In Racket, this problem requires a compound expression, but it is relatively straightforward, because arguments have names. With all its arguments on the stack, a Joy function has to do more work simply to access values, let alone replicate them for multiple uses.

  • A slight difference in task can lead to a large change in the code. For Problem 4, I implemented operators for modular addition, subtraction, and multiplication. +mod and *mod were elegant and straightforward. -mod was a different story. Joy has a rem operator that operates like Racket's remainder, but it has no equivalent to modulo. The fact that rem returns negative values means that I need a boolean expression and quoted programs and a new kind of thinking. This militates for a bigger shift in programming style right away.

  • I miss the freedom of Racket naming style. This isn't a knock on Joy, because most every programming language restricts severely the set of characters you can use in identifiers. But after being able to name functions +mod, in-range?, and int->char in Racket, the restrictions feel even more onerous.

  • As in most programming styles, the right answer in Joy is often to write the correct helpers. The difference in level of detail between +mod and *mod on the one hand and -mod on the other indicates that I am missing solution. A better approach is to implement a modulo operator and use it to write all three mod operators. This will hide lower-level details in a general-purpose operator. modulo would make a nice addition to a library of math operators.

  • Even simple problems can excite me about the next step. Several of these solutions, especially the mod operators, cry out for higher-order operators. In Racket, we can factor out the duplication in these operators and create a function that generates these functions for us. In Joy, we can do it, too, using quoted programs of the sort you see in the code for -mod. I'll be moving on to quoted programs in more detail soon, and I can't wait... I know that they will push me farther along the transition to the unique style of stack programming.

It's neat for me to be reminded that even the simplest little functions raise interesting design questions. In Joy, use of a stack for all data values means that identifying the most natural order for the arguments we make available to an operators can have a big effect on the ability to read and write code. In what order will arguments generally appear "in the wild"?

In the course of experimenting and trying to debug my code (or, even more frustrating, trying to understand why the code I wrote worked), I even wrote my first general utility operator:

    DEFINE clear  == [] unstack.

It clears the stack so that I can see exactly what the code I'm about to run creates and consumes. It's the first entry in my own little user library, named

Fun, fun, fun. Have I ever said that I like to write programs?

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

July 19, 2016 10:32 AM

What Is The Function Of School Today?

Painter Adolph Gottlieb was dismissive of art school in the 1950s:

I'd have done what I'm doing now twenty years ago if I hadn't had to go through that crap. What is the function of the art school today? To confuse the student. To make a living for the teacher. The painter can learn from museums -- probably it is the only way he can to learn. All artists have to solve their problems in the context of their own civilization, painting what their time permits them to paint, extending the boundaries a little further.

It isn't much of a stretch to apply this to computer programming in today's world. We can learn so much these days from programs freely available on GitHub and elsewhere on the web. A good teacher can help, but in general is there a better way to learn how to make things than to study existing works and to make our own?

Most importantly, today's programmers-to-be have to solve their problems in the context of their own civilization: today's computing, whether that's mobile or web or Minecraft. University courses have a hard time keeping up with constant change in the programming milieu. Instead, they often rely on general principles that apply across most environments but which seem lifeless in their abstraction.

I hope that, as a teacher, I add some value for the living I receive. Students with interests and questions and goals help keep me and my courses alive. At least I can set a lower bound of not confusing my students any more than necessary.


(The passage above is quoted from Conversations with Artists, Selden Rodman's delightful excursion through the art world of the 1950s, chatting with many of the great artists of the era. It's not an academic treatise, but rather more an educated friend chatting with creative friends. I would thank the person who recommended this, but I have forgotten whose tweet or article did.)

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 18, 2016 12:47 PM

Getting a Sense of Joy Style Through the Stack

Last time, I wrote about returning to the programming language Joy, without giving too many details of the language or its style. Today, I'll talk a bit more what it means to say that Joy is a "stack language" and show how I sometimes think about the stack as I am writing a simple program. The evolution of the stack is helping me to think about how data types work in this language and to get a sense of why languages such as Joy are called concatenative.

Racket and other Lisp-like languages use prefix notation, unlike the more common infix notation we use in C, Java, and most mainstream languages. A third alternative is postfix notation, in which operators follow their operands. For example, this postfix expression is a valid Joy program:

    2 3 +

computes 2 + 3.

Postfix expressions are programs in a stack-based language. They correspond to a postfix traversal of a program tree. Where is the stack? It is used to interpret the program:

  • 2 is pushed onto the stack.
  • 3 is pushed onto the stack.
  • + is an operator. An operator pops its arguments from the stack, computes a result, and pushes the result on to the stack. + is a two-argument function, so it pops two arguments from the stack, computes their sum, and pushes the 5 back onto the stack.

Longer programs work the same way. Consider 2 3 + 5 *:

  • 2 is pushed onto the stack.
  • 3 is pushed onto the stack.
  • + pops the 2 and the 3, computes a 5, and pushes it on to the stack.
  • 5 is pushed onto the stack.
  • * pops the two 5s, computes a 25, and pushes it on to the stack.

This program is equivalent to 2 + 3 - 5, er, make that (2 + 3) * 5. As long as we know the arity of each procedure, postfix notation requires no rules for the precedence of operators.

The result of the program can be the entire stack or the top value on the stack. At the end of a program, the stack could finish with more than one value on it:

    2 3 + 5 * 6 3 /

leaves a stack consisting of 25 2.

Adding an operator to the end of that program can return us to a single-value stack:

    2 3 + 5 * 6 3 / -

leaves a stack of one value, 23.

This points out an interesting feature of this style of programming: we can compose programs simply by concatenating their source code. 2 * is a program: "double my argument", where all arguments are read from the stack. If we place it at the end of 2 3 + 5 * 6 3 / -, we get a new program, one that computes the value 46.

This gives us our first hint as to why this style is called concatenative programming.

As I am re-learning Joy's style, I think a lot about the stack, which holds the partial results of the computation I am trying to program. I often find myself simulating the state of stack to help me keep track of what to do next, whether on paper or in a comment on my code.

As an example, think back to the wind chill problem I discussed last time: given the the air temperature T and the wind speed V, compute the wind chill index T'.

    T' = 35.74 + 0.6215·T - 35.75·V0.16 + 0.4275·T·V0.16

This program expects to find T and V on top of the stack:

    T V

The program 0.16 pow computes x0.16 for the 'x' on top of the stack, so it leaves the stack in this state:

    T V'

In Joy documentation, the type of a program is often described with a before/after picture of the stack. The dup2 operator we saw last time is described as:

    dup2 :: a b -> a b a b

because it assumes two arbitrary values on top of the stack and leaves the stack with those values duplicated. In this fashion, we could describe the program 0.16 pow using

    0.16 pow :: N -> N

It expects a number N to be on top of the stack and leaves the stack with another number on top.

When I'm using these expressions to help me follow the state of my program, I sometimes use problem-specific names or simple modifiers to indicate changes in value or type. For the wind-chill program, I thought of the type of 0.16 pow as

    T V -> T V'
, because the values on the stack were a temperature and a velocity-based number.

Applied to this stack, dup2 converts the stack as

    T V' -> T V' T V'
, because the values on the stack were a temperature and a velocity-based number.

If we concatenate these two programs and evaluate them against the original stack, we get:

    [ initial stack]         T V
    0.16 pow              -> T V'
    dup2                  -> T V' T V'

I've been preserving these stack traces in my documentation, for me and any students who might end up reading my code. Here is the definition of wind-chill, taken directly from the source file:

    DEFINE wind-chill == 0.16 pow        (* -> T V'        *)
                         dup2            (* -> T V' T V'   *)
                         * 0.4275 *      (* -> T V' t3     *)
                         rollup          (* -> t3 T V'     *)
                         35.75 *         (* -> t3 T t2     *)
                         swap            (* -> t3 t2 T     *)
                         0.6215 *        (* -> t3 t2 t1    *)
                         35.74           (* -> t3 t2 t1 t0 *)
                         + swap - +.     (* -> T'          *)

After the dup2 step, the code alternates between computing a term of the formula and rearranging the stack for the next computation.

Notice the role concatenation plays. I can solve each substep in almost any order, paste the resulting little programs together, and boom! I have the final solution.

I don't know if this kind of documentation helps anyone but me, or if I will think it is as useful after I feel more comfortable in the style. But for now, I find it quite helpful. I do wonder whether thinking in terms of stack transformation may be helpful as I prepare to look into what type theory means for a language like Joy. We'll see.

I am under no pretense that this is a great Joy program, even in the context of a relative novice figuring things out. I'm simply learning, and having fun doing it.

Posted by Eugene Wallingford | Permalink | Categories: Computing

July 16, 2016 2:52 PM

A First Day Back Programming in Joy

I learned about the programming language Joy in the early 2000s, when I was on sabbatical familiarizing myself with functional programming. It drew me quickly into its web. Joy is a different sort of functional language, one in which function composition replaces function application as the primary focus. At that time, I wrote a bunch of small Joy programs and implemented a simple interpreter in PLT Scheme. After my sabbatical, though, I got pulled in a lot of different directions and lost touch with Joy. I saw it only every two or three semesters when I included it in my Programming Languages course. (The future of programming might not look like you expect it to....)

This spring, I felt Joy's call again and decided to make time to dive back into the language. Looking back over my notes from fifteen years ago, I'm surprised at some of the neat thoughts I had back then and at some of the code I wrote. Unfortunately, I have forgotten most of what I learned, especially about higher-order programming in Joy. I never reached the level of a Joy master anyway, but I feel like I'm starting from scratch. That's okay.

On Thursday, I sat down to write solutions in Joy for a few early homework problems from my Programming Languages course. These problems are intended to help my students learn the basics of functional programming and Racket. I figured they could help me do the same in Joy before I dove deeper, while also reminding me of the ways that programming in Joy diverges stylistically from more traditional functional style. As a bonus, I'd have a few more examples to show my students in class next time around.

It didn't take me long to start having fun. I'll talk more in upcoming posts about Joy, this style of programming, and -- I hope -- some of my research. For now, though, I'd like to tell you about one experience I had on my first day without getting into too many details.

In one homework problem, we approximate the wind chill index using this formula:

    T' = 35.74 + 0.6215·T - 35.75·V0.16 + 0.4275·T·V0.16
where T' is the wind chill index in degrees Fahrenheit, T is the air temperature in degrees Fahrenheit, and V is the wind speed in miles/hour. In Racket, this computation gives student a chance to write a compound expression and, if adventurous, to create a local variable to hold V0.16.

In Joy, we don't pass arguments to functions as in most other languages. Its operators pop their arguments from a common data stack and push their results back on to the stack. Many of Joy's operators manipulate the data stack: creating, rearranging, and deleting various items. For example, the dup operator makes a copy of the item on top of the stack, the swap operator swaps the top two items on the stack, and the rolldown operator moves the top two items on the stack below the third.

A solution to the wind-chill problem will expect to find T and V on top of the stack:

    T V
After computing V' = V0.16, the stack looks like this:
    T V'

The formula uses these values twice, so I really need two copies of each:

    T V' T V'

With a little work, I found that this sequence of operations does the trick:

    swap dup rolldown dup rolldown swap

From there, it didn't take long to find a sequence of operators that consumed these four values and left T' on the stack.

As I looked back over my solution, I noticed the duplication of dup rolldown in the longer expression shown above and thought about factoring it out. Giving that sub-phrase a name is hard, though, because it isn't all that meaningful on its own. However, the whole phrase is meaningful, and probably useful in a lot of other contexts: it duplicates the top two items on the stack. So I factored the whole phrase out and named it dup2:

    DEFINE dup2 == swap dup rolldown dup rolldown swap.
My first refactoring in Joy!

As soon as my fingers typed "dup2", though, my mind recognized it. Surely I had seen it before... So I went looking for "dup2" in Joy's documentation. It is not a primitive operator, but it is defined in the language's initial library,, a prelude that defines syntactic sugar in Joy itself:

    dup2 ==  dupd dup swapd

This definition uses two other stack operators, dupd and swapd, which are themselves defined using the higher-order operator dip. I'll be getting to higher-order operators soon enough, I thought; for now I was happy with my longer but conceptually simpler solution.

I was not surprised to find dup2 already defined in Joy. It defines a lot of cool stack operators, the sorts of operations every programmer needs to build more interesting programs in the language. But I was a little disappointed that my creation wasn't new, in the way that only a beginner can be disappointed when he learns that his discovery isn't new. My disappointment was more than offset by the thought that I had recognized an operator that the language's designer thought would be useful. I was already starting to feel like a Joy programmer again.

It was a fun day, and a much-needed respite from administrative work. I intend for it to be only the first day of many more days programming with Joy.

Posted by Eugene Wallingford | Permalink | Categories: Computing

July 07, 2016 2:01 PM

Oberon: GoogleMaps as Desktop UI

Oberon is a software stack created by Niklaus Wirth and his lab at ETH Zürich. Lukas Mathis describes some of what makes Oberon unusual, including the fact that its desktop is "an infinitely large two-dimensional space on which windows ... can be arranged":

It's incredibly easy to move around on this plane and zoom in and out of it. Instead of stacking windows, hiding them behind each other (which is possible in modern versions of Oberon), you simply arrange them next to each other and zoom out and in again to switch between them. When people held presentations using Oberon, they would arrange all slides next to each other, zoom in on the first one, and then simply slide the view one screen size to the right to go to the next slide.

This sounds like interacting with Google Maps, or any other modern map app. I wonder if anyone else is using this as a model for user interaction on the desktop?

Check out Mathis's article for more. The section "Everything is a Command Line" reminds me of workspaces in Smalltalk. I used to have several workspaces open, with useful snippets of code just waiting for me to do it. Each workspace window was like a custom text-based menu.

I've always liked the idea of Oberon and even considered using the programming language in my compilers course. (I ended up using a variant.) A version of Compiler Construction is now available free online, so everyone can see how Wirth's clear thinking lead to a sparse, complete, elegant whole. I may have to build the latest installment of Oberon and see what all they have working these days.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

July 06, 2016 12:43 PM

Computational Research Lets Us Ask New Questions, Political Science Edition

The latest example comes from, an organization that is built on the idea of data-driven approaches to old problems:

Almost all data about politics that you encounter comes from polls and surveys of individuals or else from analysis of geographic units such as precincts, counties and states. Individual data and geographic data do not capture the essential networks in which we all live -- households and friendships and communities. But other and newer kinds of data -- such as voter files that connect individuals to their households or network data that capture online connections -- revolutionize how we understand politics. By the end of this election cycle, expect to see many more discoveries about the social groupings that define our lives.

Computational research enables us to ask entirely new questions -- both ones that were askable before but not feasible to answer and ones that would not have been conceived before. Even if the question begins as whimsically as "How often do Democrats and Republicans marry one another?"

Back in December 2007, I commented on this in the context of communications studies and public relations. One of our CS master's students, Sergey Golitsynskiy, had just defended an M.A. thesis in communications studies that investigated the blogosphere's impact on a particular dust-up in the public relations world. Such work has the potential to help refine the idea of "the general public" within public relations, and even its nature of "publics". (Sergey is now a tenure-track professor here in communications studies.)

Data that encodes online connections enables us to explore network effects that we can't even see with simpler data. As the 538 piece says, this will revolutionize how we understand politics, along with so many other disciplines.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 27, 2016 2:12 PM

Looking Back to Chris Ford's StrangeLoop 2015 Talk

StrangeLoop 2015 is long in the books for most people, but I occasionally still think about some of the things I learned there. Chris Ford's recent blog post reminded me that I had a draft about his talk waiting to be completed and posted. Had I published this post closer to the conference, I would have called it Kolmogorov Music: Compression and Understanding.

(This is the second follow-up post about a StrangeLoop 2015 talk that is still on my mind. The previous follow-up was about Peter Alvaro's talk on languages and distributed systems.)

the opening screen for the Kolmogorov Music talk

Chris Ford stepped the podium in front of an emacs buffer. "Imagine a string of g's," he said, "infinitely long, going in both directions." This is an infinite string, he pointed out, with an 11-word description. That's the basic idea of Kolmogorov complexity, and the starting point for his talk.

I first read about Kolmogorov complexity in a couple of papers by Gregory Chaitin that I found on the early web back in the 1990s. It fascinated me then, and I went into "Kolmogorov Music", Ford's talk, with high hopes. It more than delivered. The talk was informative, technically clear, and entertaining.

Ford uses Clojure for this work in order to write macros. They allow him to talk about code at two levels: source and expansion. The source macro is his description of some piece of music, and the expansion is the music itself, "executable" by an interpreter.

He opened by demo'ing some cool music, including a couple of his own creations. Then he began his discussion of how complex a piece of music is. His measure of complexity is the ratio of the length of the evaluated data (the music) to the length of the macro (the program that generates it). This means that complexity is relative, in part, to the language of expression. If we used a language other than Clojure, the ratios would be different.

Once we settle on a programming language, we can compare the relative complexity of two pieces of music. This also gives rise to cool ideas such as conditional complexity, based on the distance between the programs that encode two pieces of music.

Compression algorithms do something quite similar: exploit our understanding of data to express it in fewer bytes. Ford said that he based his exploration on the paper Analysis by Compression by David Meredith, a "musicologist with a computational bent". Meredith thinks of listening to music as a model-building process that can be described using algorithms.

Programs have more expressive power than traditional music notation. Ford gave as an example clapping music that falls farther and farther behind itself as accompaniment continues. It's much easier to write this pattern using a programming language with repetition and offsets than using musical notation.

Everything has been cool so far. Ford pushed on to more coolness.

A minimalist idea can be described briefly. As Borges reminds us in The Library of Babel, a simple thing can contain things that are more complex than itself. Ford applied this idea to music. He recalled Carl Sagan's novel Contact, in which the constant pi was found to contain a hidden message. Inspired by Sagan, Ford looked to the Champernowne constant, a number created by concatenating all integers in succession -- 0.12345678910111213141516..., and turned it into music. Then he searched it for patterns.

Ford found something that sounded an awful lot like "Blurred Lines", a pop hit by Robin Thicke in 2013, and played it for us. He cheekily noted that his Champernowne song infringes the copyright on Thicke's song, which is quite humorous given the controversial resemblance of Thicke's song to "Got to Give It Up", a Marvin Gaye tune from 1977. Of course, Ford's song is infinitely long, so it likely infringes the copyright of every song ever written! The good news for him is that it also subsumes every song to be written in the future, offering him the prospect of a steady income as an IP troll.

Even more than usual, my summary of Ford's talk cannot possibly do it justice, because he shows code and plays music! Let me echo what was a common refrain on Twitter immediately after his talk at StrangeLoop: Go watch this video. Seriously. You'll get to see him give a talk using only emacs and a pair of speakers, and hear all of the music, too. Then check out Ford's raw material. All of his references, music, and code are available on his Github site.

After that, check out his latest blog entry. More coolness.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 13, 2016 2:47 PM

Patterns Are Obvious When You Know To Look For Them...

normalized counts of consecutive 8-digit primes (mod 31)

In Prime After Prime (great title!), Brian Hayes boils down into two sentences the fundamental challenge that faces people doing research:

What I find most surprising about the discovery is that no one noticed these patterns long ago. They are certainly conspicuous enough once you know how to look for them.

It would be so much easier to form hypotheses and run tests if interesting hypotheses were easier to find.

Once found, though, we can all see patterns. When they can be computed, we can all write programs to generate them! After reading a paper about the strong correlations among pairs of consecutive prime numbers, Hayes wrote a bunch of programs to visualize the patterns and to see what other patterns he might find. A lot of mathematicians did the same.

Evidently that was a common reaction. Evelyn Lamb, writing in Nature, quotes Soundararajan: "Every single person we've told this ends up writing their own computer program to check it for themselves."

Being able to program means being able to experiment with all kinds of phenomena, even those that seemingly took genius to discover in the first place.

Actually, though, Hayes's article gives a tour of the kind of thinking we all can do that can yield new insights. Once he had implemented some basic ideas from the research paper, he let his imagination roam. He tried different moduli. He visualized the data using heat maps. When he noticed some symmetries in his tables, he applied a cyclic shift to the data (which he termed a "twist") to see if some patterns were easier to identify in the new form.

Being curious and asking questions like these are one of the ways that researchers manage to stumble upon new patterns that no one has noticed before. Genius may be one way to make great discoveries, but it's not a reliable one for those of us who aren't geniuses. Exploring variations on a theme is a tactic we mortals can use.

Some of the heat maps that Hayes generates are quite beautiful. The image above is a heat map of the normalized counts of consecutive eight-digit primes, taken modulo 31. He has more fun making images of his twists and with other kinds of primes. I recommend reading the entire article, for its math, for its art, and as an implicit narration of how a computational scientist approaches a cool result.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

June 12, 2016 10:33 AM

The Tension Between Free and Expensive

Yesterday, William Stein's talk about the origins of SageMath spread rapidly through certain neighborhoods of Twitter. It is a thorough and somewhat depressing discussion of how hard it is to develop open source software within an academic setting. Writing code is not part of the tenure reward system or the system for awarding grants. Stein has tenure at the University of Washington but has decided that he has to start a company, SageMath, work for it full-time in order to create a viable open source alternative to the "four 'Ma's": Mathematica, Matlab, Maple, and Magma.

Stein's talk reminded me of something I read earlier this year, from a talk by Matthew Butterick:

"Information wants to be expensive, because it's so valuable ... On the other hand, information wants to be free, because the cost of getting it out is getting lower ... So you have these two fighting against each other."

This was said by a guy named Stewart Brand, way back in 1984.

So what's the message here? Information wants to be free? No, that's not the message. The message is that there are two forces in tension. And the challenge is how to balance the forces.

Proponents of open source software -- and I count myself one -- are often so glib with the mantra "information wants to be free" that we forget about the opposing force. Wolfram et al. have capitalized quite effectively on information's desire to be expensive. This force has an economic power that can overwhelm purely communitarian efforts in many contexts, to the detriment of open work. The challenge is figuring out how to balance the forces.

In my mind, Mozilla stands out as the modern paradigm of seeking a way to balance the forces between free and expensive, creating a non-profit shell on top of a commercial foundation. It also seeks ways to involve academics in process. It will be interesting to see whether this model is sustainable.

Oh, and Stewart Brand. He pointed out this tension thirty years ago. I recently recommended How Buildings Learn to my wife and thought I should look back at the copious notes I took when I read it twenty years ago. But I should read the book again myself; I hope I've changed enough since then that reading it anew brings new ideas to mind.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 08, 2016 2:48 PM

Happy Birthday Programming

Yesterday, I wrote me some Java. It was fun.

A few days ago, I started wondering if there was something unique I could send my younger daughter for her birthday today. My daughters and I were all born in presidential election years, which is neat little coincidence. This year's election is special for the birthday girl: it is her first opportunity to vote for the president. She has participated in the process throughout, which has seen both America's most vibrant campaign for progressive candidate in at least forty years and the first nomination of a woman by a major party. Both of these are important to her.

In the spirit of programming and presidential politics, I decided to write a computer program to convert images into the style of Shepard Fairey's iconic Obama "Hope" poster and then use it to create a few images for her.

I dusted off Dr. Java and fired up some code I wrote when I taught media computation in our intro course many years ago. It had been a long time since I had written any Java at all, but it came back just like riding a bike. More than decade of writing code in a language burns some pretty deep grooves in the mind.

I found RGB values to simulate the four colors in Fairey's poster in an old message to the mediacomp mailing list:

    Color darkBlue  = new Color(0, 51, 76);
    Color lightBlue = new Color(112, 150, 158);
    Color red       = new Color(217, 26, 33);
    Color yellow    = new Color(252, 227, 166);

Then came some experimentation...

  • First, I tried turning each pixel into the Fairey color to which it was closest. That gave an image that was grainy and full of lines, almost like a negative.

  • Then I calculated the saturation of each pixel (the average of its RGB values) and translated the pixel into one of the four colors depending on which quartile it was in. If the saturation was less than 256/4, it went dark blue; if it was less than 256/2, it went red; and so on. This gave a better result, but some images ended up having way too much of one or two of the colors.

  • Attempt #2 skews the outputs because in many images (most?) the saturation values are not distributed evenly over the four quartiles. So I wrote a function to "normalize" the quartiles. I recorded all of the saturation values in the image and divided them evenly across the four colors. The result was an image with an equal numbers of pixels assigned to each of the four colors.

I liked the outputs of this third effort quite a bit, at least for the photos I gave it as input. Two of them worked out especially well. With a little doctoring in Photoshop, they would have an even more coherent feel to them, like an artist might produce with a keener eye. Pretty good results for a few fun minutes of programming.

Now, let's hope my daughter likes them. I don't think she's ever received a computer-generated present before, at least not generated by a program her dad wrote!

The images I created were gifts to her, so I'll not share them here. But if you've read this far, you deserve a little something, so I give you these:

Eugene obamified by his own program, version 1
Eugene obamified by his own program, version 2

Now that is change we can all believe in.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

June 03, 2016 3:07 PM

The Lisp 1.5 Universal Interpreter, in Racket

John McCarthy presents Lisp from on High
"John McCarthy presents Recursive
Functions of Symbolic Expressions
and Their Computation by Machine,
Part I"
courtesy of Classic Programmer Paintings

Earlier this week, Alan Kay was answering questions on Hacker News and mentioned Lisp 1.5:

This got deeper if one was aware of how Lisp 1.5 had been implemented with the possibility of late bound parameter evaluation ...

Kay mentions Lisp, and especially Lisp 1.5, often whenever he is talking about the great ideas of computing. He sometimes likens McCarthy's universal Lisp interpreter to Maxwell's equations in physics -- a small, simple set of equations that capture a huge amount of understanding and enable a new way of thinking. Late-bound evaluation of parameters is one of the neat ideas you can find embedded in that code.

The idea of a universal Lisp interpreter is pretty simple: McCarthy defined the features of Lisp in terms of the language features themselves. The interpreter consists of two main procedures:

  • a procedure that evaluates an expression, and
  • a procedure that applies a procedure to its arguments.

These procedures recurse mutually to evaluate a program.

This is one of the most beautiful ideas in computing, one that we take for granted today.

The syntax and semantics of Lisp programs are so sparse and so uniform that the McCarthy's universal Lisp interpreter consisted of about one page of Lisp code. Here that page is: Page 13 of the Lisp 1.5 Programmer's Manual, published in 1962.

Page 13 of the Lisp 1.5 Programmer's Manual

You may see this image passed around the Twitter and the web these whenever Lisp 1.5 is mentioned. But the universal Lisp interpreter is a program. Why settle for a JPG image?

While preparing for the final week of my programming languages course this spring, I sat down and implemented the Lisp interpreter on Page 13 of the Lisp 1.5 manual in universal-lisp-interpreter.rkt, using Racket.

I tried to reproduce the main procedures from the manual as faithfully as I could. You see the main two functions underlying McCarthy's idea: "evaluate an expression" and "apply a function to its arguments". The program assumes the existence of only a few primitive forms from Racket:

  • the functions cons, car, cdr, atom, and eq?
  • the form lambda, for creating functions
  • the special forms quote and cond
  • the values 't and nil

't means true, and nil means both false and the empty list. My Racket implementation uses #t and #f internally, but they do not appear in the code for the interpreter.

Notice that this interpreter implements all of the language features that it uses: the same five primitive functions, the same two special forms, and lambda. It also defines label, a way to create recursive functions. (label offers a nice contrast to the ways we talk about implementing recursive functions in my course.)

The interpreter uses a few helper functions, which I also define as in the manual. evcon evaluates a cond expression, and evlis evaluates a list of arguments. assoc looks up the value for a key in an "association list", and pairlis extends an existing association list with new key/value pairs. (In my course, assoc and pairlis correspond to basic operations on a finite function, which we use to implement environments.)

I enjoyed walking through this code briefly with my students. After reading this code, I think they appreciated anew the importance of meaningful identifiers...

The code works. Open it up in Racket and play with a Lisp from the dawn of time!

It really is remarkable how much can be built out of so little. I sometimes think of the components of this program as the basic particles out of which all computation is built, akin to an atomic theory of matter. Out of these few primitives, all programs can be built.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 02, 2016 3:10 PM

Restoring Software's Good Name with a Humble Script

In a Startup School interview earlier this year, Paul Graham reminds software developers of an uncomfortable truth:

Still, to this day, one of the big things programmers do not get is how traumatized users have been by bad and hard-to-use software. The default assumption of users is, "This is going to be really painful, and in the end, it's not going to work."

I have encountered this trauma even more since beginning to work with administrators on campus a decade ago. "Campus solutions" track everything from enrollments to space usage. So-called "business-to-business software" integrates purchasing with bookkeeping. Every now and then the university buys and deploys a new system, to manage hiring, say, or faculty travel. In almost every case, interacting with the software is painful for the user, and around the edges it never quite seems to fit what most users really want.

When administrators or faculty relate their latest software-driven pain, I try to empathize while also bring a little perspective to their concerns. These systems address large issues, and trying to integrate them into a coherent whole is a very real challenge, especially for an understaffed group of programmers. Sometimes, the systems are working exactly as they should to implement an inconvenient policy. Unfortunately, users don't see the policy on a daily basis; they see the inconvenient and occasionally incomplete software that implements it.

Yet there are days when even I have to complain out loud. Using software can be painful.

Today, though, I offer a story of nascent redemption.

After reviewing some enrollment data earlier this spring, my dean apologized in advance for any errors he had made in the reports he sent to the department heads. Before he can analyze the data, he or one of the assistant deans has to spend many minutes scavenging through spreadsheets to eliminate rows that are not part of the review. They do this several times a semester, which adds up to hours of wasted time in the dean's office. The process is, of course, tedious and error-prone.

I'm a programmer. My first thought was, "A program can do this filtering almost instantaneously and never make an error."

In fact, a few years ago, I wrote a simple Ruby program to do just this sort of filtering for me, for a different purpose. I told the dean that I would be happy to adapt it for use in his office to process data for all the departments in the college. My primary goal was to help the dean; my ulterior motive was self-improvement. On top of that, this was a chance to put my money where my mouth is. I keep telling people that a few simple programs can make our lives better, and now I could provide a concrete example.

Last week, I whipped up a new Python script. This week, I demoed it to the dean and an assistant dean. The dean's first response was, "Wow, this will help us a lot." The rest of the conversation focused on ways that the program could help them even more. Like all users, once they saw what was possible, they knew even better what they really wanted.

I'll make a few changes and deliver a more complete program soon. I'll also help the users as they put it to work and run into any bugs that remain. It's been fun. I hope that this humble script is an antidote, however small, to the common pain of software that is hard to use and not as helpful as it should be. Many simple problems can be solved by simple programs.

Posted by Eugene Wallingford | Permalink | Categories: Computing

May 27, 2016 1:38 PM

Brilliance Is Better Than Magic, Because You Get To Learn It

Brent Simmons has recently suggested that Swift would be better if it were more dynamic. Some readers have interpreted his comments as an unwillingness to learn new things. In Oldie Complains About the Old Old Ways, Simmons explains that new things don't bother him; he simply hopes that we don't lose access to what we learned in the previous generation of improvements. The entry is short and worth reading in its entirety, but the last sentence of this particular paragraph deserves to be etched in stone:

It seemed like magic, then. I later came to understand how it worked, and then it just seemed like brilliance. (Brilliance is better than magic, because you get to learn it.)

This gets to close to the heart of why I love being a computer scientist.

So many of the computer programs I use every day seem like magic. This might seem odd coming from a computer scientist, who has learned how to program and who knows many of the principles that make complex software possible. Yet that complexity takes many forms, and even a familiar program can seem like magic when I'm not thinking about the details under its hood.

As a computer scientist, I get to study the techniques that make these programs work. Sometimes, I even get to look inside the new program I am using, to see the algorithms and data structures that bring to life the experience that feels like magic.

Looking under the hood reminds me that it's not really magic. It isn't always brilliance either, though. Sometimes, it's a very cool idea I've never seen or thought about before. Other times, it's merely a bunch of regular ideas, competently executed, woven together in a way that give an illusion of magic. Regular ideas, competently executed, have their own kind of beauty.

After I study a program, I know the ideas and techniques that make it work. I can use them to make my own programs.

This fall, I will again teach a course in compiler construction. I will tell a group of juniors and seniors, in complete honesty, that every time I compile and execute a program, the compiler feels like magic to me. But I know it's not. By the end of the semester, they will know what I mean; it won't feel like magic to them any more, either. They will have learned how their compilers work. And that is even better than the magic, which will never go away completely.

After the course, they will be able to use the ideas and techniques they learn to write their own programs. Those programs will probably feel like magic to the people who use them, too.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development

May 23, 2016 1:55 PM

Your Occasional Reminder to Use Plain Text Whenever Possible

The authors of Our Lives, Encoded found that they had lost access to much of their own digital history to the proprietary data formats of long-dead applications:

Simple text files have proven to be the only format interoperable through the decades. As a result, they are the only artifacts that remain from my own digital history.

I myself have lost access to many WordPerfect files from the '80s in their original form, though I have been migrating their content to other formats over the years. I was fortunate, though, to do most of my early work in VMS and Unix, so a surprising number of my programs and papers from that era are still readable as they were then. (Occasionally, this requires me to dust off troff to see what I intended for them to look like then.)

However, the world continues to conspire against us. Even when we are doing something that is fundamentally plain text, the creators of networks and apps build artificial barriers between their services.

One cannot simply transfer a Twitter profile over to Facebook, or message a Snapchat user with Apple's iMessage. In the sense that they are all built to transmit text and images, these platforms aren't particularly novel, they're just designed to be incompatible.

This is one more reason that you will continue to find me consorting in the ancient technology of email. Open protocol, plain text. Plenty of goodness there, even with the spam.

Posted by Eugene Wallingford | Permalink | Categories: Computing

May 15, 2016 9:36 AM

An Interview about Encryption

A local high student emailed me last week to say that he was writing a research paper about encryption and the current conversation going on regarding its role in privacy and law enforcement. He asked if I would be willing to answer a few interview questions, so that he could have a few expert quotes for his paper. I'm always glad when our local schools look to the university for expertise, and I love to help young people, so I said yes.

I have never written anything here about my take on encryption, Edward Snowden, or the FBI case against Apple, so I figured I'd post my answers. Keep in mind that my expertise is in computer science. I am not a lawyer, a political scientist, or a philosopher. But I am an informed citizen who knows a little about how computers work. What follows is a lightly edited version of the answers I sent the student.

  1. Do you use encryption? If so, what do you use?

    Yes. I encrypt several disk images that hold sensitive financial data. I use encrypted files to hold passwords and links to sensitive data. My work laptop is encrypted to protect university-related data. And, like everyone else, I happily use https: when it encrypts data that travels between me and my bank and other financial institutions on the web.

  2. In light of the recent news on groups like ISIS using encryption, and the Apple v. Department of Justice, do you support legislation that eliminates or weakens powerful encryption?

    I oppose any legislation that weakens strong encryption for ordinary citizens. Any effort to weaken encryption so that the government can access data in times of need weakens encryption for all people at all times and against all intruders.

  3. Do you think the general good of encryption (protection of data and security of users) outweighs or justifies its usage when compared to the harmful aspects of it (being used by terrorists groups or criminals)?

    I do. Encryption is one of the great gifts that computer science has given humanity: the ability to be secure in one's own thoughts, possessions, and communication. Any tool as powerful as this one can be misused, or used for evil ends.

    Encryption doesn't protect us from only the U.S. government acting in good faith. It protects people from criminals who want to steal our identities and our possessions. It protects people from the U.S. government acting in bad faith. And it protects people from other governments, including governments that terrorize their own people. If I were a citizen of a repressive regime in the Middle East, Africa, Southeast Asia, or anywhere else, I would want the ability to communicate without intrusion from my government.

    Those of us who are lucky to live in safer, more secure circumstances owe this gift to the people who are not so lucky. And weakening it for anyone weakens it for everyone.

  4. What is your response to someone who justifies government suppression of encryption with phrases like "What are you hiding?" or "I have nothing to hide."?

    I think that most people believe in privacy even when they have nothing to hide. As a nation, we do not allow police to enter our homes at any time for any reason. Most people lock their doors at night. Most people pull their window shades down when they are bathing or changing clothes. Most people do not have intimate relations in public view. We value privacy for many reasons, not just when we have something illegal to hide.

    We do allow the police to enter our homes when executing a search warrant, after the authorities have demonstrated a well-founded reason to believe it contains material evidence in an investigation. Why not allow the authorities to enter or digital devices under similar circumstances? There are two reasons.

    First, as I mentioned above, weakening encryption so that the government can access data in times of legitimate need weakens encryption for everyone all the time and makes them vulnerable against all intruders, including bad actors. It is simply not possible to create entry points only for legitimate government uses. If the government suppresses encryption in order to assist law enforcement, there will be disastrous unintended side effects to essential privacy of our data.

    Second, our digital devices are different than our homes and other personal property. We live in our homes and drive our cars, but our phones, laptops, and other digital devices contain fundamental elements of our identity. For many, they contain the entirety of our financial and personal information. They also contain programs that enact common behaviors and would enable law enforcement to recreate past activity not stored on the device. These devices play a much bigger role in our lives than a house.

  5. In 2013 Edward Snowden leaked documents detailing surveillance programs that overstepped boundaries spying on citizens. Do you think Snowden became "a necessary evil" to protect citizens that were unaware of surveillance programs?

    Initially, I was unsympathetic to Snowden's attempt to evade detainment by the authorities. The more I learned about the programs that Snowden had uncovered, the more I came to see that his leak was an essential act of whistleblowing. The American people deserve to know what their government is doing. Indeed, citizens cannot direct their government if they do not know what their elected officials and government agencies are doing.

  6. In 2013 to now, the number of users that are encrypting their data has significantly risen. Do you think that Snowden's whistleblowing was the action responsible for a massive rise in Americans using encryption?

    I don't know. I would need to see some data. Encryption is a default in more software and on more devices now. I also don't know what the trend line for user encryption looked like before his release of documents.

  7. Despite recent revelations on surveillance, millions of users still don't voluntarily use encryption. Do you believe it is fear of being labeled a criminal or the idea that encryption is unpatriotic or makes them an evil person?

    I don't know. I expect that there are a number of bigger reasons, including apathy and ignorance.

  8. Encryption defaults on devices like iPhones, where the device is encrypted while locked with a passcode is becoming a norm. Do you support the usage of default encryption and believe it protects users who aren't computer savvy?

    I like encryption by default on my devices. It comes with risks: if I lose my password, I lose access to my own data. I think that users should be informed that encryption is turned on by default, so that they can make informed choices.

  9. Should default encryption become required by law or distributed by the government to protect citizens from foreign governments or hackers?

    I think that we should encourage people to encrypt their data. At this point, I am skeptical of laws that would require it. I am not a legal scholar and do not know that the government has the authority to require it. I also don't know if that is really what most Americans want. We need to have a public conversation about this.

  10. Do you think other foreign countries are catching up or have caught up to the United States in terms of technical prowess? Should we be concerned?

    People in many countries have astonishing technical prowess. Certainly individual criminals and other governments are putting that prowess to use. I am concerned, which is one reason I encrypt my own data and encourage others to do so. I hope that the U.S. government and other American government agencies are using encryption in an effort to protect us. This is one reason I oppose the government mandating weakness in encryption mechanisms for its own purposes.

  11. The United States government disclosed that it was hacked and millions of employees information was compromised. Target suffered a breach that resulted in credit card information being stolen. Should organizations and companies be legally responsible for breaches like these? What reparations should they make?

    I am not a lawyer, but... Corporations and government agencies should take all reasonable precautions to protect the sensitive data they store about their customers and citizens. I suspect that corporations are already subject to civil suit for damages caused by data breaches, but that places burdens on people to recover damages for losses due to breached data. This is another area where we as a people need to have a deeper conversation so that we can decide to what extent we want to institute safeguards into the law.

  12. Should the US begin hacking into other countries infrastructures and businesses to potentially damage that country in the future or steal trade secrets similar to what China has done to us?

    I am not a lawyer or military expert, but... In general, I do not like the idea of our government conducting warfare on other peoples and other governments when we are not in a state of war. The U.S. should set a positive moral example of how a nation and a people should behave.

  13. Should the US be allowed to force companies and corporations to create backdoors for the government? What do believe would be the fallout from such an event?

    No. See the third paragraph of my answer to #4.

As I re-read my answers, I realize that, even though I have thought a lot about some of these issues over the years, I have a lot more thinking to do. One of my takeaways from the interview is that the American people need to think about these issues and have public conversations in order to create good public policy and to elect officials who can effectively steward the government in a digital world. In order for this to happen, we need to teach everyone enough math and computer science that they can participate effectively in these discussions and in their own governance. This has big implications for our schools and science journalism.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

May 11, 2016 2:45 PM

Why Do Academic Research in Programming Languages?

When people ask Jean Yang this question, she reminds them that most of the features in mainstream languages follow decades of research:

Yes, Guido Van Rossum was a programmer and not a researcher before he became the Benevolent Dictator of Python. But Python's contribution is not in innovating in terms of any particular paradigm, but in combining well features like object orientation (Smalltalk, 1972, and Clu, 1975), anonymous lambda functions (the lambda calculus, 1937), and garbage collection (1959) with an interactive feel (1960s).

You find a similar story with Matz and Ruby. Many other languages were designed by people working in industry but drawing explicitly on things learned by academic research.

I don't know what percentage of mainstream languages were designed by people in industry rather than academia, but the particular number is beside the point. The same is true in other areas of research, such as databases and networks. We need some research to look years and decades into the future in order to figure what is and isn't possible. That research maps the terrain that makes more applied work, whether in industry or academia, possible.

Without academics betting years of their career on crazy ideas, we are doomed to incremental improvements of old ideas.

Posted by Eugene Wallingford | Permalink | Categories: Computing

May 05, 2016 1:45 PM


In her 1942 book Philosophy in a New Key, philosopher Susanne Langer wrote:

A question is really an ambiguous proposition; the answer is its determination.

This sounds like something a Prolog programmer might say in a philosophical moment. Langer even understood how tough it can be to write effective Prolog queries:

The way a question is asked limits and disposes the ways in which any answer to it -- right or wrong -- may be given.

Try sticking a cut somewhere and see what happens...

It wouldn't be too surprising if a logical philosopher reminded me of Prolog, but Langer's specialties were consciousness and aesthetics. Now that I think about it, though, this connection makes sense, too.

Prolog can be a lot of fun, though logic programming always felt more limiting to me than most other styles. I've been fiddling again with Joy, a language created by a philosopher, but every so often I think I should earmark some time to revisit Prolog someday.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

May 02, 2016 4:30 PM

A Pawn and a Move

So, a commodity chess program is now giving odds of a pawn and a move to a world top-ten player -- and winning?

The state of computer chess certainly has changed since the fall of 1979, when I borrowed Mike Jeffers's Chess Challenger 7 and played it over and over and over. I was a rank novice, really just getting my start as a player, yet after a week or so I was able to celebrate my first win over the machine, at level 3. You know what they say about practice...

My mom stopped by our study room several times during that week, trying to get me to stop playing. It turns out that she and my dad had bought me a Chess Challenger 7 for Christmas, and she didn't want me to tire of my present before I had even unwrapped it. She didn't know just how not tired I would get of that computer. I wore it out.

When I graduated with my Ph.D., my parents bought me Chess Champion 2150L, branded by in the name of world champion Garry Kasparov. The 2150 in the computer's name was a rough indication that it played expert-level chess, much better than my CC7 and much better than me. I could beat it occasionally in a slow game, but in speed chess it pounded me mercilessly. I no longer had the time or inclination to play all night, every night, in an effort to get better, so it forever remained my master.

Now US champ Hikaru Nakamura and world champ Magnus Carlsen know how I feel. The days of any human defeating even the programs you can buy at Radio Shack have long passed.

Two pawns and move odds against grandmasters, and a pawn and a move odds against the best players in the world? Times have changed.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

April 29, 2016 3:30 PM

A Personal Pantheon of Programming Books

Michael Fogus, in the latest issue of Read-Eval-Print-λove, writes:

The book in question was Thinking Forth by Leo Brodie (Brodie 1987) and upon reading it I immediately put it into my own "personal pantheon" of influential programming books (along with SICP, AMOP, Object-Oriented Software Construction, Smalltalk Best Practice Patterns, and Programmers Guide to the 1802).

Mr. Fogus has good taste. Programmers Guide to the 1802 is new to me. I guess I need to read it.

The other five books, though, are in my own pantheon influential programming books. Some readers may be unfamiliar with these books or the acronyms, or aware that so many of them are available free online. Here are a few links and details:

  • Thinking Forth teaches us how to program in Forth, a concatenative language in which programs run against a global stack. As Fogus writes, though, Brodie teaches us so much more. He teaches a way to think about programs.

  • SICP is Structure and Interpretation of Computer Programs, hailed by many as the greatest book on computer programming ever written. I am sympathetic to this claim.

  • AMOP is The Art of the Metaobject Protocol, a gem of a book that far too few programmers know about. It presents a very different and more general kind of OOP than most people learn, the kind possible in a language like Common Lisp. I don't know of an authorized online version of this book, but there is an HTML copy available.

  • Object-Oriented Software Construction is Bertrand Meyer's opus on OOP. It did not affect me as deeply as the other books on this list, but it presents the most complete, internally consistent software engineering philosophy of OOP that I know of. Again, there seems to be an unauthorized version online.

  • I love Smalltalk Best Practice Patterns and have mentioned it a couple of times over the years [ 1 | 2 ]. Ounce for ounce, it contains more practical wisdom for programming in the trenches than any book I've read. Don't let "Smalltalk" in the title fool you; this book will help you become a better programmer in almost any language and any style. I have a PDF of a pre-production draft of SBPP, and Stephane Ducasse has posted a free online copy, with Kent's blessing.

Paradigms of Artificial Intelligence Programming

There is one book on my own list that Fogus did not mention: Paradigms of Artificial Intelligence Programming, by Peter Norvig. It holds perhaps the top position in my personal pantheon. Subtitled "Case Studies in Common Lisp", this book teaches Common Lisp, AI programming, software engineering, and a host of other topics in a classical case studies fashion. When you finish working through this book, you are not only a better programmer; you also have working versions of a dozen classic AI programs and a couple of language interpreters.

Reading Fogus's paragraph of λove for Thinking Forth brought to mind how I felt when I discovered PAIP as a young assistant professor. I once wrote a short blog entry praising it. May these paragraphs stand as a greater testimony of my affection.

I've learned a lot from other books over the years, both books that would fit well on this list (in particular, A Programming Language by Kenneth Iverson) and others that belong on a different list (say, Gödel, Escher, Bach -- an almost incomparable book). But I treasure certain programming books in a very personal way.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development, Teaching and Learning

April 11, 2016 2:53 PM

A Tax Form is Really a Program

I finally got around to preparing my federal tax return this weekend. As I wrote a decade ago, I'm one of those dinosaurs who still does taxes by hand, using pencil and paper. Most of this works involves gathering data from various sources and entering numbers on a two-page Form 1040. My family's finances are relatively simple, I'm reasonably well organized, and I still enjoy the annual ritual of filling out the forms.

For supporting forms such as Schedules A and B, which enumerate itemized deductions and interest and dividend income, I reach into my books. My current accounting system consists of a small set of Python programs that I've been developing over the last few years. I keep all data in plain text files. These files are amenable to grep and simple Python programs, which I use to create lists and tally numbers to enter into forms. I actually enjoy the process and, unlike some people, enjoy reflecting once each year about how I support "we, the people" in carrying out our business. I also reflect on the Rube Goldberg device that is US federal tax code.

However, every year there is one task that annoys me: computing the actual tax I owe. I don't mind paying the tax, or the amount I owe. But I always forget how annoying the Qualified Dividends and Capital Gain Tax Worksheet is. In case you've never seen it, or your mind has erased its pain from your memory in an act of self-defense, here it is:

Qualified Dividends and Capital Gain Tax Worksheet--Line 44

It may not seem so bad at this moment, but look at that logic. It's a long sequence of "Enter the smaller of line X or line Y" and "Add lines Z and W" instructions, interrupted by an occasional reference to an entry on another form or a case statement to select a constant based on your filing status. By the time I get to this logic puzzle each year, I am starting to tire and just want to be done. So I plow through this mess by hand, and I start making mistakes.

This year I made a mistake in the middle of the form, comparing the wrong numbers when instructed to choose the smaller. I realized my mistake when I got to a line where the error resulted in a number that made no sense. (Fortunately, I was still alert enough to notice that much!) I started to go back and refigure from the line with the error, when suddenly sanity kicked it.

This worksheet is a program written in English, being executed by a tired, error-prone computer: me. I don't have to put up with this; I'm a programmer. So I turned the worksheet into a Python program.

This is what the Qualified Dividends and Capital Gain Tax Worksheet for Line 44 of Form 1040 (Page 44 of the 2015 instruction book) could be, if we weren't still distributing everything as dead PDF:

line   = [None] * 28

line[ 0] = 0.00 # unused line[ 1] = XXXX # 1040 line 43 line[ 2] = XXXX # 1040 line 9b line[ 3] = XXXX # 1040 line 13 line[ 4] = line[ 2] + line[ 3] line[ 5] = XXXX # 4952 line 4g line[ 6] = line[ 4] - line[ 5] line[ 7] = line[ 1] - line[ 6] line[ 8] = XXXX # from worksheet line[ 9] = min(line[ 1],line[ 8]) line[10] = min(line[ 7],line[ 9]) line[11] = line[9] - line[10] line[12] = min(line[ 1],line[ 6]) line[13] = line[11] line[14] = line[12] - line[13] line[15] = XXXX # from worksheet line[16] = min(line[ 1],line[15]) line[17] = line[ 7] + line[11] line[18] = line[16] - line[17] line[19] = min(line[14],line[18]) line[20] = 0.15 * line[19] line[21] = line[11] + line[19] line[22] = line[12] - line[21] line[23] = 0.20 * line[22] line[24] = XXXX # from tax table line[25] = line[20] + line[23] + line[24] line[26] = XXXX # from tax table line[27] = min(line[25],line[26])

i = 0 for l in line: print('{:>2} {:10.2f}'.format(i, l)) i += 1

This is a quick-and-dirty first cut, just good enough for what I needed this weekend. It requires some user input, as I have to manually enter values from other forms, from the case statements, and from the tax table. Several of these steps could be automated, with only a bit more effort or a couple of input statements. It's also not technically correct, because my smaller-of tests don't guard for a minimum of 0. Maybe I'll add those checks soon, or next year if I need them.

Wouldn't it be nice, though, if our tax code were written as computer code, or if we could at least download worksheets and various forms as simple programs? I know I can buy commercial software to do this, but I shouldn't have to. There is a bigger idea at play here, and a principle. Computers enable so much more than sharing PDF documents and images. They can change how we write many ideas down, and how we think. Most days, we barely scratch the surface of what is possible.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 08, 2016 4:27 PM

"Algorithm" is not a Postmodern Concept

Ken Perlin's The Romantic View of Programming opens:

I was attending a panel recently on the topic of new media art that seemed to be culturally split. There were panelists who were talking about abstract concepts like "algorithm" as though these were arcane postmodern terms, full of mysterious power and potential menace.

I encounter this sort of thinking around my own campus. Not everyone needs to learn a lot about computer science, but it would be nice if we could at least alleviate this misunderstanding.

An algorithm is not a magic incantation, even when its implementation seems to perform magic. For most people, as for Perlin, an algorithm is ultimately a means toward a goal. The article closes with the most important implication of the author's romantic view of programming: "And if some software tool you need doesn't exist, you build it." That can be as true for a new media artist as it is for a run-of-the-mill programmer like me.

Posted by Eugene Wallingford | Permalink | Categories: Computing

March 31, 2016 2:00 PM

TFW Your Students Get Abstraction

A colleague sent me the following exchange from his class, with the tag line "Best comments of the day." His students were working in groups to design a Java program for Conway's Game of Life.

Student 1: I can't comprehend what you are saying.

Student 2: The board doesn't have to be rectangular, does it?

Instructor: In Conway's design, it was. But abstractly, no.

Student 3: So you could have a board of different shapes, or you could even have a three-dimensional "board". Each cell knows its neighbors even if we can't easily display it to the user.

Instructor: Sure, "neighbor" is an abstract concept that you can implement differently depending on your need.

Student 2: I knew there was a reason I took linear algebra.

Student 1: Ok. So let's only allow rectangular boards.

Maybe Student 1 still can't comprehend what everyone is saying... or perhaps he or she understands perfectly well and is a pragmatist. YAGNI for the win!

It always makes me happy when a student encounters a situation in which linear algebra is useful and recognizes its applicability unprompted.

I salute all three of these students, and the instructor who is teaching the class. A good day.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 14, 2016 5:27 PM

Can AlphaGo Teach Us to Play Go Better?

All Systems Go -- the cover of Nature magazine
Nature heralds AlphaGo's arrival
courtesy of the American Go Association

In Why AlphaGo Matters, Ben Kamphaus writes:

AlphaGo recognises strong board positions by first recognizing visual features in the board. It's connecting movements to shapes it detects. Now, we can't see inside AlphaGo unless DeepMind decides they want to share some of the visualizations of its intermediate representations. I hope they do, as I bet they'd offer a lot of insight into both the game of Go and how AlphaGo specifically is reasoning about it.

I'm not sure seeing visualizations of AlphaGo's intermediate representations would offer much insight into either the game of Go or how AlphaGo reasons about it, but I would love to find out.

One of the things that drew me to AI when I was in high school and college was the idea that computer programs might be able to help us understand the world better. At the most prosaic level, I though this might happen in what we had to learn in order to write an intelligent program, and in how we structured the code that we wrote. At a more interesting level, I thought that we might have a new kind of intelligence with which to interact, and this interaction would help us to learn more about the domain of the program's expertise.

Alas, computer chess advanced mostly by making computers that were even faster at applying the sort of knowledge we already have. In other domains, neural networks and then statistical approaches led to machines capable of competent or expert performance, but their advances were opaque. The programs might shed light on how to engineer systems, but the systems themselves didn't have much to say to us about their domains of expertise or competence.

Intelligent programs, but no conversation. Even when we play thousands of games against a chess computer, the opponent seems otherworldly, with no new principles emerging. Perhaps new principles are there, but we cannot see them. Unfortunately, chess computers cannot explain their reasoning to us; they cannot teach us. The result is much less interesting to me than my original dreams for AI.

Perhaps we are reaching a point now where programs such as AlphaGo can display the sort of holistic, integrated intelligence that enables them to teach us something about the game -- even if only by playing games with us. If it turns out that neural nets, which are essentially black boxes to us, are the only way to achieve AI that can work with us at a cognitive level, I will be chagrined. And most pleasantly surprised.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

March 07, 2016 5:12 PM

Solving a Fun Little Puzzle with Analysis and Simulation

I'm on a mailing list of sports fans who happen also to be geeks of various kinds, including programmers and puzzle nuts. Last Friday, one of my friends posted this link and puzzle to the list:

Two players go on a hot new game show called "Higher Number Wins". The two go into separate booths, and each presses a button, and a random number between zero and one appears on a screen. (At this point, neither knows the other's number, but they do know the numbers are chosen from a standard uniform distribution.) They can choose to keep that first number, or to press the button again to discard the first number and get a second random number, which they must keep. Then, they come out of their booths and see the final number for each player on the wall. The lavish grand prize -- a case full of gold bouillon -- is awarded to the player who kept the higher number. Which number is the optimal cutoff for players to discard their first number and choose another? Put another way, within which range should they choose to keep the first number, and within which range should they reject it and try their luck with a second number?

From there, the conversation took off quiickly with a lot of intuition and analysis. There was immediate support for the intuitive threhold of 0.5, which a simple case analysis shows to give the maximum expected value for a player, 0.625. Some head-to-head analysis of various combinations, however, showed other values winning more often than 0.5, with values around 0.6 doing the best.

What was up? One of these analyses was wrong, but we weren't sure which. One list member, who had built a quick model in Excel, said,

I think the optimum may be somewhere around .61, but I'm damned if I know why.

Another said,

I can't help thinking we're ignoring something game theoretical. I'm expecting we've all arrived at the most common wrong answer.

To confirm his suspicions, this person went off and wrote a C program -- a "terrible, awful, ugly, horrible C program" -- to compute all the expected values for all possible head-to-head cases. He announced,

We have a winner, according to my program. ... 61 wins.

He posted a snippet of the output from his program, which showed a steady rise in the win percentages for cutoffs up to a threshold of 0.61, which beat the 99 other cutoffs, with a steady decline for cutoffs thereafter.

Before reading the conversation on the mailing list, I discussed the puzzle with my wife. We were partial to 0.5, too, but that seemed too simple... So I sat down and wrote a program of my own.

My statistics skills are not as strong as many of my friends, and for this reason I like to write programs that simulate the situation at hand. My Racket program creates players who use all possible thresholds, plays matches of 1,000,000 games between each pair, and tallies up the results. Like the C program written by my buddy, my program is quick and dirty; it replays all hundred combinations on each pass, without taking advantage of the fact that the tournament matrix is symmetric. It's slower than it needs to be, but it gets the job done.

Player 57 defeats 95 opponents.
Player 58 defeats 96 opponents.
Player 59 defeats 99 opponents.
Player 60 defeats 100 opponents.
Player 61 defeats 98 opponents.
Player 62 defeats 96 opponents.
Player 63 defeats 93 opponents.

The results of my simulation mirrored the results of the brute-force case analysis. In simulation, 0.6 won, with 0.59 and 0.61 close behind. The two approaches gave similar enough results that it's highly likely there are bugs in neither program -- or both!

Once my friends were confident that 0.5 was not the winner, they were able to diagnose the error in the reasoning that made us think it was the best we could do: Although a player's choice of strategies is independent of the other player's choice, we cannot treat the other player's value as a uniform distribution over [0..1]. That is true only when they choose a threshold of 0 or 1.

In retrospect, this seems obvious, and maybe it was obvious to my mathematician friends right off the bat. But none of us on the mailing list is a professional statistician. I'm proud that we all stuck with the problem until we understood what was going on.

I love how we can cross-check our intuitions about puzzles like this with analysis and simulation. There is a nice interplay between theory and empirical investigation here. A simple theory, even if incomplete or incorrect, gives us a first approximation. Then we run a test and use the results to go back and re-think our theory. The data helped us see the holes in our thinking. What works for puzzles also works for hairier problems out in the world, too.

And we created the data we ended by writing a computer program. You know how much I like to do that. This the sort of situation we see when writing chess-playing programs and machine learning programs: We can write programs that are smarter than we are by starting from much simpler principles that we know and understand.

This experience is also yet another reminder of why, if I ever go freelance as a consultant or developer, I plan to team up with someone who is a better mathematician than I am. Or at least find such a person to whom I can sub-contract a sanity check.

Posted by Eugene Wallingford | Permalink | Categories: Computing

February 24, 2016 2:36 PM

Computer Science is the Discipline of Reinvention

The quote of the day comes courtesy of the inimitable Matthias Felleisen, on Racket mailing list:

Computer science is the discipline of reinvention. Until everyone who knows how to write 10 lines of code has invented a programming language and solved the Halting Problem, nothing will be settled :-)

One of the great things about CS is that we can all invent whatever we want. One of the downsides is that we all do.

Sometimes, making something simply for the sake of making it is a wonderful thing, edifying and enjoyable. Other times, we should heed the advice carved above the entrance to the building that housed my first office as a young faculty member: Do not do what has already been done. Knowing when to follow each path is a sign of wisdom.

Posted by Eugene Wallingford | Permalink | Categories: Computing

February 14, 2016 11:28 AM

Be Patient, But Expect Better. Then Make Better.

In Reversing the Tide of Declining Expectations Matthew Butterick exhorts designers to expect more from themselves, as well as from the tools they use. When people expect more, other people sometimes tell them to be patient. There is a problem with being patient:

[P]atience is just another word for "let's make it someone else's problem. ... Expectations count too. If you have patience, and no expectations, you get nothing.

But what if you find the available tools lacking and want something better?

Scientists often face this situation. My physicist friends seem always to be rigging up some new apparatus in order to run the experiments they want to run. For scientists and so many other people these days, though, if they want a new kind of tool, they have to write a computer program.

Butterick tells a story that shows designers can do the same:

Let's talk about type-design tools. If you've been at the conference [TYPO Berlin, 2012], maybe you saw Petr van Blokland and Frederick Berlaen talking about RoboFont yesterday. But that is the endpoint of a process that started about 15 years ago when Erik and Petr van Blokland, and Just van Rossum (later joined by many others) were dissatisfied with the commercial type-design tools. So they started building their own. And now, that's a whole ecosystem of software that includes code libraries, a new font-data format called UFO, and applications. And these are not hobbyist applications. These are serious pieces of software being used by professional type designers.

What makes all of this work so remarkable is that there are no professional software engineers here. There's no corporation behind it all. It's a group of type designers who saw what they needed, so they built it. They didn't rely on patience. They didn't wait for someone else to fix their problems. They relied on their expectations. The available tools weren't good enough. So they made better.

That is fifteen years of patience. But it is also patience and expectation in action.

To my mind, this is the real goal of teaching more people how to program: programmers don't have to settle. Authors and web designers create beautiful, functional works. They shouldn't have to settle for boring or cliché type on the web, in their ebooks, or anywhere else. They can make better. Butterick illustrates this approach to design himself with Pollen, his software for writing and publishing books. Pollen is a testimonial to the power of programming for authors (as well as a public tribute to the expressiveness of a programming language).

Empowering professionals to make better tools is a first step, but it isn't enough. Until programming as a skill becomes part of the culture of a discipline, better tools will not always be used to their full potential. Butterick gives an example:

... I was speaking to a recent design-school graduate. He said, "Hey, I design fonts." And I said, "Cool. What are you doing with RoboFab and UFO and Python?" And he said, "Well, I'm not really into programming." That strikes me as a really bad attitude for a recent graduate. Because if type designers won't use the tools that are out there and available, type design can't make any progress. It's as if we've built this great spaceship, but none of the astronauts want to go to Mars. "Well, Mars is cool, but I don't want to drive a spaceship. I like the helmet, though." Don't be that guy. Go the hell to Mars.

Don't be that person. Go to Mars. While you are at it, help the people you know to see how much fun programming can be and, more importantly, how it can help them make things better. They can expect more.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 12, 2016 3:34 PM

Computing Everywhere: Detecting Gravitational Waves

a linearly-polarized gravitational wave
a linearly-polarized gravitational wave
Wikimedia Commons (CC BY-SA 3.0 US)

This week the world is excitedly digesting news that the interferometer at LIGO has detected gravitational waves being emitted by the merger of two black holes. Gravitational waves were predicted by Einstein one hundred years ago in his theory of General Relativity. Over the course of the last century, physicists have amassed plenty of indirect evidence that such waves exist, but this is the first time they have detected them directly.

The physics world is understandably quite excited by this discovery. We all should be! This is another amazing moment in science: Build a model. Make a falsifiable prediction. Wait for 100 years to have the prediction confirmed. Wow.

We in computer science can be excited, too, for the role that computation played in the discovery. As physicist Sabine Hossenfelder writes in her explanation of the gravitational wave story:

Interestingly, even though it was long known that black hole mergers would emit gravitational waves, it wasn't until computing power had increased sufficiently that precise predictions became possible. ... General Relativity, though often praised for its beauty, does leave you with one nasty set of equations that in most cases cannot be solved analytically and computer simulations become necessary.

As with so many cool advances in the world these days, whether in the sciences or the social sciences, computational modeling and simulation were instrumental in helping to confirm the existence of Einstein's gravitational waves.

So, fellow computer scientists, celebrate a little. Then, help a young person you know to see why they might want to study CS, alone or in combination with some other discipline. Computing is one of the fundamental tools we need these days in order to contribute to the great tableau of human knowledge. Even Einstein can use a little computational help now and then.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 29, 2016 3:43 PM

Marvin Minsky and the Irony of AlphaGo

Semantic Information Processing on my bookshelf
a portion of my bookshelf
(CC BY 3.0 US)

Marvin Minsky, one of the founders of AI, died this week. His book Semantic Information Processing made a big impression on me when I read it in grad school, and his paper Why Programming is a Good Medium for Expressing Poorly Understood and Sloppily-Formulated Ideas remains one of my favorite classic AI essays. The list of his students contains many of the great names from decades of computer science; several of them -- Daniel Bobrow, Bertram Raphael, Eugene Charniak, Patrick Henry Winston, Gerald Jay Sussman, Benjamin Kuipers, and Luc Steels -- influenced my work. Winston wrote one of my favorite AI textbooks ever, one that captured the spirit of Minsky's interest in cognitive AI.

It seems fitting that Minsky left us the same week that Google published the paper Mastering the Game of Go with Deep Neural Networks and Tree Search, which describes the work that led to AlphaGo, a program strong enough to beat an expert human Go player. ( This brief article describes the accomplishment and program at a higher level.) One of the key techniques at the heart of AlphaGo is neural networks, an area Minsky pioneered in his mid-1950s doctoral dissertation and continued to work in throughout his career.

In 1969, he and Seymour Papert published a book, Perceptrons, which showed the limitations of a very simple kind of neural network. Stories about the book's claims were quickly exaggerated as they spread to people who had never read the book, and the resulting pessimism stifled neural network research for more than a decade. It is a great irony that, in the week he died, one of the most startling applications of neural networks to AI was announced.

Researchers like Minsky amazed me when I was young, and I am more amazed by them and their lifelong accomplishments as I grow older. If you'd like to learn more, check out Stephen Wolfram's personal farewell to Minsky. It gives you a peek into the wide-ranging mind that made Minsky such a force in AI for so long.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

January 28, 2016 2:56 PM

Remarkable Paragraphs: "Everything Is Computation"'s 2016 question for sophisticated minds is, What do you consider the most interesting recent [scientific] news? What makes it important? Joscha Bach's answer is: Everything is computation. Read his essay, which contains some remarkable passages.

Computation changes our idea of knowledge: instead of treating it as justified true belief, knowledge describes a local minimum in capturing regularities between observables.

Epistemology was one of my two favorite courses in grad school (cognitive psych was the other), and "justified true belief" was the starting point for many interesting ideas of what constitutes knowledge. I don't see Bach's formulation as a replacement for justified true belief as a starting point, but rather as a specification of what beliefs are most justified in a given context. Still, Bach's way of using computation in such a concrete way to define "knowledge" is marvelous.

Knowledge is almost never static, but progressing on a gradient through a state space of possible world views. We will no longer aspire to teach our children the truth, because like us, they will never stop changing their minds. We will teach them how to productively change their minds, how to explore the never ending land of insight.

Knowledge is a never-ending process of refactoring. The phrase "how to productively change their minds" reminds me of Jon Udell's recent blog post on liminal thinking at scale. From the perspective that knowledge is a function, "changing one's mind intelligently" is the dynamic computational process that keeps the mind at a local minimum.

A growing number of physicists understand that the universe is not mathematical, but computational, and physics is in the business of finding an algorithm that can reproduce our observations. The switch from uncomputable, mathematical notions (such as continuous space) makes progress possible. Climate science, molecular genetics, and AI are computational sciences. Sociology, psychology, and neuroscience are not: they still seem to be confused by the apparent dichotomy between mechanism (rigid, moving parts) and the objects of their study. They are looking for social, behavioral, chemical, neural regularities, where they should be looking for computational ones.

This is a strong claim, and one I'm sympathetic with. However, I think that the apparent distinction between the computational sciences and the non-computational ones is a matter of time, not a difference in kind. It wasn't that long ago that most physicists thought of the universe in mathematical terms, not computational ones. I suspect that with a little more time, the orientation in other disciplines will begin to shift. Neuroscience and psychology are positioned well for such a phase shift.

In any case, Bach's response points our attention in a direction that has the potential to re-define every problem we try to solve. This may seem unthinkable to many, though perhaps not to computer scientists, especially those of us with an AI bent.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

January 15, 2016 4:02 PM

This Week's Edition of "Amazed by Computers"

As computer scientists get older, we all find ourselves reminiscing about the computers we knew in the past. I sometimes tell my students about using 5.25" floppies with capacities listed in kilobytes, a unit for which they have no frame of reference. It always gets a laugh.

In a recent blog entry, Daniel Lemire reminisces about the Cray 2, "the most powerful computer that money could buy" when he was in high school. It was took up more space than an office desk (see some photos here), had 1 GB of memory, and provided a peak performance of 1.9 gigaflops. In contrast, a modern iPhone fits in a pocket, has 1 GB of memory, too, and contains a graphics processing unit that provides more gigaflops than the Cray 2.

I saw Lemire's post a day after someone tweeted this image of a 64 GB memory card from 2016 next to a 2 GB Western Digital hard drive from 1996:

a 64 GB memory card (2016), a 2 GB hard drive (1996)

The youngest students in my class this semester were born right around 1996. Showing them a 1996 hard drive is like my college professors showing me magnetic cores: ancient history.

This sort of story is old news, of course. Even so, I occasionally remember to be amazed by how quickly our hardware gets smaller and faster. I only wish I could improve my ability to make software just as fast. Alas, we programmers must deal with the constraints of human minds and human organizations. Hardware engineers do battle only with the laws of the physical universe.

Lemire goes a step beyond reminiscing to close his entry:

And what if, today, I were to tell you that in 40 years, we will be able to fit all the computational power of your phone into a nanobot that can live in your blood stream?

Imagine the problems we can solve and the beauty we can make with such hardware. The citizens of 2056 are counting on us.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

January 12, 2016 3:58 PM

Peter Naur and "Datalogy"

Peter Naur died early this year at the age of 87. Many of you may know Naur as the "N" in BNF notation. His contributions to CS were much broader and deeper than BNF, though. He received the 2005 Turing Award in recognition of his contributions to programming language and compiler design, including his involvement in the definition of Algol 60. I have always been a huge fan of his essay Programming as Theory Building, which I share with anyone I think might enjoy it.

When Michael Caspersen sent a note to the SIGCSE mailing list, I learned something new about Naur: he coined the term datalogy for "the science of the nature and use of data" and suggested that it might be a suitable replacement for the term "computer science". I had to learn more...

It turns out that Naur coined this term in a letter to the Communications of the ACM, which ran in the July 1966 under the headline "The Science of Datalogy". This letter is available online through the ACM digital library. Unfortunately, this is behind a paywall for many of you who might be interested. For posterity, here is an excerpt from that page:

This is to advocate that the following new words, denoting various aspects of our subject, be considered for general adoption (the stress is shown by an accent):
  • datálogy, the science of the nature and use of data,
  • datamátics, that part of datalogy which deals with the processing of data by automatic means,
  • datámaton, an automatic device for processing data.

In this terminology much of what is now referred to "data processing" would be datamatics. In many cases this will be a gain in clarity because the new word includes the important aspect of data representations, while the old one does not. Datalogy might be a suitable replacement for "computer science."

The objection that possibly one of these words has already been used as a proper name of some activity may be answered partly by saying that of course the subject of datamatics is written with a lower case d, partly by remembering that the word "electronics" is used doubly in this way without inconvenience.

What also speaks for these words is that they will transfer gracefully into many other languages. We have been using them extensively in my local environment for the last few months and have found them a great help.

Finally I wish to mention that datamatics and datamaton (Danish: datamatik and datamat) are due to Paul Lindgreen and Per Brinch Hansen, while datalogy (Danish: datalogi) is my own invention.

I also learned from Caspersen's email that Naur was named the first Professor in Datalogy in Denmark, and held that titled at the University of Copenhagen until he retired in 1998.

Naur was a pioneer of computing. We all benefit from his work every day.

Posted by Eugene Wallingford | Permalink | Categories: Computing

January 07, 2016 1:52 PM

Parsimony and Obesity on the Web

Maciej Cegłowski is in fine form in his talk The Website Obesity Crisis. In it, he mentions recent projects from Facebook and Google to help people create web pages that load quickly, especially for users of mobile devices. Then he notes that their announcements do not practice what the projects preach:

These comically huge homepages for projects designed to make the web faster are the equivalent of watching a fitness video where the presenter is just standing there, eating pizza and cookies.

There is even more irony in creating special subsets of HTML "designed to be fast on mobile devices".

Why not just serve regular HTML without stuffing it full of useless crap?

William Howard Taft, a president of girth
Wikipedia photo
(photographer not credited)

Indeed. Cegłowski offers a simple way to determine whether the non-text elements of your page are useless, which he dubs the Taft Test:

Does your page design improve when you replace every image with William Howard Taft?

(Taft was an American president and chief justice widely known for his girth.)

My blog is mostly text. I should probably use more images, to spice up the visual appearance and to augment what the text says, but doing so takes more time and skill than I often have at the ready. When I do use images, they tend to be small. I am almost certainly more parsimonious than I need to be for most Internet connections in the 2010s, even wifi.

You will notice that I never embed video, though. I dug into the documentation for HTML and found a handy alternative to use in its place: the web link. It is small and loads fast.

Posted by Eugene Wallingford | Permalink | Categories: Computing

December 11, 2015 2:59 PM

Looking Backward and Forward

Jon Udell looks forward to a time when looking backward digitally requires faithful reanimation of born-digital artifacts:

Much of our culture heritage -- our words, our still and moving pictures, our sounds, our data -- is born digital. Soon almost everything will be. It won't be enough to archive our digital artifacts. We'll also need to archive the software that accesses and renders them. And we'll need systems that retrieve and animate that software so it, in turn, can retrieve and animate the data.

We already face this challenge. My hard drive is littered by files I have a hard time opening, if I am able to at all.

Tim Bray reminds us that many of those "born-digital" artifacts will probably live on someone else's computer, including ones owned by his employer, as computing moves to a utility model:

Yeah, computing is moving to a utility model. Yeah, you can do all sorts of things in a public cloud that are too hard or too expensive in your own computer room. Yeah, the public-cloud operators are going to provide way better up-time, security, and distribution than you can build yourself. And yeah, there was a Tuesday in last week.

I still prefer to have original versions of my documents live on my hardware, even when using a cloud service. Maybe one day I'll be less skeptical, when it really is as unremarkable as Tuesday next week. But then, plain text still seems to me to be the safest way to store most data, so what do I know?

Posted by Eugene Wallingford | Permalink | Categories: Computing

December 09, 2015 2:54 PM

What Is The Best Way Promote a Programming Language?

A newcomer to the Racket users mailing list asked which was the better way to promote the language: start a discussion on the mailing list, or ask questions on Stack Overflow. After explaining that neither was likely to promote Racket, Matthew Butterick gave some excellent advice:

Here's one good way to promote the language:
  1. Make something impressive with Racket.
  2. When someone asks "how did you make that?", give Racket all the credit.

Don't cut corners in Step 1.

This technique applies to all programming languages.

Butterick has made something impressive with Racket: Practical Typography, an online book. He wrote the book using a publishing system named Pollen, which he created in Racket. It's a great book and a joy to read, even if typography is only a passing interest. Check it out. And he gives Racket and the Racket team a lot of credit.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 08, 2015 3:55 PM

A Programming Digression: Generating Excellent Numbers


Whenever I teach my compiler course, it seems as if I run across a fun problem or two to implement in our source language. I'm not sure if that's because I'm looking or because I'm just lucky to read interesting blogs and Twitter feeds.

Farey sequences as Ford circles

For example, during a previous offering, I read on John Cook's blog about Farey's algorithm for approximating real numbers with rational numbers. This was a perfect fit for the sort of small language that my students were writing a compiler for, so I took a stab at implementing it. Because our source language, Klein, was akin to an integer assembly language, I had to unravel the algorithm's loops and assignment statements into function calls and if statements. The result was a program that computed an interesting result and that tested my students' compilers in a meaningful way. The fact that I had great fun writing it was a bonus.

This Semester's Problem

Early this semester, I came across the concept of excellent numbers. A number m is "excellent" if, when you split the sequence of its digits into two halves, a and b, b² - a² equals n. 48 is the only two-digit excellent number (8² - 4² = 48), and 3468 is the only four-digit excellent number (68² - 34² = 3468). Working with excellent numbers requires only integers and arithmetic operations, which makes them a perfect domain for our programming language.

My first encounter with excellent numbers was Brian Foy's Computing Excellent Numbers, which discusses ways to generate numbers of this form efficiently in Perl. Foy uses some analysis by Mark Jason Dominus, written up in An Ounce of Theory Is Worth a Pound of Search, that drastically reduces the search space for candidate a's and b's. A commenter on the Programming Praxis article uses the same trick to write a short Python program to solve that challenge. Here is an adaptation of that program which prints all of the 10-digit excellent numbers:

    for a in range(10000, 100000):
        b = ((4*a**2+400000*a+1)**0.5+1) / 2.0
        if b == int(b):
           print( int(str(a)+str(int(b))) )

I can't rely on strings or real numbers to implement this in Klein, but I could see some alternatives... Challenge accepted!

My Standard Technique

We do not yet have a working Klein compiler in class yet, so I prefer not to write complex programs directly in the language. It's too hard to get subtle semantic issues correct without being able to execute the code. What I usually do is this:

  • Write a solution in Python.
  • Debug it until it is correct.
  • Slowly refactor the program until it uses only a Klein-like subset of Python.

This produces what I hope is a semantically correct program, using only primitives available in Klein.

Finally, I translate the Python program into Klein and run it through my students' Klein front-ends. This parses the code to ensure that it is syntactically correct and type-checks the code to ensure that it satisfies Klein's type system. (Manifest types is the one feature Klein has that Python does not.)

As mentioned above, Klein is something like integer assembly language, so converting to a Klein-like subset of Python means giving up a lot of features. For example, I have to linearize each loop into a sequence of one or more function calls, recursing at some point back to the function that kicks off the loop. You can see this at play in my Farey's algorithm code from before.

I also have to eliminate all data types other than booleans and integers. For the program to generate excellent numbers, the most glaring hole is a lack of real numbers. The algorithm shown above depends on taking a square root, getting a real-valued result, and then coercing a real to an integer. What can I do instead?

the iterative step in Newton's method

Not to worry. sqrt is not a primitive operator in Klein, but we have a library function. My students and I implement useful utility functions whenever we encounter the need and add them to a file of definitions that we share. We then copy these utilities into our programs as needed.

sqrt was one of the first complex utilities we implemented, years ago. It uses Newton's method to find the roots of an integer. For perfect squares, it returns the argument's true square root. For all other integers, it returns the largest integer less than or equal to the true root.

With this answer in hand, we can change the Python code that checks whether a purported square root b is an integer using type coercion:

    b == int(b)
into Klein code that checks whether the square of a square root equals the original number:
    isSquareRoot(r : integer, n : integer) : boolean
      n = r*r

(Klein is a pure functional language, so the return statement is implicit in the body of every function. Also, without assignment statements, Klein can use = as a boolean operator.)

Generating Excellent Numbers in Klein

I now have all the Klein tools I need to generate excellent numbers of any given length. Next, I needed to generalize the formula at the heart of the Python program to work for lengths other than 10.

For any given desired length, let n = length/2. We can write any excellent number m in two ways:

  • a10n + b (which defines it as the concatenation of its front and back halves)
  • b² - a² (which defines it as excellent)

If we set the two m's equal to one another and solve for b, we get:

    b = -(1 + sqrt[4a2 + 4(10n)a + 1])

Now, as in the algorithm above, we loop through all values for a with n digits and find the corresponding value for b. If b is an integer, we check to see if m = ab is excellent.

The Python loop shown above works plenty fast, but Klein doesn't have loops. So I refactored the program into one that uses recursion. This program is slower, but it works fine for numbers up to length 6:

    > python3.4 6

Unfortunately, this version blows out the Python call stack for length 8. I set the recursion limit to 50,000, which helps for a while...

    > python3.4 8
    Segmentation fault: 11


Next Step: See Spot Run

The port to an equivalent Klein program was straightforward. My first version had a few small bugs, which my students' parsers and type checkers helped me iron out. Now I await their full compilers, due at the end of the week, to see it run. I wonder how far we will be able to go in the Klein run-time system, which sits on top of a simple virtual machine.

If nothing else, this program will repay any effort my students make to implement the proper handling of tail calls! That will be worth a little extra-credit...

This programming digression has taken me several hours spread out over the last few weeks. It's been great fun! The purpose of Klein is to help my students learn to write a compiler. But the programmer in me has fun working at this level, trying to find ways to implement challenging algorithms and then refactoring them to run deeper or faster. I'll let you know the results soon.

I'm either a programmer or crazy. Probably both.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 01, 2015 4:38 PM

A Non-Linear Truth about Wealth and Happiness

This tweet has been making the rounds again the last few days. It pokes good fun at the modern propensity to overuse the phrase 'exponential growth', especially in situations that aren't exponential at all. This usage has even invaded the everyday speech of many of my scientist friends, and I'm probably guilty more than I'd like to admit.

In The Day I Became a Millionaire, David Heinemeier Hansson avoids this error when commenting on something he's learned about wealth and happiness:

The best things in life are free. The second best things are very, very expensive. -- Coco Chanel
While the quote above rings true, I'd add that the difference between the best things and the second best things is far, far greater than the difference between the second best things and the twentieth best things. It's not a linear scale.

I started to title this post "A Power Law of Wealth and Happiness" before realizing that I was falling into a similar trap common among computer scientists and software developers these days: calling every function with a steep end and a long tail "a power law". DHH does not claim that the relationship between cost and value is exponential, let alone that it follows a power law. I reined in my hyperbole just in time. "A Non-Linear Truth ..." may not have quite the same weight of power law academic-speak, but it sounds just fine.

By the way, I agree with DHH's sentiment. I'm not a millionaire, but most of the things that contribute to my happiness would scarcely be improved by another zero or two in my bank account. A little luck at birth afforded me almost all of what I need in life, as it has many other people. The rest is an expectations game that is hard to win by accumulating more.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

November 26, 2015 11:04 AM

I Am Thankful for Programming

I smiled a big smile when I read this passage in an interview with Victoria Gould, a British actor and mathematician:

And just as it did when she was at school, maths still brings Victoria relief and reassurance. "When teaching or acting becomes stressful, I retreat to maths a lot for its calmness and its patterns. I'll quite often, in a stressful time, go off and do a bit of linear algebra or some trigonometric identities. They're hugely calming for me." Maths as stress relief? "Absolutely, it works every time!"

It reminded me of a former colleague, a mathematician who now works at Ohio University. He used to say that he had pads and pencils scattered on tables and counters throughout his house, because "I never know when I'll get the urge to do some math."

Last night, I came home after a couple of days of catching up on department work and grading. Finally, it was time to relax for the holiday. What did I do first? I wrote a fun little program in Python to reverse an integer, using only arithmetic operators. Then I watched a movie with my wife. Both relaxed me.

I was fortunate as a child to find solace in fiddling with numbers and patterns. Setting up a system of equations and solving it algebraically was fun. I could while away many minutes playing with the square root key on my calculator, trying to see how long it would take me to drive a number to 1.

Then in high school I discovered programming, my ultimate retreat.

On this day, I am thankful for many people and many things, of course. But Gould's comments remind me that I am also thankful for the privilege of knowing how to program, and for the way it allows me to escape into a world away from stress and distraction. This is a gift.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

November 20, 2015 6:02 PM

The Scientific Value of Reading Old Texts

In Numbers, Toys and Music, the editors of Plus Magazine interview Manjul Bhargava, who won a 2014 Fields Medal for his work on a problem involving a certain class of square numbers.

Bhargava talked about getting his start on problems of this sort not by studying Gauss's work from nineteenth century, but by reading the work of the seventh century mathematician Brahmagupta in the original Sanskrit. He said it was exciting to read original texts and old translations of original texts from at least two perspectives. Historically, you see an idea as it is encountered and discovered. It's an adventure story. Mathematically, you see the idea as it was when it was discovered, before it has been reinterpreted over many years by more modern mathematicians, using newer, fancier, and often more complicated jargon than was available to the original solver of the problem.

He thinks this is an important step in making a problem your own:

So by going back to the original you can bypass the way of thinking that history has somehow decided to take, and by forgetting about that you can then take your own path. Sometimes you get too influenced by the way people have thought about something for 200 years, that if you learn it that way that's the only way you know how to think. If you go back to the beginning, forget all that new stuff that happened, go back to the beginning. Think about it in a totally new way and develop your own path.

Bhargava isn't saying that we can ignore the history of math since ancient times. In his Fields-winning work, he drew heavily on ideas about hyperelliptic curves that were developed over the last century, as well as computational techniques unavailable to his forebears. He was prepared with experience and deep knowledge. But by going back to Brahmagupta's work, he learned to think about the problem in simpler terms, unconstrained by the accumulated expectations of modern mathematics. Starting from a simpler set of ideas, he was able to make the problem his own and find his own way toward a solution.

This is good advice in computing as well. When CS researchers tell us to read the work of McCarthy, Newell and Simon, Sutherland, and Engelbart, they are channeling the same wisdom that helped Manjul Bhargava discover new truths about the structure of square numbers.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 19, 2015 2:45 PM

Hope for the Mature Researcher

In A Primer on Graph Isomorphism, Lance Fortnow puts László Babai's new algorithm for the graph isomorphism problem into context. To close, he writes:

Also we think of theory as a young person's game, most of the big breakthroughs coming from researchers early in their careers. Babai is 65, having just won the Knuth Prize for his lifetime work on interactive proofs, group algorithms and communication complexity. Babai uses his extensive knowledge of combinatorics and group theory to get his algorithm. No young researcher could have had the knowledge base or maturity to be able to put the pieces together the way that Babai did.

We often hear that research, especially research aimed at solving our deepest problems, is a young person's game. Great work takes a lot of stamina. It often requires a single-minded focus that comes naturally to a young person but which is a luxury unavailable to someone with a wider set of obligations beyond work. Babai's recent breakthrough reminds us that other forces are at play, that age and broad experience can be advantages, too.

This passage serves as a nice counterweight to Garrison Keillor's The slow rate of learning... line, quoted in my previous post. Sometimes, slow and steady are what it takes to get a big job done.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 22, 2015 4:22 PM

Aramaic, the Intermediate Language of the Ancient World

My compiler course is making the transition from the front end to the back end. Our attention is on static analysis of abstract syntax trees and will soon turn to other intermediate representations.

In the compiler world, an "intermediate representation" or intermediate language is a notation used as a stepping stone between the abstract syntax tree and the machine language that is ultimately produced. Such a stepping stone allows the compiler to take smaller steps in translation process and makes it easier to improve the code before getting down into the details of machine language.

We sometimes see intermediate languages in the "real world", too. They tend to arise as a result of cultural and geopolitical forces and, while they usually serve different purposes in human affairs than in compiler affairs, they still tend to be practical stepping stones to another language.

Consider the case of Darius I, whose Persian armies conquered most of the Middle East around 500 BC. As John McWhorter writes in The Atlantic, at the time of Darius's conquest,

... Aramaic was so well-entrenched that it seemed natural to maintain it as the new empire's official language, instead of using Persian. For King Darius, Persian was for coins and magnificent rock-face inscriptions. Day-to-day administration was in Aramaic, which he likely didn't even know himself. He would dictate a letter in Persian and a scribe would translate it into Aramaic. Then, upon delivery, another scribe would translate the letter from Aramaic into the local language. This was standard practice for correspondence in all the languages of the empire.

For sixty years, many compiler writers have dreamed of a universal intermediate language that would ease the creation of compilers for new languages and new machines, to no avail. But for several hundred years, Aramaic was the intermediate representation of choice for a big part of the Western world! Alas, Greek and Arabic later came along to supplant Aramaic, which now seems to be on a path to extinction.

This all sounds a lot like the world of programming, in which languages come and go as we develop new technologies. Sometimes a language, human or computer, takes root for a while as the result of historical or technical forces. Then a new regime or a new culture rises, or an existing culture gains in influence, and a different language comes to dominate.

McWhorter suggests that English may have risen to prominence at just the right moment in history to entrench itself as the world's intermediate language for a good long run. We'll see. Human languages and computer languages may operate on different timescales, but history treats them much the same.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 18, 2015 10:42 AM

What a Tiny Language Can Teach Us About Gigantic Systems

StrangeLoop is long in the books for most people, but I'm still thinking about some of the things I learned there. This is the first of what I hope to be a few more posts on talks and ideas still on my mind.

The conference opened with a keynote address by Peter Alvaro, who does research at the intersection of distributed systems and programming languages. The talk was titled "I See What You Mean", but I was drawn in more by his alternate title: "What a Tiny Language Can Teach Us About Gigantic Systems". Going in, I had no idea what to expect from this talk and so, in an attitude whose pessimism surprised me, I expected very little. Coming out, I had been surprised in the most delightful way.

Alvaro opened with the confounding trade-off of all abstractions: Hiding the distracting details of a system can illuminate the critical details (yay!), but the boundaries of an abstraction lock out the people who want to work with the system in a different way (boo!). He illustrated the frustration felt by those who are locked out with a tweet from @pxlplz:

SELECT bs FROM table WHERE sql="arrgh" ORDER BY hate

From this base, Alvaro moved on to his personal interests: query languages, semantics, and distributed systems. When modeling distributed systems, we want a language that is resilient to failure and tolerant of a loose ordering on the execution of operations. But we also need a way to model what programs written in the language mean. The common semantic models express a common split in computing:

  • operational semantics: a program means what it does
  • model-theoretic semantics: a program means the set of facts that makes it true

With query languages, we usually think of programs in terms of the databases of facts that makes them true. In many ways, the streaming data of a distributed system is a dual to the database query model. In the latter, program control flows down to fixed data. In distributed systems, data flows down to fixed control units. If I understood Alvaro correctly, his work seeks to find a sweet spot amid the tension between these two models.

Alvaro walked through three approaches to applicative programming. In the simplest form, we have three operators: select (σ), project (Π), and join (). The database language SQL adds to this set negation (¬). The Prolog subset Datalog makes computation of the least fixed point a basic operation. Datalog is awesome, says Alvaro, but not if you add ¬! That creates a language with too much power to allow the kind of reasoning we want to do about a program.

Declarative programs don't have assignment statements, because they introduce time into a model. An assignment statement effectively partitions the past (in which an old value holds) from the present (characterized by the current value). In a program with state, there is an hidden clock inside the program.

We all know the difficulty of managing state in a standard system. Distributed systems create a new challenge. They need to deal with time, but a relativistic time in which different programs seem to be working on their own timelines. Alvaro gave a couple of common examples:

  • a sender crashes, then restarts and begins to replay a set of transaction
  • a receiver enters garbage collection, then comes back to life and begins to respond to queued messages

A language that helps us write better distributed systems must give us a way to model relativistic time without a hidden universal clock. The rest of the talk looked at some of Alvaro's experiments aimed at finding such languages for distributed systems, building on the ideas he had introduced earlier.

The first was Dedalus, billed as "Datalog in time and space". In Dedalus, knowledge is local and ephemeral. It adds two temporal operators to the set found in SQL: @next, for making assertions about the future, and @async, for making assertions of independence between operations. Computation in Dedalus is rendezvous between data and control. Program state is a deduction.

But what of semantics? Alas, a Dedalus program has an infinite number of models, each model itself infinite. The best we can do is to pull at all of the various potential truths and hope for quiescence. That's not comforting news if you want to know what your program will mean while operating out in the world.

Dedalus as the set of operations {σ, Π, , ¬, @next, @async} takes us back to the beginning of the story: too much power for effective reasoning about programs.

However, Dedalus minus ¬ seems to be a sweet spot. As an abstraction, it hides state representation and control flow and illuminates data, change, and uncertainty. This is the direction Alvaro and his team are moving in now. One result is Bloom, a small new language founded on the Dedalus experiment. Another is Blazes, a program analysis framework that identifies potential inconsistencies in a distributed program and generates the code needed to ensure coordination among the components in question. Very interesting stuff.

Alvaro closed by returning to the idea of abstraction and the role of programming language. He is often asked why he creates new programming languages rather than working in existing languages. In either approach, he points out, he would be creating abstractions, whether with an API or a new syntax. And he would have to address the same challenges:

  • Respect users. We are they.
  • Abstractions leak. Accept that and deal with it.
  • It is better to mean well than to feel good. Programs have to do what we need them to do.

Creating a language is an act of abstraction. But then, so is all of programming. Creating a language specific to distributed systems is a way to make very clear what matters in the domain and to provide both helpful syntax and clear, reliable semantics.

Alvaro admits that this answer hides the real reason that he creates new languages:

Inventing languages is dope.

At the end of this talk, I understood its title, "I See What You Mean", better than I did before it started. The unintended double entendre made me smile. This talk showed how language interacts with problems in all areas of computing, the power language gives us as well as the limits it imposes. Alvaro delivered a most excellent keynote address and opened StrangeLoop on a high note.

Check out the full talk to learn about all of this in much greater detail, with the many flourishes of Alvaro's story-telling.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 15, 2015 8:18 AM

Perfection Is Not A Pre-Requisite To Accomplishing Something Impressive

In Not Your Typical Role Model, mathematician Hannah Fry tells us some of what she learned about Ada Lovelace, "the 19th century programmer", while making a film about her. Not all of it was complimentary. She concludes:

Ada was very, very far from perfect, but perfection is not a pre-requisite to accomplishing something impressive. Our science role models shouldn't always be there to celebrate the unachievable.

A lot of accomplished men of science were far from perfect role models, too. In the past, we've often been guilty of covering up bad behavior to protect our heroes. These days, we sometimes rush to judge them. Neither inclination is healthy.

By historical standards, it sounds like Lovelace's imperfections were all too ordinary. She was human, like us all. Lovelace thought some amazing things and wrote them down for us. Let's celebrate that.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

September 29, 2015 4:01 PM

StrangeLoop: Pixie, big-bang, and a Mix Tape

The second day of StrangeLoop was as eventful as the first. Let me share notes on a few more of the talks that I enjoyed.

the opening screen for the Pixie talk

In the morning, I learned more about Pixie, a small, lightweight version of Lisp at the talk by Timothy Baldridge, its creator. Why another Lisp, especially when its creator already knows and enjoys Clojure? Because he wanted to explore ideas he had come across over the years. Sometimes, there is no better way to do that than to create your own language. This is what programmers do.

Pixie uses Clojure-like syntax and semantics, but it diverges enough from the Clojure spec that he didn't want to be limited by calling it a variant of Clojure. "Lisp" is a generic term these days and doesn't carry as much specific baggage.

Baldridge used RPython and the PyPy tool chain to create Pixie, which has its own virtual machine and its own bytecode format. It also has garbage collection. (Implementors of RPython-based languages don't have to write their own GC; they write hints to a GC generator, which layers the GC code into the generated C of the interpreter.) Pixie also offers strong interoperation with C, which makes it possible to speed up even further hot spots in a program.

For me, the most interesting part of Pixie is its tracing just-in-time compiler. A just-in-time compiler, or "JIT", generates target code at run time, when a specific program provides more information to the translator than just the language grammar. A tracing JIT records frequently executed operations, especially in and around loops, in order to get the information the code generator needs to produce its output. Tracing JITs are an attractive idea for implementing a Lisp, in which programs tend to consist of many small functions. A tracing JIT can dive deep through all those calls and generate small, tight code.

the opening screen for the 'mix tape' talk

Rather than give a detailed talk about a specific language, David Nolen and Michael Bernstein gave a talk they dubbed "a mix tape to lovers of programming languages". They presented an incomplete and very personal look at the history of programming languages, and at each point on the timeline they offered artists whose songs reflected a similar development in the music world. Most of the music was out of the mainstream, where connoisseurs such as Nolen and Bernstein find and appreciate hidden gems. Some of it sounded odd to a modern ear, and at one point Bernstein gently assured the audience, "It's okay to laugh."

The talk even taught me one bit of programming history that I didn't know before. Apparently, John Backus got his start by trying to make a hi-fi stereo system for listening to music! This got him into radios and electronics, and then into programming computers at the hardware level. Bernstein quoted Backus as saying, "I figured there had to be a better way." This adds a new dimension to something I wrote on the occasion of Backus's passing. Backus eventually found himself writing programs to compute missile trajectories on the IBM 701 mainframe. "Much of my work has come from being lazy," Backus said, so "I started work on a programming system to make it easier to write programs." The result was Fortran.

Nolen and Bernstein introduced me to some great old songs, as well as several new artists. Songs by jazz pianist Cecil Taylor and jazz guitarist Sonny Sharrock made particular impressions on me, and I plan to track down more of their work.

it all stated with a really big-bang

Matthias Felleisen combined history and the details of a specific system in his talk about big-bang, the most recent development in a 20-year journey to find ways to integrate programming with mathematical subjects in a way that helps students learn both topics better. Over that time, he and the PLT team have created a sequence of Racket libraries and languages that enable students to begin with middle-school pre-algebra and progress in smooth steps all the way up to college-level software engineering. He argued near the end of his talk that the progression does not stop there, as extension of big-bang has led to bona fide CS research into network calculus.

All of this programming is done in the functional style. This is an essential constraint if we wish to help students learn and so real math. Felleisen declared boldly "I don't care about programming per se" when it comes to programming in the schools. Even students who never write another program again should have learned something valuable in the content area.

The meat of the talk demonstrated how big-bang makes it possible for students to create interactive, graphical programs using nothing but algebraic expressions. I can't really do justice to Matthias's story or storytelling here, so you should probably watch the video. I can say, though, that the story he tells here meshes neatly with The Racket Way as part of a holistic vision of computing unlike most anything you will find in the computing world. It's impressive both in the scope of its goals and in the scope of the tools it has produced.

More notes on StrangeLoop soon.


PHOTO. I took both photos above from my seats in the audience at StrangeLoop. Please pardon my unsteady hand. CC BY-SA.

Posted by Eugene Wallingford | Permalink | Categories: Computing

September 27, 2015 6:56 PM

StrangeLoop is in the Books

a plaque outside the St. Louis Public Library

The conference came and went far too quickly, with ideas enough for many more days. As always, Alex Miller and his team put on a first-class program with the special touches and the vibe that make me want to come back every year.

Most of the talks are already online. I will be writing up my thoughts on some of the talks that touched me deepest in separate entries over the next few days. For now, let me share notes on a few other talks that I really enjoyed.

Carin Meier talked about her tinkering with the ideas of chemical computing, in which we view molecules and reactions as a form of computation. In her experiments, Meier encoded numbers and operations as molecules, put them in a context in which they could react with one another, and then visualized the results. This sort of computation may seem rather inefficient next to a more direct algorithm, it may give us a way to let programs discover ideas by letting simple concepts wander around and bump into one another. This talk reminded me of AM and Eurisko, AI programs from the late 1970s which always fascinated me. I plan to try Meier's ideas out in code.

Jan Paul Posma gave us a cool look at some Javascript tools for visualizing program execution. His goal is to make it possible to shift from ordinary debugging, which follows something like the scientific method to uncover hidden errors and causes, to "omniscient debugging", in which everything we need to understand how our code runs is present in the system. Posma's code and demos reminded me of Bret Victor's work, such as learnable programming.

Caitie McCaffrey's talk on building scalable, stateful services and Camille Fournier's talk on hope and fear in distributed system design taught me a little about a part of the computing world I don't know much about. Both emphasized the importance of making trade-offs among competing goals and forces. McCaffrey's talk had a more academic feel, with references to techniques such as distributed hash tables with nondeterministic placement, whereas Fournier took a higher-level look at how context drives the balance between scale and fault tolerance. From each talk I took at least one take-home lesson for me and my students:

  • McCaffrey asked, "Should you read research papers?" and immediately answered "Yes." A lot of the ideas we need today appear in the database literature of the '60s, '70s, and '80s. Study!
  • Fournier says that people understand asynchrony and changing data better than software designers seem to think. If we take care of the things that matter most to them, such as not charging their credit cards more once, they will understand the other effects of asynchrony as simply one of the costs of living in a world that gives them amazing possibilities.

Fournier did a wonderful job stepping in to give the Saturday keynote address on short notice. She was lively, energetic, and humorous -- just what the large audience needed after a long day of talks and a long night of talking and carousing. Her command of the room was impressive.

More notes soon.


PHOTO. One of the plaques on the outside wall of the St. Louis Public Library, just a couple of blocks from the Peabody Opera House and StrangeLoop. Eugene Wallingford, 2015. Available under a CC BY-SA license.

Posted by Eugene Wallingford | Permalink | Categories: Computing

September 24, 2015 9:04 PM

Off to StrangeLoop

StrangeLoop 2010 logo

StrangeLoop 2015 starts tomorrow, and after a year's hiatus, I'm back. The pre-conference workshops were today, and I wish I could have been here in time for the Future of Programming workshop. Alas, I have a day job and had to teach class before hitting the road. My students knew I was eager to get away and bid me a quick goodbye as soon as we wrapped up our discussion of table-driven parsing. (They may also have been eager to finish up the scanners for their compiler project...)

As always, the conference line-up consists of strong speakers and intriguing talks throughout. Tomorrow, I'm looking forward to talks by Philip Wadler and Gary Bernhardt. Wadler is Wadler, and if anyone can shed new light in 2015 on the 'types versus unit tests' conflagration and make it fun, it's probably Bernhardt.

On Saturday, my attention is honed in on David Nolen's and Michael Bernstein's A History of Programming Languages for 2 Voices. I've been big fans of their respective work for years, swooning on Twitter and reading their blogs and papers, and now I can see them in person. I doubt I'll be able to get close, though; they'll probably be swamped by groupies. Immediately after that talk, Matthias Felleisen is giving a talk on Racket's big-bang, showing how we can use pure functional programming to teach algebra to middle school students and fold the network into the programming language.

Saturday was to begin with a keynote by Kathy Sierra, whom I last saw many years ago at OOPSLA. I'm sad that she won't be able to attend after all, but I know that Camille Fournier's talk about hopelessness and confidence in distributed systems design will be an excellent lead-off talk for the day.

I do plan one change for this StrangeLoop: my laptop will stay in its shoulder bag during all of the talks. I'm going old school, with pen and a notebook in hand. My mind listens differently when I write notes by hand, and I have to be more frugal in the notes I take. I'm also hoping to feel a little less stress. No need to blog in real time. No need to google every paper the speakers mention. No temptation to check email and do a little work. StrangeLoop will have my full attention.

The last time I came to StrangeLoop, I read Raymond Queneau's charming and occasionally disorienting "Exercises in Style", in preparation for Crista Lopes's talk about her exercises in programming style. Neither the book nor talk disappointed. This year, I am reading The Little Prince -- for the first time, if you can believe it. I wonder if any of this year's talks draw their inspiration from Saint-Exupéry? At StrangeLoop, you can never rule that kind of connection out.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

September 22, 2015 2:57 PM

"Good Character" as an Instance of Postel's Law

Mike Feathers draws an analogy I'd never thought of before in The Universality of Postel's Law: what we think of as "good character" can be thought of as an application of Postel's Law to ordinary human relations.

Societies often have the notion of 'good character'. We can attempt all sorts of definitions but at its core, isn't good character just having tolerance for the foibles of others and being a person people can count on? Accepting wider variation at input and producing less variation at output? In systems terms that puts more work on the people who have that quality -- they have to have enough control to avoid 'going off' on people when others 'go off on them', but they get the benefit of being someone people want to connect with. I argue that those same dynamics occur in physical systems and software systems that have the Postel property.

These days, most people talk about Postel's Law as a social law, and criticisms of it even in software design refer to it as creating moral hazards for designers. But Postel coined this "principle of robustness" as a way to talk about implementing TCP, and most references I see to it now relate to HTML and web browsers. I think it's pretty cool when a software design principle applies more broadly in the design world, or can even be useful for understanding human behavior far removed from computing. That's the sign of a valuable pattern -- or anti-pattern.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns, Software Development

September 19, 2015 11:56 AM

Software Gets Easier to Consume Faster Than It Gets Easier to Make

In What Is the Business of Literature?, Richard Nash tells a story about how the ideas underlying writing, books, and publishing have evolved over the centuries, shaped by the desires of both creators and merchants. One of the key points is that technological innovation has generally had a far greater effect on the ability to consume literature than on the ability to create it.

But books are just one example of this phenomenon. It is, in fact, a pattern:

For the most part, however, the technical and business-model innovations in literature were one-sided, far better at supplying the means to read a book than to write one. ...

... This was by no means unique to books. The world has also become better at allowing people to buy a desk than to make a desk. In fact, from medieval to modern times, it has become easier to buy food than to make it; to buy clothes than to make them; to obtain legal advice than to know the law; to receive medical care than to actually stitch a wound.

One of the neat things about the last twenty years has been the relatively rapid increase in the ability for ordinary people to to write and disseminate creative works. But an imbalance remains.

Over a shorter time scale, this one-sidedness has been true of software as well. The fifty or sixty years of the Software Era have given us seismic changes in the availability, ubiquity, and backgrounding of software. People often overuse the word 'revolution', but these changes really have had an immense effect in how and when almost everyone uses software in their lives.

Yet creating software remains relatively difficult. The evolution of our tools for writing programs hasn't kept pace with the evolution in platforms for using them. Neither has the growth in our knowledge of how make great software.

There is, of course, a movement these days to teach more people how to program and to support other people who want to learn on their own. I think it's wonderful to open doors so that more people have the opportunity to make things. I'm curious to see if the current momentum bears fruit or is merely a fad in a world that goes through fashions faster than we can comprehend them. It's easier still to toss out a fashion that turns out to require a fair bit of work.

Writing software is still a challenge. Our technologies have not changed that fact. But this is also true, as Nash reminds us, of writing books, making furniture, and a host of other creative activities. He also reminds us that there is hope:

What we see again and again in our society is that people do not need to be encouraged to create, only that businesses want methods by which they can minimize the risk of investing in the creation.

The urge to make things is there. Give people the resources they need -- tools, knowledge, and, most of all, time -- and they will create. Maybe one of the new programmers can help us make better tools for making software, or lead us to new knowledge.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns, Software Development

September 11, 2015 3:55 PM

Search, Abstractions, and Big Epistemological Questions

Andy Soltis is an American grandmaster who writes a monthly column for Chess Life called "Chess to Enjoy". He has also written several good books, both recreational and educational. In his August 2015 column, Soltis talks about a couple of odd ways in which computers interact with humans in the chess world, ways that raise bigger questions about teaching and the nature of knowledge.

As most people know, computer programs -- even commodity programs one can buy at the store -- now play chess better than the best human players. Less than twenty years ago, Deep Blue first defeated world champion Garry Kasparov in a single game. A year later, Deep Blue defeated Kasparov in a closely contested six-game match. By 2005, computers were crushing Top Ten players with regularity. These days, world champion Magnus Larson is no match for his chess computer.

a position in which humans see the win, but computers don't

Yet there are still moments where humans shine through. Soltis opens with a story in which two GMs were playing a game the computers thought Black was winning, when suddenly Black resigned. Surprised journalists asked the winner, GM Vassily Ivanchuk, what had happened. It was easy, he said: it only looked like Black was winning. Well beyond the computers' search limits, it was White that had a textbook win.

How could the human players see this? Were they searching deeper than the computers? No. They understood the position at a higher level, using abstractions such as "being in the square" and passed pawns like splitting a King like "pants". (We chessplayers are an odd lot.)

When you can define 'flexibility' in 12 bits,
it will go into the program.

Attempts to program computers to play chess using such abstract ideas did not work all that well. Concepts like king safety and piece activity proved difficult to implement in code, but eventually found their way into the programs. More abstract concepts like "flexibility", "initiative", and "harmony" have proven all but impossible to implement. Chess programs got better -- quickly -- when two things happened: (1) programmers began to focus on search, implementing metrics that could be applied rapidly to millions of positions, and (2) computer chips got much, much faster.

Pawn Structure Chess, by Andy Soltis

The result is that chess programs can beat us by seeing farther down the tree of possibilities than we do. They make moves that surprise us, puzzle us, and even offend our sense of beauty: "Fischer or Tal would have played this move; it is much more elegant." But they win, easily -- except when they don't. Then we explain why, using ideas that express an understanding of the game that even the best chessplaying computers don't seem to have.

This points out one of the odd ways computers relate to us in the world of chess. Chess computers crush us all, including grandmasters, using moves we wouldn't make and many of us do not understand. But good chessplayers do understand why moves are good or bad, once they figure it out. As Soltis says:

And we can put the explanation in words. This is why chess teaching is changing in the computer age. A good coach has to be a good translator. His students can get their machine to tell them the best move in any position, but they need words to make sense of it.

Teaching computer science at the university is affected by a similar phenomenon. My students can find on the web code samples to solve any problem they have, but they don't always understand them. This problem existed in the age of the book, too, but the web makes available so much material, often undifferentiated and unexplained, so, so quickly.

The inverse of computers making good moves we don't understand brings with it another oddity, one that plays to a different side of our egos. When a chess computer loses -- gasp! -- or fails to understand why a human-selected move is better than the moves it recommends, we explain it using words that make sense of human move. These are, of course, the same words and concepts that fail us most of the time when we are looking for a move to beat the infernal machine. Confirmation bias lives on.

Soltis doesn't stop here, though. He realizes that this strange split raises a deeper question:

Maybe it's one that only philosophers care about, but I'll ask it anyway:

Are concepts like "flexibility" real? Or are they just artificial constructs, created by and suitable only for feeble, carbon-based minds?

(Philosophers are not the only ones who care. I do. But then, the epistemology course I took in grad school remains one of my two favorite courses ever. The second was cognitive psychology.)


We can implement some of our ideas about chess in programs, and those ideas have helped us create machines we can no longer defeat over the board. But maybe some of our concepts are simply be fictions, "just so" stories we tell ourselves when we feel the need to understand something we really don't. I don't think so, the pragmatist in me keeps pushing for better evidence.

Back when I did research in artificial intelligence, I always chafed at the idea of neural networks. They seemed to be a fine model of how our brains worked at the lowest level, but the results they gave did not satisfy me. I couldn't ask them "why?" and receive an answer at the conceptual level at which we humans seem to live. I could not have a conversation with them in words that helped me understand their solutions, or their failures.

Now we live in a world of "deep learning", in which Google Translate can do a dandy job of translating a foreign phrase for me but never tell me why it is right, or explain the subtleties of choosing one word instead of another. Add more data, and it translates even better. But I still want the sort of explanation that Ivanchuk gave about his win or the sort of story Soltis can tell about why a computer program only drew a game because it saddled itself with inflexible pawn structure.

Perhaps we have reached the limits of my rationality. More likely, though, is that we will keep pushing forward, bringing more human concepts and abstractions within the bounds of what programs can represent, do, and say. Researchers like Douglas Hofstadter continue the search, and I'm glad. There are still plenty of important questions to ask about the nature of knowledge, and computer science is right in the middle of asking and answering them.


IMAGE 1. The critical position in Ivanchuk-Jobava, Wijk aan Zee 2015, the game to which Soltis refers in his story. Source: Chess Life, August 2015, Page 17.

IMAGE 2. The cover of Andy Soltis's classic Pawn Structure Chess. Source: the book's page at

IMAGE 3. A bust of Aristotle, who confronted Plato's ideas about the nature of ideals. Source: Classical Wisdom Weekly.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

September 03, 2015 3:26 PM

Compilers and the Universal Machine

I saw Eric Normand's The Most Important Idea in Computer Science a few days ago and enjoyed it. I almost always enjoy watching a programmer have fun writing a little interpreter and then share that fun with others.

In class this week, my students and I spent a few minutes playing with T-diagrams to illustrate techniques for porting, bootstrapping, and optimizing compilers, and Normand's post came to mind. So I threw a little purple prose into my classroom comments.

All these examples of building compilers by feeding programs for new compilers into old compilers ultimately depend on a single big idea from the theory of computer science: that a certain kind of machine can simulate anything -- including itself. As a result, this certain kind of machine, the Turing machine, is the very definition of computability. But this big idea also means that, whatever problem we want to solve with information, we can solve it with a program. No additional hardware needed. We can emulate any hardware we might need, new or old, in software.

This is truly one of the most important ideas in computer science. But it's also an idea that changes how we approach problems in nearly every other discipline. Philosophically, it was a monumental achievement in humanity's ongoing quest to understand the universe and our place in it.

In this course, you will learn some of the intricacies of writing programs that simulate and translate other programs. At times, that will be a challenge. When you are deep in the trenches some night, trying to find an elusive error in your code, keep the big idea in mind. Perhaps it will comfort you.

Oh, and I am teaching my compilers course again after a two-year break. Yay!

Posted by Eugene Wallingford | Permalink | Categories: Computing

August 23, 2015 10:12 AM

Science Students Should Learn How to Program, and Do Research

Physicist, science blogger, and pop science author Chad Orzel offered some advice for prospective science students in a post on his Forbes blog last week. Among other things, he suggests that science students learn to program. Orzel is among many physics profs who integrate computer simulations into their introductory courses, using the Matter and Interactions curriculum (which you may recall reading about here in a post from 2007).

I like the way Orzel explains the approach to his students:

When we start doing programming, I tell students that this matters because there are only about a dozen problems in physics that you can readily solve exactly with pencil and paper, and many of them are not that interesting. And that goes double, maybe triple for engineering, where you can't get away with the simplifying spherical-cow approximations we're so fond of in physics. Any really interesting problem in any technical field is going to require some numerical simulation, and the sooner you learn to do that, the better.

This advice complements Astrachan's Law and its variants, which assert that we should not ask students to write a program if they can do the task by hand. Conversely, if they can't solve their problems by hand, then they should get comfortable writing programs that can. (Actually, that's the contrapositive of Astrachan, but "contrapositively" doesn't sound as good.) Programming is a medium for scientists, just as math is, and it becomes more important as they try to solve more challenging problems.

Orzel and Astrachan both know that the best way to learn to program is to have a problem you need a computer to solve. Curricula such as Matter and Interactions draw on this motivation and integrate computing directly into science courses. This is good news for us in computer science. Some of the students who learn how to program in their science courses find that they like it and want to learn more. We have just the courses they need to go deeper.

I concur with all five of Orzel's suggestions for prospective science students. They apply as well to computer science students as to those interested in the physical sciences. When I meet with prospective CS students and their families, I emphasize especially that students should get involved in research. Here is Orzel's take:

While you might think you love science based on your experience in classes, classwork is a pale imitation of actual science. One of my colleagues at Williams used a phrase that I love, and quote all the time, saying that "the hardest thing to teach new research students is that this is not a three-hour lab."

CS students can get involved in empirical research, but they also have the ability to write their own programs to explore their own ideas and interests. The world of open source software enables them to engage the discipline in ways that preceding generations could only have dreamed of. By doing empirical CS research with a professor or working on substantial programs that have users other than the creators, students can find out what computer science is really about -- and find out what they want to devote their lives to.

As Orzel points out, this is one of the ways in which small colleges are great for science students: undergrads can more readily become involved in research with their professors. This advantage extends to smaller public universities, too. In the past year, we have had undergrads do some challenging work on bioinformatics algorithms, motion virtual manipulatives, and system security. These students are having a qualitatively different learning experience than students who are only taking courses, and it is an experience that is open to all undergrad students in CS and the other sciences here.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 12, 2015 10:09 AM

Graphic Art: Links in Jewish Literature

"Genesis 1:1 is the Kevin Bacon of Sefaria."

This morning I finally read Sefaria in Gephi: Seeing Links in Jewish Literature, which had been in my reading list for a few months. In it, Liz Shayne introduces a collaborative project to visualize the relationships among 100,000+ sections of Jewish literature encoded in Sefaria, an online library of Jewish texts. It's a cool project, and the blog entries about it remind us how beautiful visualizations of graphs can be. I love this basic image, in which nodes represent sections of text, color indicates the type of text, and size corresponds to the degree of the node:

a graph of relationships in the Sefaria

This is suitable for framing and would make a fine piece of art on my office wall.

Images like this can help us to understand a large dataset at a high level more easily than simply looking at the data themselves. Of course, creating the image requires some initial understanding, too. There is a give-and-take between analyzing the data and visualizing it that mutually reinforces our understanding.

As I mentioned in a December 2004 post, sometimes a computer scientist can produce a beautiful picture without intending to. One of my grad students, Nate Labelle, studied package dependencies in Linux as part of a project on power laws and open-source software. He created this image that shows the dependencies among one hundred randomly selected packages:

Linux package dependencies as art

Unlike the neat concentric Sefaria image above, Nate's image has a messy asymmetry that reflects the more decentralized nature of the Linux ecosystem. It evokes for me a line drawing of a book whose pages are being riffled. After all these years, I still think it's an attractive image.

I have not read the rest of the Sefaria blog series, but peeking ahead I saw a neat example in Sefaria III: Comparative Graphing that shows the evolution of the crowd-sourced Sefaria dataset over the course of four months:

evolution of the Sefaria dataset over time

These images look almost like a time-lapse photograph of a supernova exploding ( video). They are pretty as art, and perhaps instructive about how the Sefaria community operates.

The Ludic Analytics site has links to two additional entries for the project [ II | IV ], but the latest is dated the end of 2014. I hope that Shayne or others involved with the project write more about their use of visualizations to understand the growing dataset. If nothing else, they may create more art for my walls.

Posted by Eugene Wallingford | Permalink | Categories: Computing

July 27, 2015 2:23 PM

The Flip Side to "Programming for All"

a thin volume of William Blake

We all hear the common refrain these days that more people should learn to program, not just CS majors. I agree. If you know how to program, you can make things. Even if you don't write many programs yourself, you are better prepared to talk to the programmers who make things for you. And even if you don't need to talk to programmers, you have expanded your mind a bit to a way of thinking that is changing the world we live in.

But there are two sides to this equation, as Chris Crawford laments in his essay, Fundamentals of Interactivity:

Why is it that our entertainment software has such primitive algorithms in it? The answer lies in the people creating them. The majority are programmers. Programmers aren't really idea people; they're technical people. Yes, they use their brains a great deal in their jobs. But they don't live in the world of ideas. Scan a programmer's bookshelf and you'll find mostly technical manuals plus a handful of science fiction novels. That's about the extent of their reading habits. Ask a programmer about Rabelais, Vivaldi, Boethius, Mendel, Voltaire, Churchill, or Van Gogh, and you'll draw a blank. Gene pools? Grimm's Law? Gresham's Law? Negentropy? Fluxions? The mind-body problem? Most programmers cannot be troubled with such trivia. So how can we expect them to have interesting ideas to put into their algorithms? The result is unsurprising: the algorithms in most entertainment products are boring, predictable, uninformed, and pedestrian. They're about as interesting in conversation as the programmers themselves.

We do have some idea people working on interactive entertainment; more of them show up in multimedia than in games. Unfortunately, most of the idea people can't program. They refuse to learn the technology well enough to express themselves in the language of the medium. I don't understand this cruel joke that Fate has played upon the industry: programmers have no ideas and idea people can't program. Arg!

My office bookshelf occasionally elicits a comment or two from first-time visitors, because even here at work I have a complete works of Shakespeare, a thin volume of William Blake (I love me some Blake!), several philosophy books, and "The Brittanica Book of Usage". I really should have some Voltaire here, too. I do cover one of Crawford's bases: a recent blog entry made a software analogy to Gresham's Law.

In general, I think you're more likely to find a computer scientist who knows some literature than you are to find a literary professional who knows much CS. That's partly an artifact of our school system and partly a result of the wider range historically of literature and the humanities. It's fun to run into a colleague from across campus who has read deeply in some area of science or math, but rare.

However, we are all prone to fall into the chasm of our own specialties and miss out on the well-roundedness that makes us better at whatever specialty we practice. That's one reason that, when high school students and their parents ask me what students should take to prepare for a CS major, I tell them: four years of all the major subjects, including English, math, science, social science, and the arts; plus whatever else interests them, because that's often where they will learn the most. All of these topics help students to become better computer scientists, and better people.

And, not surprisingly, better game developers. I agree with Crawford that more programmers should be learn enough other stuff to be idea people, too. Even if they don't make games.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 26, 2015 10:03 AM

A Couple of Passages on Disintermediation

"Disintermediation" is just a fancy word for getting other people out of the space between the people who create things and the people who read or listen to those things.

1. In What If Authors Were Paid Every Time Someone Turned a Page?, Peter Wayner writes:

One latter-day Medici posted a review of my (short) book on Amazon complaining that even 99 cents was too expensive for what was just a "blog post". I've often wondered if he was writing that comment in a Starbucks, sipping a $6 cup of coffee that took two minutes to prepare.

Even in the flatter world of ebooks, Amazon has the power to shape the interactions of creators and consumers and to influence strongly who makes money and what kind of books we read.

2. Late last year, Steve Albini spoke on the surprisingly sturdy state of the music industry:

So there's no reason to insist that other obsolete bureaux and offices of the lapsed era be brought along into the new one. The music industry has shrunk. In shrinking it has rung out the middle, leaving the bands and the audiences to work out their relationship from the ends. I see this as both healthy and exciting. If we've learned anything over the past 30 years it's that left to its own devices bands and their audiences can get along fine: the bands can figure out how to get their music out in front of an audience and the audience will figure out how to reward them.

Most of the authors and bands who aren't making a lot of money these days weren't making a lot of money -- or any money at all -- in the old days, either. They had few effective ways to distribute their writings or their music.

Yes, there are still people in between bands and their fans, and writers and their readers, but Albini reminds us how much things have improved for creators and audiences alike. I especially like his takedown of the common lament, "We need to figure out how to make this work for everyone." That sentence has always struck me as the reactionary sentiment of middlemen who no longer control the space between creators and audiences and thus no longer get their cut of the transaction.

I still think often about what this means for universities. We need to figure out how to make this internet thing work for everyone...

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 24, 2015 2:07 PM

Sentences of the Day

Three sentences stood out from the pages of my morning reading. The first two form an interesting dual around power and responsibility.

The Power to Name Things

Among the many privileges of the center, for example, is the power to name things, one of the greatest powers of all.

Costica Bradatan writes this in Change Comes From the Margins, a piece on social change. We programmers know quite well the power of good names, and thus the privilege we have in being able to create them and the responsibility we have to do that well.

The Avoidance of Power as Irresponsibility

Everyone's sure that speech acts and cultural work have power but no one wants to use power in a sustained way to create and make, because to have power persistently, in even a small measure, is to surrender the ability to shine a virtuous light on one's own perfected exclusion from power.

This sentence comes from the heart of Timothy Burke's All Grasshoppers, No Ants, his piece on one of the conditions he thinks ails our society as a whole. Burke's essay is almost an elaboration of Teddy Roosevelt's well-known dismissal of critics, but with an insightful expression of how and why rootless critics damage society as a whole.

Our Impotence in the Face of Depression

Our theories about mental health are often little better than Phlogiston and Ether for the mind.

Quinn Norton gives us this sentence in Descent, a personally-revealing piece about her ongoing struggle with depression. Like many of you, I have watched friends and loved ones fight this battle, which demonstrates all too readily the huge personal costs of civilization's being in such an early stage of understanding this disease, its causes, and its effective treatment.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 21, 2015 3:02 PM

'Send' Is The Universal Verb

In the mid-1980s, Ray Ozzie left IBM with the idea of creating an all-in-one software platform for business collaboration, based on his experience using the group messaging system in the seminal computer-assisted instruction system Plato. Ozzie's idea eventually became Lotus Notes. This platform lives on today in an IBM product, but it never had the effect that Ozzie envisioned for it.

In Office, Messaging, and Verbs, Benedict Evans tells us that Ozzie's idea is alive and well and finally taking over the world -- in the form of Facebook:

But today, Facebook's platform on the desktop is pretty much Ray Ozzie's vision built all over again but for consumers instead of enterprise and for cat pictures instead of sales forecasts -- a combination of messaging with embedded applications and many different data types and views for different tasks.

"Office, Messaging, and Verbs" is an engaging essay about how collaborative work and the tools we use to do it co-evolve, changing each other in turn. You need a keyboard to do the task at hand... But is the task at hand your job, or is it merely the way you do your job today? The answer depends on where you are on the arc of evolution.

Alas, most days I need to create or consume a spreadsheet or two. Spreadsheets are not my job, but they are way people in universities and most other corporate entities do too many of their jobs these days. So, like Jack Lemmon in The Apartment, I compute my cell's function and pass it along to the next person in line.

I'm ready for us to evolve further down the curve.


Note: I added the Oxford comma to Evans's original title. I never apologize for inserting an Oxford comma.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 20, 2015 2:59 PM

Rethinking Accounting Software and Interfaces in the 1980s

In Magic Ink: Information Software and the Graphical Interface, Bret Victor reminds us that the dominant style of user interface today was created long before today's computers:

First, our current UI paradigm was invented in a different technological era. The initial Macintosh, for example, had no network, no mass storage, and little inter-program communication. Thus, it knew little of its environment beyond the date and time, and memory was too precious to record significant history. Interaction was all it had, so that's what its designers used. And because the computer didn't have much to inform anyone of, most of the software at the time was manipulation software -- magic versions of the typewriter, easel, and ledger-book. Twenty years and an internet explosion later, software has much more to say, but an inadequate language with which to say it.

William McCarthy, creator of the REA model of accounting

Victor's mention of the accounting ledger brings to mind the work being done since the early 1980s by Bill McCarthy, an accounting professor at Michigan State. McCarthy is motivated by a similar set of circumstances. The techniques by which we do financial accounting were created long before computers came along, and the constraints that made them necessary no longer exist. But he is looking deeper than simply the interaction style of accounting software; he is interested in upending the underlying model of accounting data.

McCarthy proposed the resources, events, agents (REA) model -- essentially an application of database theory from CS -- as an alternative to traditional accounting systems. REA takes advantage of databases and other computing ideas to create a more accurate model of a business and its activity. It eliminates many of the artifacts of double-entry bookkeeping, including debits, credits, and placeholder accounts such as accounts receivable and payable, because they can generated in real time from more fine-grained source data. An REA model of a business enables a much wider range of decision support than the traditional accounting model while still allowing the firm to produce all the artifacts of traditional accounting as side effect.

(I had the good fortune to work with McCarthy during my graduate studies and even helped author a conference paper on the development of expert systems from REA models. He also served on my dissertation committee.)

In the early years, many academic accountants reacted with skepticism to the idea of REA. They feared losing the integrity of the traditional accounting model, which carried a concomitant risk to the trust placed by the public in audited financial statements. Most of these concerns were operational, not theoretical. However, a few people viewed REA as somehow dissing the system that had served the profession so well for so long.

Victor includes a footnote in Magic Ink that anticipates a similar concern from interaction designers to his proposals:

Make no mistake, I revere GUI pioneers such as Alan Kay and Bill Atkinson, but they were inventing rules for a different game. Today, their windows and menus are like buggy whips on a car. (Although Alan Kay clearly foresaw today's technological environment, even in the mid-'70s. See "A Simple Vision of the Future" in his fascinating Early History of Smalltalk (1993).)

"They were inventing rules for a different game." This sentence echoes how I have always felt about Luca Pacioli, the inventor of double-entry bookkeeping. It was a remarkable technology that helped to enable the growth of modern commerce by creating a transparent system of accounting that could be trusted by insiders and outsiders alike. But he was inventing rules for a different game -- 500 years ago. Half a century dwarfs the forty or fifty year life of windows, icons, menus, and pointing and clicking.

I sometimes wonder what might have happened if I had pursued McCarthy's line of work more deeply. It dovetails quite nicely with software patterns and would have been well-positioned for the more recent re-thinking of financial support software in the era of ubiquitous mobile computing. So many interesting paths...

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 13, 2015 2:51 PM

Thinking in Code

A conversation this morning with a student reminded me of a story one of our alumni, a local entrepreneur, told me about his usual practice whenever he has an idea for a new system or a new feature for an existing system.

The alum starts by jotting the idea down in Java, Scala, or some other programming language. He puts this sketch into a git repository and uses the file to document his thought process. He also records there links to related systems, links to papers on implementation techniques, and any other resources he thinks might be handy. The code itself can be at varying levels of completeness. He allows himself to work out some of the intermediate steps in enough detail to make code work, while leaving other parts as skeletons.

This approach helps him talk to technical customers about the idea. The sketch shows what the idea might look like at a high level, perhaps with some of the intermediate steps running in some useful way. The initial draft helps him identify key development issues and maybe even a reasonable first estimate for how long it would take to flesh out a complete implementation. By writing code and making some of it work, the entrepreneur in him begins to see where the opportunities for business value lie.

If he decides that the idea is worth a deeper look, he passes the idea onto members of his team in the form of his git repo. The file includes links to relevant reading and his initial thoughts about the system and its design. The code conveys ideas more clearly and compactly than a natural language description would. Even if his team decides to use none of the code -- and he expects they won't -- they start from something more expressive than a plain text document.

This isn't quite a prototype or a spike, but it has the same spirit. The code sketch is another variation on how programming is a medium for expressing ideas in a way that other media can't fully capture.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 29, 2015 1:58 PM

Bridging the Gap Between Learning and Doing

a sketch of bridging the gap

I recently learned about the work of Amelia McNamara via this paper published as Research Memo M-2014-002 by the Viewpoints Research Institute. McNamara is attacking an important problem: the gap between programming tools for beginners and programming tools for practitioners. In Future of Statistical Programming, she writes:

The basic idea is that there's a gap between the tools we use for teaching/learning statistics, and the tools we use for doing statistics. Worse than that, there's no trajectory to make the connection between the tools for learning statistics and the tools for doing statistics. I think that learners of statistics should also be doers of statistics. So, a tool for statistical programming should be able to step learners from learning statistics and statistical programming to truly doing data analysis.

"Learners of statistics should also be doers of statistics." -- yes, indeed. We see the same gap in computer science. People who are learning to program are programmers. They are just working at a different level of abstraction and complexity. It's always a bit awkward, and often misleading, when we give novice programmers a different set of tools than we give professionals. Then we face a new learning barrier when we ask them to move up to professional tools.

That doesn't mean that we should turn students loose unprotected in the wilds of C++, but it does require that that we have a pedagogically sound trajectory for making the connection between novice languages and tools and those used by more advanced programmers.

It also doesn't mean that we can simply choose a professional language that is in some ways suitable for beginners, such as Python, and not think any more about the gap. My recent experience reminds me that there is still a lot of complexity to help our students deal with.

McNamara's Ph.D. dissertation explored some of the ways to bridge this gap in the realm of statistics. It starts from the position that the gap should not exist and suggests ways to bridge it, via both better curricula and better tools.

Whenever I experience this gap in my teaching or see researchers trying to make it go away, I think back to Alan Kay's early vision for Smalltalk. One of the central tenets of the Smalltalk agenda was to create a language flexible and rich enough that it could accompany the beginner as he or she grew in knowledge and skill, opening up to a new level each time the learner was ready for something more powerful. Just as a kindergartener learns the same English language used by Shakespeare and Joyce, a beginning programmer might learn the same language as Knuth and Steele, one that opens up to a new level each time the learner is ready.

We in CS haven't done especially good job at this over the years. Matthias Felleisen and the How to Design Programs crew have made perhaps the most successful effort thus far. (See *SL, Not Racket for a short note on the idea.) But this project has not made a lot of headway yet in CS education. Perhaps projects such as McNamara's can help make inroads for domain-specific programmers. Alan Kay may harbor a similar hope; he served as a member of McNamara's Ph.D. committee.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 12, 2015 2:39 PM

A Cool Example of Turning Data into Program: TempleOS

Hyperlinks that point to and execute code, not transfer us to a data file:

In a file from the TempleOS source code, one line contains the passage "Several other routines include a ...", where the "other routines" part is a hyperlink. Unlike in HTML, where that ... may lead to a page listing those other routines, here a DolDoc macro is used so that a grep is actually performed when you click on it. While the HTML version could become stale if no-one updated it, this is always up-to-date.

This comes from Richard Milton's A Constructive Look At TempleOS, which highlights some of the unusual features of an OS I had never heard of until I ran across his article. As I read it, I thought of Alan Kay's assertion that a real programming language should eliminate the need to have an operating system at all. The language should give programmers access to whatever they need to access and marshall the resources of the computer. Smalltalk is a language that aspired to this goal. Today, the best example of this idea is probably Racket, which continues to put more of the underlying system into the hands of programmers via the language itself. That is an essential element of the Racket Way.

TempleOS comes at this idea from the other side, as an operating system that puts as much computing as it can in the hands of the user. This includes programming, in the form of HolyC, a homegrown variant of C. TempleOS is written in HolyC, but HolyC is also the scripting language of the system's REPL. It's odd to talk about programming TempleOS at all, though. As Milton points out, like Xerox Alto, Oberon, and Plan 9, TempleOS "blurs the lines between programs and documents". Writing a program is like creating a document of any other sort, and creating a document of any sort is a form of programming.

Trading data for code creates a different kind of barrier for new users of TempleOS. It also pays dividends by injecting a tempting sort of dynamism to the system.

In any case, programmers of a certain age will feel a kinship with the kind of experience that TempleOS seeks to provide. We grew up in an age when every computer was an open laboratory, just waiting for us to explore them at every level. TempleOS has the feel -- and, perhaps unfortunately, the look -- of the 1970s and 1980s.

Hurray for crazy little operating systems like TempleOS. Maybe we can learn something useful from them. That's how the world of programming languages works, too. If not, the creator can have a lot of fun making a new world, and the rest of us can share in the fun vicariously.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 04, 2015 2:33 PM

If the Web is the Medium, What is the Message?

How's this for a first draft:

History may only be a list of surprises, but you sure as heck don't want to lose the list.

That's part of the message in Bret Victor's second 'Web of Alexandria' post. He Puts it in starker terms:

To forget the past is to destroy the future. This is where Dark Ages come from.

Those two posts followed a sobering observation:

60% of my fav links from 10 yrs ago are 404. I wonder if Library of Congress expects 60% of their collection to go up in smoke every decade.

But it's worse than that, Victor tells us in his follow-up. As his tweet notes, the web has turned out to be unreliable as a publication medium. We publish items because we want them to persist in the public record, but they don't rarely persist for very long. However, the web has turned out to be a pernicious conversational medium as well. We want certain items shared on the web to be ephemeral, yet often those items are the ones that last forever. At one time, this may have seemed like only an annoyance, but now we know it to be dangerous.

The problem isn't that the web is a bad medium. In one sense, the web isn't really a medium at all; it's an infrastructure that enables us to create new kinds of media with historically uncharacteristic ease. The problem is that we are using web-based media for many different purposes, without understanding how each medium determines "the social and temporal scope of its messages".

The same day I read Victor's blog post, I saw this old Vonnegut quote fly by on Twitter:

History is merely a list of surprises. ... It can only prepare us to be surprised yet again.

Alas, on the web, history appears to be a list of cat pictures and Tumblr memes, with all the important surprises deleted when the author changed internet service providers.

In a grand cosmic coincidence, on the same day I read Victor's blog post and saw the Vonnegut quote fly by, I also read a passage from Marshall McLuhan in a Farnam Street post. It ends:

The modern world abridges all historical times as readily as it reduces space. Everywhere and every age have become here and now. History has been abolished by our new media.

The internet certainly amplifies the scale of McLuhan's worry, but the web has created unique form of erasure. I'm sure McLuhan would join Victor in etching an item on history's list of surprises:

Protect the past.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

June 02, 2015 1:46 PM

"I Just Need a Programmer", Screenplay Edition

Noted TV writer, director, producer, and blogger Ken Levine takes on a frequently-asked question in the latest edition of his "Friday Questions" feature:

I have a great idea for a movie, but I'm not a writer, I'm not in show biz, and I don't live in New York or LA. What do I do with this great idea? (And I'm sure you've never heard this question before, right?)

Levine is gentle in response:

This question does come up frequently. I wish I had a more optimistic answer. But the truth is execution is more valued than ideas. ...

Is there any domain where this isn't true? Yet professionals in every domain seem to receive this question all the time. I certainly receive the "I just need a programmer..." phone call or e-mail every month. If I went to cocktail parties, maybe I'd hear it at them, too.

The bigger the gap between idea and product, the more valuable, relatively speaking, execution is than having ideas. For many app ideas, executing the idea is not all that far beyond the reach of many people. Learn a little Objective C, and away you go. In three or four years, you'll be set! By comparison, writing a screenplay that anyone in Hollywood will look at (let alone turn into a blockbuster film) seems like Mount Everest.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

May 29, 2015 11:20 AM

Fulfill God's Plan. Write a Computer Program.

In his entry for The Harvard Guide to Influential Books, Psychologist Jerome Kagan recommends the novel The Eternal Smile by Par Lagerqvist. He focuses his recommendation on a single sentence:

After an interminably long search, a large group of dead people find God and the leader steps forward and asks him what purpose he had in creating human beings. God replies, "I only intended that you need never be content with nothing."

Kagan sees this sentence as capturing a thematic idea about the historical conditions that shape humanity's conception of morality. He is probably right; he's a deeply read and highly respected scholar.

When I read it, though, I thought about how lucky I am that I know how to program. When you can write a computer program, you never need to be content with the status quo in any situation that involves information and a problem to solve. You can write a program and reshape a little part of the world.

So, in a way, computer programming is a part of how humanity achieves its destiny in the universe. I hope that isn't too much hubris for a Friday morning.

Posted by Eugene Wallingford | Permalink | Categories: Computing

May 09, 2015 9:28 AM

A Few Thoughts on Graduation Day

Today is graduation day for the Class of 2015 at my university. CS students head out into the world, most with a job in hand or nearly so, ready to apply their hard-earned knowledge and skills to all variety of problems. It's an exciting time for them.

This week also brought two other events that have me thinking about the world in which my students my will live and the ways in which we have prepared them. First, on Thursday, the Technology Association of Iowa organized a #TechTownHall on campus, where the discussion centered on creating and retaining a pool of educated people to participate in, and help grow, the local tech sector. I'm a little concerned that the TAI blog says that "A major topic was curriculum and preparing students to provide immediate value to technology employers upon graduation." That's not what universities do best. But then, that is often what employers want and need.

Second, over the last two mornings, I read James Fallows's classic The Case Against Credentialism, from the archives of The Atlantic. Fallows gives a detailed account of the "professionalization" of many lines of work in the US and the role that credentials, most prominently university degrees, have played in the movement. He concludes that our current approach is biased heavily toward evaluating the "inputs" to the system, such as early success in school and other demonstrations of talent while young, rather than assessing the outputs, namely, how well people actually perform after earning their credentials.

Two passages toward the end stood out for me. In one, Fallows wonders if our professionalized society creates the wrong kind of incentives for young people:

An entrepreneurial society is like a game of draw poker; you take a lot of chances, because you're rarely dealt a pat hand and you never know exactly what you have to beat. A professionalized society is more like blackjack, and getting a degree is like being dealt nineteen. You could try for more, but why?

Keep in mind that this article appeared in 1985. Entrepreneurship has taken a much bigger share of the public conversation since then, especially in the teach world. Still, most students graduating from college these days are likely thinking of ways to convert their nineteens into steady careers, not ways to risk it all on the next Amazon or Über.

Then this quote from "Steven Ballmer, a twenty-nine-year-old vice-president of Microsoft", on how the company looked for new employees:

We go to colleges not so much because we give a damn about the credential but because it's hard to find other places where you have large concentrations of smart people and somebody will arrange the interviews for you. But we also have a lot of walk-on talent. We're looking for programming talent, and the degree is in no way, shape, or form very important. We ask them to send us a program they've written that they're proud of. One of our superstars here is a guy who literally walked in off the street. We talked him out of going to college and he's been here ever since.

Who would have guessed in 1985 the visibility and impact that Ballmer would have over the next twenty years? Microsoft has since evolved from the entrepreneurial upstart to the staid behemoth, and now is trying to reposition itself as an important player in the new world of start-ups and mobile technology.

Attentive readers of this blog may recall that I fantasize occasionally about throwing off the shackles of the modern university, which grow more restrictive every year as the university takes on more of the attributes of corporate and government bureaucracy. In one of my fantasies, I organize a new kind of preparatory school for prospective software developers, one with a more modern view of learning to program but also an attention to developing the whole person. That might not satisfy corporate America's need for credentials, but it may well prepare students better for a world that needs poker players as much as it needs blackjack players. But where would the students come from?

So, on a cloudy graduation day, I think about Fallows's suggestion that more focused vocational training is what many grads need, about the real value of a liberal university education to both students and society, and about how we can best prepare CS students participate to in the world. It is a world that needs not only their technical skills but also their understanding of what tech can and cannot do. As a society, we need them to take a prominent role in civic and political discourse.

One final note on the Fallows piece. It is a bit long, dragging a bit in the middle like a college research paper, but opens and closes strongly. With a little skimming through parts of less interest, it is worth a read. Thanks to Brian Marick for the recommendation.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

April 30, 2015 6:00 PM

Software is a Means of Communication, Just Like a Research Paper

I can't let my previous post be my only comment on Software in Scientific Research. Hinsen's bigger point is worth a post of its own.

Software is a means of communication, just like papers or textbooks.

... much like the math that appears in a paper or a textbook -- except that, done properly, a computer program runs and provides a dynamic demonstration of an idea.

The main questions asked about scientific software [qua software] are "What does it do?" and "How efficient is it?" When considering software as a means of communication, we would ask questions such as "Is it well-written, clear, elegant?", "How general is the formulation?", or "Can I use it as the basis for developing new science?".

This shift requires a different level of understanding of programs and programming than many scientists (and other people who do not program for a living) have. But it is a shift that needs to take place, so we should so all we can to help scientists and others become more fluent. (Hey to Software Carpentry and like-minded efforts.)

We take for granted that all researchers are responsible for being able to produce and, more importantly, understand the other essential parts of scientific communication:

We actually accept as normal that the scientific contents of software, i.e., the models implemented by it, are understandable only to software specialists, meaning that for the majority of users, the software is just a black box. Could you imagine this for a paper? "This paper is very obscure, but the people who wrote it are very smart, so let's trust them and base our research on their conclusions." Did you ever hear such a claim? Not me.

This is a big part of the challenge we face in getting faculty across the university to see the vital role that computing should play in modern education -- as well as the roles it should not play. The same is true in the broader culture. We'll see if efforts such as can make a dent in this challenge.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 29, 2015 1:52 PM

Beautiful Sentences: Scientific Data as Program

On the way to making a larger point about the role of software in scientific research, Konrad Hinsen writes these beautiful sentences:

Software is just data that can be interpreted as instructions for a computer. One could conceivably write some interpreter that turns previously generated data into software by executing it.

They express one side of one of the great ideas of computer science, the duality of program and data:

  • Every program is data to some other program, and
  • every set of data is a program to some machine.

This is one of the reasons why it is so important for CS students to study the principles of programming languages, create languages, and build interpreters. These activities help bring this great idea to life and prepare those who understand it to solve problems in ways that are otherwise hard to imagine.

Besides, the duality is a thing of beauty. We don't have to use it as a tool in order to appreciate this sublime truth.

As Hinsen writes, few people outside of computer science (and, sadly, too many within CS) appreciate "the particular status of software as both tool an information carrier and a tool". The same might be said for our appreciation of data, and the role that language plays in bridging the gap between the two.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 20, 2015 4:02 PM

"Disjunctive Inference" and Learning to Program

Over the weekend, I read Hypothetical Reasoning and Failures of Disjunctive Inference, a well-sourced article on the problems people have making disjunctive inferences. It made me think about some of the challenges students have learning to program.

Disjunctive inference is reasoning that requires us to consider hypotheticals. A simple example from the article is "the married problem":

Jack is looking at Ann, but Ann is looking at George. Jack is married, but George is not. Is a married person looking at an unmarried person?
  1. Yes.
  2. No.
  3. Cannot be determined.

The answer is yes, of course, which is obvious if we consider the two possible cases for Ann. Most people, though, stop thinking as soon as they realize that the answer hinges on Ann's status. They don't know her status, so they can't know the answer to the question. Even so, most everyone understands the answer as soon as the reasoning is explained to them.

The reasons behind our difficulties handling disjunctive inferences are complex, including both general difficulties we have with hypotheticals and a cognitive bias sometimes called cognitive miserliness: we seek to apply the minimum amount of effort to solving problems and making decisions. This is a reasonable evolutionary bias in many circumstances, but here it is maladaptive.

The article is fascinating and well worth a full read. It points to a number of studies in cognitive psychology that seek to understand how humans behave in the face if disjunctive inferences, and why. It closes with some thoughts on improving disjunctive reasoning ability, though there are no quick fixes.

As I read the article, it occurred to me that learning to program places our students in a near-constant state of hypothetical reasoning and disjunctive inference. Tracing code that contains an if statement asks them to think alternative paths and alternative outcomes. To understand what is true after the if statement executes is disjunctive inference.

Something similar may be true for a for loop, which executes once each for multiple values of a counter, and a while loop, which runs an indeterminate number of times. These aren't disjunctive inferences, but they do require students to think hypothetically. I wonder if the trouble many of my intro CS students had last semester learning function calls involved failures of hypothetical reasoning as much as it involves difficulties with generalization.

And think about learning to debug a program.... How much of that process involves hypotheticals and even full-on disjunctive inference? If most people have trouble with this sort of reasoning even on simple tasks, imagine how much harder it must be for young people who are learning a programming language for the first time and trying to reason about programs that are much more complex than "the married problem"?

Thinking explicitly about this flaw in human thinking may help us teachers do a better job helping students to learn. In the short term, we can help them by giving more direct prompts for how to reason. Perhaps we can also help them learn to prompt themselves when faced with certain kinds of problems. In the longer term, we can perhaps help them to develop a process for solving problems that mitigates the bias. This is all about forming useful habits of thought.

If nothing else, reading this article will help me be slower to judge my students's work ethic. What looks like laziness is more likely a manifestation of a natural bias to exert the minimum amount of effort to solving problems. We are all cognitive misers to a certain extent, and that serves us well. But not always when we are writing and debugging programs.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 30, 2015 3:33 PM

Reminiscing on the Effects of Photoshop

Thomas Knoll, one of the creators of Adobe Photoshop, reminisces on the insight that gave rise to the program. His brother, John, worked on analog image composition at Industrial Light and Magic, where they had just begun to experiment with digital processing.

[ILM] had a scanner that could scan in frames from a movie, digitally process them, and then write the images out to film again.

My brother saw that and had a revelation. He said, "If we convert the movie footage into numbers, and we can convert the numbers back into movie footage, then once it's in the numerical form we could do anything to it. We'd have complete power."

I bought my first copy of Photoshop in the summer of 1992, as part of my start-up package for new faculty. In addition to the hardware and software I needed to do my knowledge-based systems research, we also outfitted the lab with a number of other tools, including Aldus Persuasion, a LaCie digital scanner, OmniPage Pro software for OCR, Adobe Premiere, and Adobe Photoshop. I felt like I could do anything I wanted with text, images, and video. It was a great power.

In truth, I barely scratched the surface of what was possible. Others took Photoshop and went places that even Adobe didn't expect them to go. The Knoll brothers sensed what was possible, but it must have been quite something to watch professionals and amateurs alike use the program to reinvent our relationship with images. Here is Thomas Knoll again:

Photoshop has so many features that make it extremely versatile, and there are artists in the world who do things with it that are incredible. I suppose that's the nature of writing a versatile tool with some low-level features that you can combine with anything and everything else.

Digital representation opens new doors for manipulation. When you give users control at both the highest levels and the lowest, who knows what they will do. Stand back and wait.

Posted by Eugene Wallingford | Permalink | Categories: Computing

March 13, 2015 3:07 PM

Two Forms of Irrelevance

When companies become irrelevant to consumers.
From The Power of Marginal, by Paul Graham:

The big media companies shouldn't worry that people will post their copyrighted material on YouTube. They should worry that people will post their own stuff on YouTube, and audiences will watch that instead.

You mean Grey's Anatomy is still on the air? (Or, as today's teenagers say, "Grey's what?")

When people become irrelevant to intelligent machines.
From Outing A.I.: Beyond the Turing Test, by Benjamin Bratton:

I argue that we should abandon the conceit that a "true" Artificial Intelligence must care deeply about humanity -- us specifically -- as its focus and motivation. Perhaps what we really fear, even more than a Big Machine that wants to kill us, is one that sees us as irrelevant. Worse than being seen as an enemy is not being seen at all.

Our new computer overlords indeed. This calls for a different sort of preparation than studying lists of presidents and state capitals.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

March 04, 2015 3:28 PM

Code as a Form of Expression, Even Spreadsheets

Even formulas in spreadsheets, even back in the early 1980s:

Spreadsheet models have become a form of expression, and the very act of creating them seem to yield a pleasure unrelated to their utility. Unusual models are duplicated and passed around; these templates are sometimes used by other modelers and sometimes only admired for their elegance.

People love to make and share things. Computation has given us another medium in which to work, and the things people make with it are often very cool.

The above passage comes from Stephen Levy's A Spreadsheet Way of Knowledge, which appeared originally in Harper's magazine in November 1984. He re-published it on Medium this week in belated honor of Spreadsheet Day last October 17, which was the 35th anniversary of VisiCalc, "the Apple II program that started it all". It's a great read, both as history and as a look at how new technologies create unexpected benefits and dangers.

Posted by Eugene Wallingford | Permalink | Categories: Computing

February 27, 2015 3:37 PM

Bad Habits and Haphazard Design

With an expressive type system for its teaching
languages, HtDP could avoid this problem to some
extent, but adding such rich types would also take
the fun out of programming.

As we approach the midpoint of the semester, Matthias Felleisen's Turing Is Useless strikes a chord in me. My students have spent the last two months learning a little Racket, a little functional programming, and a little about how to write data-driven recursive programs. Yet bad habits learned in their previous courses, or at least unchecked by what they learned there, have made the task harder for many of them than it needed to be.

The essay's title plays off the Church-Turing thesis, which asserts that all programming languages have the same expressive power. This powerful claim is not good news for students who are learning to program, though:

Pragmatically speaking, the thesis is completely useless at best -- because it provides no guideline whatsoever as to how to construct programs -- and misleading at worst -- because it suggests any program is a good program.

With a Turing-universal language, a clever student can find a way to solve any problem with some program. Even uninspired but persistent students can tinker their way to a program that produces the right answers. Unfortunately, they don't understand that the right answers aren't the point; the right program is. Trolling StackOverflow will get them a program, but too often the students don't understand whether it is a good or bad program in their current situation. It just works.

I have not been as faithful to the HtDP approach this semester as I probably should have been, but I share its desire to help students to design programs systematically. We have looked at design patterns that implement specific strategies, not language features. Each strategy focuses on the definition of the data being processed and the definition of the value being produced. This has great value for me as the instructor, because I can usually see right away why a function isn't working for the student the way he or she intended: they have strayed from the data as defined by the problem.

This is also of great value to some of my students. They want to learn how to program in a reliable way, and having tools that guide their thinking is more important than finding yet another primitive Racket procedure to try. For others, though "garage programming" is good enough; they just want get the job done right now, regardless of which muscles they use. Design is not part of their attitude, and that's a hard habit to break. How use doth breed a habit in a student!

Last semester, I taught intro CS from what Felleisen calls a traditional text. Coupled that experience with my experience so far this semester, I'm thinking a lot these days about how we can help students develop a design-centered attitude at the outset of their undergrad courses. I have several blog entries in draft form about last semester, but one thing that stands out is the extent to which every step in the instruction is driven by the next cool programming construct. Put them all on the table, fiddle around for a while, and you'll make something that works. One conclusion we can draw from the Church-Turing thesis is that this isn't surprising. Unfortunately, odds are any program created this way is not a very good program.


(The sentence near the end that sounds like Shakespeare is. It's from The Two Gentlemen of Verona, with a suitable change in noun.)

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

February 06, 2015 3:11 PM

What It Feels Like To Do Research

In one sentence:

Unless you tackle a problem that's already solved, which is boring, or one whose solution is clear from the beginning, mostly you are stuck.

This is from Alec Wilkinson's The Pursuit of Beauty, about mathematician Yitang Zhang, who worked a decade on the problem of bounded gaps between prime numbers. As another researcher says in the article,

When you try to prove a theorem, you can almost be totally lost to knowing exactly where you want to go. Often, when you find your way, it happens in a moment, then you live to do it again.

Programmers get used to never feeling normal, but tackling the twin prime problem is on a different level altogether. The same is true for any deep open question in math or computing.

I strongly recommend Wilkinson's article. It describes what life for untenured mathematicians is like, and how a single researcher can manage to solve an important problem.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 28, 2015 3:38 PM

The Relationship Between Coding and Literacy

Many people have been discussing Chris Granger's recent essay Coding is not the New Literacy, and most seem to approve of his argument. Reading it brought to my mind this sentence from Alan Kay in VPRI Memo M-2007-007a, The Real Computer Revolution Hasn't Happened Yet:

Literacy is not just being able to read and write, but being able to deal fluently with the kind of ideas that are important enough to write about and discuss.

Literacy requires both the low-level skills of reading and writing and the higher-order capacity for using them on important ideas.

That is one thing that makes me uneasy about Granger's argument. It is true that teaching people only low-level coding skills won't empower them if they don't know how to use them to use them fluently to build models that matter. But neither will teaching them how to build models without giving them access to the programming skills they need to express their ideas beyond what some tool gives them.

Like Granger, though, I am also uneasy about many of the learn-to-code efforts. Teaching people enough Javascript or Ruby to implement a web site out of the box skips past the critical thinking skills that people need to use computation effectively in their world. They may be "productive" in the short term, but they are also likely to hit a ceiling pretty soon. What then? My guess: they become frustrated and stop coding altogether.

the Scratch logo

We sometimes do a better job introducing programming to kids, because we use tools that allow students to build models they care about and can understand. In the VPRI memo, Kay describes experiences teaching elementary school, students to use eToys to model physical phenomena. In the end, they learn physics and the key ideas underlying calculus. But they also learn the fundamentals of programming, in an environment that opens up into Squeak, a flavor of Smalltalk.

I've seen teachers introduce students to Scratch in a similar way. Scratch is a drag-and-drop programming environment, but it really is a open-ended and lightweight modeling tool. Students can learn low-level coding skills and higher-level thinking skills in tandem.

That is the key to making Granger's idea work in the best way possible. We need to teach people how to think about and build models in a way that naturally evolves into programming. I am reminded of another quote from Alan Kay that I heard back in the 1990s. He reminded us that kindergarteners learn and use the same language that Shakespeare used It is possible for their fluency in the language to grow to the point where they can comprehend some of the greatest literature ever created -- and, if they possess some of Shakepeare's genius, to write their own great literature. English starts small for children, and as they grow, it grows with them. We should aspire to do the same thing for programming.

the logo for Eve

Granger reminds us that literacy is really about composition and comprehension. But it doesn't do much good to teach people how to solidify their thoughts so that they can be written if they don't know how to write. You can't teach composition until your students know basic reading and writing.

Maybe we can find a way to teach people how to think in terms of models and how to implement models in programs at the same time, in a language system that grows along with their understanding. Granger's latest project, Eve, may be a step in that direction. There are plenty of steps left for us to take in the direction of languages like Scratch, too.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 18, 2015 10:26 AM

The Infinite Horizon

In Mathematics, Live: A Conversation with Laura DeMarco and Amie Wilkinson, Amie Wilkinson recounts the pivotal moment when she knew she wanted to be a mathematician. Insecure about her abilities in mathematics, unsure about what she wanted to do for a career, and with no encouragement, she hadn't applied to grad school. So:

I came back home to Chicago, and I got a job as an actuary. I enjoyed my work, but I started to feel like there was a hole in my existence. There was something missing. I realized that suddenly my universe had become finite. Anything I had to learn for this job, I could learn eventually. I could easily see the limits of this job, and I realized that with math there were so many things I could imagine that I would never know. That's why I wanted to go back and do math. I love that feeling of this infinite horizon.

After having written software for an insurance company during the summers before and after my senior year in college, I knew all too well the "hole in my existence" that Wilkinson talks about, the shrinking universe of many industry jobs. I was deeply interested in the ideas I had found in Gödel, Escher, Bach, and in the idea of creating an intelligent machine. There seemed no room for those ideas in the corporate world I saw.

I'm not sure when the thought of graduate school first occurred to me, though. My family was blue collar, and I didn't have much exposure to academia until I got to Ball State University. Most of my friends went out to get jobs, just like Wilkinson. I recall applying for a few jobs myself, but I never took the job search all that seriously.

At least some of the credit belongs to one of my CS professors, Dr. William Brown. Dr. Brown was an old IBM guy who seemed to know so much about how to make computers do things, from the lowest-level details of IBM System/360 assembly language and JCL up to the software engineering principles needed to write systems software. When I asked him about graduate school, he talked to me about how to select a school and a Ph.D. advisor. He also talked about the strengths and weaknesses of my preparation, and let me know that even though I had some work to do, I would be able to succeed.

These days, I am lucky even to have such conversations with my students.

For Wilkinson, DeMarco and me, academia was a natural next step in our pursuit of the infinite horizon. But I now know that we are fortunate to work in disciplines where a lot of the interesting questions are being asked and answers by people working in "the industry". I watch with admiration as many of my colleagues do amazing things while working for companies large and small. Computer science offers so many opportunities to explore the unknown.

Reading Wilkinson's recollection brought a flood of memories to mind. I'm sure I wasn't alone in smiling at her nod to finite worlds and infinite horizons. We have a lot to be thankful for.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

January 16, 2015 2:59 PM

Programming Language As Artistic Medium

Says Ramsey Nasser:

I have always been fascinated by esolangs. They are the such an amazing intersection of technical and formal rigor on one hand and nerdy inside humor on the other. The fact that they are not just ideas, but *actual working languages* is incredible. Its something that could only exist in a field as malleable and accessible as code. NASA engineers cannot build a space station as a joke.

Because we can create programming languages as a joke, or for any other reason, a programming language can be both message and medium.

a Hello, World program in Piet

Esolang is enthusiast shorthand for esoteric programming language. I'm not an enthusiast on par with many, but I've written a few Ook! interpreters and played around with others. Piet is the most visually appealing of the esoteric languages I've encountered. The image to the right is a "Hello, World" program written in Piet, courtesy of the Wikimedia Commons.

Recently I have been reading more about the work of Nasser, a computer scientist and artist formerly at the Eyebeam Art + Technology Center. In 2010, he created the Zajal programming language as his MFA thesis project at the Parsons School of Design. Zajal was inspired by Processing and runs on top of Ruby. A couple of years ago, he received widespread coverage for Qalb, a language with Arabic script characters and a Scheme-like syntax. Zajal enables programmers to write programs with beautiful output; Qalb enables programmers to write programs that are themselves quite beautiful.

I wouldn't call Zajal or Qalb esoteric programming languages. They are, in an important way, quite serious, exploring the boundary between "creative vision" and software. As he says at the close of the interview quoted above, we now live in a world in which "code runs constantly in our pockets":

Code is a driving element of culture and politics, which means that code that is difficult to reason about or inaccessible makes for a culture and politics that are difficult to reason about and inaccessible. The conversation about programming languages has never been more human than it is now, and I believe this kind of work will only become more so as software spreads.

As someone who teaches computer science students to think more deeply about programming languages, I would love to see more and different kinds of people entering the conversation.

Posted by Eugene Wallingford | Permalink | Categories: Computing

January 12, 2015 10:26 AM

WTF Problems and Answers for Questions Unasked

Dan Meyer quotes Scott Farrand in WTF Math Problems:

Anything that makes students ask the question that you plan to answer in the lesson is good, because answering questions that haven't been asked is inherently uninteresting.

My challenge this semester: getting students to ask questions about the programming languages they use and how they work. I myself have many questions about languages! My experience teaching our intro course last semester reminded me that what interests me (and textbook authors) doesn't always interest my students.

If you have any WTF? problems for a programming languages course, please share.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 09, 2015 3:40 PM

Computer Science Everywhere, Military Edition

Military Operations Orders are programs that are executed by units. Code re-use and other software engineering principles applied regularly to these.

An alumnus of my department, a CS major-turned-military officer, wrote those lines in an e-mail responding to my recent post, A Little CS Would Help a Lot of College Grads. Contrary to what many people might imagine, he has found what he learned in computer science to be quite useful to him as an Army captain. And he wasn't even a programmer:

One of the biggest skills I had over my peers was organizing information. I wasn't writing code, but I was handling lots of data and designing systems for that data. Organizing information in a way that was easy to present to my superiors was a breeze and having all the supporting data easily accessible came naturally to me.

Skills and principles from software engineering and project development apply to systems other than software. They also provide a vocabulary for talking about ideas that non-programmers encounter every day:

I did introduce my units to the terms border cases, special cases, and layers of abstraction. I cracked a smile every time I heard those terms used in a meeting.

Excel may not be a "real programming language", but knowing the ways in which it is a language can make managers of people and resources more effective at what they do.

For more about how a CS background has been useful to this officer, check out CS Degree to Army Officer, a blog entry that expands on his experiences.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

December 31, 2014 10:15 AM

Reinventing Education by Reinventing Explanation

One of the more important essays I read in 2014 was Michael Nielsen's Reinventing Explanation. In it, Nielsen explores how we might design media that help us explain scientific ideas better than we are able with our existing tools.

... it's worth taking non-traditional media seriously not just as a vehicle for popularization or education, which is how they are often viewed, but as an opportunity for explanations which can be, in important ways, deeper.

This essay struck me deep. Nielsen wants us to consider how we might take what we have learned using non-traditional media to popularize and educate and use it to think about how to explain more deeply. I think that learning how to use non-traditional media to explain more deeply will help us change the way we teach and learn.

In too many cases, new technologies are used merely as substitutes for old technology. The web has led to an explosion of instructional video aimed at all levels of learners. No matter how valuable these videos are, most merely replace reading a textbook or a paper. But computational technology enables us to change the task at hand and even redefine what we do. Alan Kay has been telling this story for decades, pointing us to the work of Ivan Sutherland and many others from the early days of computing.

Nielsen points to Bret Victor as an example of someone trying to develop tools that redefine how we think. As Victor himself says, he is following in the grand tradition of Kay, Sutherland, et al. Victor's An Ill-Advised Personal Note about "Media for Thinking the Unthinkable" is an especially direct telling of his story.

Vi Hart is another. Consider her recent Parable of the Polygons, created with Nicky Case, which explains dynamically how local choices and create systemic bias. This simulation uses computation to help people think differently about an idea they might not understand as viscerally from a traditional explanation. Hart has a long body of working using visualization to explain differently, and the introduction of computing extends the depth of her approach.

Over the last few weeks, I have felt myself being pulled by Nielsen's essay and the example of people such as Victor and Hart to think more about how we might design media that help us to teach and explain scientific ideas more deeply. Reinventing explanation might help us reinvent education in a way that actually matters. I don't have a research agenda yet, but looking again at Victor's work is a start.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 28, 2014 11:12 AM

A Little CS Would Help a Lot of College Grads

I would love to see more CS majors, but not everyone should major in CS. I do think that most university students could benefit from learning a little programming. There are plenty of jobs not only for CS and math grads, but also for other majors who have CS and math skills:

"If you're an anthropology major and you want to get a marketing job, well, guess what? The toughest marketing jobs to fill require SQL skills," Sigelman says. "If you can ... along the peripheries of your academic program accrue some strong quantitative skills, you'll still have the advantage [in the job market]." Likewise, some legal occupations (such as intellectual property law) and maintenance and repair jobs stay open for long periods of time, according to the Brookings report, if they require particular STEM skills.

There is much noise these days about the importance of STEM, both for educated citizens and for jobs, jobs, jobs. STEM isn't an especially cohesive category, though, as the quoted Vox article reminds us, and even when we look just at economic opportunity, it misleads. We don't need more college science graduates from every STEM discipline. We do need more people with the math and CS skills that now pervade the workplace, regardless of discipline. As Kurtzleben says in the article, "... characterizing these skill shortages as a broad STEM crisis is misleading to students, and has distorted the policy debate."

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 27, 2014 8:47 AM

Let's Not Forget: CS 1 Is Hard For Most Students

... software is hard. It's harder than
anything else I've ever had to do.
-- Donald Knuth

As students were leaving my final CS 1 lab session of the semester, I overheard two talking about their future plans. One student mentioned that he was changing his major to actuarial science. I thought, wow, that's a tough major. How is a student who is struggling with basic programming going to succeed there?

When I checked on his grades, though, I found that he was doing fine in my course, about average. I also remembered that he had enjoyed best the programming exercises that computed terms of infinite arithmetic series and other crazy mathematical values that his classmates often found impenetrable. Maybe actuarial science, even with some hard math, will be a good fit for him.

It really shouldn't surprise us that some students try computer science and decide to major in something else, even something that looks hard to most people. Teaching CS 1 again this semester after a long break reminded me just how much we expect from the students in our introductory course:

  • Details. Lots and lots of details. Syntax. Grammar. Vocabulary, both in a programming language and about programming more generally. Tools for writing, editing, compiling, and running programs.

  • Experimentation. Students have to design and execute experiments in order to figure out how language constructs work and to debug the programs they write. Much of what they learn is by trial and error, and most students have not yet developed skills for doing that in a controlled fashion.

  • Design. Students have to decompose problems and combine parts into wholes. They have to name things. They have to connect the names they see with ideas from class, the text, and their own experience.

  • Abstraction. Part of the challenge in design comes from abstraction, but abstract ideas are everywhere in learning about CS and how to program. Variables, choices, loops and recursion, functions and arguments and scope, ... all come not just as concrete forms but also as theoretical notions. These notions can sometimes be connected to the students' experience of the physical world, but the computing ideas are often just different enough to disorient the student. Other CS abstractions are so different as to appear unique.

In a single course, we expect students to perform tasks in all three of these modes, while mastering a heavy load of details. We expect them to learn by deduction, induction, and abduction, covering many abstract ideas and many concrete details. Many disciplines have challenging first courses, but CS 1 requires an unusual breadth of intellectual tools.

Yes, we can improve our students' experience with careful pedagogy. Over the last few decades we've seen many strong efforts. And yes, we can help students through the process with structural support, emotional support, and empathy. In the end, though, we must keep this in mind: CS 1 is going to be a challenge for most students. For many, the rewards will be worth the struggle, but that doesn't mean it won't take work, patience, and persistence along the way -- by both the students and the teachers.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 26, 2014 8:32 AM

Editing and the Illusion of Thought

Martin Amis, in The Paris Review, The Art of Fiction No. 151:

By the way, it's all nonsense about how wonderful computers are because you can shift things around. Nothing compares with the fluidity of longhand. You shift things around without shifting them around--in that you merely indicate a possibility while your original thought is still there. The trouble with a computer is that what you come out with has no memory, no provenance, no history--the little cursor, or whatever it's called, that wobbles around the middle of the screen falsely gives you the impression that you're thinking. Even when you're not.

My immediate reaction was that Mr. Amis needs version control, but there is something more here.

When writing with pencil and paper, we work on an artifact that embodies the changes it has gone through. We see the marks and erasures; we see the sentence where it once was once at the same time we see the arrow telling us where it now belongs. When writing in a word processor, our work appears complete, even timeless, though we know it isn't. Mark-up mode lets us see some of the document's evolution, but the changes feel more distant from our minds. They live out there.

I empathize with writers like Amis, whose experience predates the computer. Longhand feels different. Teasing out what what was valuable, even essential, in previous experience and what was merely the limitation of our tools is one of the great challenges of any time. How do we make new tools that are worth the change, that enable us to do more and better?

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

December 24, 2014 2:05 PM

Computer Science Everywhere, Christmas Eve Edition

Urmson says Google is better positioned than a traditional automaker to crack the riddle of self-driving, because it's more about software than hardware: "When you look at what we're doing, on the surface, you see a vehicle. But the heart of it is computer science.

That is Chris Urmson, the head of Google's self-driving car program, quoted in this article. (Apparently, senior citizens are a natural market for driverless cars.)

Everywhere we look these days, we see gadgets. Increasingly, though, at the heart of them is computer science.

Posted by Eugene Wallingford | Permalink | Categories: Computing

November 25, 2014 1:43 PM

Concrete Play Trumps All

Areschenko-Johannessen, Bundesliga 2006-2007

One of the lessons taught by the computer is that concrete play trumps all.

This comment appeared in the review of a book of chess analysis [ paywalled ]. The reviewer is taking the author to task for talking about the positional factors that give one player "a stable advantage" in a particular position, when a commercially-available chess program shows the other player can equalize easily, and perhaps even gain an advantage.

It is also a fitting comment on our relationship with computers these days more generally. In areas such as search and language translation, Google helped us see that conventional wisdom can often be upended by a lot of data and many processors. In AI, statistical techniques and neural networks solve problems in ways that models of human cognition cannot. Everywhere we turn, it seems, big data and powerful computers are helping us to redefine our understanding of the world.

We humans need not lose all hope, though. There is still room for building models of the world and using them to reason, just as there is room for human analysis of chess games. In chess, computer analysis is pushing grandmasters to think differently about the game. The result is a different kind of understanding for the more ordinary of us, too. We just have to be careful to check our abstract understanding against computer analysis. Concrete play trumps all, and it tests our hypotheses. That's good science, and good thinking.


(The chess position is from Areschenko-Johannessen 2006-2007, used as an example in Chess Training for Post-Beginners by Yaroslav Srokovski and cited in John Hartmann's review of the book in the November 2014 issue of Chess Life.)

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 23, 2014 8:50 AM

Supply, Demand, and K-12 CS

When I meet with prospective students and their parents, we often end up discussing why most high schools don't teach computer science. I tell them that, when I started as a new prof here, about a quarter of incoming freshmen had taken a year of programming in high school, and many other students had had the opportunity to do so. My colleagues and I figured that this percentage would go way up, so we began to think about how we might structure our first-year courses when most or all students already knew how to program.

However, the percentage of incoming students with programming experience didn't go up. It went way down. These days, about 10% of our freshman know how to program when they start our intro course. Many of those learned what they know on their own. What happened, today's parents ask?

A lot of things happened, including the dot-com bubble, a drop in the supply of available teachers, a narrowing of the high school curriculum in many districts, and the introduction of high-stakes testing. I'm not sure how much each contributed to the change, or whether other factors may have played a bigger role. Whatever the causes, the result is that our intro course still expects no previous programming experience.

Yesterday, I saw a post by a K-12 teacher on the Racket users mailing list that illustrates the powerful pull of economics. He is leaving teaching for software development industry, though reluctantly. "The thing I will miss the most," he says, "is the enjoyment I get out of seeing youngsters' brains come to life." He also loves seeing them succeed in the careers that knowing how to program makes possible. But in that success lies the seed of his own career change:

Speaking of my students working in the field, I simply grew too tired of hearing about their salaries which, with a couple of years experience, was typically twice what I was earning with 25+ years of experience. Ultimately that just became too much to take.

He notes that college professors probably know the feeling, too. The pull must be much stronger on him and his colleagues, though; college CS professors are generally paid much better than K-12 teachers. A love of teaching can go only so far. At one level, we should probably be surprised that anyone who knows how to program well enough to teach thirteen- or seventeen-year-olds to do it stays in the schools. If not surprised, we should at least be deeply appreciative of the people who do.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

November 20, 2014 3:23 PM

When I Procrastinate, I Write Code

I procrastinated one day with my intro students in mind. This is the bedtime story I told them as a result. Yes, I know that I can write shorter Python code to do this. They are intro students, after all.


Once upon a time, a buddy of mine, Chad, sent out a tweet. Chad is a physics prof, and he was procrastinating. How many people would I need to have in class, he wondered, to have a 50-50 chance that my class roster will contain people whose last names start with every letter of the alphabet?


This is a lot like the old trivia about how we only need to have 23 people in the room to have a 50-50 chance that two people share a birthday. The math for calculating that is straightforward enough, once you know it. But last names are much more unevenly distributed across the alphabet than birthdays are across the days of the year. To do this right, we need to know rough percentages for each letter of the alphabet.

I can procrastinate, too. So I surfed over to the US Census Bureau, rummaged around for a while, and finally found a page on Frequently Occurring Surnames from the Census 2000. It provides a little summary information and then links to a couple of data files, including a spreadsheet of data on all surnames that occurred at least 100 times in the 2000 census. This should, I figure, cover enough of the US population to give us a reasonable picture of how peoples' last names are distributed across the alphabet. So I grabbed it.

(We live in a wonderful time. Between open government, open research, and open source projects, we have access to so much cool data!)

The spreadsheet has columns with these headers:

    name,rank,count,prop100k,cum_prop100k,      \
                    pctwhite,pctblack,pctapi,   \

The first and third columns are what we want. After thirteen weeks, we know how to do compute the percentages we need: Use the running total pattern to count the number of people whose name starts with 'a', 'b', ..., 'z', as well as how many people there are altogether. Then loop through our collection of letter counts and compute the percentages.

Now, how should we represent the data in our program? We need twenty-six counters for the letter counts, and one more for the overall total. We could make twenty-seven unique variables, but then our program would be so-o-o-o-o-o long, and tedious to write. We can do better.

For the letter counts, we might use a list, where slot 0 holds a's count, slot 1 holds b's count, and so one, through slot 25, which holds z's count. But then we would have to translate letters into slots, and back, which would make our code harder to write. It would also make our data harder to inspect directly.

    ----  ----  ----  ...  ----  ----  ----    slots in the list

0 1 2 ... 23 24 25 indices into the list

The downside of this approach is that lists are indexed by integer values, while we are working with letters. Python has another kind of data structure that solves just this problem, the dictionary. A dictionary maps keys onto values. The keys and values can be of just about any data type. What we want to do is map letters (characters) onto numbers of people (integers):

    ----  ----  ----  ...  ----  ----  ----    slots in the dictionary

'a' 'b' 'c' ... 'x' 'y' 'z' indices into the dictionary

With this new tool in hand, we are ready to solve our problem. First, we build a dictionary of counters, initialized to 0.

    count_all_names = 0
    total_names = {}
    for letter in 'abcdefghijklmnopqrstuvwxyz':
        total_names[letter] = 0

(Note two bits of syntax here. We use {} for dictionary literals, and we use the familiar [] for accessing entries in the dictionary.)

Next, we loop through the file and update the running total for corresponding letter, as well as the counter of all names.

    source = open('app_c.csv', 'r')
    for entry in source:
        field  = entry.split(',')        # split the line
        name   = field[0].lower()        # pull out lowercase name
        letter = name[0]                 # grab its first character
        count  = int( field[2] )         # pull out number of people
        total_names[letter] += count     # update letter counter
        count_all_names     += count     # update global counter

Finally, we print the letter → count pairs.

    for (letter, count_for_letter) in total_names.items():
        print(letter, '->', count_for_letter/count_all_names)

(Note the items method for dictionaries. It returns a collection of key/value tuples. Recall that tuples are simply immutable lists.)

We have converted the data file into the percentages we need.

    q -> 0.002206197888442366
    c -> 0.07694634659082318
    h -> 0.0726864447688946
    f -> 0.03450702533438715
    x -> 0.0002412718532764804
    k -> 0.03294646311104032

(The entries are not printed in alphabetical order. Can you find out why?)

I dumped the output to a text file and used Unix's built-in sort to create my final result. I tweet Chad, Here are your percentages. You do the math.

Hey, I'm a programmer. When I procrastinate, I write code.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

November 11, 2014 7:53 AM

The Internet Era in One Sentence

I just love this:

When a 14-year-old kid can blow up your business in his spare time, not because he hates you but because he loves you, then you have a problem.

Clay Shirky attributes it to Gordy Thompson, who managed internet services at the New York Times in the early 1990s. Back then, it was insightful prognostication; today, it serves as an epitaph for many an old business model.

Are 14-year-old kids making YouTube videos to replace me yet?

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 31, 2014 2:52 PM

Ada Lovelace, AI Visionary

We hear a lot about Ada Lovelace being the first computer programmer, but that may not be her most impressive computing first. When I read Steven Johnson's The Tech Innovators of the Victorian Age I learned that she may have been the first modern person to envision the digital computer as a vehicle for an intelligent machine.

Though I have heard about Ada's work with Charles Babbage before, I didn't know any of the details. An engineer had written an essay about the Analytical Engine in Italian, and Lovelace set out to translate it into English. But she also added her own comments to the text as footnotes. It was in a footnote that she recorded "a series of elemental instruction sets that could be used to direct the calculations of the Analytical Engine". When people say Lovelace was the first computer programmer, they are referring to this footnote.

Some people contend that Lovelace did not write this program; rather, Babbage had outlined some procedures and that she refined them. If that is true, then Lovelace and Babbage still conspired on a noteworthy act: they were the first people to collaborate on a program. How fitting that the first computer program was a team effort.

That is only the beginning. Writes Johnson,

But her greatest contribution lay not in writing instruction sets but, rather, in envisioning a range of utility for the machine that Babbage himself had not considered. "Many persons," she wrote, "imagine that because the business of the engine is to give its results in numerical notation, the nature of its processes must consequently be arithmetical and numerical, rather than algebraical and analytical. This is an error. The engine can arrange and combine its numerical quantities exactly as if they were letters or any other general symbols."

Lovelace foresaw the use of computation for symbol manipulation, analytical reasoning, and even the arts:

"Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and musical composition were susceptible of such expressions and adaptations, the Engine might compose elaborate and scientific pieces of music of any degree of complexity or extent."

The Analytical Engine could be used to simulate intelligent behavior. Lovelace imagined artificial intelligence.

Johnson calls this perhaps the most visionary footnote in the history of print. That may be a bit over the top, but can you blame him? Most people of the 19th century could hardly conceive of the idea of a programmable computer. By the middle of the 20th century, many people understood that computers could implement arithmetic processes that would change many areas of life. But for most people, the idea of an "intelligent machine" was fantastic, not realistic.

In 1956, a group of visionary scientists organized the Dartmouth conferences to brainstorm from the belief that "every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it". The Darmouth summer project may have been a seminal event in the history of AI. However, over a century earlier, Ada Lovelace saw the potential that a computing machine could partake in language and art. That may have been the first seminal moment in AI history.

Posted by Eugene Wallingford | Permalink | Categories: Computing

October 29, 2014 3:56 PM

Computing Future and Computing Past:

Administrative and teaching duties have been keeping me busy of late, but I've enjoyed following along with, a throwback, shell-based, Unix community started by Paul Ford and blogged about by him on his ~ford page there. feels like 1986 to me, or maybe 2036. In one sense, it is much less than today's social networks. In many other ways, it is so much more. The spirit of learning and adventure and connecting are more important there than glitzy interface and data anlytics and posturing for a public that consists of hundreds of Facebook 'friends' and Twitter 'followers'.

Ford mentions the trade-off in his long Medium article:

It's not like you can build the next Facebook or Twitter or Google on top of a huge number of Internet-connected Linux servers. Sure, Facebook, Twitter, and Google are built on top of a huge number of loosely connected Linux servers. But you know what I mean.

This project brings to mind a recent interview with writer William Gibson, in which he talks about the future and the past. In particular, this passage expresses a refreshingly different idea of what knowledge from the future would be most interesting -- and useful -- today:

If there were somehow a way for me to get one body of knowledge from the future -- one volume of the great shelf of knowledge of a couple of hundred years from now -- I would want to get a history. I would want to get a history book. I would want to know what they think of us.

I often wonder what the future will think of this era of computing, in which we dream too small and set the bar of achievement too low. We can still see the 1960s and 1970s in our rearview mirror, yet the dreams and accomplishments of that era are forgotten by so many people today -- even computer scientists, who rarely ever think about that time at all. is the sort of project that looks backward and yet enables us to look forward. Eliminate as much noise as possible and see what evolves next. I'm curious to see where it goes.

Posted by Eugene Wallingford | Permalink | Categories: Computing

October 17, 2014 3:05 PM

Assorted Quotes

... on how the world evolves.

On the evolution of education in the Age of the Web. Tyler Cowen, in Average Is Over, via The Atlantic:

It will become increasingly apparent how much of current education is driven by human weakness, namely the inability of most students to simply sit down and try to learn something on their own.

I'm curious whether we'll ever see a significant change in the number of students who can and do take the reins for themselves.

On the evolution of the Web. Jon Udell, in A Web of Agreements and Disagreements:

The web works as well as it does because we mostly agree on a set of tools and practices. But it evolves when we disagree, try different approaches, and test them against one another in a marketplace of ideas. Citizens of a web-literate planet should appreciate both the agreements and the disagreements.

Some disagreements are easier to appreciate after they fade into history.

On the evolution of software. Nat Pryce on the Twitter, via The Problematic Culture of "Worse is Better":

Eventually a software project becomes a small amount of useful logic hidden among code that copies data between incompatible JSON libraries

Not all citizens of a web-literate planet appreciate disagreements between JSON libraries. Or Ruby gems.

On the evolution of start-ups. Rands, in The Old Guard:

... when [the Old Guard] say, "It feels off..." what they are poorly articulating is, "This process that you're building does not support one (or more) of the key values of the company."

I suspect the presence of incompatible JSON libraries means that our software no longer supports the key values of our company.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Managing and Leading, Software Development, Teaching and Learning

October 16, 2014 3:54 PM

For Programmers, There Is No "Normal Person" Feeling

I see this in the lab every week. One minute, my students sit peering at their monitors, their heads buried in their hands. They can't do anything right. The next minute, I hear shouts of exultation and turn to see them, arms thrust in the air, celebrating their latest victory over the Gods of Programming. Moments later I look up and see their heads again in their hands. They are despondent. "When will this madness end?"

Last week, I ran across a tweet from Christina Cacioppo that expresses nicely a feeling that has been vexing so many of my intro CS students this semester:

I still find programming odd, in part, because I'm either amazed by how brilliant or how idiotic I am. There's no normal-person feeling.

Christina is no beginner, and neither am I. Yet we know this feeling well. Most programmers do, because it's a natural part of tackling problems that challenge us. If we didn't bounce between feeling puzzlement and exultation, we wouldn't be tackling hard-enough problems.

What seems strange to my students, and even to programmers with years of experience, is that there doesn't seem to be a middle ground. It's up or down. The only time we feel like normal people is when we aren't programming at all. (Even then, I don't have many normal-person feelings, but that's probably just me.)

I've always been comfortable with this bipolarity, which is part of why I have always felt comfortable as a programmer. I don't know how much of this comfort is natural inclination -- a personality trait -- and how much of it is learned attitude. I am sure it's a mixture of both. I've always liked solving puzzles, which inspired me to struggle with them, which helped me get better struggling with them.

Part of the job in teaching beginners to program is to convince them that this is a habit they can learn. Whatever their natural inclination, persistence and practice will help them develop the stamina they need to stick with hard problems and the emotional balance they need to handle the oscillations between exultation and despondency.

I try to help my students see that persistence and practice are the answer to most questions involving missing skills or bad habits. A big part of helping them this is coaching and cheerleading, not teaching programming language syntax and computational concepts. Coaching and cheerleading are not always tasks that come naturally to computer science PhDs, who are often most comfortable with syntax and abstractions. As a result, many CS profs are uncomfortable performing them, even when that's what our students need most. How do we get better at performing them? Persistence and practice.

The "no normal-person feeling" feature of programming is an instance of a more general feature of doing science. Martin Schwartz, a microbiologist at the University of Virginia, wrote a marvelous one-page article called The importance of stupidity in scientific research that discusses this element of being a scientist. Here's a representative sentence:

One of the beautiful things about science is that it allows us to bumble along, getting it wrong time after time, and feel perfectly fine as long as we learn something each time.

Scientists get used to this feeling. My students can, too. I already see the resilience growing in many of them. After the moment of exultation passes following their latest conquest, they dive into the next task. I see a gleam in their eyes as they realize they have no idea what to do. It's time to bury their heads in their hands and think.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 15, 2014 3:54 PM

Maybe We Just Need to Teach Better

Maybe We Just Need to Teach Better

A couple of weeks ago, I wrote Skills We Can Learn in response to a thread on the SIGCSE mailing list. Mark Guzdial has now written a series of posts in response to that thread, most recently Teaching Computer Science Better To Get Better Results. Here is one of the key paragraphs in his latest piece:

I watch my children taking CS classes, along with English, Chemistry, Physics, and Biology classes. In the CS classes, they code. In the other classes, they do on-line interactive exercises, they write papers, they use simulations, they solve problems by-hand. Back in CS, the only activity is coding with feedback. If we only have one technique for teaching, we shouldn't be surprised if it doesn't always work.

Mark then offers a reasonable hypothesis: We get poor results because we use ineffective teaching methods.

That's worthy of a new maxim of the sort found in my previous post: If things aren't going well in my course, it's probably my fault. Mark's hypothesis sounds more professional.

A skeptic might say that learning to program is like learning to speak a new human language, and when we learn new human languages we spend most of our time reading, writing, and speaking, and getting feedback from these activities. In an introductory programming course, the programming exercises are where students read, write, and get feedback. Isn't that enough?

For some students, yes, but not for all. This is also true in introductory foreign language courses, which is why teachers in those courses usually include games and other activities to engage the students and provide different kinds of feedback. Many of us do more than just programming exercises in computer science courses, too. In courses with theory and analysis, we give homework that asks students to solve problems, compute results, or give proofs for assertions about computation.

In my algorithms course, I open most days with a game. Students play the game for a while, and then we discuss strategies for playing the game well. I choose games whose playing strategies illustrate some algorithm design technique we are studying. This is a lot more fun than yet another Design an algorithm to... exercise. Some students seem to understand the ideas better, or at least differently, when they experience the ideas in a wider context.

I'm teaching our intro course right now, and over the last few weeks I have come to appreciate the paucity of different teaching techniques and methods used by a typical textbook. This is my first time to teach the course in ten years, and I'm creating a lot of my own materials from scratch. The quality and diversity of the materials are limited by my time and recent experience, with the result being... a lot of reading and writing of code.

What of the other kinds of activities that Mark mentions? Some code reading can be turned into problems that the students solve by hand. I have tried a couple of debugging exercises that students seemed to find useful. I'm only now beginning to see the ways in which those exercises succeeded and failed, as the students take on bigger tasks.

I can imagine all sorts of on-line interactive exercises and simulations that would help in this course. In particular, a visual simulator for various types of loops could help students see a program's repetitive behavior more immediately than watching the output of a simple program. Many of my students would likely benefit from a Bret Victor-like interactive document that exposes the internal working of, say, a for loop. Still others could use assistance with even simpler concepts, such as sequences of statements, assignment to variables, and choices.

In any case, I second Mark's calls to action. We need to find more and better methods for teaching CS topics. We need to find better ways to make proven methods available to CS instructors. Most importantly, we need to expect more of ourselves and demand more from our profession.

When things go poorly in my classroom, it's usually my fault.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 06, 2014 4:02 PM

A New Programming Language Can Inspire Us

In A Fresh Look at Rust, Armin Ronacher tells us that some of what inspires him about Rust:

For me programming in Rust is pure joy. Yes I still don't agree with everything the language currently forces me to do but I can't say I have enjoyed programming that much in a long time. It gives me new ideas how to solve problems and I can't wait for the language to get stable.

Rust is inspiring for many reasons. The biggest reason I like it is because it's practical. I tried Haskell, I tried Erlang and neither of those languages spoke "I am a practical language" to me. I know there are many programmers that adore them, but they are not for me. Even if I could love those languages, other programmers would never do and that takes a lot of enjoyment away.

I enjoy reading personal blog entries from people excited by a new language, or newly excited by a language they are visiting again after a while away. I've only read Rust code, not written it, but I know just how Ronacher feels. These two paragraphs touch on several truths about how languages excite us:

  • Programmers are often most inspired when a language shows them new ideas how to solve problems.
  • Even if we love a language, we won't necessarily love every feature of the language.
  • What inspires us is personal. Other people can be inspired by languages that do not excite us.
  • Community matters.

Many programmers make a point of learning a new language periodically. When we do, we are often most struck by a language that teaches us new ways to think about problems and how to solve them. These are usually the languages that have the most teach us at the moment.

As Kevin Kelly says, progress sometimes demands that we let go of problems. We occasionally have to seek new problems, in order to be excited by new ways to answer them.

This all is very context-specific, other. How wonderful it is to live in a time with so many languages available to learn from. Let them all flourish, I say.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 02, 2014 3:46 PM

Skills We Can Learn

In a thread on motivating students on the SIGCSE mailing list, a longtime CS prof and textbook author wrote:

Over the years, I have come to believe that those of us who can become successful programmers have different internal wiring than most in the population. We know you need problem solving, mathematical, and intellectual skills but beyond that you need to be persistent, diligent, patient, and willing to deal with failure and learn from it.

These are necessary skills, indeed. Many of our students come to us without these skills and struggle to learn how to think like a computer scientist. And without persistence, diligence, patience, and a willingness to deal with failure and learn from it, anyone will likely have a difficult time learning to program.

Over time, it's natural to begin to think that these attributes are prerequisites -- things a person must have before he or she can learn to write programs. But I think that's wrong.

As someone else pointed out in the thread, too many people believe that to succeed in certain disciplines, one must be gifted, to possess an inherent talent for doing that kind of thing. Science, math, and computer science fit firmly in that set of disciplines for most people. Carol Dweck has shown that having such a "fixed" mindset of this sort prevents many people from sticking with these disciplines when they hit challenges, or even trying to learn them in the first place.

The attitude expressed in the quote above is counterproductive for teachers, whose job it is to help students learn things even when the students don't think they can.

When I talk to my students, I acknowledge that, to succeed in CS, you need to be persistent, diligent, patient, and willing to deal with failure and learn from it. But I approach these attributes from a growth mindset:

Persistence, diligence, patience, and willingness to learn from failure are habits anyone can develop with practice. Students can develop these habits regardless of their natural gifts or their previous education.

Aristotle said that excellence is not an act, but a habit. So are most of the attributes we need to succeed in CS. They are habits, not traits we are born with or actions we take.

Donald Knuth once said that only about 2 per cent of the population "resonates" with programming the way he does. That may be true. But even if most of us will never be part of Knuth's 2%, we can all develop the habits we need to program at a basic level. And a lot more than 2% are capable of building successful careers in the discipline.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

September 23, 2014 4:37 PM

The Obstacles in the Way of Teaching More Students to Program

All students should learn to program? Not so fast, says Larry Cuban in this Washington Post blog entry. History, including the Logo movement, illustrates several ways in which such a requirement can fail. I've discussed Cuban's article with a couple of colleagues, and all are skeptical. They acknowledge that he raises important issues, but in the end they offer a "yeah, but...". It is easy to imagine that things are different now, and the result will be similarly different.

I am willing to believe that things may be different this time. They always are. I've written favorably here in the past of the value of more students learning to program, but I've also been skeptical of requiring it. Student motivations change when they "have to take that class". And where will all the teachers come from?

In any case, it is wise to be alert to how efforts to increase the reach of programming instruction have fared. Cuban reminds us of some of the risks. One line in his article expresses what is, to my mind, the biggest challenge facing this effort:

Traditional schools adapt reforms to meet institutional needs.

Our K-12 school system is a big, complex organism (actually, fifty-one of them). It tends to keep moving in the direction of its own inertia. If a proposed reform fits its needs, the system may well adopt it. If it doesn't, but external forces push the new idea onto system, the idea is adapted -- assimilated into what the institution already wants to be, not what the reform actually promises.

We see this in the university all the time, too. Consider accountability measures such as student outcomes assessment. Many schools have adopted the language of SOA, but rarely do faculty and programs change all that much how they behave. They just find ways to generate reports that keep the external pressures at bay. The university and its faculty may well care about accountability, but they tend to keep on doing it the way they want to do it.

So, how can we maximize the possibility of substantive change in the effort to teach more students how to program, and not simply create a new "initiative" with frequent mentions in brochures and annual reports? Mark Guzdial has been pointing us in the right direction. Perhaps the most effective way to change K-12 schools is to change the teachers we send into the schools. We teach more people to be computing teachers, or prepare more teachers in the traditional subjects to teach computing. We prepare them to recognize opportunities to introduce computing into their courses and curricula.

In this sense, universities have an irreplaceable role to play in the revolution. We teach the teachers.

Big companies can fund programs such as and help us reach younger students directly. But that isn't enough. Google's CS4HS program has been invaluable in helping universities reach current K-12 teachers, but they are a small percentage of the installed base of teachers. In our schools of education, we can reach every future teacher -- if we all work together within and across university boundaries.

Of course, this creates a challenge at the meta-level. Universities are big, complex organisms, too. They tends to keep moving in the direction of their own inertia. Simply pushing the idea of programming instruction onto system from the outside is more likely to result in harmless assimilation than in substantive change. We are back to Cuban's square one.

Still, against all these forces, many people are working to make a change. Perhaps this time will be different after all.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 22, 2014 10:33 AM

Strange Loop 2014 Videos Are Up

generic Strange Loop logo

Wow. Strange Loop just ended Friday evening, and already videos of nearly all the talks are available on a YouTube channel. (A few have been delayed at the speaker's request.)

I regret missing the conference this year. I've been a regular attendee over the years and much enjoyed last year's edition. But it's probably just as well that the tickets sold out before I bought mine. My intro course has kept me pedaling full speed since school started, and I would have regretted missing a lab day and a class session just as we are getting to the meat of the course. I followed along with the conference on Twitter as time permitted.

The video titles foreshadow the usual treasure trove of Strange Loop content. It would be easier to list the talks I don't want to watch than the ones I do. A few I'll watch early on include Stephen Kell's "Liberating the Smalltalk Lurking in C and Unix", Stefanie Schirmer's "Dynamic Programming At Ease", Mark Allen's "All Of This Has Happened Before, and It Will All Happen Again", Julia Evans's "You Can Be a Kernel Hacker!", and Michael Nygard's "Simulation Testing".

An underrated advantage of actually attending a conference is not being able to be in two places at one time. Having to make a choice is sometimes a good thing; it helps us to preserve limited resources. The downside to the wonderfulness of having all the videos available on-line, for viewing at my leisure, is that I want to watch them all -- and I don't have enough leisure!

Posted by Eugene Wallingford | Permalink | Categories: Computing

September 12, 2014 1:49 PM

The Suffocating Gerbils Problem

I had never heard of the "suffocating gerbils" problem until I ran across this comment in a Lambda the Ultimate thread on mixing declarative and imperative approaches to GUI design. Peter Van Roy explained the problem this way:

A space rocket, like the Saturn V, is a complex piece of engineering with many layered subsystems, each of which is often pushed to the limits. Each subsystem depends on some others. Suppose that subsystem A depends on subsystem B. If A uses B in a way that was not intended by B's designers, even though formally B's specification is being followed by A, then we have a suffocating gerbils problem. The mental image is that B is implemented by a bunch of gerbils running to exhaustion in their hoops. A is pushing them to do too much.

I first came to appreciate the interrelated and overlapping functionality of engineered subsystems in graduate school, when I helped a fellow student build a software model of the fuel and motive systems of an F-18 fighter plane. It was quite a challenge for our modeling language, because the functions and behaviors of the systems were intertwined and did not follow obviously from the specification of components and connections. This challenge motivated the project. McDonnell Douglas was trying to understand the systems in a new way, in order to better monitor performance and diagnose failures. (I'm not sure how the project turned out...)

We suffocate gerbils at the university sometimes, too. Some functions depend on tenure-track faculty teaching occasional overloads, or the hiring of temporary faculty as adjuncts. When money is good, all is well. As budgets tighten, we find ourselves putting demands on these subsystems to meet other essential functions, such as advising, recruiting, and external engagement. It's hard to anticipate looming problems before they arrive in full failure; everything is being done according to specification.

Now there's a mental image: faculty gerbils running to exhaustion.

If you are looking for something new to read, check out some of Van Roy's work. His Concepts, Techniques, and Models of Computer Programming offers all kinds of cool ideas about programming language design and use. I happily second the sentiment of this tweet:

Note to self: read all Peter Van Roy's LtU comments in chronological order and build the things that don't exist yet:

There are probably a few PhD dissertations lurking in those comments.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

September 04, 2014 3:32 PM

Language Isn't Just for Experts

Stephen Ramsey wrote The Mythical Man-Finger, in defense of an earlier piece on the virtues of the command line. The gist of his argument is this:

... the idea that language is for power users and pictures and index fingers are for those poor besotted fools who just want toast in the morning is an extremely retrograde idea from which we should strive to emancipate ourselves.

Ramsay is an English professor who works in digital humanities. From the writings posted on his web site, it seems that he spends nearly as much time teaching and doing computing these days as he spends on the humanities. This opens him to objections from his colleagues, some of whom minimize the relevance of his perspective for other humanists by reminding him that he is a geek. He is one of those experts who can't see past his own expertise. We see this sort of rhetorical move in tech world all the time.

I think the case is quite the opposite. Ramsay is an expert on language. He knows that language is powerful, that language is more powerful than the alternatives in many contexts. When we hide language from our users, we limit them. Other tools can optimize for a small set of particular use cases, but they generally make it harder to step outside of those lines drawn by the creator of the tools: to combine tasks in novel ways, to extend them, to integrate them with other tools.

Many of my intro students are just beginning to see what knowing a programming language can mean. Giving someone language is one of the best ways to empower them, and also a great way to help them even see what is possible.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 30, 2014 7:43 AM

A Monad Sighting in Pop Literature

Lab experiments are invaluable in the hard sciences, in part because neutrinos and monads don't change their behavior when they are being watched; but humans do.

Several things ran through my mind when I read this sentence.

  • "Monads don't change their behavior when watched." Wow. The authors of this book must know a little functional programming.

  • Monads mentioned in the same sentence as neutrinos, which are fundamental particles of the universe? Oh, no. This will only make the smug functional programming weenies more smug.

  • Monads are part of the "hard sciences"? These authors really do get functional programming!

  • This sentence appears in a chapter called "The Three Hardest Words in the English Language". That joke writes itself.

  • Maybe I shouldn't be surprised to see this sentence. The book called Think Like a Freak.

I kid my monad-loving friends; I kid. The rest of the book is pretty good, too.

Posted by Eugene Wallingford | Permalink | Categories: Computing

July 21, 2014 10:52 AM

Wesley's Quoted Quote

My recent post Burn All Your Sermons was triggered by a quote taken out of context. Theologian John Wesley did not say:

Once in seven years I burn all my sermons...

He said:

"Once in seven years I burn all my sermons..."

Those "" make all the difference. Wesley wasn't saying that he himself burns all his sermons every seven years; he was talking about the practice doing so. Imagine the assistant of Wesley who, upon seeing this passage in the theologian's diary, burned all of Wesley's old sermons in an effort to ingratiate himself with the boss, only later to find out that Wesley very much intended to use them again. Fiery furnace, indeed.

This sort of indirection isn't important only for human communication. It is a key idea in computing. I wrote a blog post last year about such quotations and how this distinction is an important element in Jon Udell's notion of "thinking like the web". Thinking like the web isn't always foreign to the way most of us already think and work; sometimes it simply emphasizes a particular human practice that until now has been less common.

Studying a little computer science can help, though. Programmers have multiple ways of speaking indirectly about an action such as "burn all the sermons". In Scheme, I might express the program to burn all the sermons in a collection as:

(burn sermons)

We can quote this program, in much the same way that the "" above do, as:

'(burn sermons)

This is actually shorthand for (quote (burn sermons)). The result is a piece of data, much like Wesley's quotation of another person's utterance, that we can manipulate a variety of ways.

This sort of quotation trades on the distinction between data and process. In a post a few years back, I talked a bit about how this distinction is only a matter of perspective, that at a higher level data and program are two sides of the same coin.

However, we can also "quote" our sermon-burning program in a way that stays on the side of process. Consider this program:

(lambda () (burn sermons))

The result is a program that, when executed, will execute the sermon-burning program. Like the data version of the quote, it turns the original statement into something that we can talk about, pass around as a value, and manipulate in a variety of ways. But it does so by creating another program.

This technique, quite simple at its heart, plays a helpful role in the way many of computer language processors work.

Both techniques insert a level of indirection between a piece of advice -- burn all your sermons -- and its execution. That is a crucial distinction when we want to talk about an idea without asserting the idea's truth at that moment. John Wesley knew that, and so should we.

Posted by Eugene Wallingford | Permalink | Categories: Computing

July 16, 2014 2:11 PM

Burn All Your Sermons

Marketers and bridge players have their Rules of Seven. Teachers and preachers might, too, if they believe this old saw:

Once in seven years I burn all my sermons; for it is a shame if I cannot write better sermons now than I did seven years ago.

I don't have many courses in which I lecture uninterrupted for long periods of time. Most of my courses are a mixture of short lectures, student exercises, and other activities that explore or build upon whatever we are studying. Even when I have a set of materials I really like, which have been successful for me and my students in the past, I am forever reinventing them, tweaking and improving as we move through the course. This is in the same spirit as the rule of seven: surely I can make something better since the last time I taught the course.

Having a complete set of materials for a course to start from can be a great comfort. It can also be a straitjacket. The high-level structure of a course design limits how we think about the essential goals and topics of the course. The low-level structure generally optimizes for specific transitions and connections, which limits how easily we can swap in new examples and exercises.

Even as an inveterate tinkerer, I occasionally desire to break out of the straitjacket of old material and make a fresh start. Burn it all and start over. Freedom! What I need to remember will come back to me.

The adage quoted above tells us to do this regularly even if we don't feel the urge. The world changes around us. Our understanding grows. Our skills as a writer and storyteller grow. We can do better.

Of course, starting over requires time. It's a lot quicker to prep a course by pulling a prepped course out of an old directory of courses and cleaning it up around the edges. When I decide to redesign a course from bottom up, I usually have to set aside part of a summer to allow for long hours writing from scratch. This is a cost you have to take into account any time you create a new course.

Being in computer science makes it easier to force ourselves to start from scratch. While many of the principles of CS remain the same across decades, the practices and details of the discipline change all the time. And whatever we want to say about timeless principles, the undergrads in my courses care deeply about having some currency when they graduate.

In Fall 2006, I taught our intro course. The course used Java, which was the first language in our curriculum at that time. Before that, the last time I had taught the course, our first language was Pascal. I had to teach an entirely new course, even though many of the principles of programming I wanted to teach were the same.

I'm teaching our intro course again this fall for the first time since 2006. Python is the language of choice now. I suppose I could dress my old Java course in a Python suit, but that would not serve my students well. It also wouldn't do justice to the important ideas of the course, or Python. Add to this that I am a different -- and I hope better -- teacher and programmer now than I was eight years ago, and I have all the reasons I need to design a new course.

So, I am getting busy. Burn all the sermons.

Of course, we should approach the seven-year advice with some caution. The above passage is often attributed to theologian John Wesley. And indeed he did write it. However, as is so often the case, it has been taken out of context. This is what Wesley actually wrote in his journal:

Tuesday, September 1.--I went to Tiverton. I was musing here on what I heard a good man say long since--"Once in seven years I burn all my sermons; for it is a shame if I cannot write better sermons now than I could seven years ago." Whatever others can do, I really cannot. I cannot write a better sermon on the Good Steward than I did seven years ago; I cannot write a better on the Great Assize than I did twenty years ago; I cannot write a better on the Use of Money, than I did nearly thirty years ago; nay, I know not that I can write a better on the Circumcision of the Heart than I did five-and-forty years ago. Perhaps, indeed, I may have read five or six hundred books more than I had then, and may know a little more history, or natural philosophy, than I did; but I am not sensible that this has made any essential addition to my knowledge in divinity. Forty years ago I knew and preached every Christian doctrine which I preach now.

Note that Wesley attributes the passage to someone else -- and then proceeds to deny its validity in his own preaching! We may choose to adopt the Rule of Seven in our teaching, but we cannot do so with Wesley as our prophet.

I'll stick with my longstanding practice of building on proven material when that seems best, and starting from scratch whenever the freedom to tell a new story outweighs the value of what has worked for me and my students in the past.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 10, 2014 3:08 PM

The Passing of the Postage Stamp

In this New York Times article on James Baldwin's ninetieth birthday, scholar Henry Louis Gates laments:

On one hand, he's on a U.S. postage stamp; on the other hand, he's not in the Common Core.

I'm not qualified to comment on Baldwin and his place in the Common Core. In the last few months, I read several articles about and including Baldwin, and from those I have come to appreciate better his role in twentieth-century literature. But I also empathize with anyone trying to create a list of things that every American should learn in school.

What struck me in Gates's comment was the reference to the postage stamp. I'm old enough to have grown up in a world where the postage stamp held a position of singular importance in our culture. It enabled communication at a distance, whether geographical or personal. Stamps were a staple of daily life.

In such a world, appearing on a stamp was an honor. It indicated a widespread acknowledgment of a person's (or organization's, or event's) cultural impact. In this sense, the Postal Service's decision to include James Baldwin on a stamp was a sign of his importance to our culture, and a way to honor his contributions to our literature.

Alas, this would have been a much more significant and visible honor in the 1980s or even the 1990s. In the span of the last decade or so, the postage stamp has gone from relevant and essential to archaic.

When I was a boy, I collected stamps. It was a fun hobby. I still have my collection, even if it's many years out of date now. Back then, stamp collecting was a popular activity with a vibrant community of hobbyists. For all I know, that's still true. There's certainly still a vibrant market for some stamps!

But these days, whenever I use a new stamp, I feel as if I'm holding an anachronism in my hands. Computing technology played a central role in the obsolescence of the stamp, at least for personal and social communication.

Sometimes people say that we in CS need to a better job helping potential majors see the ways in which our discipline can be used to effect change in the world. We never have to look far to find examples. If a young person wants to be able to participate in how our culture changes in the future, they can hardly do better than to know a little computer science.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

July 03, 2014 2:13 PM

Agile Moments: Conspicuous Progress and Partial Value

Dorian Taylor, in Toward a Theory of Design as Computation:

You can scarcely compress the time it takes to do good design. The best you can do is arrange the process so that progress is conspicuous and the partially-completed result has its own intrinsic value.

Taylor's piece is about an idea much bigger than simply software methodology, but this passage leapt off the page at me. It seems to embody two of the highest goals of the various agile approaches to making software: progress that is conspicuous and partial results that have intrinsic value to the user.

If you like ambition attempts to create a philosophy of design, check out the whole essay. Taylor connects several disparate sources:

  • Edwin Hutchins and Cognition in the Wild,
  • Donald Norman and Things That Make Us Smart, and
  • Douglas Hofstadter and Gödel, Escher, Bach
with the philosophy of Christopher Alexander, in particular Notes on the Synthesis of Form and The Nature of Order. Ambitious it is.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

July 02, 2014 4:31 PM

My Jacket Blurb for "Exercises in Programming Style"

On Monday, my copy of Crista Lopes's new book, Exercises in Programming Style, arrived. After blogging about the book last year, Crista asked me to review some early chapters. After I did that, the publisher graciously offered me a courtesy copy. I'm glad it did! The book goes well beyond Crista's talk at StrangeLoop last fall, with thirty three styles grouped loosely into nine categories. Each chapter includes historical notes and a reading list for going deeper. Readers of this blog know that I often like to go deeper.

I haven't had a chance to study any of the chapters deeply yet, so I don't have a detailed review. For now, let me share the blurb I wrote for the back cover. It gives a sense of why I was so excited by the chapters I reviewed last summer and by Crista's talk last fall:

It is difficult to appreciate a programming style until you see it in action. Cristina's book does something amazing: it shows us dozens of styles in action on the same program. The program itself is simple. The result, though, is a deeper understanding of how thinking differently about a problem gives rise to very different programs. This book not only introduced me to several new styles of thinking; it also taught me something new about the styles I already know well and use every day.

The best way to appreciate a style is to use it yourself. I think Crista's book opens the door for many programmers to do just that with many styles most of us don't use very often.

As for the blurb itself: it sounds a little stilted as I read it now, but I stand by the sentiment. It is very cool to see my blurb and name along side blurbs from James Noble and Grady Booch, two people whose work I respect so much. Very cool. Leave it to James to sum up his thoughts in a sentence!

While you are waiting for your copy of Crista's book to arrive, check out her recent blog entry on the evolution of CS papers in publication over the last 50+ years. It presents a lot of great information, with some nice images of pages from a few classics. It's worth a read.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 27, 2014 3:55 PM

Beautiful Words, File Format Edition

In The Great Works of Software, Paul Ford tells us that the Photoshop file format is

a fascinating hellish palimpsest.

"Palimpsest" is one of those words I seem always have to look up whenever I run across it. What a lyrical word.

After working with a student a few summers ago on a translator from Photoshop PSD format to HTML/CSS (mentioned in the first paragraph of this essay, I can second the assertion that PSD is fascinating and hellish. Likewise, however often it has changed over time, it looks in several places as if it is held together with bailing wire.

Ford said it better than I could have, though.

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 25, 2014 2:03 PM

You Shouldn't Need a License to Program

In Generation Liminal, Dorian Taylor recalls how the World Wide Web arrived at the perfect time in his life:

It's difficult to appreciate this tiny window of opportunity unless you were present for it. It was the World-Wild West, and it taught me one essential idea: that I can do things. I don't need a license, and I don't need credentials. I certainly don't need anybody telling me what to do. I just need the operating manual and some time to read it. And with that, I can bring some amazing -- and valuable -- creations to life.

I predate the birth of the web. But when we turned on the computers at my high school, BASIC was there. We could program, and it seemed the natural thing to do. These days, the dominant devices are smart phones and iPads and tablets. Users begin their experience far away from the magic of creating. It is a user experience for consumers.

One day many years ago, my older daughter needed to know how many words she had written for a school assignment. I showed her and wc. She was amazed by its simplicity; it looked like nothing else she'd ever seen. She still uses it occasionally.

I spent several days last week watching middle schoolers -- play. They consumed other people's creations, including some tools my colleagues set up for them. They have creative minds, but for the most part it doesn't occur to them that they can create things, too.

We need to let them know they don't need our permission to start, or credentials defined by anyone else. We need to give them the tools they need, and the time to play with them. And, sometimes, we need to give them a little push to get started.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 23, 2014 3:13 PM

The Coder's High Beats The Rest

At least David Auerbach thinks so. One of the reasons is that programming has a self-perpetuating cycle of creation, implementation, repair, and new birth:

"Coding" isn't just sitting down and churning out code. There's a fair amount of that, but it's complemented by large chunks of testing and debugging, where you put your code through its paces and see where it breaks, then chase down the clues to figure out what went wrong. Sometimes you spend a long time in one phase or another of this cycle, but especially as you near completion, the cycle tightens -- and becomes more addictive. You're boosted by the tight feedback cycle of coding, compiling, testing, and debugging, and each stage pretty much demands the next without delay. You write a feature, you want to see if it works. You test it, it breaks. It breaks, you want to fix it. You fix it, you want to build the next piece. And so on, with the tantalizing possibility of -- just maybe! -- a perfect piece of code gesturing at you in the distance.

My experience is similar. I can get lost for hours in code, and come out tired but mentally energized. Writing has never given me that kind of high, but then I've not written a really long piece of prose in a long time. Perhaps writing fiction could give me the sort of high I experience when deep in a program.

What about playing games? Back in my younger days, I experienced incredible flow while playing chess for long stretches. I never approached master level play, but a good game could still take my mind to a different level of consciousness. That high differed from a coder's high, though, in that it left me tired. After a three-round day at a chess tournament, all I wanted to do was sleep.

Getting lost in a computer game gives me a misleading feeling of flow, but it differs from the chess high. When I come out of a session lost in most computer games, I feel destroyed. The experience doesn't give life the way coding does, or the way I imagine meditation does. I just end up feeling tired and used. Maybe that's what drug addiction feels like.

I was thinking about computer games even before reading Auerbach's article. Last week, I was sitting next to one of the more mature kids in our summer camp after he had just spent some time gaming, er, collecting data for our our study of internet traffic. We had an exchange that went something like this:

Student: I love this feeling. I'd like to create a game like this some day.

Eugene: You can!

Student: Really? Where?

Eugene: Here. A group of students in my class last month wrote a computer game next door. And it's way cooler than playing a game.

I was a little surprised to find that this young high schooler had no idea that he could learn computer programming at our university. Or maybe he didn't make the connection between computer games and computer programs.

In any case, this is one of the best reasons for us CS profs to get out of their university labs and classrooms and interact with younger students. Many of them have no way of knowing what computer science is, what they can do with computer science, or what computer science can do for them -- unless we show them!

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 20, 2014 1:27 PM

Programming Everywhere, Business Edition

Q: What do you call a company that has staff members with "programmer" or "software developer" in their titles?

A: A company.

Back in 2012, Alex Payne wrote What Is and Is Not A Technology Company to address a variety of issues related to the confounding of companies that sell technology with companies that merely use technology to sell something else. Even then, developing technology in house was a potential source of competitive advantage for many businesses, whether that involved modifying existing software or writing new.

The competitive value in being able to adapt and create software is only larger and more significant in the last two years. Not having someone on staff with "programmer" in the title is almost a red flag even for non-tech companies these days.

Those programmers aren't likely to have been CS majors in college, though. We don't produce enough. So we need to find a way to convince more non-majors to learn a little programming.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 19, 2014 2:11 PM

Yet Another Version of Alan Kay's Definition of "Object-Oriented"

In 2003, Stefan Ram asked Alan Kay to explain some of the ideas and history behind the term "object-oriented". Ram posted Kay's responses for all to see. Here is how Kay responded to the specific question, "What does 'object-oriented [programming]' mean to you?":

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.

Messaging and extreme late-binding have been consistent parts of Kay's answer to this question over the years. He has also always emphasized the encapsulated autonomy of objects, with analogy to cells from biology and nodes on the internet. As Kay has said many times, in his conception of the basic unit of computation is a whole computer.

For some reason, I really like the way Kay phrased the encapsulated autonomy clause in this definition: local retention and protection and hiding of state-process. It's not poetry or anything, but it has a rhythm.

Kay's e-mail mentions another of Kay's common themes, that most computer scientists didn't take full advantage of the idea of objects. Instead, we stayed too close to the dominant data-centric perspective. I often encounter this with colleagues who confound object-oriented programming with abstract data types. A system designed around ADTs will not offer the same benefits that Kay envisions for objects defined by their interactions.

In some cases, the words we adopted for OO concepts may have contributed to the remaining bias toward data, even if unintentionally. For example, Kay thinks that the term "polymorphism" hews too closely to the standard concept of a function to convey the somewhat different notion of an object as embodying multiple algebras.

Kay's message also mentions two projects I need to learn more about. I've heard of Robert Balzer's Dataless Programming paper but never read it. I've heard of GEDANKEN, a programming language project by John Reynolds, but never seen any write-up. This time I downloaded GEDANKEN: A Simple Typeless Language Which Permits Functional Data Structures and Coroutines, Reynolds's tech report from Argonne National Lab. Now I am ready to become a little better informed than I was this morning.

The messages posted by Ram are worth a look. They serve as a short precursor to (re-)reading Kay's history of Smalltalk paper. Enjoy!

Posted by Eugene Wallingford | Permalink | Categories: Computing

June 17, 2014 2:38 PM

Cookies, Games, and Websites: A Summer Camp for Kids

Cut the Rope 2 logo

Today is the first day of Cookies, Games, and Websites, a four-day summer camp for middle-school students being offered by our department. A colleague of mine developed the idea for a workshop that would help kids of that age group understand better what goes on when they play games on their phones and tablets. I have been helping, as a sounding board for ideas during the prep phase and now as a chaperone and helper during the camp. A local high school student has been providing much more substantial help, setting up hardware and software and serving as a jack-of-all-trades.

The camp's hook is playing games. To judge from this diverse group of fifteen students from the area, kids this age already know very well how to download, install, and play games. Lots of games. Lots and lots of games. If they had spent as much time learning to program as they seem to have spent playing games, they would be true masters of the internet.

The first-order lesson of the camp is privacy. Kids this age play a lot of games, but they don't have a very good idea how much network traffic a game like Cut the Rope 2 generates, or how much traffic accessing Instagram generates. Many of their apps and social websites allow them to exercise some control over who sees what in their space, but they don't always know what that means. More importantly, they don't realize how important all this all is, because they don't know how much traffic goes on under the hood when they use their mobiles devices -- and even when they don't!

The second-order lesson of the camp, introduced as a means to an end, is computing: the technology that makes communication on the web possible, and some of the tools they can use to look at and make sense of the network traffic. We can use some tools they already know and love, such as Google maps, to visualize the relevant data.

This is a great idea: helping young people understand better the technology they use and why concepts like privacy matter to them when they are using that technology. If the camp is successful, they will be better-informed users of on-line technology, and better prepared to protect their identities and privacy. The camp should be a lot of fun, too, so perhaps one or two of them will be interested diving deeper into computer science after the camp is over.

This morning, the campers learned a little about IP addresses and domain names, mostly through interactive exercises. This afternoon, they are learning a little about watching traffic on the net and then generating traffic by playing some of their favorite games. Tomorrow, we'll look at all the traffic they generated playing, as well as all the traffic generated while their tablets were idle overnight.

We are only three-fourths of the way through Day 1, and I have already learned my first lesson: I really don't want to teach middle school. The Grinch explains why quite succinctly: noise, noise, NOISE! One thing seems to be true of any room full of fifteen middle-school students: several of them are talking at any given time. They are fun people to be around, but they are wearing me out...

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 05, 2014 2:45 PM

Choosing the Right Languages for Early CS Instruction is Important

In today's ACM interview, Donald Knuth identifies one of the problems he has with computer science instruction:

Similarly, the most common fault in computer classes is to emphasize the rules of specific programming languages, instead of to emphasize the algorithms that are being expressed in those languages. It's bad to dwell on form over substance.

I agree. The challenges are at least two in number:

  • ... finding the right level of support for the student learning his or her first language. It is harder for students to learn their first language than many people realize until after they've tried to teach them.

  • ... helping students develop the habit and necessary skills to learn new languages on their own with some facility. For many, this involves overcoming the fear they feel until they have done it on their own a time or two.

Choosing the right languages can greatly help in conquering Challenges 1 and 2. Choosing the wrong languages can make overcoming them almost impossible, if only because we lose students before they cross the divide.

I guess that makes choosing the right languages Challenge 3.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 29, 2014 2:14 PM

Invention, One Level Down

Brent Simmons wrote a blog entry on his time at UserLand. After describing a few of the ideas that founder Dave Winer created and extending, such as RSS and blogging, Simmons said this about Winer:

The tech was his invention too: he built the thing he needed to be able to build other things.

This is among the highest praise one can bestow on an inventor. It's also one of the things I like about computer science. The hallmark of so many interesting advances in computing is the creation of a technology or language that makes the advance possible. Sometimes the enabling technology turns out to be pretty important in its own right. Sometimes, it's a game changer. But even when it is only a scaffold to something bigger, it needed to be created.

Posted by Eugene Wallingford | Permalink | Categories: Computing

May 28, 2014 4:20 PM

Programming for Everyone, Intro Physics Edition

Rhett Allain asked his intro physics students to write a short bit of Python code to demonstrate some idea from the course, such as the motion of an object with a constant force, or projectile motion with air resistance. Apparently, at least a few complained: "Wait! I'm not a computer scientist." That caused Allain to wonder...

I can just imagine the first time a physics faculty told a class that they needed to draw a free body diagram of the forces on an object for the physics solutions. I wonder if a student complained that this was supposed to be a physics class and not an art class.

As Allain points out, the barriers that used to prevent students from doing numerical calculations in computer programs have begun to disappear. We have more accessible languages now, such as Python, and powerful computers are everywhere, capable of running VPython and displaying beautiful visualizations.

About all that remains is teaching all physics students, even the non-majors, a little programming. The programs they write are simply another medium through which they can explore physical phenomena and perhaps come to understand them better.

Allain is exactly right. You don't have to be an artist to draw simple diagrams or a mathematician to evaluate an integral. All students accept, if grudgingly, that people might reasonably expect them to present an experiment orally in class.

Students don't have to be "writers", either, in order for teachers or employers to reasonably expect them to write an essay about physics or computer science. Even so, you might be surprised how many physics and computer science students complain if you ask them to write an essay. And if you dare expect them to spell words correctly, or to write prose somewhat more organized than Faulkner stream of consciousness -- stand back.

(Rant aside, I have been quite lucky this May term. I've had my students write something for me every night, whether a review of something they've read or a reflection on the practices they are struggling to learn. There's been nary a complaint, and most of their writings have been organized, clear, and enjoyable to read.)

You don't have to be a physicist to like physics. I hope that most educated adults in the 21st century understand how the physical world works and appreciate the basic mechanisms of the universe. I dare to hope that many of them are curious enough to want to learn more.

You also don't have to be a computer programmer, let alone a computer scientist, to write a little code. Programs are simply another medium through which we can create and express ideas from across the spectrum of human thought. Hurray to Allain for being in the vanguard.


Note. Long-time readers of this blog may recognize the ideas underlying Allain's approach to teaching introductory physics. He uses Matter and Interactions, a textbook and set of supporting materials created by Ruth Chabay and Bruce Sherwood. Six years ago, I wrote about some of Chabay's and Sherwood's ideas in an entry on creating a dialogue between science and CS and mentioned the textbook project in an entry on scientists who program. These entries were part of a report on my experiences attending SECANT, a 2007 NSF workshop on the intersection of science, computation, and education.

I'm glad to see that the Matter and Interactions project continued to fruition and has begun to seep into university physics instruction. It sounds like a neat way to learn physics. It's also a nice way to pick up a little "stealth programming" along the way. I can imagine a few students creating VPython simulations and thinking, "Hey, I'd like to learn more about this programming thing..."

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 25, 2014 12:03 PM

CS Prof From Iowa Was a 'Heroine of Computing' -- and a Nun

While cleaning up the house recently for a family visit, I came across a stack of newspaper articles I'd saved from last fall. Among them was an article about a September 7, 2013, exhibition at The National Museum of Computing in Bletchley Park, Milton Keynes, England. The exhibition was titled "Celebrating the Heroines of Computing". That alone would have made the article worth clipping, but it had a closer connection to me: it featured a CS professor from the state of Iowa, who was also a Catholic nun.

Sister Mary Kenneth Keller, with Paul Laube, MD, undated

Sister Mary Kenneth Keller was a professed member of the Sisters of Charity of the Blessed Virgin Mary, an order of nuns based in Dubuque, Iowa. If you have had the privilege of working or studying with nuns, you know that they are often amazing people. Sister Mary Kenneth certainly was. She was also a trailblazer who studied computer science before it was a thing and helped to create a CS department:

As the first person to receive a Ph.D. in computer science from the University of Wisconsin-Madison, she was a strong advocate for women entering the field of computer science. For nearly 20 years she served as chair of the newly-created computer science department at Clarke University and was among the first to recognize the future importance of computers in the sciences, libraries and business. Under her leadership at Clarke, a master's degree program in computer applications in education was included.

Claims that some individual was the "first person to receive a Ph.D. in computer science" have been relatively common over the years. The Department of Computer Science at Wisconsin has a page listing Ph.D.'s conferred, 1965-1970, which list Sister Mary Kenneth first, for a dissertation titled "Inductive Inference on Computer Generated Patterns". But that wasn't her only first; this ACM blog piece by Ralph London asserts that Keller is the first woman to receive a Ph.D. in CS anywhere in the US, and one of the first two US CS Ph.D.s overall.

This bit of history is only a small part of Keller's life in academia and computing. She earned a master's degree in math at DePaul University in the early 1950s. In 1958, she worked at the Dartmouth University Computer Center as part of an NSF workshop, during which time she participated in the development of the BASIC programming language. She wrote four books on computing and served as consultant for a group of business and government organizations that included the city of Dubuque and the state of Illinois.

Sister Mary Kenneth spent her career on the faculty of Clarke University, apparently chairing the Department of Computer Science until her retirement. The university's computer center is named the Keller Computer Center and Information Service in her honor, as is a scholarship for students of computing.

I'd been in Iowa twenty years before I first heard this story of an Iowan's role in the history of computing. Her story also adds to the history of women in computing and, for me, creates a whole new area in the history of computing: women religious. A pretty good find for cleaning up the house.


The passage quoted above come from an article by Jody Iler, "BVM to be Featured as One of the 'Heroines of Computing'", which ran some time last fall in The Witness, the newspaper of the Archdiocese of Dubuque. I found substantially the same text on a news archive page on the web site of the Sisters of Charity, BVM. There is, of course, a Wikipedia page for Sister Mary Kenneth that reports many of the same details of her life.

The photo above, which appears both in Iler's article and on the web site, shows Sister Mary Kenneth with Dr. Paul Laube, a flight surgeon from Dubuque who consulted with her on some computing matter. (Laube's obituary indicates he lived an interesting life as well.) In the article, the photo is credited to Clarke University.

Posted by Eugene Wallingford | Permalink | Categories: Computing

May 07, 2014 3:39 PM

Thinking in Types, and Good Design

Several people have recommended Pat Brisbin's Thinking in Types for programmers with experience in dynamically-typed languages who are looking to grok Haskell-style typing. He wrote it after helping one of his colleagues of mine was get unstuck with a program that "seemed conceptually simple but resulted in a type error" in Haskell when implemented in a way similar to a solution in a language such as Python or Ruby.

This topic is of current interest to me at a somewhat higher level. Few of our undergrads have a chance to program in Haskell as a part of their coursework, though a good number of them learn Scala while working at a local financial tech company. However, about two-thirds of undergrads now start with a one or two semesters of Python, and types are something of a mystery to them. This affects their learning of Java and colors how they think about types if they take my course on programming languages.

So I read this paper. I have two comments.

First, let me say that I agree with my friends and colleagues who are recommending this paper. It is a clear, concise, and well-written description of how to use Haskell's types to think about a problem. It uses examples that are concrete enough that even our undergrads could implement with a little help. I may use this as a reading in my languages course next spring.

Second, I think think this paper does more than simply teach people about types in a Haskell-like language. It also gives a great example of how thinking about types can help programmers create better designs for their programs, even if they are working in an object-oriented language! Further, it hits right at the heart of the problem we face these days, with students who are used to working in scripting languages that provide high-level but very generic data structures.

The problem that Brisbin addresses happens after he helps his buddy create type classes and two instance classes, and they reach this code:

    renderAll [ball, playerOne, playerTwo]

renderAll takes a list of values that are Render-able. Unfortunately, in this case, the arguments come from two different classes... and Haskell does not allow heterogeneous lists. We could try to work around this feature of Haskell and "make it fit", but as Brisbin points out, doing so would cause you to lose the advantages of using Haskell in the first place. The compiler wouldn't be able to find errors in the code.

The Haskell way to solve the problem is to replace the generic list of stuff we pass to renderAll with a new type. With a new Game type that composes a ball with two players, we are able to achieve several advantages at once:

  • create a polymorphic render method for Game that passes muster with the type checker
  • allow the type checker to ensure that this element of our program is correct
  • make the program easier to extend in a type-safe way
  • our program is correct
  • and, perhaps most importantly, express the intent of the program more clearly

It's this last win that jumped off the page for me. Creating a Game class would give us a better object-oriented design in his colleague's native language, too!

Students who become accustomed to programming in languages like Python and Ruby often become accustomed to using untyped lists, arrays, hashes, and tuples as their go-to collections. They are oh, so, handy, often the quickest route to a program that works on the small examples at hand. But those very handy data structures promote sloppy design, or at least enable it; they make it easy not to see very basic objects living in the code.

Who needs a Game class when a Python list or Ruby array works out of the box? I'll tell you: you do, as soon as you try to almost anything else in your program. Otherwise, you begin working around the generality of the list or array, writing code to handle special cases really aren't special cases at all. They are simply unbundled objects running wild in the program.

Good design is good design. Most of the features of a good design transcend any particular programming style or language.

So: This paper is a great read! You can use it to learn better how to think like a Haskell programmer. And you can use it to learn even if thinking like a Haskell programmer is not your goal. I'm going to use it, or something like it, to help my students become better OO programmers.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

April 09, 2014 3:26 PM

Programming Everywhere, Vox Edition

In a report on the launch of Vox Media, we learn that line between software developers and journalists at Vox is blurred, as writers and reporters work together "to build the tools they require".

"It is thrilling as a journalist being able to envision a tool and having it become a real thing," Mr. Topolsky said. "And it is rare."

It will be less rare in the future. Programming will become a natural part of more and more people's toolboxes.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 04, 2014 12:43 PM

Creative Recombination of Existing Ideas

In a post on his move to California, Jon Udell notes that he may be out of step with the dominant view of the tech industry there:

And I think differently about innovation than Silicon Valley does. I don't think we lack new ideas. I think we lack creative recombination of proven tech, and the execution and follow-through required to surface its latent value.

As he found with the Elm City project, sometimes a good idea doesn't get traction quickly, even with sustained effort. Calendar aggregation seems like such a win even for a university the size of mine, yet a lot of folks don't get it. It's hard to know whether the slowness results from the idea, the technology, or the general resistance of communities to change how they operate.

In any case, Udell is right: there is a lot of latent value in the "creative recombination" of existing ideas. Ours is a remix culture, too. That's why it's so important to study widely in and out of computing, to build the base of tools needed to have a great idea and execute on it.

Posted by Eugene Wallingford | Permalink | Categories: Computing

March 31, 2014 3:21 PM

Programming, Defined and Re-imagined

By Chris Granger of Light Table fame:

Programming is our way of encoding thought such that the computer can help us with it.

Read the whole piece, which recounts Granger's reflection after the Light Table project left him unsatisfied and he sought answers. He concludes that we need to re-center our idea of what programming is and how we can make it accessible to more people. Our current idea of programming doesn't scale because, well,

It turns out masochism is a hard sell.

Every teacher knows this. You can sell masochism to a few proud souls, but not to anyone else.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 12, 2014 3:55 PM

Not Content With Content

Last week, the Chronicle of Higher Ed ran an article on a new joint major at Stanford combining computer science and the humanities.

[Students] might compose music or write a short story and translate those works, through code, into something they can share on the web.

"For students it seems perfectly natural to have an interest in coding," [the program's director] said. "In one sense these fields might feel like they're far apart, but they're getting closer and closer."

The program works in both directions, by also engaging CS students in the societal issues created by ubiquitous networks and computing power.

We are doing something similar at my university. A few years ago, several departments began to collaborate on a multidisciplinary program called Interactive Digital Studies which went live in 2012. In the IDS program, students complete a common core of courses from the Communication Studies department and then take "bundles" of coursework involving digital technology from at least two different disciplines. These areas of emphasis enable students to explore the interaction of computing with various topics in media, the humanities, and culture.

Like Stanford's new major, most of the coursework is designed to work at the intersection of disciplines, rather than pursuing disciplines independently, "in parallel".

The initial version of the computation bundle consists of an odd mix of application tools and opportunities to write programs. Now that the program is in place, we are finding that students and faculty alike desire more depth of understanding about programming and development. We are in the process of re-designing the bundle to prepare students to work in a world where so many ideas become web sites or apps, and in which data analytics plays an important role in understanding what people do.

Both our IDS program and Stanford's new major focus on something that we are seeing increasingly at universities these days: the intersections of digital technology and other disciplines, in particular the humanities. Computational tools make it possible for everyone to create more kinds of things, but only if people learn how to use new tools and think about their work in new ways.

Consider this passage by Jim O'Loughlin, a UNI English professor, from a recent position statement on the the "digital turn" of the humanities:

We are increasingly unlikely to find writers who only provide content when the tools for photography, videography and digital design can all be found on our laptops or even on our phones. It is not simply that writers will need to do more. Writers will want to do more, because with a modest amount of effort they can be their own designers, photographers, publishers or even programmers.

Writers don't have to settle for producing "content" and then relying heavily on others to help bring the content to an audience. New tools enable writers to take greater control of putting their ideas before an audience. But...

... only if we [writers] are willing to think seriously not only about our ideas but about what tools we can use to bring our ideas to an audience.

More tools are within the reach of more people now than ever before. Computing makes that possible, not only for writers, but also for musicians and teachers and social scientists.

Going further, computer programming makes it possible to modify existing tools and to create new tools when the old ones are not sufficient. Writers, musicians, teachers, and social scientists may not want to program at that level, but they can participate in the process.

The critical link is preparation. This digital turn empowers only those who are prepared to think in new ways and to wield a new set of tools. Programs like our IDS major and Stanford's new joint major are among the many efforts hoping to spread the opportunities available now to a larger and more diverse set of people.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

March 08, 2014 10:18 AM

Sometimes a Fantasy

This week I saw a link to The Turing School of Software & Design, "a seven-month, full-time program for people who want to become professional developers". It reminded me of Neumont University, a ten-year-old school that offers a B.S. degree program in Computer science that students can complete in two and a half years.

While riding the bike, I occasionally fantasize about doing something like this. With the economics of universities changing so quickly [ 1 | 2 ], there is an opportunity for a new kind of higher education. And there's something appealing about being able to work closely with a cadre of motivated students on the full spectrum of computer science and software development.

This could be an accelerated form of traditional CS instruction, without the distractions of other things, or it could be something different. Traditional university courses are pretty confining. "This course is about algorithms. That one is about programming languages." It would be fun to run a studio in which students serve as apprentices making real stuff, all of us learning as we go along.

A few years ago, one of our ChiliPLoP hot topic groups conducted a greenfield thought experiment to design an undergrad CS program outside of the constraints of any existing university structure. Student advancement was based on demonstrating professional competencies, not completing packaged courses. It was such an appealing idea! Of course, there was a lot of hard work to be done working out the details.

My view of university is still romantic, though. I like the idea of students engaging the great ideas of humanity that lie outside their major. These days, I think it's conceivable to include the humanities and other disciplines in a new kind of CS education. In a recent blog entry, Hollis Robbins floats the idea of Home College for the first year of a liberal arts education. The premise is that there are "thousands of qualified, trained, energetic, and underemployed Ph.D.s [...] struggling to find stable teaching jobs". Hiring a well-rounded tutor could be a lot less expensive than a year at a private college, and more lucrative for the tutor than adjuncting.

Maybe a new educational venture could offer more than targeted professional development in computing or software. Include a couple of humanities profs, maybe some a social scientist, and it could offer a more complete undergraduate education -- one that is economical both in time and money.

But the core of my dream is going broad and deep in CS without the baggage of a university. Sometimes a fantasy is all you need. Other times...

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 01, 2014 11:35 AM

A Few Old Passages

I was looking over a couple of files of old notes and found several quotes that I still like, usually from articles I enjoyed as well. They haven't found their way into a blog entry yet, but they deserve to see the light of day.

Evidence, Please!

From a short note on the tendency even among scientists to believe unsubstantiated claims, both in and out of the professional context:

It's hard work, but I suspect the real challenge will lie in persuading working programmers to say "evidence, please" more often.

More programmers and computer scientists are trying to collect and understand data these days, but I'm not sure we've made much headway in getting programmers to ask for evidence.

Sometimes, Code Before Math

From a discussion of The Expectation Maximization Algorithm:

The code is a lot simpler to understand than the math, I think.

I often understand the language of code more quickly than the language of math. Reading, or even writing, a program sometimes helps me understand a new idea better than reading the math. Theory is, however, great for helping me to pin down what I have learned more formally.

Grin, Wave, Nod

From Iteration Inside and Out, a review of the many ways we loop over stuff in programs:

Right now, the Rubyists are grinning, the Smalltalkers are furiously waving their hands in the air to get the teacher's attention, and the Lispers are just nodding smugly in the back row (all as usual).

As a Big Fan of all three languages, I am occasionally conflicted. Grin? Wave? Nod? Look like the court jester by doing all three simultaneously?

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 21, 2014 3:35 PM

Sticking with a Good Idea

My algorithms students and I recently considered the classic problem of finding the closest pair of points in a set. Many of them were able to produce a typical brute-force approach, such as:

    minimum ← 
    for i ← 1 to n do
        for j ← i+1 to n do
            distance ← sqrt((x[i] - x[j])² + (y[i] - y[j])²)
            if distance < minimum then
               minimum ← distance
               first   ← i
               second  ← j
    return (first, second)

Alas, this is an O(n²) process, so we considered whether we might do better with a divide-and-conquer approach. It did not look promising, though. Divide-and-conquer doesn't let us solve the sub-problems independently. What if the closest pair straddles two partitions?

This is a common theme in computing, and problem solving more generally. We try a technique only to find that it doesn't quite work. Something doesn't fit, or a feature of the domain violates a requirement of the technique. It's tempting in such cases to give up and settle for something less.

Experienced problem solvers know not to give up too quickly. Many of the great advances in computing came under conditions just like this. Consider Leonard Kleinrock and the theory of packet switching.

In a Computing Conversations podcast published last year, Kleinrock talks about his Ph.D. research. He was working on the problem of how to support a lot of bursty network traffic on a shared connection. (You can read a summary of the podcast in an IEEE Computer column also published last year.)

His wonderful idea: apply the technique of time sharing from multi-user operating systems. The system could break all messages into "packets" of a fixed size, let messages take turns on the shared line, then reassemble each message on the receiving end. This would give every message a chance to move without making short messages wait too long behind large ones.

Thus was born the idea of packet switching. But there was a problem. Kleinrock says:

I set up this mathematical model and found it was analytically intractable. I had two choices: give up and find another problem to work on, or make an assumption that would allow me to move forward. So I introduced a mathematical assumption that cracked the problem wide open.

His "independence assumption" made it possible for him to complete his analysis and optimize the design of a packet-switching network. But an important question remained: Was his simplifying assumption too big a cheat? Did it skew the theoretical results in such a way that his model was no longer a reasonable approximation of how networks would behave in the real world?

Again, Kleinrock didn't give up. He wrote a program instead.

I had to write a program to simulate these networks with and without the assumption. ... I simulated many networks on the TX-2 computer at Lincoln Laboratories. I spent four months writing the simulation program. It was a 2,500-line assembly language program, and I wrote it all before debugging a single line of it. I knew if I didn't get that simulation right, I wouldn't get my dissertation.

High-stakes programming! In the end, Kleinrock was able to demonstrate that his analytical model was sufficiently close to real-world behavior that his design would work. Every one of us reaps the benefit of his persistence every day.

Sometimes, a good idea poses obstacles of its own. We should not let those obstacles beat us without a fight. Often, we just have to find a way to make it work.

This lesson applies quite nicely to using divide-and-conquer on the closest pairs problem. In this case, we don't make a simplifying assumption; we solve the sub-problem created by our approach:

After finding a candidate for the closest pair, we check to see if there is a closer pair straddling our partitions. The distance between the candidate points constrains the area we have to consider quite a bit, which makes the post-processing step feasible. The result is an O(n log n) algorithm that improves significantly on brute force.

This algorithm, like packet switching, comes from sticking with a good idea and finding a way to make it work. This is a lesson every computer science student and novice programmer needs to learn.

There is a complementary lesson to be learned, of course: knowing when to give up on an idea and move on to something else. Experience helps us tell the two situations apart, though never with perfect accuracy. Sometimes, we just have to follow an idea long enough to know when it's time to move on.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 19, 2014 4:12 PM

Teaching for the Perplexed and the Traumatized

Teaching for the Perplexed and the Traumatized

On need for empathy when writing about math for the perplexed and the traumatized, Steven Strogatz says:

You have to help them love the questions.

Teachers learn this eventually. If students love the questions, they will do an amazing amount of working searching for answers.

Strogatz is writing about writing, but everything he says applies to teaching as well, especially teaching undergraduates and non-majors. If you teach only grad courses in a specialty area, you may be able to rely on the students to provide their own curiosity and energy. Otherwise having empathy, making connections, and providing Aha! moments are a big part of being successful in the classroom. Stories trump formal notation.

This semester, I've been trying a particular angle on achieving this trifecta of teaching goodness: I try to open every class session with a game or puzzle that the students might care about. From there, we delve into the details of algorithms and theoretical analysis. I plan to write more on this soon.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 02, 2014 5:19 PM

Things That Make Me Sigh

In a recent article unrelated to modern technology or the so-called STEM crisis, a journalist writes:

Apart from mathematics, which demands a high IQ, and science, which requires a distinct aptitude, the only thing that normal undergraduate schooling prepares a person for is... more schooling.


On the one hand, this seems to presume that one need neither a high IQ nor any particular aptitude to excel in any number of non-math and science disciplines.

On the other, it seems to say that if one does not have the requisite characteristics, which are limited to a lucky few, one need not bother with computer science, math or science. Best become a writer or go into public service, I guess.

I actually think that the author is being self-deprecating, at least in part, and that I'm probably reading too much into one sentence. It's really intended as a dismissive comment on our education system, the most effective outcome of which often seems to be students who are really good at school.

Unfortunately, the attitude expressed about math and science is far too prevalent, even in our universities. It demeans our non-scientists as well as our scientists and mathematicians. It also makes it even harder to convince younger students that, with a little work and practice, they can achieve satisfying lives and careers in technical disciplines.

Like I said, sigh.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 27, 2014 11:39 AM

The Polymath as Intellectual Polygamist

Carl Djerassi, quoted in The Last Days of the Polymath:

Nowadays people [who] are called polymaths are dabblers -- are dabblers in many different areas. I aspire to be an intellectual polygamist. And I deliberately use that metaphor to provoke with its sexual allusion and to point out the real difference to me between polygamy and promiscuity.

On this view, a dilettante is merely promiscuous, making no real commitment to any love interest. A polymath has many great loves, and loves them all deeply, if not equally.

We tend to look down on dilettantes, but they can perform a useful service. Sometimes, making a connection between two ideas at the right time and in the right place can help spur someone else to "go deep" with the idea. Even when that doesn't happen, dabbling can bring great personal joy and provide more substantial entertainment than a lot of pop culture.

Academics are among the people these days with a well-defined social opportunity to be explore at least two areas deeply and seriously: their chosen discipline and teaching. This is perhaps the most compelling reason to desire a life in academia. It even offers a freedom to branch out into new areas later in one's career that is not so easily available to people who work in industry.

These days, it's hard to be a polymath even inside one's own discipline. To know all sub-areas of computer science, say, as well as the experts in those sub-areas is a daunting challenge. I think back to the effort my fellow students and I put in over the years that enabled us to take the Ph.D. qualifying exams in CS. I did quite well across the board, but even then I didn't understand operating systems or programming languages as well as experts in those areas. Many years later, despite continued reading and programming, the gap has only grown.

I share the vague sense of loss, expressed by the author of the article linked to above, of a time when one human could master multiple areas of discourse and make fundamental advances to several. We are certainly better off for collective understanding the world so much much better, but the result is a blow to a certain sort of individual mind and spirit.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

January 26, 2014 3:05 PM

One Reason We Need Computer Programs

Code bridges the gap between theory and data. From A few thoughts on code review of scientific code:

... there is a gulf of unknown size between the theory and the data. Code is what bridges that gap, and specifies how edge cases, weird features of the data, and unknown unknowns are handled or ignored.

I learned this lesson the hard way as a novice programmer. Other activities, such as writing and doing math, exhibit the same characteristic, but it wasn't until I started learning to program that the gap between theory and data really challenged me.

Since learning to program myself, I have observed hundreds of CS students encounter this gap. To their credit, they usually buckle down, work hard, and close the gap. Of course, we have to close the gap for every new problem we try to solve. The challenge doesn't go away; it simply becomes more manageable as we become better programmers.

In the passage above, Titus Brown is talking to his fellow scientists in biology and chemistry. I imagine that they encounter the gap between theory and data in a new and visceral way when they move into computational science. Programming has that power to change how we think.

There is an element of this, too, in how techies and non-techies alike sometimes lose track of how hard it is to create a successful start up. You need an idea, you need a programmer, and you need a lot of hard work to bridge the gap between idea and executed idea.

Whether doing science or starting a company, the code teaches us a lot about out theory. The code makes our theory better.

As Ward Cunningham is fond of saying, it's all talk until the tests run.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

January 23, 2014 4:14 PM

The CS Mindset

Chad Orzel often blogs about the physics mindset, the default orientation that physicists tend to have toward the world, and the way they think about and solve problems. It is fun to read a scientist talking about doing science.

Earlier this week I finally read this article about the popularity of CS50, an intro CS course at Harvard. It's all about how Harvard is "is righting an imbalance in academia" by finally teaching practical skills to its students. When I read:

"CS50 made me look at Harvard with new eyes," Guimaraes said.

That is a sea change from what Harvard represented to the outside world for decades: the guardian of a classic education, where the value of learning is for its own sake.

I sighed audibly, loud enough for the students on the neighboring exercise equipment to hear. A Harvard education used to be about learning only for its own sake, but now students can learn practical stuff, too. Even computer programming!

As I re-read the article now, I see that it's not as blunt as that throughout. Many Harvard students are learning computing because of the important role it plays in their futures, whatever their major, and they understand the value of understanding it better. But there are plenty of references to "practical ends" and Harvard's newfound willingness to teach practical skills it once considered beneath it.

Computer programming is surely one of those topics old Charles William Eliot would deem unworthy of inclusion in Harvard's curriculum.

I'm sensitive to such attitudes because I think computer science is and should be more. If you look deeper, you will see that the creators of CS50 think so, too. On its Should I take CS50? FAQ page, we find:

More than just teach you how to program, this course teaches you how to think more methodically and how to solve problems more effectively. As such, its lessons are applicable well beyond the boundaries of computer science itself.

The next two sentences muddy the water a bit, though:

That the course does teach you how to program, though, is perhaps its most empowering return. With this skill comes the ability to solve real-world problems in ways and at speeds beyond the abilities of most humans.

With this skill comes something else, something even more important: a discipline of thinking and a clarity of thought that are hard to attain when you learn "how to think more methodically and how to solve problems more effectively" in the abstract or while doing almost any other activity.

Later the same day, I was catching up on a discussion taking place on the PLT-EDU mailing list, which is populated by the creators, users, and fans of the Racket programming language and the CS curriculum designed in tandem with it. One poster offered an analogy for talking to HS students about how and why they are learning to program. A common theme in the discussion that ensued was to take the conversation off of the "vocational track". Why encourage students to settle for such a limiting view of what they are doing?

One snippet from Matthias Felleisen (this link works only if you are a member of the list) captured my dissatisfaction with the tone of the Globe article about CS50:

If we require K-12 students to take programming, it cannot be justified (on a moral basis) with 'all of you will become professional programmers.' I want MDs who know the design recipe, I want journalists who write their articles using the design recipe, and so on.

The "design recipe" is a thinking tool students learn in Felleisen "How to Design Programs" curriculum. It is a structured way to think about problems and to create solutions. Two essential ideas stand out for me:

  • Students learn the design recipe in the process of writing programs. This isn't an abstract exercise. Creating a working computer program is tangible evidence that student has understood the problem and created a clear solution.
  • This way of thinking is valuable for everyone. We will all better off if our doctors, lawyers, and journalists are able to think this way.

This is one of my favorite instantiations of the vague term computational thinking so many people use without much thought. It is a way of thinking about problems both abstractly and concretely, that leads to solutions that we have verified with tests.

You might call this the CS mindset. It is present in CS50 independent of any practical ends associated with tech start-ups and massaging spreadsheet data. It is practical on a higher plane. It is also present in the HtDP curriculum and especially in the Racket Way.

It is present in all good CS education, even the CS intro courses that more students should be taking -- even if they are only going to be doctors, lawyers, or journalists.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 16, 2014 4:22 PM

Another Take on "Know the Past"

Ted Nelson offers a particularly stark assessment of how well we fulfill our obligation to know the past in his eulogy for Douglas Engelbart:

To quote Joan of Arc, from Shaw's play about her: "When will the world be ready to receive its saints?"

I think we know the answer -- when they are dead, pasteurized and homogenized and simplified into stereotypes, and the true depth and integrity of their ideas and initiatives are forgotten.

Nelson's position is stronger yet, because he laments the way in which Engelbart and his visions of the power of computing were treated throughout his career. How, he wails, could we have let this vision slip through our hands while Engelbart lived among us?

Instead, we worked on another Java IDE or a glue language for object-relational mapping. All the while, as Nelson says, "the urgent and complex problems of mankind have only grown more urgent and more complex."

This teaching is difficult; who can accept it?

Posted by Eugene Wallingford | Permalink | Categories: Computing

January 08, 2014 3:06 PM

"I'm Not a Programmer"

In The Exceptional Beauty of Doom 3's Source Code, Shawn McGrath first says this:

I've never really cared about source code before. I don't really consider myself a 'programmer'.

Then he says this:

Dyad has 193k lines of code, all C++.

193,000 lines of C++? Um, dude, you're a programmer.

Even so, the point is worth thinking about. For most people, programming is a means to an end: a way to create something. Many CS students start with a dream program in mind and only later, like McGrath, come to appreciate code for its own sake. Some of our graduates never really get there, and appreciate programming mostly for what they can do with it.

If the message we send from academic CS is "come to us only if you already care about code for its own sake", then we may want to fix our message.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

January 07, 2014 4:09 PM

Know The Past

In this profile of computational geneticist Jason Moore, the scientist speaks explains how his work draws on work from the 1910s, which may offer computational genetics a better path forward than the work that catapulted genetics forward in the 1990s and 2000s.

Yet despite his use of new media and advanced technology, Moore spends a lot of time thinking about the past. "We have a lot to learn from early geneticists," he says. "They were smart people who were really thinking deeply about the problem."

Today, he argues, genetics students spend too much time learning to use the newest equipment and too little time reading the old genetics literature. Not surprisingly, given his ambivalent attitude toward technology, Moore believes in the importance of history. "Historical context is so important for what we do," he says. "It provides a grounding, a foundation. You have to understand the history in order ... to understand your place in the science."

Anyone familiar with the history of computing knows there is another good reason to know your history: Sometimes, we dream too small these days, and settle for too little. We have a lot to learn from early computer scientists.

I intend to make this a point of emphasis in my algorithms course this spring. I'd like to expose students to important new ideas outside the traditional canon (more on that soon), while at the same time exposing them to some of the classic work that hasn't been topped.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 24, 2013 11:35 AM

Inverting the Relationship Between Programs and Literals

This chapter on quasi-literals in the E language tells the story of Scott Kim teaching the author that Apple's HyperCard was "powerful in a way most of had not seen before". This unexpected power led many people to misunderstand its true importance.

Most programs are written as text, sequences of characters. In this model, a literal is a special form of embedded text. When the cursor is inside the quotes of a literal string, we are "effectively no longer in a program editor, but in a nested text editor". Kim calls this a 'pun': Instead of writing text to be evaluated by another program, we are creating output directly.

What if we turn things inside out and embed our program in literal text?

Hypercard is normally conceived of as primarily a visual application builder with an embedded silly programming language (Hypertalk). Think instead of a whole Hypercard stack as a mostly literal program. In addition to numbers and strings, the literals you can directly edit include bitmaps, forms, buttons, and menus. You can literally assemble most things, but where you need dynamic behavior, there's a hole in the literal filled in by a Hypertalk script. The program editor for this script is experienced as being nested inside the direct manipulation user interface editor.

There is a hole in the literal text, where a program goes, instead of a hole in the program, where literal text goes.

HyperTalk must surely have seemed strange to most programmers in 1987. Lisp programmers had long used macros and so knew the power of nesting code to be eval'ed inside of literal text. Of course, the resulting text was then passed on the eval to be treated again as program!

The inside-out idea of HyperCard is alive today in the form of languages such as PHP, which embed code in HTML text:

      echo $_SERVER['HTTP_USER_AGENT'];

This is a different way to think about programming, one perhaps suitable for bringing experts in some domains toward the idea of writing code gradually from documents in their area of expertise.

I sometimes have the students in my compiler course implement a processor for a simple Mustache-like template language as an early warm-up homework assignment. I do not usually require them to go as far as Turing-complete embedded code, but they create a framework that makes it possible. I think I'll look for ways to bring more of this idea into the next offering of our more general course on programming languages.

(HyperCard really was more than many people realized at the time. The people who got it became Big Fans, and the program still has an ardent following. Check out this brief eulogy, which rhapsodizes on "the mystically-enchanting mantra" at the end of the application's About box: "A day of acquaintance, / And then the longer span of custom. / But first -- / The hour of astonishment.")

Posted by Eugene Wallingford | Permalink | Categories: Computing

December 16, 2013 2:20 PM

More Fun with Integer "Assembly Language": Brute-Forcing a Function Minimum

Or: Irrational Exuberance When Programming

My wife and daughter laughed at me yesterday.

A few years ago, I blogged about implementing Farey sequences in Klein, a language for which my students at the time were writing a compiler. Klein was a minimal functional language with few control structures, few data types, and few built-in operations. Computing rational approximations using Farey's algorithm was a challenge in Klein that I likened to "integer assembly programming".

I clearly had a lot of fun with that challenge, especially when I had the chance to watch my program run using my students' compilers.

This semester, I am again teaching the compiler course, and my students are writing a compiler for a new version of Klein.

Last week, while helping my daughter with a little calculus, I ran across a fun new problem to solve in Klein:

the task of optimizing cost across the river

There are two stations on opposite sides of a river. The river is 3 miles wide, and the stations are 5 miles apart along the river. We need to lay pipe between the stations. Pipe laid on land costs $2.00/foot, and pipe laid across the river costs $4.00/foot. What is the minimum cost of the project?

This is the sort of optimization problem one often encounters in calculus textbooks. The student gets to construct a couple of functions, differentiate one, and find a maximum or minimum by setting f' to 0 and solving.

Solving this problem in Klein creates some of challenges. Among them are that ideally it involves real numbers, which Klein doesn't support, and that it requires a square root function, which Klein doesn't have. But these obstacles are surmountable. We already have tools for computing roots using Newton's method in our collection of test programs. Over a 3mi-by-5mi grid, an epsilon of a few feet approximates square roots reasonably well.

My daughter's task was to use the derivative of the cost function but, after talking about the problem with her, I was interested more in "visualizing" the curve to see how the cost drops as one moves in from either end and eventually bottoms out for a particular length of pipe on land.

So I wrote a Klein program that "brute-forces" the minimum. It loops over all possible values in feet for land pipe and compares the cost at each value to the previous value. It's easy to fake such a loop with a recursive function call.

The programmer's challenge in writing this program is that Klein has no local variables other function parameters. So I had to use helper functions to simulate caching temporary variables. This allowed me to give a name to a value, which makes the code more readable, but most importantly it allowed me to avoid having to recompute expensive values in what was already a computationally-expensive program.

This approach creates another, even bigger challenge for my students, the compiler writers. My Klein program is naturally tail recursive, but tail call elimination was left as an optional optimization in our class project. With activation records for all the tail calls stored on the stack, a compiler has to use a lot of space for its run-time memory -- far more than is available on our default target machine.

How many frames do we need? Well, we need to compute the cost at every foot along a (5 miles x 5280 feet/mile) rectangle, for a total of 26,400 data points. There will, of course, be other activation records while computing the last value in the loop.

Will I be able to see the answer generated by my program using my students' compilers? Only if one or more of the teams optimized tail calls away. We'll see soon enough.

So, I spent an hour or so writing Klein code and tinkering with it yesterday afternoon. I was so excited by the time I finished that I ran upstairs to tell my wife and daughter all about it: my excitement at having written the code, and the challenge it sets for my students' compilers, and how we could compute reasonable approximations of square roots of large integers even without real numbers, and how I implemented Newton's method in lieu of a sqrt, and...

That's when my wife and daughter laughed at me.

That's okay. I am programmer. I am still excited, and I'd do it again.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

December 11, 2013 12:01 PM

"Costs $20" is Functionally Indistinguishable from Gone

In his write-up on the origin of zero-based indexing in computing, Mike Hoye comments on the difficulties he had tracking down original sources:

Part of the problem is access to the historical record, of course. I was in favor of Open Access publication before, but writing this up has cemented it: if you're on the outside edge of academia, $20/paper for any research that doesn't have a business case and a deep-pocketed backer is completely untenable, and speculative or historic research that might require reading dozens of papers to shed some light on longstanding questions is basically impossible. There might have been a time when this was OK and everyone who had access to or cared about computers was already an IEEE/ACM member, but right now the IEEE -- both as a knowledge repository and a social network -- is a single point of a lot of silent failure. "$20 for a forty-year-old research paper" is functionally indistinguishable from "gone", and I'm reduced to emailing retirees to ask them what they remember from a lifetime ago because I can't afford to read the source material.

I'm an academic. When I am on campus, I have access to the ACM Digital Library. When I go home, I do not. I could pay for a personal subscription, but that seems an unnecessary expense when I am on campus so much.

I never have access to IEEE Xplore, Hoy's "single point of silent failure". Our university library chose to drop its institutional subscription a few years ago, and for good reason: it is ridiculously expensive, especially relative to the value we receive from it university-wide. (We don't have an engineering college.) We inquired about sharing a subscription with our sister schools, as we are legally under a single umbrella, but at least at that time, IEEE didn't allow such sharing.

What about non-academics, such as Hoye? We are blessed in computing with innumerable practitioners who study our history, write about, and create new ideas. Some are in industry and may have access to these resources, or an expense account. Many others, though, work on their own as independent contractors and researchers. They need access to materials, and $20 a pop is an acceptable expense.

Their loss if our loss. If Hoye had not written his article on the history of zero-based indexing, most of us wouldn't know the full story.

As time goes by, I hope that open access to research publications continues to grow. We really shouldn't have to badger retired computer scientists with email asking what they remember now about a topic they wrote an authoritative paper on forty years ago.

Posted by Eugene Wallingford | Permalink | Categories: Computing

December 10, 2013 3:33 PM

Your Programming Language is Your Raw Material, Too

Recently someone I know retweeted this familiar sentiment:

If carpenters were hired like programmers:
"Must have at least 5 years experience with the Dewalt 18V 165mm Circular Saw"

This meme travels around the world in various forms all the time, and every so often it shows up in one of my inboxes. And every time I think, "There is more to the story."

In one sense, the meme reflects a real problem in the software world. Job ads often use lists of programming languages and technologies as requirements, when what the company presumably really wants is a competent developer. I may not know the particular technologies on your list, or be expert in them, but if I am an experienced developer I will be able to learn them and become an expert.

Understanding and skill run deeper than a surface list of tools.

But. A programming language is not just a tool. It is a building material, too.

Suppose that a carpenter uses a Dewalt 18V 165mm circular saw to add a room to your house. When he finishes the project and leaves your employ, you won't have any trace of the Dewalt in his work product. You will have a new room.

He might have used another brand of circular saw. He may not have used a power tool at all, preferring the fine craftsmanship of a handsaw. Maybe he used no saw of any kind. (What a magician!) You will still have the same new room regardless, and your life will proceed in the very same way.

Now suppose that a programmer uses the Java programming language to add a software module to your accounting system. When she finishes the project and leaves your employ, you will have the results of running her code, for sure. But you will have a trace of Java in her work product. You will have a new Java program.

If you intend to use the program again, to generate a new report from new input data, you will need an instance of the JVM to run it. If want to modify the program to work differently, then you will also need a Java compiler to create the byte codes that run in the JVM. If you want to extend the program to do more, then you again will need a Java compiler and interpreter.

Programs are themselves tools, and we use programming languages to build them. So, while the language itself is surely a tool at one level, at another level it is the raw material out of which we create other things.

To use a particular language is to introduce a slew of other dependencies to the overall process: compilers, interpreters, libraries, and sometimes even machine architectures. In the general case, to use a particular language is to commit at least some part of the company's future attention to both the language and its attendant tooling.

So, while I am sympathetic to sentiment behind our recurring meme, I think it's important to remember that a programming language is more than just a particular brand of power tool. It is the stuff programs are made of.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 04, 2013 3:14 PM

Agile Moments, "Why We Test" Edition

Case 1: Big Programs.

This blog entry tells the sad story of a computational biologist who had to retract six published articles. Why? Their conclusions depended on the output of a computer program, and that program contained a critical error. The writer of the entry, who is not the researcher in question, concludes:

What this should flag is the necessity to aggressively test all the software that you write.

Actually, you should have tests for any program you use to draw important conclusions, whether you wrote it or not. The same blog entry mentions that a grad student in the author's previous lab had found several bugs a molecular dynamics program used by many computational biologists. How many published results were affected before they were found?

Case 2: Small Scripts.

Titus Brown reports finding bugs every time he reused one of his Python scripts. Yet:

Did I start doing any kind of automated testing of my scripts? Hell no! Anyone who wants to write automated tests for all their little scriptlets is, frankly, insane. But this was one of the two catalysts that made me personally own up to the idea that most of my code was probably somewhat wrong.

Most of my code has bugs but, hey, why write tests?

Didn't a famous scientist define insanity as doing the same thing over and over but expecting different results?

I consider myself insane, too, but mostly because I don't write tests often enough for my small scripts. We say to ourselves that we'll never reuse them, so we don't need tests. But we don't throw them away, and then we do reuse them, perhaps with a tweak here or there.

We all face time constraints. When we run a script the first time, we may well pay enough attention to the output that we are confident it is correct. But perhaps we can all agree that the second time we use a script, we should write tests for it if we don't already have them.

There are only three numbers in computing, 0, 1, and many. The second time we use a program is a sign from the universe that we need the added confidence provided by tests.

To be fair, Brown goes on to offer some good advice, such as writing tests for code after you find a bug in it. His article is an interesting read, as is almost everything he writes about computation and science.

Case 3: The Disappointing Trade-Off.

Then there's this classic from Jamie Zawinski, as quoted in Coders at Work:

I hope I don't sound like I'm saying, "Testing is for chumps." It's not. It's a matter of priorities. Are you trying to write good software or are you trying to be done by next week? You can't do both.

Sigh. If you you don't have good software by next week, maybe you aren't done yet.

I understand that the real world imposes constraints on us, and that sometimes worse is better. Good enough is good enough, and we rarely need a perfect program. I also understand that Zawinski was trying to be fair to the idea of testing, and that he was surely producing good enough code before releasing.

Even still, the pervasive attitude that we can either write good programs or get done on time, but not both, makes me sad. I hope that we can do better.

And I'm betting that the computational biologist referred to in Case 1 wishes he had had some tests to catch the simple error that undermined five years worth of research.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 30, 2013 9:45 AM

The Magic at the Heart of AI

This paragraph from The Man Who Would Teach Machines to Think expresses a bit of my uneasiness with the world of AI these days:

As our machines get faster and ingest more data, we allow ourselves to be dumber. Instead of wrestling with our hardest problems in earnest, we can just plug in billions of examples of them. Which is a bit like using a graphing calculator to do your high-school calculus homework -- it works great until you need to actually understand calculus.

I understand the desire to solve real problems and the resulting desire to apply opaque mathematics to large data sets. Like most everyone, I revel in what Google can do for me and watch in awe when Watson defeats the best human Jeopardy! players ever. But for me, artificial intelligence was about more than just getting the job done.

Over the years teaching AI, my students often wanted to study neural networks in much greater detail than my class tended to go. But I was more interested in approaches to AI and learning that worked at a more conceptual level. Often we could find a happy middle ground while studying genetic algorithms, which afforded them the magic of something-for-nothing and afforded me the potential for studying ideas as they evolved over time.

(Maybe my students were simply exhibiting Astrachan's Law.)

When I said goodbye to AAAI a few years ago, I mentioned Hofstadter's work as one of my early inspirations -- Gödel, Escher, Bach and the idea of self-reference, with its "intertwining worlds of music, art, mathematics, and computers". That entry said I was leaving AAAI because my own work had moved in a different direction. But it left unstated a second truth, which The Man Who Would Teach Machines to Think asserts as Hofstadter's own reason for working off the common path: the world of AI had moved in a different direction, too.

For me, as for Hofstadter, AI has always meant more than engineering a solution. It was about understanding scientifically something that seemed magical, something that is both deeply personal and undeniably universal to human experience, about how human consciousness seems to work. My interest in AI will always lie there.


If you enjoy the article about Hofstadter and his work linked above, perhaps you will enjoy a couple of entries I wrote after he visited my university last year:

Posted by Eugene Wallingford | Permalink | Categories: Computing

November 24, 2013 10:54 AM

Teaching Algorithms in 2014

This spring, I will be teaching the undergraduate algorithms course for first time in nine years, since the semester before I became department head. I enjoy this course. It gives both the students and me opportunities to do a little theory, a little design, and a little programming. I also like to have some fun, using what we learn to play games and solve puzzles.

Nine years is a long time in computing, even in an area grounded in well-developed theory. I will need to teach a different sort of course. At the end of this entry, I ask for your help in meeting this challenge.

Algorithms textbooks don't look much different now than they did in the spring of 2005. Long-time readers of this blog know that I face the existential crisis of selecting a textbook nearly every semester. Picking a textbook requires balancing several forces, including the value they give to the instructor, the value they give to the student during and after the course, and the increasing expense to students.

My primary focus in these decisions is always on net value to the students. I like to write my own material anyway. When time permits, I'd rather write as much as I can for students to read than outsource that responsibility (and privilege) to a textbook author. Writing my lecture notes in depth lets me weave a lot of different threads together, including pointers into primary and secondary sources. Students benefit from learning to read non-textbook material, the sort they will encounter as throughout their careers.

My spring class brings a new wrinkle to the decision, though. Nearly fifty students are enrolled, with the prospect a few more to come. This is a much larger group than I usually work with, and large classes carry a different set of risks than smaller courses. In particular, when something goes wrong in a small section, it is easier to recover through one-on-one remediation. That option is not so readily available for a fifty-person course.

There is more risk in writing new lecture material than in using a textbook that has been tested over time. A solid textbook can be a security blanket as much for the instructor as for the student. I'm not too keen on selecting a security blanket for myself, but the predictability of a text is tempting. There is one possible consolation in such a choice: perhaps subordinating my creative impulses to the design of someone's else's textbook will make me more creative as a result.

But textbook selection is a fairly ordinary challenge for me. The real question is: Which algorithms should we teach in this course, circa 2014? Surely the rise of big data, multi-core processors, mobile computing, and social networking require a fresh look at the topics we teach undergrads.

Perhaps we need only adjust the balance of topics that we currently teach. Or maybe we need to add a new algorithm or data structure to the undergraduate canon. If we teach a new algorithm, or a new class of algorithms, which standard material should be de-emphasized, or displaced altogether? (Alas, the semester is still but fifteen weeks long.)

Please send me your suggestions! I will write up a summary of the ideas you share, and I will certainly use your suggestions to design a better algorithms course for my students.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 19, 2013 4:49 PM

First Model, Then Improve

Not long ago, I read Unhappy Truckers and Other Algorithmic Problems, an article by Tom Vanderbilt that looks at efforts to optimize delivery schedules at UPS and similar companies. At the heart of the challenge lies the traveling salesman problem. However, in practice, the challenge brings companies face-to-face with a bevy of human issues, from personal to social, psychological to economic. As a result, solving this TSP is more complex than what we see in the algorithms courses we take in our CS programs.

Yet, in the face of challenges both computational and human, the human planners working at these companies do a pretty good job. How? Over the course of time, researchers figured out that finding optimal routes shouldn't be their main goal:

"Our objective wasn't to get the best solution," says Ted Gifford, a longtime operations research specialist at Schneider. "Our objective was to try to simulate what the real world planners were really doing."

This is a lesson I learned the hard way, too, back in graduate school, when my advisor's lab was trying to build knowledge-based systems for real clients, in chemical engineering, aeronautics, business, and other domains. We were working with real people who were solving hard problems under serious constraints.

At the beginning I was a typically naive programmer, armed with fancy AI techniques and unbounded enthusiasm. I soon learned that, if you walk into a workplace and propose to solve all the peoples' problems with a program, things don't go as smoothly as the programmer might hope.

First of all, this impolitic approach generally creates immediate pushback. These are people, with personal investment in the way things work now. They tend to bristle when a 20-something grad student walks in the door promoting the wonder drug for all their ills. Some might even fear that you are right, and success for your program will mean negative consequences for them personally. We see this dynamic in Vanderbilt's article.

There's a deeper reason that things don't go so smoothly, though, and it's the real lesson of Vanderbilt's piece. Until you implement the existing solution to the problem, you don't really understand the problem yet.

These problems are complex, often with many more constraints than typical theoretical solutions have dealt with. The humans solving the problem often have many years of experience contributing to their approach. They have deep knowledge of the domain, but also repeated exposure to the exceptions and edge cases that sometimes confound theoretical solutions. They use heuristics that are hard to tease apart or articulate.

I learned that it's easy to solve a problem if you are solving the wrong one.

A better way to approach these challenges is: First, model the existing system, including the extant solution. Then, look for ways to improve on the solution.

This approach often gives everyone involved greater confidence that the programmers understand -- and so are solving -- the right problem. It also enables the team to make small, incremental changes to the system, with a correspondingly higher probability of success. Together, these two outcomes greatly increase the chance of human buy-in from the current workers. This makes it easier for the whole team to recognize the need for larger-scale changes to the process, and to support and contribute to an improved solution.

Vanderbilt tells a similarly pragmatic story. He writes:

When I suggest to Gifford that he's trying to understand the real world, mathematically, he concurs, but adds: "The word 'understand' is too strong--we are happy to get positive outcomes."

Positive outcomes are what the company wants. Fortunately for the academics who work on such problems in industry, achieving good outcomes is often an effective way to test theories, encounter their shortcomings, and work on improvements. That, too, is something I learned in grad school. It was a valuable lesson.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 14, 2013 2:55 PM

Toward A New Data Science Culture in Academia

Fernando Perez has a nice write-up, An Ambitious Experiment in Data Science, describing a well-funded new project in which teams at UC Berkeley, the University of Washington, and NYU will collaborate to "change the culture of universities to create a data science culture". A lot of people have been quoting Perez's entry for its colorful assessment of academic incentives and reward structures. I like this piece for the way Perez defines and outlines the problem, in terms of both data science across disciplines and academic culture in general.

For example:

Most scientists are taught to treat computation as an afterthought. Similarly, most methodologists are taught to treat applications as an afterthought.

Methodologists here includes computer scientists, who are often more interested in new data structures, algorithms, and protocols.

This "mirror" disconnect is a problem for a reason many people already understand well:

Computation and data skills are all of a sudden everybody's problem.

(Here are a few past entries of mine that talk about how programming and the nebulous "computational thinking" have spread far and wide: 1 | 2 | 3 | 4.)

Perez rightly points out that the open-source software, while imperfect, often embodies the principles or science and scientific collaboration better than the academy. It will be interesting to see how well this data science project can inject OSS attitudes into big research universities.

He is concerned because, as I have noted before, are, as a whole, a conservative lot. Perez says this in a much more entertaining way:

There are few organizations more proud of their traditions and more resistant to change than universities (churches and armies might be worse, but that's about it).

I think he gives churches and armies more credit than they deserve.

The good news is that experiments of the sort being conducted in the Berkley/UW/NYU project are springing up on a smaller scale around the world. There is some hope for big change in academic culture if a lot of different people at a lot of different institutions experiment, learn, and create small changes that can grow together as they bump into one another.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 29, 2013 3:49 PM

My PLoP 2013 Retrospective

wrapper of the Plopp candy bar I received from Rebecca Rikner

PLoP 2013 was as much fun and as invigorating as I had hoped it would be. I hadn't attended in eight years, but it didn't take long to fall back into the rhythm of writers' workshops interspersed among invited talks, focus group sessions, BoFs, mingling, and (yes) games.

I was lead moderator for Workshop 010010, which consisted of pedagogical patterns papers. The focus of all of them was interactivity, whether among students building LEGO Mindstorms robots or among students and instructor on creative projects. The idea of the instructor as an active participant, even "generative" in the sense meant by Christopher Alexander, dominated our discussion. I look forward to seeing published versions of the papers we discussed.

The other featured events included invited talks by Jenny Quillien and Ward Cunningham and a 20-year retrospective panel featuring people who were present at the beginning of PLoP, the Hillside Group, and software patterns.

Quillien spent six years working with Alexander during the years he created The Nature of Order. Her talk shared some of the ways that Alexander was disappointed in the effect of his seminal "A Pattern Language" had on the world, both as a result of people misunderstanding it and as a result of the books inherent faults. Along the way, she tried to give pragmatic advice to people trying to document patterns of software. I may try to write up some of her thoughts, and some of my own in response, in the coming weeks.

Cunningham presented his latest work on federated wiki, the notion of multiple, individual wikis "federated" in relationships that share and present information for a common good. Unlike the original wiki, in which collaboration happened in a common document, federated wiki has a fork button on every page. Anyone can copy, modify, and share pages, which are then visible to everyone and available for merging back into the home wikis.

the favicon for my federated wiki on Ward's server

Ward set me up with a wiki in the federation on his server before I left on Saturday. I want to play with it a bit before I say much more than this: Federated wiki could change how communities share and collaborate in much the same way that wiki did.

I also had the pleasure of participating in one other structured activity while at PLoP. Takashi Iba and his students at Keio University in Japan are making a documentary about the history of the patterns community. Takashi invited me to sit for an interview about pedagogical patterns and their history within the development of software patterns. I was happy to help. It was a fun challenge to explain my understanding of what a pattern language is, and to think about what my colleagues and I struggled with in trying to create small pattern languages to guide instruction. Of course, I strayed off to the topic of elementary patterns as well, and that led to more interesting discussion with Takashi. I look forward to seeing their film in the coming years.

More so than even other conferences, unstructured activity plays a huge role in any PLoP conference. I skipped a few meals so that I could walk the extensive gardens and grounds of Allerton Park (and also so that I would not gain maximum pounds from the plentiful and tasty meals that were served). I caught up with old friends such as Ward, Kyle Brown, Bob Hanmer, Ralph Johnson, and made too many new friends to mention here. All the conversation had my mind swirling with new projects and old... Forefront in my mind is exploring again the idea of design and implementation patterns of functional programming. The time is still right, and I want to help.

Now, to write my last entry or two from StrangeLoop...


Image 1. A photo of the wrapper of a Plopp candy bar, which I received as a gift from Rebecca Rikner. PLoP has a gifting tradition, and I received a box full of cool tools, toys, mementoes, and candy. Plopp is a Swedish candy bar, which made it a natural gift for Rebecca to share from her native land. (It was tasty, too!)

Image 2. The favicon for my federated wiki on Ward's server, I like the color scheme that gave me -- and I'm glad to be early enough an adopter that I could claim my first name as the name of my wiki. The rest of the Eugenes in the world will have to settle for suffix numbers and all the other contortions that come with arriving late to the dance.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

October 19, 2013 7:38 AM

The Proto Interpreter for J

(Update: Josh Grams took my comment about needing a week of work to grok this code as a challenge. He figured it out much more quickly than that and wrote up an annotated version of the program as he went along.

I like finding and reading about early interpreters for programming languages, such as the first Lisp interpreter or Smalltalk-71, which grew out of a one-page proof of concept written by Dan Ingalls on a bet with Alan Kay.

So I was quite happy recently to run across Appendix A from An Implementation of J, from which comes the following code. Arthur Whitney whipped up this one-page interpreter fragment for the AT&T 3B1 one weekend in 1989 to demonstrate his idea for a new APL-like like language. Roger Hui studied this interpreter for a week before writing the first implementation of J.

typedef char C;typedef long I;
typedef struct a{I t,r,d[3],p[2];}*A;
#define P printf
#define R return
#define V1(f) A f(w)A w;
#define V2(f) A f(a,w)A a,w;
#define DO(n,x) {I i=0,_n=(n);for(;i<_n;++i){x;}}
I *ma(n){R(I*)malloc(n*4);}mv(d,s,n)I *d,*s;{DO(n,d[i]=s[i]);}
tr(r,d)I *d;{I z=1;DO(r,z=z*d[i]);R z;}
A ga(t,r,d)I *d;{A z=(A)ma(5+tr(r,d));z->t=t,z->r=r,mv(z->d,d,r);
 R z;}
V1(iota){I n=*w->p;A z=ga(0,1,&n);DO(n,z->p[i]=i);R z;}
V2(plus){I r=w->r,*d=w->d,n=tr(r,d);A z=ga(0,r,d);
 DO(n,z->p[i]=a->p[i]+w->p[i]);R z;}
V2(from){I r=w->r-1,*d=w->d+1,n=tr(r,d);
 A z=ga(w->t,r,d);mv(z->p,w->p+(n**a->p),n);R z;}
V1(box){A z=ga(1,0,0);*z->p=(I)w;R z;}
V2(cat){I an=tr(a->r,a->d),wn=tr(w->r,w->d),n=an+wn;
 A z=ga(w->t,1,&n);mv(z->p,a->p,an);mv(z->p+an,w->p,wn);R z;}
V2(rsh){I r=a->r?*a->d:1,n=tr(r,a->p),wn=tr(w->r,w->d);
 A z=ga(w->t,r,a->p);mv(z->p,w->p,wn=n>wn?wn:n);
 if(n-=wn)mv(z->p+wn,z->p,n);R z;}
V1(sha){A z=ga(0,1,&w->r);mv(z->p,w->d,w->r);R z;}
V1(id){R w;}V1(size){A z=ga(0,0,0);*z->p=w->r?*w->d:1;R z;}
pi(i){P("%d ",i);}nl(){P("\n");}
pr(w)A w;{I r=w->r,*d=w->d,n=tr(r,d);DO(r,pi(d[i]));nl();
 if(w->t)DO(n,P("< ");pr(w->p[i]))else DO(n,pi(w->p[i]));nl();}

C vt[]="+{~<#,"; A(*vd[])()={0,plus,from,find,0,rsh,cat}, (*vm[])()={0,id,size,iota,box,sha,0}; I st[26]; qp(a){R a>='a'&&a<='z';}qv(a){R a<'a';} A ex(e)I *e;{I a=*e; if(qp(a)){if(e[1]=='=')R st[a-'a']=ex(e+2);a= st[ a-'a'];} R qv(a)?(*vm[a])(ex(e+1)):e[1]?(*vd[e[1]])(a,ex(e+2)):(A)a;} noun(c){A z;if(c<'0'||c>'9')R 0;z=ga(0,0,0);*z->p=c-'0';R z;} verb(c){I i=0;for(;vt[i];)if(vt[i++]==c)R i;R 0;} I *wd(s)C *s;{I a,n=strlen(s),*e=ma(n+1);C c; DO(n,e[i]=(a=noun(c=s[i]))?a:(a=verb(c))?a:c);e[n]=0;R e;}

main(){C s[99];while(gets(s))pr(ex(wd(s)));}

I think it will take me a week of hard work to grok this code, too. Whitney's unusually spare APL-like C programming style is an object worthy of study in its own right.

By the way, Hui's Appendix A bears the subtitle Incunabulum, a word that means a work of art or of industry of an early period. So, I not only discovered a new bit of code this week; I also learned a cool new word. That's a good week.

Posted by Eugene Wallingford | Permalink | Categories: Computing

October 16, 2013 11:38 AM

Poetry as a Metaphor for Software

I was reading Roger Hui's Remembering Ken Iverson this morning on the elliptical, and it reminded me of this passage from A Conversation with Arthur Whitney. Whitney is a long-time APL guru and the creator of the A, K, and Q programming languages. The interviewer is Bryan Cantrill.

BC: Software has often been compared with civil engineering, but I'm really sick of people describing software as being like a bridge. What do you think the analog for software is?

AW: Poetry.

BC: Poetry captures the aesthetics, but not the precision.

AW: I don't know, maybe it does.

A poet's use of language is quite precise. It must balance forces in many dimensions, including sound, shape, denotation, and connotation. Whitney seems to understand this. Richard Gabriel must be proud.

Brevity is a value in the APL world. Whitney must have a similar preference for short language names. I don't know the source of his names A, K, and Q, but I like Hui's explanation of where J's name came from:

... on Sunday, August 27, 1989, at about four o'clock in the afternoon, [I] wrote the first line of code that became the implementation described in this document.

The name "J" was chosen a few minutes later, when it became necessary to save the interpreter source file for the first time.

Beautiful. No messing around with branding. Gotta save my file.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 11, 2013 1:42 PM

The Tang of Adventure, and a Lively Appreciation

"After you've learned the twelve times table," John Yarnelle asks, "what else is there to do?"

The concepts of modern mathematics give the student something else to do in great abundance and variety at all levels of his development. Not only may he discover unusual types of mathematical structures where, believe it or not, two and two does not equal four, but he may even be privileged to invent a new system virtually on his own. Far from a sense of stagnation, there is the tang of adventure, the challenge of exploration; perhaps also a livelier appreciation of the true nature of mathematical activity and mathematical thought.

Not only the tang of adventure; students might also come to appreciate what math really is. That's an admirable goal for any book or teacher.

This passage comes from Yarnelle's Finite Mathematical Structures, a 1964 paperback that teaches fields, groups, and algebras with the prose of a delighted teacher. I picked this slender, 66-page gem up off a pile of books being discarded by a retired math professor a decade ago. How glad I am that none of the math profs who walked past that pile bothered to claim it before I happened by.

We could use a few CS books like this, too.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 07, 2013 12:07 PM

StrangeLoop: Exercises in Programming Style

[My notes on StrangeLoop 2013: Table of Contents]

Crista Lopes

I had been looking forward to Crista Lopes's StrangeLoop talk since May, so I made sure I was in the room well before the scheduled time. I even had a copy of the trigger book in my bag.

Crista opened with something that CS instructors have learned the hard way: Teaching programming style is difficult and takes a lot of time. As a result, it's often not done at all in our courses. But so many of our graduates go into software development for the careers, where they come into contact with many different styles. How can they understand them -- well, quickly, or at all?

To many people, style is merely the appearance of code on the screen or printed. But it's not. It's more, and something entirely different. Style is a constraint. Lopes used images of a few stylistic paintings to illustrate the idea. If an artist limits herself to pointillism or cubism, how can she express important ideas? How does the style limit the message, or enhance it?

But we know this is true of programming as well. The idea has been a theme in my teaching for many years. I occasionally write about the role of constraints in programming here, including Patterns as a Source of Freedom, a few programming challenges, and a polymorphism challenge that I've run as a workshop.

Lopes pointed to a more universal example, though: the canonical The Elements of Programming Style. Drawing on this book and other work in software, she said that programming style ...

  • is a way to express tasks
  • exists at all scales
  • recurs at multiple scales
  • is codified in programming language

For me, the last bullet ties back most directly to idea of style as constraint. A language makes some things easier to express than others. It can also make some things harder to express. There is a spectrum, of course. For example, some OO languages make it easy to create and use objects; others make it hard to do anything else! But the language is an enabler and enforcer of style. It is a proxy for style as a constraint on the programmer.

Back to the talk. Lopes asked, Why is it so important that we understand programming style? First, a style provides the reader with a frame of reference and a vocabulary. Knowing different styles makes us a more effective consumers of code. Second, one style can be more appropriate for a given problem or context than another style. So, knowing different styles makes us a more effective producers of code. (Lopes did not use the producer-consumer distinction in the talk, but it seems to me a nice way to crystallize her idea.)

the cover of Raymond Queneau's Exercises in Style

The, Lopes said, I came across Raymond Queneau's playful little book, "Exercises in Style". Queneau constrains himself in many interesting ways while telling essentially the same story. Hmm... We could apply the same idea to programming! Let's do it.

Lopes picked a well-known problem, the common word problem famously solved in a Programming Pearls column more than twenty-five years. This is a fitting choice, because Jon Bentley included in that column a critique of Knuth's program by Doug McIlroy, who considered both engineering concerns and program style in his critique.

The problem is straightforward: identify and print the k most common terms that occur in a given text document, in decreasing order. For the rest of the talk, Lopes presented several programs that solve the problem, each written in a different style, showing code and highlighting its shape and boundaries.

Python was her language of choice for the examples. She was looking for a language that many readers would be able to follow and understand, and Python has the feel of pseudo-code about it. (I tell my students that it is the Pascal of their time, though I may as well be speaking of hieroglyphics.) Of course, Python has strengths and weaknesses that affect its fit for some styles. This is an unavoidable complication for all communication...

Also, Lopes did not give formal names to the styles she demonstrated. Apparently, at previous versions of this talk, audience members had wanted to argue over the names more than the styles themselves! Vowing not to make that mistake again, she numbered her examples for this talk.

That's what programmers do when they don't have good names.

In lieu of names, she asked the crowd to live-tweet to her what they thought each style is or should be called. She eventually did give each style a fun, informal name. (CS textbooks might be more evocative if we used her names instead of the formal ones.)

I noted eight examples shown by Lopes in the talk, though there may have been more:

  • monolithic procedural code -- "brain dump"
  • a Unix-style pipeline -- "code golf"
  • procedural decomposition with a sequential main -- "cook book"
  • the same, only with functions and composition -- "Willy Wonka"
  • functional decomposition, with a continuation parameter -- "crochet"
  • modules containing multiple functions -- "the kingdom of nouns"
  • relational style -- (didn't catch this one)
  • functional with decomposition and reduction -- "multiplexer"

Lopes said that she hopes to produce solutions using a total of thirty or so styles. She asked the audience for help with one in particular: logic programming. She said that she is not a native speaker of that style, and Python does not come with a logic engine built-in to make writing a solution straightforward.

Someone from the audience suggested she consider yet another style: using a domain-specific language. That would be fun, though perhaps tough to roll from scratch in Python. By that time, my own brain was spinning away, thinking about writing a solution to the problem in Joy, using a concatenative style.

Sometimes, it's surprising just how many programming styles and meaningful variations people have created. The human mind is an amazing thing.

The talk was, I think, a fun one for the audience. Lopes is writing a book based on the idea. I had a chance to review an early draft, and now I'm looking forward to the finished product. I'm sure I'll learn something new from it.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

October 04, 2013 3:12 PM

StrangeLoop: Rich Hickey on Channels and Program Design

[My notes on StrangeLoop 2013: Table of Contents]

Rich Hickey setting up for his talk

Rich Hickey spoke at one of the previous StrangeLoops I attended, but this was my first time to attend one of his talks in person. I took the shaky photo seen at the right as proof. I must say, he gives a good talk.

The title slide read "Clojure core.async Channels", but Hickey made a disclaimer upfront: this talk would be about what channels are and why Clojure has them, not the details of how they are implemented. Given that there were plenty of good compiler talks elsewhere at the conference, this was a welcome change of pace. It was also a valuable one, because many more people will benefit from what Hickey taught about program design than would have benefited from staring at screens full of Clojure macros. The issues here are important ones, and ones that few programmers understand very well.

The fundamental problem is this: Reactive programs need to be machines, but functions make bad machines. Even sequences of functions.

The typical solution to this problem these days is to decompose the system logic into a set of response handlers. Alas, this leads to callback hell, a modern form of spaghetti code. Why? Even though the logic has been decomposed into pieces, it is still "of a piece", essentially a single logical entity. When this whole is implemented across multiple handlers, we can't see it as a unit, or talk about it easily. We need to, though, because we need to design the state machine that it comprises.

Clojure's solution to the problem, in the form of core.async, is the channel. This is an implementation of Tony Hoare's communicating sequential process. One of the reasons that Hickey likes this approach is that it lets a program work equally well in fully threaded apps and in apps with macro-generated inversion of control.

Hickey then gave some examples of code using channels and talked a bit about the implications of the implementation for system design. For instance, the language provides handy put! and take! operators for integrating channels with code at the edge of non-core.async systems. I don't have much experience with Clojure, so I'll have to study a few examples in detail to really appreciate this.

For me, the most powerful part of the talk was an extended discussion of communication styles in program. Hickey focused on the trade-offs between direct communication via shared state and indirect communication via channels. He highlighted six or seven key distinctions between the two and how these affect the way a system works. I can't do this part of the talk justice, so I suggest you watch the video of the talk. I plan to watch it again myself.

I had always heard that Hickey was eminently quotable, and he did not disappoint. Here are three lines that made me smile:

  • "Friends don't let friends put logic in handlers."
  • "Promises and futures are the one-night stands" of asynchronous architecture.
  • "Unbounded buffers are a recipe for a bad program. 'I don't want to think about this bug yet, so I'll leave the buffer unbounded.'"

That last one captures the indefatigable optimism -- and self-delusion -- that characterizes so many programmers. We can fix that problem later. Or not.

In the end, this talk demonstrates how a good engineer approaches a problem. Clojure and its culture reside firmly in the functional programming camp. However, Hickey recognizes that, for the problem at hand, a sequence of functional calls is not the best solution. So he designs a solution that allows programmers to do FP where it fits best and to do something else where FP doesn't. That's a pragmatic way to approach problems.

Still, this solution is consistent with Clojure's overall design philosophy. The channel is a first-class object in the language. It converts a sequence of functional calls into data, whereas callbacks implement the sequence in code. As code, we see the sequence only at run-time. As data, we see it in our program and can use it in all the ways we can use any data. This consistent focus on making things into data is an attractive part of the Clojure language and the ecosystem that has been cultivated around it.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 28, 2013 12:17 PM

StrangeLoop: This and That, Volume 2

[My notes on StrangeLoop 2013: Table of Contents]

I am at a really good talk and look around the room. So many people are staring at their phones, scrolling away. So many others are staring at their laptops, typing away. The guy next to me: doing both at the same time. Kudos, sir. But you may have missed the point.


Conference talks are a great source of homework problems. Sometimes, the talk presents a good problem directly. Others, watching the talk sets my subconscious mind in motion, and it creates something useful. My students thank you. I thank you.


Jenny Finkel talked about the difference between two kinds of recommenders: explorers, who forage for new content, and exploiters, who want to see what's already popular. The former discovers cool new things occasionally but fails occasionally, too. The latter is satisfied most of the time but rarely surprised. As conference goes, I felt this distinction at play in my own head this year. When selecting the next talk to attend, I have to take a few risks if I ever hope to find something unexpected. But when I fail, a small regret tugs at me.


We heard a lot of confident female voices on the StrangeLoop stages this year. Some of these speakers have advanced academic degrees, or at least experience in grad school.


The best advice I received on Day 1 perhaps came not from a talk but from the building:

The 'Do not Climb on Bears' sign on a Peabody statue

"Please do not climb on bears." That sounds like a good idea most everywhere, most all the time.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

September 27, 2013 4:26 PM

StrangeLoop: Add All These Things

[My notes on StrangeLoop 2013: Table of Contents]

I took a refreshing walk in the rain over the lunch hour on Friday. I managed to return late and, as a result, missed the start of Avi Bryant's talk on algebra and analytics. Only a few minutes, though, which is good. I enjoyed this presentation.

Bryant didn't talk about the algebra we study in eighth or ninth grade, but the mathematical structure math students encounter in a course called "abstract" or "modern" algebra. A big chunk of the talk focused on an even narrower topic: why +, and operators like it, are cool.

One reason is that grouping doesn't matter. You can add 1 to 2, and then add 4 to the result, and have the same answer as if you added 4 to 1, and then added 2 to the result. This is, of course, the associative property.

Another is that order doesn't matter. 1 + 2 is the same as 2 + 1. That's the commutative property.

Yet another is that, if you have nothing to add, you can add nothing and have the same value you started with. 4 + 0 = 4. 0 is the identity element for addition.

Finally, when you add two numbers, you get a number back. This is not quite as true in computers as in math, because an operation can cause an overflow or underflow and create an error. But looked at through fuzzy lenses, this is true in our computers, too. This is the closure property for addition of integers and real numbers.

Addition isn't the only operation on numbers that has these properties. Finding the maximum value in a set of numbers, does, too. The maximum of two numbers is a number. max(x,y) = max(y,x), and if we have three or more numbers, it doesn't how matter how we group them; max will find the maximum among them. The identity value is tricky -- there is no smallest number... -- but in practice we can finesse this by using the smallest number of a given data type, or even allowing max to take "nothing" as a value and return its other argument.

When we see a pattern like this, Bryant said, we should generalize:

  • We have a function f that takes two values from a set and produces another member of the same set.
  • The order of f's arguments doesn't matter.
  • The grouping of f's arguments doesn't matter.
  • There is some identity value, a conceptual "zero", that doesn't matter, in the sense that f(i,zero) for any i is simply i.

There is a name for this pattern. When we have such as set and operation, we have a commutative monoid.

     S ⊕ S → S
     x ⊕ y = y ⊕ x
     x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z
     x ⊕ 0 = x

I learned about this and other such patterns in grad school when I took an abstract algebra course for kicks. No one told me at the time that I'd being seeing them again as soon as someone created the Internet and it unleashed a torrent of data on everyone.

Just why we are seeing the idea of a commutative monoid again was the heart of Bryant's talk. When we have data coming into our company from multiple network sources, at varying rates of usage and data flow, and we want to extract meaning from the data, it can be incredibly handy if the meaning we hope to extract -- the sum of all the values, or the largest -- can be computed using a commutative monoid. You can run multiple copies of your function at the entry point of each source, and combine the partial results later, in any order.

Bryant showed this much more attractively than that, using cute little pictures with boxes. But then, there should be an advantage to going to the actual talk... With pictures and fairly straightforward examples, he was able to demystify the abstract math and deliver on his talk's abstract:

A mathematician friend of mine tweeted that anyone who doesn't understand abelian groups shouldn't build analytics systems. I'd turn that around and say that anyone who builds analytics systems ends up understanding abelian groups, whether they know it or not.

That's an important point. Just because you haven't studied group theory or abstract algebra doesn't mean you shouldn't do analytics. You just need to be prepared to learn some new math when it's helpful. As programmers, we are all looking for opportunities to capitalize on patterns and to generalize code for use in a wider set of circumstances. When we do, we may re-invent the wheel a bit. That's okay. But also look for opportunities to capitalize on patterns recognized and codified by others already.

Unfortunately, not all data analysis is as simple as summing or maximizing. What if I need to find an average? The average operator doesn't form a commutative monoid with numbers. It falls short in almost every way. But, if you switch from the set of numbers to the set of pairs [n, c], where n is a number and c is a count of how many times you've seen n, then you are back in business. Counting is addition.

So, we save the average operation itself as a post-processing step on a set of number/count pairs. This turns out to be a useful lesson, as finding the average of a set is a lossy operation: it loses track of how many numbers you've seen. Lossy operations are often best saved for presenting data, rather than building them directly into the system's computation.

Likewise, finding the top k values in a set of numbers (a generalized form of maximum) can be handled just fine as long as we work on lists of numbers, rather than numbers themselves.

This is actually one of the Big Ideas of computer science. Sometimes, we can use a tool or technique to solve a problem if only we transform the problem into an equivalent one in a different space. CS theory courses hammer this home, with oodles of exercises in which students are asked to convert every problem under the sun into 3-SAT or the clique problem. I look for chances to introduce my students to this Big Idea when I teach AI or any programming course, but the lesson probably gets lost in the noise of regular classwork. Some students seem to figure it out by the time they graduate, though, and the ones who do are better at solving all kinds of problems (and not by converting them all 3-SAT!).

Sorry for the digression. Bryant didn't talk about 3-SAT, but he did demonstrate several useful problem transformations. His goal was more practical: how can we use this idea of a commutative monoid to extract as many interesting results from the stream of data as possible.

This isn't just an academic exercise, either. When we can frame several problems in this way, we are able to use a common body of code for the processing. He called this body of code an aggregator, comprising three steps:

  • prepare the data by transforming it into the space of a commutative monoid
  • reduce the data to a single value in that space, using the appropriate operator
  • present the result by transforming it back into its original space

In practice, transforming the problem into the space of a monoid presents challenges in the implementation. For example, it is straightforward to compute the number of unique values in a collection of streams by transforming each item into a set of size one and then using set union as the operator. But union requires unbounded space, and this can be inconvenient when dealing with very large data sets.

One approach is to compute an estimated number of uniques using a hash function and some fancy arithmetic. We can make the expected error in estimate smaller and smaller by using more and more hash functions. (I hope to write this up in simple code and blog about it soon.)

Bryant looked at one more problem, computing frequencies, and then closed with a few more terms from group theory: semigroup, group, and abelian group. Knowing these terms -- actually, simply knowing that they exist -- can be useful even for the most practical of practitioners. They let us know that there is more out there, should our problems become harder or our needs become larger.

That's a valuable lesson to learn, too. You can learn all about abelian groups in the trenches, but sometimes it's good to know that there may be some help out there in the form of theory. Reinventing wheels can be cool, but solving the problems you need solved is even cooler.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

September 24, 2013 4:38 PM

StrangeLoop: Compilers, Compilers, Compilers

[My notes on StrangeLoop 2013: Table of Contents]

I went to a lot of talks about compilation. There seemed to be more this year than last, but perhaps I was suffering from a perception bias. I'm teaching compilers this semester and have been reading a bit about V8 and Crankshaft on the elliptical of late.

Many of the talks I saw revolved around a common theme: dynamic run-time systems. Given the prominence these days of Javascript, Python, Ruby, Lua, and their like, it's not surprising that finding better ways to organize dynamic run-times and optimize their performance are receiving a lot of attention.

The problem of optimizing dynamic run-time systems is complicated by the wide range of tasks being performed dynamically: type checking, field access, function selection, and the most common whipping horse of my static-language friends, garbage collection. Throw in eval, which allows the execution of arbitrary code, possibly changing even the definition of core classes, and it's amazing that our dynamic languages can run in human time at all. That's a tribute to the people who have been creating compilers for us over the years.

As I listened to these talks, my ears were tuned to ideas and topics that I need to learn more about. That's what my notes captured best. Here are a few ideas that stood out.

The JavaScript interpreter, interpreted. Martha Girdler gave a quick, jargon-free tour of how Javascript works, using Javascript code to illustrate basic ideas like contexts. This sort of talk can help relatively inexperienced developers understand the common "pain points" of the language, such as variable hoisting.

Fast and Dynamic. Maxime Chevalier-Boisvert went a level deeper, tracing some of the fundamental ideas used to implement run-time systems from their historical roots in Lisp, Smalltalk, and Self up to research prototypes such as Chevalier-Boisvert's own Higgs compiler.

Many of the ideas are familiar to anyone who has had an undergrad compiler course, such as type tagging and microcoded instructions. Others are simple extensions of such ideas, such as inline caching, which is a workhorse in any dynamic compiler. Still others have entered popular discussion only recently. Maps, which are effectively hidden classes, originated in Self and are now being applied and extended in a number of interesting ways.

Two ideas from this talk that I would like to learn more about are hybrid type inference, which Chevalier-Boisvert mentioned in the context of Chrome and Firefox, and basic block versioning, a technique being explored in the Higgs compiler.

In closing, the speaker speculated on the better compilers of the future. Some of the advances will come from smarter CPUs, which might execute potential future paths in parallel, and more principled language design. But many will come from academic research that discovers new techniques and improves exiting ones.

Some of the ideas of the future are probably already available and simply haven't caught on yet. Chevalier-Boisvert offered three candidates: direct manipulation of the AST, pattern matching, and the persistent image. I certainly hear a lot of people talking about the first of these, but I've yet to see a compelling implementation yet.

Ruby Doesn't Have to Be Slow. In this session, Alex Gaynor explained why dynamic languages don't have to be slow. Though Ruby was his working example, everything he said applies to Javascript, Python, Lua, and other dynamic languages. He then talked about how he is putting these ideas to work in Topaz, a fast Ruby interpreter written in RPython. Topaz uses a number of advanced techniques, including a tracing JIT, type-specialized field look-up, maps, quasi-immutable fields, and escape analysis. It supports a subset of Ruby, though much of what is missing now is simply standard library classes and methods.

Two of the more interesting points of this talk for me were about meta-issues. First, he opened with an elaboration of the claim, "Ruby is slow", which he rightfully rejected as too imprecise to be meaningful. What people probably mean is something like, "Code written in Ruby executes CPU-bound tasks slower than other languages." I would add that, for many of my CS colleagues, the implicit benchmark is compiled C.

Further, Ruby users tend to respond to this claim poorly. Rather than refute it, they accept its premise and dance around its edges. Saddest, he says, is when they say, "If it turns out to matter, we can rewrite the program in some serious language." The compiler nerd in him says, "We can do this." Topaz is, in part, an attempt to support that claim.

Second, in response to an audience question, he claimed that people responsible for Java got something right fifteen years: they convinced people to abandon their C extensions. If the Ruby world followed course, and moved away from external dependencies that restrict what the compiler and run-time system can know, then many performance improvements would follow.

Throughout this talk, I kept coming back to JRuby in my mind...

The Art of Finding Shortcuts. Vyacheslav " @mraleph" Egorov's talk was ostensibly about an optimizing compiler for Dart, but like most of the compiler talks this year, it presented ideas of value for handling any dynamic language. Indeed, this talk gave a clear introduction to what an optimizing compiler does, what in-line caching is, and different ways that the compiler might capitalize on them.

According to Egorov, writing an optimizing compiler for language like Dart is the art of finding -- and taking -- shortcuts. The three key issues to address are representation, resolution, and redundancy. You deal with representation when you design your run-time system. The other two fall to the optimizing compiler.

Resolution is fundamentally a two-part question. Given the expression obj.prop,

  • What is obj?
  • Where is prop?

In-line caches eliminate redundancy by memoizing where/what pairs. The goal is to use the same hidden class maps to resolve property access whenever possible. Dart's optimizer uses in-line caching to give type feedback for use in improving the performance of loads and stores.

Egorov was one of the most quotable speakers I heard at StrangeLoop this year. In addition to "the art of finding shortcuts", I noted several other pithy sayings that I'll probably steal at some point, including:

  • "If all you have is an in-line cache, then everything looks like an in-line cache stub."
  • "In-lining is a Catch-22." You can't know if you will benefit from inlining unless you try, but trying (and undoing) is expensive.

Two ideas I plan to read more about after hearing this talk are allocation sinking and load forwarding.


I have a lot of research to do now.

Posted by Eugene Wallingford | Permalink | Categories: Computing

September 23, 2013 4:22 PM

StrangeLoop: This and That, Volume 1

[My notes on StrangeLoop 2013: Table of Contents]

the Peabody Opera House's Broadway series poster

I'm working on a post about the compiler talks I attended, but in the meantime here are a few stray thoughts, mostly from Day 1.

The Peabody Opera House really is a nice place to hold a conference of this size. If StrangeLoop were to get much larger, it might not fit.

I really don't like the word "architected".

The talks were scheduled pretty well. Only once in two days did I find myself really wanting to go to two talks at the same time. And only once did I hear myself thinking, "I don't want to hear any of these...".

My only real regret from Day 1 was missing Scott Vokes's talk on data compression. I enjoyed the talk I went to well enough, but I think I would have enjoyed this one more.

What a glorious time to be a programming language theory weenie. Industry practitioners are going to conferences and attending talks on dependent types, continuations, macros, immutable data structures, and functional reactive programming.

Moon Hooch? Interesting name, interesting sound.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

September 22, 2013 3:51 PM

StrangeLoop: Jenny Finkel on Machine Learning at Prismatic

[My notes on StrangeLoop 2013: Table of Contents]

The conference opened with a talk by Jenny Finkel on the role machine learning play at Prismatic, the customized newsfeed service. It was a good way to start the conference, as it introduced a few themes that would recur throughout, had a little technical detail but not too much, and reported a few lessons from the trenches.

Prismatic is trying to solve the discovery problem: finding content that users would like to read but otherwise would not see. This is more than simply a customized newsfeed from a singular journalistic source, because it draws from many sources, including other reader's links, and because it tries to surprise readers with articles that may not be explicitly indicated by their profiles.

The scale of the problem is large, but different from the scale of the raw data facing Twitter, Facebook, and the like. Finkel said that Prismatic is processing only about one million timely docs at a time, with the set of articles turning over roughly weekly. The company currently uses 5,000 categories to classify the articles, though that number will soon go up to the order of 250,000.

The complexity here comes from the cross product of readers, articles, and categories, along with all of the features used to try to tease out why readers like the things they do and don't like the others. On top of this are machine learning algorithms that are themselves exponentially expensive to run. And with articles turning over roughly weekly, they have to be amassing data, learning from it, and moving on constantly.

The main problem at the heart of a service like this is: What is relevant? Everywhere one turns in AI, one sees this question, or its more general cousin, Is this similar? In many ways, this is the problem at the heart of all intelligence, natural and artificial.

Prismatic's approach is straight from AI, too. They construct a feature vector for each user/article pair and then try to learn weights that, when applied to the values in a given vector, will rank desired articles high and undesired articles low. One of the key challenges when doing this kind of working is to choose the right features to use in the vector. Finkel mentioned a few used by Prismatic, including "Does the user follow this topic?", "How many times has the reader read an article from this publisher?", and "Does the article include a picture?"

With a complex algorithm, lots of data, and a need to constantly re-learn, Prismatic has to make adjustments and take shortcuts wherever possible in order to speed up the process. This is a common theme at a conference where many speakers are from industry. First, learn your theory and foundations; learn the pragmatics and heuristics need to turn basic techniques into the backbone of practical applications.

Finkel shared one pragmatic idea of this sort that Prismatic uses. They look for opportunities to fold user-specific feature weights into user-neutral features. This enables their program to compute many user-specific dot products using a static vector.

She closed the talk with five challenges that Prismatic has faced that other teams might be on the look out for:

Bugs in the data. In one case, one program was updating a data set before another program could take a snapshot of the original. With the old data replaced by the new, they thought their ranker was doing better than it actually was. As Finkel said, this is pretty typical for an error in machine learning. The program doesn't crash; it just gives the wrong answer. Worse, you don't even have reason to suspect something is wrong in the offending code.

Presentation bias. Readers tend to look at more of the articles at the top of a list of suggestions, even if they would have enjoyed something further down the list. This is a feature of the human brain, not of computer programs. Any time we write programs that interact with people, we have to be aware of human psychology and its effects.

Non-representative subsets. When you are creating a program that ranks things, its whole purpose is to skew a set of user/article data points toward the subset of articles that the reader most wants to read. But this subset probably doesn't have the same distribution as the full set, which hampers your ability to use statistical analysis to draw valid conclusions.

Statistical bleeding. Sometimes, one algorithm looks better than it is because it benefits from the performance of the other. Consider two ranking algorithms, one an "explorer" that seeks out new content and one an "exploiter" that recommend articles that have already been found to be popular. If we in comparing their performances, the exploiter will tend to look better than it is because it benefits from the successes of the explorer without being penalized for its failures. It is crucial to recognize that one feature you measure is not dependent on another. (Thanks to Christian Murphy for the prompt!)

Simpson's Paradox. The iPhone and the web have different clickthrough rates. They once found them in a situation where one recommendation algorithm performed worse than another on both platforms, yet better overall. This can really disorient teams who follow up experiments by assessing the results. The issue here is usually a hidden variable that is confounding the results.

(I remember discussing this classic statistical illusion with a student in my early years of teaching, when we encountered a similar illusion in his grade. I am pretty sure that I enjoyed our discussion of the paradox more than he did...)

This part of a talk is of great value to me. Hearing about another team's difficulties rarely helps me avoid the same problems in my own projects, but it often does help me recognize those problems when they occur and begin thinking about ways to work around them. This was a good way for me to start the conference.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

September 22, 2013 10:27 AM

Back from StrangeLoop 2013

the front of my StrangeLoop 2013 badge

I'm back home for StrangeLoop 2013. It was, again, an outstanding conference: a great location, excellent amenities, fun side events, and -- most importantly -- a solid set of talks: diverse topics, strong technical content, and a some very good speakers. Alex Miller and his team put on a good conference.

This year, I went to the talks old school: with a steno notebook and no technology but a camera. As a result, a couple of things are different about how I'm blogging the conference. First, I did not write or post any entries during the event itself. Second, my notes are a bit shorter than usual and will need to be typed up before they become blog entries. I'll write my thoughts up over the next week or so and post the entries as they emerge.

This entry will serve as a table of contents for my StrangeLoop posts, a home base for readers who might stumble onto one post and care to read more. For now, I'll list a few entries I expect to write, but I'll only know what belongs here after I have written them.

Primary entries:

Ancillary entries:

Is it too early to start looking forward to StrangeLoop 2014?

Posted by Eugene Wallingford | Permalink | Categories: Computing

September 10, 2013 3:40 PM

A Laugh at My Own Expense

This morning presented a short cautionary tale for me and my students, from a silly mistake I made in a procmail filter.

Back story: I found out recently that I am still subscribed to a Billy Joel fan discussion list from the 1990s. The list has been inactive for years, or I would have been filtering its messages to a separate mailbox. Someone has apparently hacked the list, as a few days ago it started spewing hundreds of spam messages a day.

I was on the road for a few days after the deluge began and was checking mail through a shell connection to the mail server. Because I was busy with my trip and checking mail infrequently, I just deleted the messages by hand. When I got back, soon learned they were junk and filtered them away for me. But the spam was still hitting my inbox on the mail server, where I read my mail occasionally even on campus.

After a session on the server early this morning, I took a few minutes to procmail them away. Every message from the list has a common pattern in the Subject: line, so I copied it and pasted it into a new procmail recipe to send all list traffic to /dev/null :

    * ^Subject.*[billyjoel]

Do you see the problem? Of course you do.

I didn't at the time. My blindness probably resulted from a combination of the early hour, a rush to get over to the gym, and the tunnel vision that comes from focusing on a single case. It all looked obvious.

This mistake offers programming lessons at several different levels.

The first is at the detailed level of the regular expression. Pay attention to the characters in your regex -- all of them. Those brackets really are in the Subject: line, but by themselves mean something else in the regex. I need to escape them:

    * ^Subject.*\[billyjoel\]

This relates to a more general piece of problem-solving advice. Step back from individual case you are solving and think about the code you are writing more generally. Focused on the annoying messages from the list, the brackets are just characters in a stream. Looked at from the perspective of the file of procmail recipes, they are control characters.

The second is at the level of programming practice. Don't /dev/null something until you know it's junk. Much better to send the offending messages to a junk mbox first:

    * ^Subject.*\[billyjoel\]

Once I see that all and only the messages from the list are being matched by the pattern, I can change that line send list traffic where it belongs. That's a specific example of the sort of defensive programming that we all should practice. Don't commit to solutions too soon.

This, too, relates to more general programming advice about software validation and verification. I should have exercised a few test cases to validate my recipe before turning it loose unsupervised on my live mail stream.

I teach my students this mindset and program that way myself, at least most of the time. Of course, the time you most need test cases will be the time you don't write them.

The day provided a bit of irony to make the story even better. The topic of today's session in my compilers course? Writing regular expressions to describe the tokens in a language. So, after my mail admin colleague and I had a good laugh at my expense, I got to tell the story to my students, and they did, too.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

August 31, 2013 11:32 AM

A Good Language Conserves Programmer Energy

Game programmer Jeff Wofford wrote a nice piece on some of the lessons he learned by programming a game in forty-eight hours. One of the recurring themes of his article is the value of a high-powered scripting language for moving fast. That's not too surprising, but I found his ruminations on this phenomenon to be interesting. In particular:

A programmer's chief resource is the energy of his or her mind. Everything that expends or depletes that energy makes him or her less effective, more tired, and less happy.

A powerful scripting language sitting atop the game engine is one of the best ways to conserve programmer energy. Sometimes, though, a game programmer must work hard to achieve the performance required by users. For this reason, Wofford goes out of his way not to diss C++, the tool of choice for many game programmers. But C++ is an energy drain on the programmer's mind, because the programmer has to be in a constant state of awareness of machine cycles and memory consumption. This is where the trade-off with a scripting language comes in:

When performance is of the essence, this state of alertness is an appropriate price to pay. But when you don't have to pay that price -- and in every game there are systems that have no serious likelihood of bottlenecking -- you will gain mental energy back by essentially ignoring performance. You cannot do this in C++: it requires an awareness of execution and memory costs at every step. This is another argument in favor of never building a game without a good scripting language for the highest-level code.

I think this is true of almost every large system. I sure wish that the massive database systems at the foundation of my university's operations had scripting languages sitting on top. I even want to script against the small databases that are the lingua franca of most businesses these days -- spreadsheets. The languages available inside the tools I use are too clunky or not powerful, so I turn to Ruby.

Unfortunately, most systems don't come with a good scripting language. Maybe the developers aren't allowed to provide one. Too many CS grads don't even think of "create a mini-language" as a possible solution to their own pain.

Fortunately for Wofford, he both has the skills and inclination. One of his to-dos after the forty-eight hour experience is all about language:

Building a SWF importer for my engine could work. Adding script support to my engine and greatly refining my tools would go some of the distance. Gotta do something.

Gotta do something.

I'm teaching our compiler course again this term. I hope that the dozen or so students in the course leave the university knowing that creating a language is often the right next action and having the skills to do it when they feel compelled to do something.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

August 29, 2013 4:31 PM

Asimov Sees 2014, Through Clear Eyes and Foggy

Isaac Asimov, circa 1991

A couple of years ago, I wrote Psychohistory, Economics, and AI, in which I mentioned Isaac Asimov and one way that he had influenced me. I never read Asimov or any other science fiction expecting to find accurate predictions of future. What drew me in was the romance of the stories, dreaming "what if?" for a particular set of conditions. Ultimately, I was more interested in the relationships among people under different technological conditions than I was in the technology itself. Asimov was especially good at creating conditions that generated compelling human questions.

Some of the scenarios I read in Asimov's SF turned out to be wildly wrong. The world today is already more different from the 1950s than the world of the Foundation, set thousands of years in the future. Others seem eerily on the mark. Fortunately, accuracy is not the standard by which most of us judge good science fiction.

But what of speculation about the near future? A colleague recently sent me a link to Visit to the World's Fair of 2014, an article Asimov wrote in 1964 speculating about the world fifty years hence. As I read it, I was struck by just how far off he was in some ways, and by how close he was in others. I'll let you read the story for yourself. Here are a few selected passages that jumped out at me.

General Electric at the 2014 World's Fair will be showing 3-D movies of its "Robot of the Future," neat and streamlined, its cleaning appliances built in and performing all tasks briskly. (There will be a three-hour wait in line to see the film, for some things never change.)

3-D movies are now common. Housecleaning robots are not. And while some crazed fans will stand in line for many hours to see the latest comic-book blockbuster, going to a theater to see a movie has become much less important part of the culture. People stream movies into their homes and into their hands. My daughter teases me for caring about the time any TV show or movie starts. "It's on Hulu, Dad." If it's not on Hulu or Netflix or the open web, does it even exist?

Any number of simultaneous conversations between earth and moon can be handled by modulated laser beams, which are easy to manipulate in space. On earth, however, laser beams will have to be led through plastic pipes, to avoid material and atmospheric interference. Engineers will still be playing with that problem in 2014.

There is no one on the moon with whom to converse. Sigh. The rest of this passage sounds like fiber optics. Our world is rapidly becoming wireless. If your device can't connect to the world wireless web, does it even exist?

In many ways, the details of technology are actually harder to predict correctly than the social, political, economic implications of technological change. Consider:

Not all the world's population will enjoy the gadgety world of the future to the full. A larger portion than today will be deprived and although they may be better off, materially, than today, they will be further behind when compared with the advanced portions of the world. They will have moved backward, relatively.

Spot on.

When my colleague sent me the link, he said, "The last couple of paragraphs are especially relevant." They mention computer programming and a couple of its effects on the world. In this regard, Asimov's predictions meet with only partial success.

The world of A.D. 2014 will have few routine jobs that cannot be done better by some machine than by any human being. Mankind will therefore have become largely a race of machine tenders. Schools will have to be oriented in this direction. ... All the high-school students will be taught the fundamentals of computer technology will become proficient in binary arithmetic and will be trained to perfection in the use of the computer languages that will have developed out of those like the contemporary "Fortran" (from "formula translation").

The first part of this paragraph is becoming truer every day. Many people husband computers and other machines as they do tasks we used to