February 29, 2024 3:45 PM

Finding the Torture You're Comfortable With

At some point last week, I found myself pointed to this short YouTube video of Jerry Seinfeld talking with Howard Stern about work habits. Seinfeld told Stern that he was essentially always thinking about making comedy. Whatever situation he found himself in, even with family and friends, he was thinking about how he could mine it for new material. Stern told him that sounded like torture. Jerry said, yes, it was, but...

Your blessing in life is when you find the torture you're comfortable with.

This is something I talk about with students a lot.

Sometimes it's a current student who is worried that CS isn't for them because too often the work seems hard, or boring. Shouldn't it be easy, or at least fun?

Sometimes it's a prospective student, maybe a HS student on a university visit or a college student thinking about changing their major. They worry that they haven't found an area of study that makes them happy all the time. Other people tell them, "If you love what you do, you'll never work a day in your life." Why can't I find that?

I tell them all that I love what I do -- studying, teaching, and writing about computer science -- and even so, some days feel like work.

I don't use torture as analogy the way Seinfeld does, but I certainly know what he means. Instead, I usually think of this phenomenon in terms of drudgery: all the grunt work that comes with setting up tools, and fiddling with test cases, and formatting documentation, and ... the list goes on. Sometimes we can automate one bit of drudgery, but around the corner awaits another.

And yet we persist. We have found the drudgery we are comfortable with, the grunt work we are willing to do so that we can be part of the thing it serves: creating something new, or understanding one little corner of the world better.

I experienced the disconnect between the torture I was comfortable with and the torture that drove me away during my first year in college. As I've mentioned here a few times, most recently in my post on Niklaus Wirth, from an early age I had wanted to become an architect (the kind who design houses and other buildings, not software). I spent years reading about architecture and learning about the profession. I even took two drafting courses in high school, including one in which we designed a house and did a full set of plans, with cross-sections of walls and eaves.

Then I got to college and found two things. One, I still liked architecture in the same way as I always had. Two, I most assuredly did not enjoy the kind of grunt work that architecture students had to do, nor did I relish the torture that came with not seeing a path to a solution for a thorny design problem.

That was so different from the feeling I had writing BASIC programs. I would gladly bang my head on the wall for hours to get the tiniest detail just the way I wanted it, either in the code or in the output. When the torture ended, the resulting program made all the pain worth it. Then I'd tackle a new problem, and it started again.

Many of the students I talk with don't yet know this feeling. Even so, it comforts some of them to know that they don't have to find The One Perfect Major that makes all their boredom go away.

However, a few others understand immediately. They are often the ones who learned to play a musical instrument or who ran cross country. The pianists remember all the boring finger exercises they had to do; the runners remember all the wind sprints and all the long, boring miles they ran to build their base. These students stuck with the boredom and worked through the pain because they wanted to get to the other side, where satisfaction and joy are.

Like Seinfeld, I am lucky that I found the torture I am comfortable with. It has made this life a good one. I hope everyone finds theirs.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Running, Software Development, Teaching and Learning

February 09, 2024 3:45 PM

Finding Cool Ideas to Play With

In a recent post on Computational Complexity, Bill Gasarch wrote up the solution to a fun little dice problem he had posed previously. Check it out. After showing the solution, he answered some meta-questions. I liked this one:

How did I find this question, and its answer, at random? I intentionally went to the math library, turned my cell phone off, and browsed some back issues of the journal Discrete Mathematics. I would read the table of contents and decide what article sounded interesting, read enough to see if I really wanted to read that article. I then SAT DOWN AND READ THE ARTICLES, taking some notes on them.

He points out that turning off his cell phone isn't the secret to his method.

It's allowing yourself the freedom to NOT work on a a paper for the next ... conference and just read math for FUN without thinking in terms of writing a paper.

Slack of this sort used to be one of the great attractions of the academic life. I'm not sure it is as much a part of the deal as it once was. The pace of the university seems faster these days. Many of the younger faculty I follow out in the world seem always to be hustling for the next conference acceptance or grant proposal. They seem truly joyous when an afternoon turns into a serendipitous session of debugging or reading.

Gasarch's advice is wise, if you can follow it: Set aside time to explore, and then do it.

It's not always easy fun; reading some articles is work. But that's the kind of fun many of us signed up for when we went into academia.

~~~~~

I haven't made enough time to explore recently, but I did get to re-read an old paper unexpectedly. A student came to me to discuss possible undergrad research projects. He had recently been noodling around, implementing his own neural network simulator. I've never been much of a neural net person, but that reminded of this paper on PushForth, a concatenative language in the spirit of Forth and Joy designed as part of an evolutionary programming project. Genetic programming has always interested me, and concatenative languages seem like a perfect fit...

I found the paper in a research folder and made time to re-read it for fun. This is not the kind of fun Gasarch is talking about, as it had potential use for a project, but I enjoyed digging into the topic again nonetheless.

The student looked at the paper and liked the idea, too, so we embarked on a little project -- not quite serendipity, but a project I hadn't planned to work on at the turn of the new year. I'll take it!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

January 06, 2024 10:41 AM

end.

a man in a suit, behind a microphone and a bottle of water
Source: Wikipedia, unrestricted

My social media feed this week has included many notes and tributes on the passing of Niklaus Wirth, including his obituary from ETH Zurich, where he was a professor. Wirth was, of course, a Turing Award winner for his foundational work designing a sequence of programming languages.

Wirth's death reminded me of END DO, my post on the passing of John Backus, and before that a post on the passing of Kenneth Iverson. I have many fond memories related to Wirth as well.

Pascal

Pascal was, I think, the fifth programming language I learned. After that, my language-learning history starts to speed up and blur. (I do think APL and Lisp came soon after.)

I learned BASIC first, as a junior in high school. This ultimately changed the trajectory of my life, as it planted the seeds for me to abandon a lifelong dream to be an architect.

Then at university, I learned Fortran in CS 1, PL/I in Data Structures (you want pointers!), and IBM 360/370 assembly language in a two-quarter sequence that also included JCL. Each of these language expanded my mind a little.

Pascal was the first language I learned "on my own". The fall of my junior year, I took my first course in algorithms. On Day 1, the professor announced that the department had decided to switch to Pascal in the intro course, so that's what we would use in this course.

"Um, prof, that's what the new CS majors are learning. We know Fortran and PL/I." He smiled, shrugged, and turned to the chalkboard. Class began.

After class, several of us headed immediately to the university library, checked out one Pascal book each, and headed back to the dorms to read. Later that week, we were all using Pascal to implement whatever classical algorithm we learned first in that course. Everything was fine.

I've always treasured that experience, even if it was little scary for a week or so. And don't worry: That professor turned out to be a good guy with whom I took several courses. He was a fellow chess player and ended up being the advisor on my senior project: a program to perform the Swiss system commonly used to run chess tournaments. I wrote that program in... Pascal. Up to that point, it was the largest and most complex program I had ever written solo. I still have the code.

The first course I taught as a tenure-track prof was my university's version of CS 1 -- using Pascal.

Fond memories all. I miss the language.

Wirth sightings in this blog

I did a quick search and found that Wirth has made an occasional appearance in this blog over the years.

• January 2006: Just a Course in Compilers

This was written at the beginning of my second offering of our compiler course, which I have taught and written about many times since. I had considered using as our textbook Wirth's Compiler Construction, a thin volume that builds a compiler for a subset of Wirth's Oberon programming language over the course of sixteen short chapters. It's a "just the facts and code" approach that appeals to me most days.

I didn't adopt the book for several reasons, not least of which that at the time Amazon showed only four copies available, starting at $274.70 each. With two decades of experience teaching the course now, I don't think I could ever really use this book with my undergrads, but it was a fun exercise for me to work through. It helped me think about compilers and my course.

Note: A PDF of Compiler Construction has been posted on the web for many years, but every time I link to it, the link ultimately disappears. I decided to mirror the files locally, so that the link will last as long as this post lasts:
[ Chapters 1-8 | Chapters 9-16 ]

• September 2007: Hype, or Disseminating Results?

... in which I quote Wirth's thoughts on why Pascal spread widely in the world but Modula and Oberon didn't. The passage comes from a short historical paper he wrote called "Pascal and its Successors". It's worth a read.

• April 2012: Intermediate Representations and Life Beyond the Compiler

This post mentions how Wirth's P-code IR ultimately lived on in the MIPS compiler suite long after the compiler which first implemented P-code.

• July 2016: Oberon: GoogleMaps as Desktop UI

... which notes that the Oberon spec defines the language's desktop as "an infinitely large two-dimensional space on which windows ... can be arranged".

• November 2017: Thousand-Year Software

This is my last post mentioning Wirth before today's. It refers to the same 1999 SIGPLAN Notices article that tells the P-code story discussed in my April 2012 post.

I repeat myself. Some stories remain evergreen in my mind.

The Title of This Post

I titled my post on the passing of John Backus END DO in homage to his intimate connection to Fortran. I wanted to do something similar for Wirth.

Pascal has a distinguished sequence to end a program: "end.". It seems a fitting way to remember the life of the person who created it and who gave the world so many programming experiences.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

December 31, 2023 1:35 PM

"I Want to Find Something to Learn That Excites Me"

In his year-end wrap-up, Greg Wilson writes:

I want to find something to learn that excites me. A new musical instrument is out because of my hand; I've thought about reviving my French, picking up some Spanish, diving into Postgres or machine learningn (yeah, yeah, I know, don't hate me), but none of them are making my heart race.

What he said. I want to find something to learn that excites me.

I just spent six months immersed in learning more about HTML, CSS, and JavaScript so that I could work with novice web developers. Picking up that project was one part personal choice and one part professional necessity. It worked out well. I really enjoyed studying the web development world and learned some powerful new tools. I will continue to use them as time and energy permit.

But I can't say that I am excited enough by the topic to keep going in this area. Right now, I am still burned out from the semester on a learning treadmill. I have a followup post to my early reactions about the course's JavaScript unit in the hopper, waiting for a little desire to finish it.

What now? There are parallels between my state and Wilson's.

  • After my first-ever trip to Europe in 2019, for a Dagstuhl seminar (brief mention here), my wife and I talked about a return trip, with a focus this time on Italy. Learning Italian was part of the nascent plan. Then came COVID, along with a loss of energy for travel. I still have learning Italian in my mind.
  • In the fall of 2020, the first full semester of the pandemic, I taught a database course for the first time (bookend posts here and here). I still have a few SQL projects and learning goals hanging around from that time, but none are calling me right now.
  • LLMs are the main focus of so many people's attention these days, but they still haven't lit up me up. In some ways, I envy David Humphrey, who fell in love with AI this year. Maybe something about LLMs will light me up one of these days. (As always, you should read David's stuff. He does neat work and shares it with the world.)

Unlike Wilson, I do not play a musical instrument. I did, however, learn a little basic piano twenty-five years ago when I was a Suzuki piano parent with my daughters. We still have our piano, and I harbor dreams of picking it back up and going farther some day. Right now doesn't seem to be that day.

I have several other possibilities on the back burner, particularly in the area of data analytics. I've been intrigued by the work on data-centric computing in education being done by Kathi Fisler and Shriram Krishnamurthi have been at Brown. I also will be reading a couple of their papers on program design and plan composition in the coming weeks as I prepare for my programming languages course this spring. Fisler and Krishnamurthi are coming at these topics from the side of CS education, but the topics are also related to my grad-school work in AI. Maybe these papers will ignite a spark.

Winter break is coming to an end soon. Like others, I'm thinking about 2024. Let's see what the coming weeks bring.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

August 23, 2023 12:31 PM

Do Machines Have a Right to Read?

Yes, according to Jeff Jarvis. That is one of his unpopular opinions about AI:

Machines should have the same right to learn as humans; to say otherwise is to set a dangerous precedent for humans. If we say that a machine is not allowed to learn, to read, to extract knowledge from existing content and adapt it to other uses, then I fear it would not be a long leap to declare what we as humans are not allowed to read, see, or know some things. This puts us in the odd position of having to defend the machine's rights so as to protect our own.

I have been staying mostly out of the AI debate these days, except as devil's advocate inside my head to most every arguments I see. However, I must admit that Jarvis's assertion seems on the right trackto me. The LLMs aren't memorizing the text they process so much as breaking it down into little bits and distributing the bits across elements that enable a level of synthesis different from regurgitation. That's why they get so many facts wrong. That sounds a lot like what people do when they read and assimilate new material into their heads. (Sadly, we get a lot of facts wrong, too.)

My support for this assertion rests on something that Jarvis says at the end of his previous bullet:

I am no lawyer but I believe training machines on any content that is lawfully acquired so it can be inspired to produce new content is not a violation of copyright. Note my italics.

I, too, am not a lawyer. Good read from Jarvis, as usual.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 01, 2023 4:21 PM

Disconcerted by a Bank Transaction

I'm not sure what to think of the fact that my bank says it received my money NaN years ago:

screen capture from my bank's web site, showing that my transaction was received NaN years ago

At least NaN hasn't show up as my account balance yet! I suppose that if it were the result of an overflow, at least I'd know what it's like to be fabulously wealthy.

(For my non-technical readers, NaN stands for "Not a Number", and is used in computing interpreted as a value that is not defined or not representable. You may be able to imagine why seeing this in a bank transaction would be disconcerting to a programmer!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

April 24, 2023 2:54 PM

PyCon Day 3

The last day of a conference is often a wildcard. If it ends early enough, I can often drive or fly home afterward, which means that I can attend all of the conference activities. If not, I have to decide between departing the next day or cutting out early. When possible, I stay all day. Sometimes, as with StrangeLoop, I can stay all day, skip only the closing keynote, and drive home into the night.

With virtual PyCon, I decided fairly early on that I would not be able to attend most of the last day. This is Sunday, I have family visiting, and work looms on the horizon for Monday. An afternoon talk or two is all I will be able to manage.

Talk 1: A Pythonic Full-Text Search

The idea of this talk was to use common Python tools to implement full-text search of a corpus. It turns out that the talk focused on websites, and thus on tools common to Python web developers: PostgreSQL and Django. It also turns out that django.contrib.postgres provides lots of features for doing search out of the box. This talk showed how to use them.

This was an interesting talk, but not immediately useful to me. I don't work with Django, and I use SQLite more often then PostgreSQL for my personal work. There may be some support for search in django.contrib.sqlite, if it exists, but the speaker said that I'd likely have to implement most of the search functionality myself. Even so, I enjoyed seeing what was possible with modules already available.

Talk 2: Using Python to Help the Unhoused

I thought I was done for the conference, but I decided I could listen in one one more talk while making dinner. This non-technical session sounded interesting:

How a group of volunteers from around the globe use Python to help an NGO in Victoria, BC, Canada to help the unhoused. By building a tool to find social media activity on unhoused in the Capitol Region, the NGO can use a dashboard of results to know where to move their limited resources.

With my attention focused on the Sri Lankan dal with coconut-lime kale in my care, I didn't take detailed notes this time, but I did learn about the existence of Statistics Without Borders, which sounds like a cool public service group that needs to exist in 2023. Otherwise, the project involved scraping Twitter as a source of data about the needs of the homeless in Victoria, and using sentiment analysis to organize the data. Filtering the data to zero in on relevant data was their biggest challenge, as keyword filters passed through many false positives.

At this point, the developers have given their app to the NGO and are looking forward to receiving feedback, so that they can make any improvements that might be needed.

This is a nice project by folks giving back to their community, and a nice way to end the conference.

~~~~~

I was a PyCon first-timer, attending virtually. The conference talks were quite good. Thanks to everyone who organized the conference and created such a complete online experience. I didn't use all of the features available, but what I did use worked well, and the people were great. I ended up with links to several Python projects to try out, a few example scripts for PyScript and Mermaid that I cobbled together after the talks, and lots of syntactic abstractions to explore. Three days well spent.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 23, 2023 12:09 PM

PyCon Day 2

Another way that attending a virtual conference is like the in-person experience: you can oversleep, or lose track of time. After a long morning of activity before the time-shifted start of Day 2, I took a nap before the 11:45 talk, and...

Talk 1: Python's Syntactic Sugar

Grrr. I missed the first two-thirds of this talk, which I greatly anticipated, but I slept longer than I planned. My body must have needed more than I was giving it.

I saw enough of the talk, though, to know I want to watch the video on YouTube when it shows up. This topic is one of my favorite topics in programming languages: What is the smallest set of features we need to implement the rest of the language? The speaker spent a couple of years implementing various Python features in terms of others, and arrived at a list of only ten that he could not translate away. The rest are sugar. I missed the list at the beginning of the talk, but I gathered a few of its members in the ten minutes I watched: while, raise, and try/except.

I love this kind of exercise: "How can you translate the statement if X: Y into one that uses only core features?" Here's one attempt the speaker gave:

    try:
        while X:
	    Y
	    raise _DONE
    except _DONE:
        None
 

I was today days old when I learned that Python's bool subclasses int, that True == 1, and that False == 0. That bit of knowledge was worth interrupting my nap to catch the end of the talk. Even better, this talk was based on a series of blog posts. Video is okay, but I love to read and play with ideas in written form. This series vaults to the top of my reading list for the coming days.

Talk 2: Subclassing, Composition, Python, and You

Okay, so this guy doesn't like subclasses much. Fair enough, but... some of his concerns seem to be more about the way Python classes work (open borders with their super- and subclasses) than with the idea itself. He showed a lot of ways one can go wrong with arcane uses of Python subclassing, things I've never thought to do with a subclass in my many years doing OO programming. There are plenty of simpler uses of inheritance that are useful and understandable.

Still, I liked this talk, and the speaker. He was honest about his biases, and he clearly cares about programs and programmers. His excitement gave the talk energy. The second half of the talk included a few good design examples, using subclassing and composition together to achieve various ends. It also recommended the book Architecture Patterns with Python. I haven't read a new software patterns book in a while, so I'll give this one a look.

Toward the end, the speaker referenced the line "Software engineering is programming integrated over time." Apparently, this is a Google thing, but it was new to me. Clever. I'll think on it.

Talk 3: How We Are Making CPython Faster -- Past, Present and Future

I did not realize that, until Python 3.11, efforts to make the interpreter had been rather limited. The speaker mentioned one improvement made in 3.7 to optimize the typical method invocation, obj.meth(arg), and one in 3.8 that sped up global variable access by using a cache. There are others, but nothing systematic.

At this point, the talk became mutually recursive with the Friday talk "Inside CPython 3.11's New Specializing, Adaptive Interpreter". The speaker asked us to go watch that talk and return. If I were more ambitious, I'd add a link to that talk now, but I'll let you any of you are interested to visit yesterday's post and scroll down two paragraphs.

He then continued with improvements currently in the works, including:

  • efforts to optimize over larger regions, such as the different elements of a function call
  • use of partial evaluation when possible
  • specialization of code
  • efforts to speed up memory management and garbage collection

He also mentions possible improvements related to C extension code, but I didn't catch the substance of this one. The speaker offered the audience a pithy takeaway from his talk: Python is always getting faster. Do the planet a favor and upgrade to the latest version as soon as you can. That's a nice hook.

There was lots of good stuff here. Whenever I hear compiler talks like this, I almost immediately start thinking about how I might work some of the ideas into my compiler course. To do more with optimization, we would have to move faster through construction of a baseline compiler, skipping some or all of the classic material. That's a trade-off I've been reluctant to make, given the course's role in our curriculum as a compilers-for-everyone experience. I remain tempted, though, and open to a different treatment.

Talk 4: The Lost Art of Diagrams: Making Complex Ideas Easy to See with Python

Early on, this talk contained a line that programmers sometimes need to remember: Good documentation shows respect for users. Good diagrams, said the speaker, can improve users' lives. The talk was a nice general introduction to some of the design choices available to us as we create diagrams, including the use of color, shading, and shapes (Venn diagrams, concentric circles, etc.). It then discussed a few tools one can use to generate better diagrams. The one that appealed most to me was Mermaid.js, which uses a Markdown-like syntax that reminded me of GraphViz's Dot language. My students and use GraphViz, so picking up Mermaid might be a comfortable task.

~~~~~

My second day at virtual PyCon confirmed that attending was a good choice. I've seen enough language-specific material to get me thinking new thoughts about my courses, plus a few other topics to broaden the experience. A nice break from the semester's grind.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

April 22, 2023 6:38 PM

PyCon Day 1

One of great benefits of a virtual conference is accessibility. I can hop from Iowa to Salt Lake City with the press of a button. The competing cost to virtual conference is that I am accessible ... from elsewhere.

On the first day of PyCon, we had a transfer orientation session that required my presence in virtual Iowa from 10:00 AM-12:00 noon local time. That's 9:00-11:00 Mountain time, so I missed Ned Batchelder's keynote and the opening set of talks. The rest of the day, though, I was at the conference. Virtual giveth, and virtual taketh away.

Talk 1: Inside CPython 3.11's New Specializing, Adaptive Interpreter

As I said yesterday, I don't know Python -- tools, community, or implementation -- intimately. That means I have a lot to learn in any talk. In this one, Brandt Bucher discussed the adaptive interpreter that is part of Python 3.11, in particular how the compiler uses specialization to improve its performance based on run-time usage of the code.

Midway through the talk, he referred us to a talk on tomorrow's schedule. "You'll find that the two talks are not only complementary, they're also mutually recursive." I love the idea of mutually recursive talks! Maybe I should try this with two sessions in one of my courses. To make it fly, I will need to make some videos... I wonder how students would respond?

This online Python disassembler by @pamelafox@fosstodon.org popped up in the chat room. It looks like a neat tool I can use in my compiler course. (Full disclosure: I have been following Pamela on Twitter and Mastodon for a long time. Her posts are always interesting!)

Talk 2: Build Yourself a PyScript

PyScript is a Javascript module that enables you to embed Python in a web page, via WebAssembly. This talk described how PyScript works and showed some of the technical issues in writing web apps.

Some of this talk was over my head. I also do not have deep experience programming in the web. It looks like I will end up teaching a beginning web development course this fall (more later), so I'll definitely be learning more about HTML, CSS, and Javascript soon. That will prepare me to be more productive using tools like PyScript.

Talk 3: Kill All Mutants! (Intro to Mutation Testing)

Our test suites are often not strong enough to recognize changes in our code. The talk introduced mutation testing, which modifies code to test the suite. I didn't take a lot of notes on this one, but I did make a note to try mutation testing out, maybe in Racket.

Talk 4: Working with Time Zones: Everything You Wish You Didn't Need to Know

Dealing with time zones is one of those things that every software engineer seems to complain about. It's a thorny problem with both technical and social dimensions, which makes it really interesting for someone who loves software design to think about.

This talk opened with example after example of how time zones don't behave as straightforwardly as you might think, and then discussed Python's newest time zone library, pytz.

My main takeaways from this talk: pytz looks useful, and I'm glad I don't have to deal with time zones on a regular basis.

Talk 5: Pythonic Functional (iter)tools for your Data Challenges

This is, of course, a topic after my heart. Functional programming is a big part of my programming languages course, and I like being able to show students Python analogues to the Racket ideas they are learning. There was not much new FP content for me here, but I did learn some new Python functions from itertools that I can use in class -- and in my own code. I enjoyed the Advent of Code segment of the talk, in which the speaker applied Python to some of the 2021 challenges. I use an Advent of Code challenge or two each year in class, too. The early days of the month usually feature fun little problems that my students can understand quickly. They know how to solve them imperatively in Python, but we tackle them functionally in Racket.

Most of the FP ideas needed to solve them in Python are similar, so it was fun to see the speaker solve them using itertools. Toward the end, the solutions got heavy quickly, which must be how some of my students feel when we are solving these problems in class.

~~~~~

Between work in the morning and the conference afternoon and evening, this was a long day. I have a lot of new tools to explore.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 21, 2023 3:33 PM

Headed to PyCon, Virtually

Last night, I posted on Mastodon:

Heading off to #PyConUS in the morning, virtually. I just took my first tour of Hubilo, the online platform. There's an awful lot going on, but you need a lot of moving parts to produce the varied experiences available in person. I'm glad that people have taken up the challenge.

Why PyCon? I'm not an expert in the language, or as into the language as I once was with Ruby. Long-time readers may recall that I blogged about attending JRubyConf back in 2012. Here is a link to my first post from that conference. You can scroll up from there to see several more posts from the conference.

However, I do write and read a lot of Python code, because our students learn it early and use it quite a bit. Besides, it's a fun language and has a large, welcoming community. Like many language-specific conferences, PyCon includes a decent number of talks about interpreters, compilers, and tools, which are a big part of my teaching portfolio these days. The conference offers a solid virtual experience, too, which makes it attractive to attend while the semester is still going on.

My goals for attending PyCon this year include:

  • learning some things that will help me improve my programming languages and compiler courses,
  • learning some things that make me a better Python programmer, and
  • simply enjoying computer science and programming for a few days, after a long and occasionally tedious year. The work doesn't go away while I am at a conference, but my mind gets to focus on something else -- something I enjoy!

More about today's talks tomorrow.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 29, 2023 2:39 PM

Fighting Entropy in Class

On Friday, I wrote a note to myself about updating an upcoming class session:

Clean out the old cruft. Simplify, simplify, simplify! I want students to grok the idea and the implementation. All that Lisp and Scheme history is fun for me, but it gets in the students' way.

This is part of an ongoing battle for me. Intellectually, I know to design class sessions and activities focused on where students are and what they need to do in order to learn. Yet it continually happens that I strike upon a good approach for a session, and then over the years I stick a little extra in here and there; within a few iterations I have a big ball of mud. Or I fool myself that some bit of history that I find fascinating is somehow essential for students to learn about, too, so I keep it in a session. Over the years, the history grows more and more distant; the needs of the session evolve, but I keep the old trivia in there, filling up time and distracting my students.

It's hard.

The specific case in question here is a session in my programming languages course called Creating New Syntax. The session has the modest goal of introducing students to the idea of using macros and other tools to define new syntactic constructs in a language. My students come into the course with no Racket or Lisp experience, and only a few have enough experience with C/C++ that they may have seen its textual macros. My plan for this session is to expose them to a few ideas and then to demonstrate one of Racket's wonderful facilities for creating new syntax. Given the other demands of the course, we don't have time to go deep, only to get a taste.

[In my dreams, I sometimes imagine reorienting this part of my course around something like Matthew Butterick's Beautiful Racket... Maybe someday.]

Looking at my notes for this session on Friday, I remembered just how overloaded and distracting the session has become. Over the years, I've pared away the extraneous material on macros in Lisp and C, but now it has evolved to include too many ideas and incomplete examples of macros in Racket. Each by itself might make for a useful part of the story. Together, they pull attention here and there without ever closing the deal.

I feel like the story I've been telling is getting in the way of the one or two key ideas about this topic I want students to walk away from the course with. It's time to clean the session up -- to make some major changes -- and tell a more effective story.

The specific idea I seized upon on Friday is an old idea I've had in mind for a while but never tried: adding a Python-like for-loop:

    (for i in lst: (sqrt i))

[Yes, I know that Racket already has a fine set of for-loops! This is just a simple form that lets my students connect their fondness for Python with the topic at hand.]

This functional loop is equivalent to a Racket map expression:

    (map (lambda (i)
           (sqrt i))
	 lst)

We can write a simple list-to-list translator that converts the loop to an equivalent map:

    (define for-to-map
      (lambda (for-exp)
        (let ((var (second for-exp))
              (lst (fourth for-exp))
              (exp (sixth for-exp)))
          (list 'map
                (list 'lambda (list var) exp)
                lst))))

This code handles only the surface syntax of the new form. To add it to the language, we'd have to recursively translate the form. But this simple function alone demonstrates the idea of translational semantics, and shows just how easy it can be to convert a simple syntactic abstraction into an equivalent core form.

Racket, of course, gives us better options! Here is the same transformer using the syntax-rules operator:

    (define-syntax for-p
      (syntax-rules (in :)
        ( (for-p var in lst : exp)
            (map (lambda (var) exp) lst) )  ))

So easy. So powerful. So clear. And this does more than translate surface syntax in the form of a Racket list; it enables the Racket language processor to expand the expression in place and execute the result:

    > (for-p i in (range 0 10):
        (sqrt i))
    '(0
      1
      1.4142135623730951
      ...
      2.8284271247461903
      3)

This small example demonstrates the key idea about macros and syntax transformers that I want students to take from this session. I plan to open the session with for-p, and then move on to range-case, a more complex operator that demonstrates more of syntax-rules's range and power.

This sets me up for a fun session in a little over a week. I'm excited to see how it plays with students. Renewed simplicity and focus should help.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 05, 2023 9:47 AM

Context Matters

In this episode of Conversations With Tyler, Cowen asks economist Jeffrey Sachs if he agrees with several other economists' bearish views on a particular issue. Sachs says they "have been wrong ... for 20 years", laughs, and goes on to say:

They just got it wrong time and again. They had failed to understand, and the same with [another economist]. It's the same story. It doesn't fit our model exactly, so it can't happen. It's got to collapse. That's not right. It's happening. That's the story of our time. It's happening.

"It doesn't fit our model, so it can't happen." But it is happening.

When your model keeps saying that something can't happen, but it keeps happening anyway, you may want to reconsider your model. Otherwise, you may miss the dominant story of the era -- not to mention being continually wrong.

Sachs spends much of his time with Cowen emphasizing the importance of context in determining which model to use and which actions to take. This is essential in economics because the world it studies is simply too complex for the models we have now, even the complex models.

I think Sachs's insight applies to any discipline that works with people, including education and software development.

The topic of education even comes up toward the end of the conversation, when Cowen asks Sachs how to "fix graduate education in economics". Sachs says that one of its problems is that they teach econ as if there were "four underlying, natural forces of the social universe" rather than studying the specific context of particular problems.

He goes on to highlight an approach that is affecting every discipline now touched by data analytics:

We have so much statistical machinery to ask the question, "What can you learn from this dataset?" That's the wrong question because the dataset is always a tiny, tiny fraction of what you can know about the problem that you're studying.

Every interesting problem is bigger than any dataset we build from it. The details of the problem matter. Again: context. Sachs suggests that we shouldn't teach econ like physics, with Maxwell's overarching equations, but like biology, with the seemingly arbitrary details of DNA.

In my mind, I immediately began thinking about my discipline. We shouldn't teach software development (or econ!) like pure math. We should teach it as a mix of principles and context, generalities and specific details.

There's almost always a tension in CS programs between timeless knowledge and the details of specific languages, libraries, and tools. Most of students don't go on to become theoretical computer scientists; they go out to work in the world of messy details, details that keep evolving and morphing into something new.

That makes our job harder than teaching math or some sciences because, like economics:

... we're not studying a stable environment. We're studying a changing environment. Whatever we study in depth will be out of date. We're looking at a moving target.

That dynamic environment creates a challenge for those of us teaching software development or any computing as practiced in the world. CS professors have to constantly be moving, so as not to fall our of date. But they also have to try to identify the enduring principles that their students can count on as they go on to work in the world for several decades.

To be honest, that's part of the fun for many of us CS profs. But it's also why so many CS profs can burn out after 15 or 20 years. A never-ending marathon can wear anyone out.

Anyway, I found Cowens' conversation with Jeffrey Sachs to be surprisingly stimulating, both for thinking about economics and for thinking about software.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 01, 2023 2:26 PM

Have Clojure and Racket Overcome the Lisp Curse?

I finally read Rudolf Winestock's 2011 essay The Lisp Curse, which is summarized in one line:

Lisp is so powerful that problems which are technical issues in other programming languages are social issues in Lisp.

It seems to me that Racket and Clojure have overcome the curse. Racket was built by a small team that grew up in academia. Clojure was designed and created by an individual. Yet they are both 100% solutions, not the sort of one-off 80% personal solutions that tend to plague the Lisp world.

But the creators went further: They also attracted and built communities.

The Racket and Clojure communities consist of programmers who care about the entire ecosystem. The Racket community welcomes and helps newcomers. I don't move in Clojure circles, but I see and hear good things from people who do.

Clojure has made a bigger impact commercially, of course. Offering a high level of performance and running on the JVM have their advantages. I doubt either will ever displace Java or the other commercial behemoths, but they appear to have staying power. They earned that status by solving both technical issues and social issues.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

February 26, 2023 8:57 AM

"If I say no, are you going to quit?"

Poet Marvin Bell, in his contribution to the collection Writers on Writing:

The future belongs to the helpless. I am often presented that irresistible question asked by the beginning poet: "Do you think I am any good?" I have learned to reply with a question: "If I say no, are you going to quit?" Because life offers any of us many excuses to quit. If you are going to quit now, you are almost certainly going to quit later. But I have concluded that writers are people who you cannot stop from writing. They are helpless to stop it.

Reading that passage brought to mind Ted Gioia's recent essay on musicians who can't seem to retire. Even after accomplishing much, these artists seem never want to stop doing their thing.

Just before starting Writers on Writing, I finished Kurt Vonnegut's Sucker's Portfolio, a slim 2013 volume of six stories and one essay not previously published. The book ends with an eighth piece: a short story unfinished at the time of Vonnegut's death. The story ends mid-sentence and, according to the book's editor, at the top of an unfinished typewritten page. In his mid-80s, Vonnegut was creating stories t the end.

I wouldn't mind if, when it's my time to go, folks find my laptop open to some fun little programming project I was working on for myself. Programming and writing are not everything there is to my life, but they bring me a measure of joy and satisfaction.

~~~~~

This week was a wonderful confluence of reading the Bell, Gioia, and Vonnegut pieces around the same time. So many connections... not least of which is that Bell and Vonnegut both taught at the Iowa Writers' Workshop.

There's also an odd connection between Vonnegut and the Gioia essay. Gioia used a quip attributed to the Roman epigrammist Martial:

Fortune gives too much to many, but enough to none.

That reminded me of a story Vonnegut told occasionally in his public talks. He and fellow author Joseph Heller were at a party hosted by a billionaire. Vonnegut asked Heller, "How does it make you feel to know that guy made more money yesterday than Catch-22 has made in all the years since it was published?" Heller answered, "I have something he'll never have: the knowledge that I have enough."

There's one final connection here, involving me. Marvin Bell was the keynote speaker at Camouflage: Art, Science & Popular Culture an international conference organized by graphic design prof Roy Behrens at my university and held in April 2006. Participants really did come from all around the world, mostly artists or designers of some sort. Bell read a new poem of his and then spoke of:

the ways in which poetry is like camouflage, how it uses a common vocabulary but requires a second look in order to see what is there.
I gave a talk at the conference called NUMB3RS Meets The DaVinci Code: Information Masquerading as Art. (That title was more timely in 2006 than 2023...) I presented steganography as a computational form of camouflage: not quite traditional concealment, not quite dazzle, but a form of dispersion uniquely available in the digital world. I recall that audience reaction to the talk was better than I feared when I proposed it to Roy. The computer science topic meshed nicely with the rest of the conference lineup, and the artists and writers who saw the talk seemed to appreciate the analogy. Anyway, lots of connections this week.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

February 23, 2023 10:40 AM

Implementing Flavius Josephus's Sieve

The other night, I wanted to have some fun, so I implemented a few Racket functions to implement Flavius Josephus's sieve. Think the sieve of Eratosthenes, but with each pass applied to the numbers left after the previous pass:

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ...
    1   3   5   7   9    11    13    15 ...  - after pass 1 (remove every 2nd)
    1   3       7   9          13    15 ...  - after pass 2 (remove every 3rd)
    1   3       7              13    15 ...  - after pass 3 (remove every 4th)
    ...

My code isn't true to the mathematical ideal, because it doesn't process the unbounded sequence of natural numbers. (Haskell, with lazy evaluation, might be a better choice for that!) Instead, it works on a range [1..n] for a given n. That's plenty good enough for me to see how the sequence evolves.

I ended up using a mix of higher-order functions and recursive functions. When it comes to time to filter out every kth item in the previous sequence, I generate a list of booleans with false in every kth position and true elsewhere. For example, '(#t #t #f #t #t #f ...) is the mask for the pass that filters out every third item. Then, this function:

    (define filter-with-mask
      (lambda (lon mask)
        (map car
             (filter (lambda (p) (cdr p))
                     (map cons lon mask)))))
applies a mask to a list of numbers lon and produces an updated list of numbers. The rest of the code is a recursive function that, on each pass, applies the next mask in sequence to the list remaining after the previous pass.

I won't be surprised if my mathematician friends, or readers who are better Racket programmers, can suggest a more elegant way to "strain" a list, but, hey, I'm a programmer. This is a simple enough approach that does the job. The code is reasonably fast and flexible.

In any case, it was a nice way to spend an hour after a long day on campus. My wife just smiled when I told what I was going to do to relax that evening. She knows me well.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 20, 2023 10:18 AM

Commands I Use

Catching up on articles in my newsreader and ran across Commands I Use by @gvwilson. That sounded like fun, and I was game:

    $ history | awk '{print $2}' | sort | uniq -c | sort -nr > commands.txt

The first four items on my list are essentially the same as Wilson's, and there are a lot of other similarities, too. I don't think this is surprising, given how Unix works and how much sense git makes for software developers to use.

  • git    - same caveat as Wilson. Next time, I may look at field and flesh this out.
  • ll    - my shorthand for ls -al
  • emacs    - less of a cheat for me. I mostly edit in emacs.
  • cd
  • mv
  • rmbak    - my shorthand for rm *~, a form of Wilson's clean
  • cpsync    - my shorthand for copying a file to a folder for syncing to my office machine
  • popd
  • pushd    - I pushd and popd a lot...
  • open
  • dirs    - ... which means occasionally checking the stack
  • mvsync    - similar to cpsync but also moves the file (often from the desktop to its permanent home)
  • tgz    - a 5-line script that bundles the sync folder used by cpsync and mvsync
  • cp
  • cls
  • pwd
  • close-journal.py    - a substantial Python script; part of my homegrown family accounting system
  • rm    - aliased to ask for confirmation before deleting
  • cat
  • /bin/rm    - the built-in command nukes a file with no shame
  • more
  • xattr
  • python3    - same caveat as Wilson
  • gzt    - an unbundling script
  • mkdir
  • gooffice    - my shorthand for sshing into my office machine

It's interesting to see that I use rm and /bin/rm in roughly even measure. I would have guessed that I used the guarded command in higher proportion.

At the bottom of the tally are a few items I don't use often, or don't generally launch from the command line:

  • touch
  • racket
  • npm
  • idle3
  • chmod

... and a bunch of typos, including:

  • pops
  • ce
  • nv
  • nl
  • mdir
  • emaxs, emavs, emacd,

That was fun! Thanks to Greg for the prompt.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

February 11, 2023 1:53 PM

What does it take to succeed as a CS student?

Today I received an email message similar to this:

I didn't do very well in my first semester, so I'm looking for ways to do better this time around. Do you have any ideas about study resources or tips for succeeding in CS courses?

As an advisor, I'm occasionally asked by students for advice of this sort. As department head, I receive even more queries, because early on I am the faculty member students know best, from campus visits and orientation advising.

When such students have already attempted a CS course or two, my first step is always to learn more about their situation. That way, I can offer suggestions suited to their specific needs.

Sometimes, though, the request comes from a high school student, or a high school student's parent: What is the best way to succeed as a CS student?

To be honest, most of the advice I give is not specific to a computer science major. At a first approximation, what it takes to succeed as a CS student is the same as what it takes to succeed as a student in any major: show up and do the work. But there are a few things a CS student does that are discipline-specific, most of which involve the tools we use.

I've decided to put together a list of suggestions that I can point our students to, and to which I can refer occasionally in order to refresh my mind. My advice usually includes one or all of these suggestions, with a focus on students at the beginning of our program:

  • Go to every class and every lab session. This is #0 because it should go without saying, but sometimes saying it helps. Students don't always have to go to their other courses every day in order to succeed.

  • Work steadily on a course. Do a little work on your course, both programming and reading or study, frequently -- every day, if possible. This gives your brain a chance to see patterns more often and learn more effectively. Cramming may help you pass a test, but it doesn't usually help you learn how to program or make software.

  • Ask your professor questions sooner rather than later. Send email. Visit office hours. This way, you get answers sooner and don't end up spinning your wheels while doing homework. Even worse, feeling confused can lead you to shying away from doing the work, which gets in the way of #1.

  • Get to know your programming environment. When programming in Python, simply feeling comfortable with IDLE, and with the file system where you store your programs and data, can make everything else seem easier. Your mind doesn't have to look up basic actions or worry about details, which enables you to make the most of your programming time: working on the assigned task.

  • Spend some of your study time with IDLE open. Even when you aren't writing a program, the REPL can help you! It lets you try out snippets of code from your reading, to see them work. You can run small experiments of your own, to see whether you understand syntax and operators correctly. You can make up your own examples to fill in the gaps in your understanding of the problem.

    Getting used to trying things out in the interactions window can be a huge asset. This is one of the touchstones of being a successful CS student.

That's what came to mind at the end of a Friday, at the end of a long week, when I sat down to give advice to one student. I'd love to hear your suggestions for improving the suggestions in my list, or other bits of advice that would help our students. Email me your ideas, and I'll make my list better for anyone who cares to read it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

January 29, 2023 7:51 AM

A Thousand Feet of Computing

Cory Doctorow, in a recent New Yorker interview reminisces about learning to program. The family had a teletype and modem.

My mom was a kindergarten teacher at the time, and she would bring home rolls of brown bathroom hand towels from the kid's bathroom at school, and we would feed a thousand feet of paper towel into the teletype and I would get a thousand feet of computing after school at the end of the day.

Two things:

  • Tsk, tsk, Mom. Absconding with school supplies, even if for a noble cause! :-) Fortunately, the statute of limitations on pilfering paper hand towels has likely long since passed.

  • I love the idea of doing "a thousand feet of computing" each day. What a wonderful phrase. With no monitor, the teletype churns out paper for every line of code, and every line the code produces. You know what they say: A thousand feet a day makes a happy programmer.

The entire interview is a good read on the role of computing in modern society. The programmer in me also resonated with this quote from Doctorow's 2008 novel, Little Brother:

If you've never programmed a computer, there's nothing like it in the whole world. It's awesome in the truest sense: it can fill you with awe.

My older daughter recommended Little Brother to me when it first came out. I read many of her recommendations promptly, but for some reason this one sits on my shelf unread. (The PDF also sits in my to-read/ folder, unread.) I'll move it to the top of my list.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 18, 2023 2:46 PM

Prompting AI Generators Is Like Prompting Students

Ethan Mollick tells us how to generate prompts for programs like ChatGPT and DALL-E: give direct and detailed instructions.

Don't ask it to write an essay about how human error causes catastrophes. The AI will come up with a boring and straightforward piece that does the minimum possible to satisfy your simple demand. Instead, remember you are the expert and the AI is a tool to help you write. You should push it in the direction you want. For example, provide clear bullet points to your argument: write an essay with the following points: -Humans are prone to error -Most errors are not that important -In complex systems, some errors are catastrophic -Catastrophes cannot be avoided

But even the results from such a prompt are much less interesting than if we give a more explicit prompt. Fo instance, we might add:

use an academic tone. use at least one clear example. make it concise. write for a well-informed audience. use a style like the New Yorker. make it at least 7 paragraphs. vary the language in each one. end with an ominous note.

This reminds me of setting essay topics for students, either for long-form writing or for exams. If you give a bland uninteresting question, you will generally get a bland uninteresting answer. Such essays are hard to evaluate. A squooshy question allows the student to write almost anything in response. Students are usually unhappy in this scenario, too, because they don't know what you want them to write, or how they will be evaluated.

Asking a human a more specific question has downsides, though. It increases the cognitive load placed on them, because there are more things for them to be thinking about as they write. Is my tone right? Does this sound like the New Yorker? Did I produce the correct number of paragraphs? Is my essay factually accurate? (ChatGPT doesn't seem to worry about this one...) The tradeoff is clearer expectations. Many students prefer this trade, at least on longer form assignments when they have time to consider the specific requests. A good spec reduces uncertainty.

Maybe these AI programs are closer to human than we think after all. (Some people don't worry much about correctness either.)

~~~~

On a related note: As I wrote elsewhere, I refuse to call ChatGPT or any program "an AI". The phrase "artificial intelligence" is not a term for a specific agent; it is the name of an idea. Besides, none of these programs are I yet, only A.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 10, 2023 2:20 PM

Are Major Languages' Parsers Implemented By Hand?

Someone on Mastodon posted a link to a 2021 survey of how the parsers for major languages are implemented. Are they written by hand, or automated by a parser generator? The answer was mixed: a few are generated by yacc-like tools (some of which were custom built), but many are written by hand, often for speed.

My two favorite notes:

Julia's parser is handwritten but not in Julia. It's in Scheme!

Good for the Julia team. Scheme is a fine language in which to write -- and maintain -- a parser.

Not only [is Clang's parser] handwritten but the same file handles parsing C, Objective-C and C++.

I haven't clicked through to the source code for Clang yet but, wow, that must be some file.

Finally, this closing comment in the paper hit close to the heart:

Although parser generators are still used in major language implementations, maybe it's time for universities to start teaching handwritten parsing?

I have persisted in having my compiler students write table-driven parsers by hand for over fifteen years. As I noted in this post at the beginning of the 2021 semester, my course is compilers for everyone in our major, or potentially so. Most of our students will not write another compiler in their careers, and traditional skills like implementing recursive descent and building a table-driven program are valuable to them more generally than knowing yacc or bison. Any of my compiler students who do eventually want to use a parser generator are well prepared to learn how, and they'll understand what's going on when they do, to boot.

My course is so old school that it's back to the forefront. I just had to be patient.

(I posted the seeds of this entry on Mastodon. Feel free to comment there!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

January 05, 2023 12:15 PM

A Family of Functions from a Serendipitous Post, and Thoughts about Teaching

Yesterday, Ben Fulton posted on Mastodon:

TIL: C++ has a mismatch algorithm that returns the first non-equal pair of elements from two sequences. ...

C++'s mismatch was new to me, too, so I clicked through to the spec on cppreference.com to read a bit more. I learned that mismatch is an algorithm implemented as as a template function with several different signatures. My thoughts turned immediately to my spring course, Programming Languages, which starts with an introduction to Racket and functional programming. mismatch would make a great example or homework problem for my students, as they learn to work with Racket lists and functions! I stopped working on what I was doing and used the C++ spec to draw up a family of functions for my course:

    ; Return the first mismatching pair of elements from two lists.
    ; Compare using eq?.
    ;   (mismatch lst1 lst2)
    ;
    ; Compare using a given binary predicate comp?.
    ;   (mismatch comp? lst1 lst2)
    ;
    ; Compare using a given binary predicate comp?,
    ; as a higher-order function.
    ;   ((make-mismatch comp?) lst1 lst2)
    ;
    ; Return the first mismatching pair of elements from two ranges,
    ; also as a higher-order function.
    ; If last2 is not provided, it denotes first2 + (last1 - first1).
    ;   (make-mismatch first1 last1 first2 [last2]) -> (f lst1 lst2)

Of course, this list is not exhaustive, only a start. With so many related possibilities, mismatch will make a great family of examples or homework problems for the course! What a fun distraction from the other work in my backlog.

Ben's post conveniently arrived in the middle of an email discussion with the folks who teach our intro course, about ChatGPT and the role it will play in Intro. I mentioned ChatGPT in a recent post suggesting that we all think about tools like ChatGPT and DALL-E from the perspective of cultural adaptation: how do we live with new AI tools knowing that we change our world to accommodate our technologies? In that post, I mentioned only briefly the effect that these tools will have on professors, their homework assignments, and the way we evaluate student competencies and performance. The team preparing to teach Intro this spring has to focus on these implications now because they affect how the course will work. Do we want to mitigate the effects of ChatGPT and, if so, how?

I think they have decided mostly to take a wait-and-see approach this semester. We always have a couple of students who do not write their own code, and ChatGPT offers them a new way not to do so. When we think students have not written the code they submitted, we talk with them. In particular, we discuss the code and ask the student to explain or reason about it.

Unless the presence of ChatGPT greatly increases the number of students submitting code they didn't write, this approach should continue to work. I imagine we will be fine. Most students want to learn; they know that writing code is where they learn the most. I don't expect that access to ChatGPT is going to change the number of students taking shortcuts, at least not in large numbers. Let's trust our students as we keep a watchful eye out for changes in behavior.

The connection between mismatch and the conversation about teaching lies in the role that a family of related functions such as mismatch can play in building a course that is more resistant to the use of AI assistants in a way that harms student learning. I already use families of related function specs as a teaching tool in my courses, for purely pedagogical reasons. Writing different versions of the same function, or seeing related functions used to solve slightly different problems, is a good way to help students deepen understanding of an idea or to help them make connections among different concepts. My mismatches give me another way to help students in Programming Languages learn about processing lists, passing functions as arguments, returning functions as values, and accepting a variable number of arguments. I'm curious to see how this family of functions works for students.

A set of related functions also offers a tool both for helping professors determine whether students have learned to write code. We already ask students in our intro course to modify code. Asking students to convert a function with one spec into a function with a slightly different spec, like writing different versions of the same function, give them the chance benefit from their understanding the existing code. It is easier for a programmer to modify a function if they understand it. The existing code is a scaffold that enables the student to focus on the single feature or concept they need to write the new code.

Students who have not written code like the code they are modifying have a harder time reading and modifying the given code, especially when operating under any time or resource limitation. In a way, code modification exercises do something simpler to asking students to explain code to us: the modification task exposes when students don't understand code they claim to have written.

Having ChatGPT generate a function for you won't be as valuable if you will soon be asked to explain the code in detail or to modify the code in a way that requires you understand it. Increasing the use of modification tasks is one way to mitigate the benefits of a student having someone else write the code for them. Families of functions such as mismatch above are a natural source of modification tasks.

Beyond the coming semester, I am curious how our thinking about writing code will evolve in the presence of ChatGPT-like tools. Consider the example of auto-complete facilities in our editors. Few people these days think of using auto-complete as cheating, but when it first came out many professors were concerned that using auto-complete was a way for students not to learn function signatures and the details of standard libraries. (I'm old enough to still have a seed of doubt about auto-complete buried somewhere deep in my mind! But that's just me.)

If LLM-based tools become the new auto-complete, one level up from function signatures, then how we think about programming will probably change. Likewise how we think about teaching programming... or not. Did we change how we teach much as a result of auto-complete?

The existence of ChatGPT is a bit disconcerting for today's profs, but the long-term implications are kind of interesting.

In the meantime, coming across example generators like C++'s mismatch helps me deal with the new challenge and gives me unexpected fun writing code and problem descriptions.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 11, 2022 9:09 AM

Living with AI in a World Where We Change the World to Accommodate Our Technologies

My social media feeds are full of ChatGPT screenshots and speculation these days, as they have been with LLMs and DALL-E and other machine learning-based tools for many months. People wonder what these tools will mean for writers, students, teachers, artists, and anyone who produces ordinary text, programs, and art.

These are natural concerns, given their effect on real people right now. But if you consider the history of human technology, they miss a bigger picture. Technologies often eliminate the need for a certain form of human labor, but they just as often create a new form of human labor. And sometimes, they increase the demand for the old kind of labor! If we come to rely on LLMs to generate text for us, where will we get the text with which to train them? Maybe we'll need people to write even more replacement-level prose and code!

As Robin Sloan reminds us in the latest edition of his newsletter, A Year of New Avenues, we redesign the world to fit the technologies we create and adopt.

Likewise, here's a lesson from my work making olive oil. In most places, the olive harvest is mechanized, but that's only possible because olive groves have been replanted to fit the shape of the harvesting machines. A grove planted for machine harvesting looks nothing like a grove planted for human harvesting.

Which means that our attention should be on how programs like GPT-2 might lead us to redesign the world we live and work in better to accommodate these new tools:

For me, the interesting questions sound more like
  • What new or expanded kinds of human labor might AI systems demand?
  • What entirely new activities do they suggest?
  • How will the world now be reshaped to fit their needs?
That last question will, on the timescale of decades, turn out to be the most consequential, by far. Think of cars ... and of how dutifully humans have engineered a world just for them, at our own great expense. What will be the equivalent, for AI, of the gas station, the six-lane highway, the parking lot?

Many professors worry that ChatGPT makes their homework assignments and grading rubrics obsolete, which is a natural concern in the short run. I'm old enough that I may not live to work in a world with the AI equivalent of the gas station, so maybe that world seems too far in the future to be my main concern. But the really interesting questions for us to ask now revolve around how tools such as these will lead us to redesign our worlds to accommodate and even serve them.

Perhaps, with a little thought and a little collaboration, we can avoid engineering a world for them at our own great expense. How might we benefit from the good things that our new AI technologies can provide us while sidestepping some of the highest costs of, say, the auto-centric world we built? Trying to answer that question is a better long-term use of our time and energy that fretting about our "Hello, world!" assignments and advertising copy.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 04, 2022 9:18 AM

If Only Ants Watched Netflix...

In the essay "On Societies as Organisms", Lewis Thomas says that we "violate science" when we try to read human meaning into the structures and behaviors of insects. But it's hard not to:

Ants are so much like human beings as to be an embarrassment. They farm fungi, raise aphids as livestock, launch armies into wars, use chemical sprays to alarm and confuse enemies, capture slaves. The families of weaver ants engage in child labor, holding their larvae like shuttles to spin out the thread that sews the leaves together for their fungus gardens. They exchange information ceaselessly. They do everything but watch television.

I'm not sure if humans should be embarrassed for still imitating some of the less savory behaviors of insects, or if ants should be embarrassed for reflecting some of the less savory behaviors of humans.

Biology has never been my forte, so I've read and learned less about it than many other sciences. Enjoying chemistry a bit at least helped keep me within range of the life sciences. I was fortunate to grow up in the Digital Age.

But with many people thinking the 21st century will the Age of Biology, I feel like I should get more in tune with the times. I picked up Thomas's now classic The Lives of a Cell, in which the quoted essay appears, as a brief foray into biological thinking about the world. I'm only a few pages in, but it is striking a chord. I can imagine so many parallels with computing and software. Perhaps I can be as at home in the 21st century as I was in the 20th.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

November 23, 2022 1:27 PM

I Can't Imagine...

I've been catching up on some items in my newsreader that went unread last summer while I rode my bike outdoors rather than inside. This passage from a blog post by Fred Wilson at AVC touched on a personal habit I've been working on:

I can't imagine an effective exec team that isn't in person together at least once a month.

I sometimes fall into a habit of saying or thinking "I can't imagine...". I'm trying to break that habit.

I don't mean to pick on Wilson, whose short posts I enjoy for insight into the world of venture capital. "I can't imagine" is a common trope in both spoken and written English. Some writers use it as a rhetorical device, not as a literal expression. Maybe he meant it that way, too.

For a while now, though, I've been trying to catch myself whenever I say or think "I can't imagine...". Usually my mind is simply being lazy, or too quick to judge how other people think or act.

It turns out that I usually can imagine, if I try. Trying to imagine how that thinking or behavior makes sense helps me see what other people might be thinking, what their assumptions or first principles are. Even when I end up remaining firm in my own way of thinking, trying to imagine usually puts me in a better position to work with the other person, or explain my own reasoning to them more effectively.

Trying to imagine can also give me insight into the limits of my own thinking. What assumptions am I making that lead me to have confidence in my position? Are those assumptions true? If yes, when might they not to be true? If no, how do I need to update my thinking to align with reality?

When I hear someone say, "I can't imagine..." I often think of Russell and Norvig's textbook Artificial Intelligence: A Modern Approach, which I used for many years in class [1]. At the end of one of the early chapters, I think, they mention critics of artificial intelligence who can't imagine the field of AI ever accomplishing a particular goal. They respond cheekily to the effect, This says less about AI than it says about the critics' lack of imagination. I don't think I'd ever seen a textbook dunk on anyone before, and as a young prof and open-minded AI researcher, I very much enjoyed that line [2].

Instead of saying "I can't imagine...", I am trying to imagine. I'm usually better off for the effort.

~~~~

[1] The Russell and Norvig text first came out in 1995. I wonder if the subtitle "A Modern Approach" is still accurate... Maybe theirs is now a classical approach!

[2] I'll have to track that passage down when I am back in my regular office and have access to my books. (We are in temporary digs this fall due to construction.) I wonder if AI has accomplished the criticized goal in the time since Russell and Norvig published their book. AI has reached heights in recent years that many critics in 1995 could not imagine. I certainly didn't imagine a computer program defeating a human expert at Go in my lifetime, let alone learning to do so almost from scratch! (I wrote about AlphaGo and its intersection with my ideas about AI a few times over the years: [ 01/2016 | 03/2016 | 05/2017 | 05/2018 ].)


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

October 31, 2022 6:11 PM

The Inventor of Assembly Language

This weekend, I learned that Kathleen Booth, a British mathematician and computer scientist, invented assembly language. An October 29 obituary reported that Booth died on September 29 at the age of 100. By 1950, when she received her PhD in applied mathematics from the University of London, she had already collaborated on building at least two early digital computers. But her contributions weren't limited to hardware:

As well as building the hardware for the first machines, she wrote all the software for the ARC2 and SEC machines, in the process inventing what she called "Contracted Notation" and would later be known as assembly language.

Her 1958 book, Programming for an Automatic Digital Calculator, may have been the first one on programming written by a woman.

I love the phrase "Contracted Notation".

Thanks to several people in my Twitter feed for sharing this link. Here's hoping that Twitter doesn't become uninhabitable, or that a viable alternative arises; otherwise, I'm going to miss out on a whole lotta learning.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 23, 2022 9:54 AM

Here I Go Again: Carmichael Numbers in Graphene

I've been meaning to write a post about my fall compilers course since the beginning of the semester but never managed to set aside time to do anything more than jot down a few notes. Now we are at the end of Week 9 and I just must write. Long-time readers know what motivates me most: a fun program to write in my student's source language!

TFW you run across a puzzle and all you want to do now is write a program to solve it. And teach your students about the process.
-- https://twitter.com/wallingf/status/1583841233536884737

Yesterday, it wasn't a puzzle so much as discovering a new kind of number, Carmichael numbers. Of course, I didn't discover them (neither did Carmichael, though); I learned of them from a Quanta article about a recent proof about these numbers that masquerade as primes. One way of defining this set comes from Korselt:

A positive composite integer n is a Carmichael number if and only if it has multiple prime divisors, no prime divisor repeats, and for each prime divisor p, p-1 divides n-1.

This definition is relatively straightforward, and I quickly imagined am imperative solution with a loop and a list. The challenge of writing a program to verify a number is a Carmichael number in my compiler course's source language is that it has neither of these things. It has no data structures or even local variables; only basic integer and boolean arithmetic, if expressions, and function calls.

Challenge accepted. I've written many times over the years about the languages I ask my students to write compilers for and about my adventures programming in them, from Twine last year through Flair a few years ago to a recurring favorite, Klein, which features prominently in popular posts about Farey sequences and excellent numbers.

This year, I created a new language, Graphene, for my students. It is essentially a small functional subset of Google's new language Carbon. But like it's predecessors, it is something of an integer assembly language, fully capable of working with statements about integers and primes. Korselt's description of Carmichael numbers is right in Graphene's sweet spot.

As I wrote in the post about Klein and excellent numbers, my standard procedure in cases like this is to first write a reference program in Python using only features available in Graphene. I must do this if I hope to debug and test my algorithm, because we do not have a working Graphene compiler yet! (I'm writing my compiler in parallel with my students, which is one of the key subjects in my phantom unwritten post.) I was proud this time to write my reference program in a Graphene-like subset of Python from scratch. Usually I write a Pythonic solution, using loops and variables, as a way to think through the problem, and then massage that code down to a program using a limited set of concepts. This time, I started with short procedural outline:

    # walk up the primes to n
    #   - find a prime divisor p:
    #     - test if a repeat         (yes: fail)
    #     - test if p-1 divides n-1  (no : fail)
    # return # prime divisors > 1
and then implemented it in a single recursive function. The first few tests were promising. My algorithm rejected many small numbers not in the set, and it correctly found 561, the smallest Carmichael number. But when I tested all the numbers up to 561, I saw many false positives. A little print-driven debugging found the main bug: I was stopping too soon in my search for prime divisors, at sqrt(n), due to some careless thinking about factors. Once I fixed that, boom, the program correctly handled all n up to 3000. I don't have a proof of correctness, but I'm confident the code is correct. (Famous last words, I know.)

As I tested the code, it occurred to me that my students have a chance to one-up standard Python. Its rather short standard stack depth prevented my program from reaching even n=1000. When I set sys.setrecursionlimit(5000), my program found the first five Carmichael numbers: 561, 1105, 1729, 2465, and 2821. Next come 6601 and 8911; I'll need a lot more stack frames to get there.

All those stack frames are unnecessary, though. My main "looping" function is beautifully tail recursive: two failure cases, the final answer case checking the number of prime divisors, and two tail-recursive calls that move either to the next prime as potential factor or to the next quotient when we find one. If my students implement proper tail calls -- an optimization that is optional in the course but encouraged by their instructor with gusto -- then their compiler will enable us to solve for values up to the maximum integer in the language, 231-1. We'll be limited only by the speed of the target language's VM, and the speed of the target code the compiler generates. I'm pretty excited.

Now I have to resist the urge to regale my students with this entire story, and with more details about how I approach programming in a language like Graphene. I love to talk shop with students about design and programming, but our time is limited... My students are already plenty busy writing the compiler that I need to run my program!

This lark resulted in an hour or so writing code in Python, a few more minutes porting to Graphene, and an enjoyable hour writing this blog post. As the song says, it was a good day.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 31, 2022 8:54 AM

Caring about something whittles the world down to a more manageable size

In The Orchid Thief, there is a passage where author Susan Orlean describes a drive across south Florida on her way to a state preserve, where she'll be meeting an orchid hunter. She ends the passage this way:

The land was marble-smooth and it rolled without a pucker to the horizon. My eyes grazed across the green band of ground and the blue bowl of sky and then lingered on a dead tire, a bird in flight, an old fence, a rusted barrel. Hardly any cars came toward me, and I saw no one in the rearview mirror the entire time. I passed so many vacant acres and looked past them to so many more vacant acres and looked ahead and behind at the empty road and up at the empty sky; the sheer bigness of the world made me feel lonely to the bone. The world is so huge that people are always getting lost in it. There are too many ideas and things and people, too many directions to go. I was starting to believe that the reason it matters to care passionately about something is that it whittles the world down to a more manageable size. It makes the world seem not huge and empty but full of possibility. If I had been an orchid hunter I wouldn't have see this space as sad-making and vacant--I think I would have seen it as acres of opportunity where the things I loved were waiting to be found.

John Laroche, the orchid hunter at the center of The Orchid Thief, comes off as obsessive, but I think many of us know that condition. We have found an idea or a question or a problem that grabs our attention, and we work on it for years. Sometimes, we'll follow a lead so far down a tunnel that it feels a lot like the swamps Laroche braves in search of the ghost orchid.

Even a field like computer science is big enough that it can feel imposing if a person doesn't have a specific something to focus their attention and energy on. That something doesn't have to be forever... Just as Laroche had cycled through a half-dozen obsessions before turning his energy to orchids, a computer scientist can work deeply in an area for a while and then move onto something else. Sometimes, there is a natural evolution in the problems one focuses on, while other times people choose to move into a completely different sub-area. I see a lot of people moving into machine learning these days, exploring how it can change the sub-field they used to focus exclusively on.

As a prof, I am fortunate to be able to work with young adults as they take their first steps in computer science. I get to watch many of them find a question they want to answer, a problem they want to work on for a few years, or an area they want to explore in depth until they master it. It's also sad, in a way, to work with a student who never quite finds something that sparks their imagination. A career in software, or anything, really, can look as huge and empty as Orlean's drive through south Florida if someone doesn't care deeply about something. When they do, the world seems not huge and empty, but full of possibility.

I'm about halfway through The Orchid Thief and am quite enjoying it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

July 06, 2022 4:13 PM

When Dinosaurs Walked the Earth

a view of UNI's campanile from just west of the building that houses Computer Science

Chad Orzel, getting ready to begin his 22nd year as a faculty member at Union College, muses:

It's really hard to see myself in the "grizzled veteran" class of faculty, though realistically, I'm very much one of the old folks these days. I am to a new faculty member starting this year as someone hired in 1980 would've been to me when I started, and just typing that out makes me want to crumble into dust.

I'm not the sort who likes to one-up another blogger, but... I can top this, and crumble into a bigger, or at least older, pile of dust.

In May, I finished my 30th year as a faculty member. I am as old to a 2022 hire as someone hired in 1962 would have been to me. Being in computer science, rather than physics or another of the disciplines older than CS, this is an even bigger gap culturally than it first appears. The first Computer Science department in the US was created in 1962. In 1992, my colleagues who started in the 1970s seemed pretty firmly in the old guard, and the one CS faculty member from the 1960s had just retired, opening the line into which I was hired.

Indeed, our Department of Computer Science only came into existence in 1992. Prior to that, the CS faculty had been offering CS degrees for a little over a decade as part of a combined department with Mathematics. (Our department even has a few distinguished alums who graduated pre-1981, with CS degrees that are actually Math degrees with a "computation emphasis".) A new department head and I were hired for the department's first year as a standalone entity, and then we hired two more faculty the next year to flesh out our offerings.

So, yeah. I know what Chad means when he says "just typing that out makes me want to crumble into dust", and then some.

On the other hand, it's kind of cool to see how far computer science has come as an academic discipline in the last thirty years. It's also cool that I am still be excited to learn new things and to work with students as they learn them, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

July 03, 2022 9:00 AM

The Difference Between Medicine and Being a Doctor Is Like ...

Morgan Housel's recent piece on experts trying too hard includes a sentence that made me think of what I teach:

A doctor once told me the biggest thing they don't teach in medical school is the difference between medicine and being a doctor — medicine is a biological science, while being a doctor is often a social skill of managing expectations, understanding the insurance system, communicating effectively, and so on.

Most of the grads from our CS program go on to be software developers or to work in software-adjacent jobs like database administrator or web developer. Most of the rest work in system administration or networks. Few go on to be academic computer scientists. As Housel's doctor knows about medicine, there is a difference between academic computer science and being a software developer.

The good news is, I think the CS profs at many schools are aware of this, and the schools have developed computer science programs that at least nod at the difference in their coursework. The CS program at my school has a course on software engineering that is more practical than theoretical, and another short course that teaches practical tools such as version control, automated testing, and build tools, and skills such as writing documentation. All of our CS majors complete a large project course in which students work in teams to create a piece of software or a secure system, and the practical task of working as a team to deliver a piece of working software is a focus of the course. On top of those courses, I think most of our profs try to keep their courses real for students they know will want to apply what they learn in Algorithms, say, or Artificial Intelligence in their jobs as developers.

Even so, there is always a tension in classes between building foundational knowledge and building practical skills. I encounter this tension in both Programming Languages and Translation of Programming Languages. There are a lot of cool things we could learn about type theory, some of which might turn out to be quite useful in a forty-year career as a developer. But any time we devote to going deeper on type theory is time we can't devote to the concrete languages skills of a software developer, such as learning and creating APIs or the usability of programming languages.

So, we CS profs have to make design trade-offs in our courses as we try to balance the forces of students learning computer science and students becoming software developers. Fortunately, we learn a little bit about recognizing, assessing, and making trade-offs in our work both as computer scientists and as programmers. That doesn't make it easy, but at least we have some experience for thinking about the challenge.

The sentence quoted above reminds me that other disciplines face a similar challenge. Knowing computer science is different from being a software developer, or sysadmin. Knowing medicine is different from being a doctor. And, as Housel explains so well in his own work, knowing finance is different from being an investor, which is usually more about psychology and self-awareness than it is about net present value or alpha ratios. (The current stock market is a rather harsh reminder of that.)

Thanks to Housel for the prompt. The main theme of his piece — that experts makes mistakes that novices can't make, which leads to occasional unexpected outcomes — is the topic for another post.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

April 16, 2022 2:32 PM

More Fun, Better Code: A Bug Fix for my Pair-as-Set Implementation

In my previous post, I wrote joyously of a fun bit of programming: implementing ordered pairs using sets.

Alas, there was a bug in my solution. Thanks to Carl Friedrich Bolz-Tereick for finding it so quickly:

Heh, this is fun, great post! I wonder what happens though if a = b? Then the set is {{a}}. first should still work, but would second fail, because the set difference returns the empty set?

Carl Friedrich had found a hole in my small set of tests, which sufficed for my other implementations because the data structures I used separate cells for the first and second parts of the pair. A set will do that only if the first and second parts are different!

Obviously, enforcing a != b is unacceptable. My first code thought was to guard second()'s behavior:

    if my formula finds a result
       then return that result
       else return (first p)

This feels like a programming hack. Furthermore, it results in an impure implementation: it uses a boolean value and an if expression. But it does seem to work. That would have to be good enough unless I could find a better solution.

Perhaps I could use a different representation of the pair. Helpfully, Carl Friedrich followed up with pointers to several blog posts by Mark Dominus from last November that looked at the set encoding of ordered pairs in some depth. One of those posts taught me about another possibility: Wiener pairs. The idea is this:

    (a,b) = { {{a},∅}, {{b}} }

Dominus shows how Wiener pairs solve the a == b edge case in Kuratowski pairs, which makes it a viable alternative.

Would I ever have stumbled upon this representation, as I did onto the Kuratowski pairs? I don't think so. The representation is more complex, with higher-order sets. Even worse for my implementation, the formulas for first() and second() are much more complex. That makes it a lot less attractive to me, even if I never want to show this code to my students. I myself like to have a solid feel for the code I write, and this is still at the fringe of my understanding.

Fortunately, as I read more of Dominus's posts, I found there might be a way to save my Kuratowski-style solution. It turns out that the if expression I wrote above parallels the set logic used to implement a second() accessor for Kuratowski pairs: a choice between the set that works for a != b pairs and a fallback to a one-set solution.

From this Dominus post, we see the correct set expression for second() is:

the correct set expression for the second() function

... which can be simplified to:

an expression for the second() function simplified to a logical statement

The latter expression is useful for reasoning about second(), but it doesn't help me implement the function using set operations. I finally figured out what the former equation was saying: if (∪ p) is same as (∩ p), then the answer comes from (∩ p); otherwise, it comes from their difference.

I realized then that I could not write this function purely in terms of set operations. The computation requires the logic used to make this choice. I don't know where the boundary lies between pure set theory and the logic in the set comprehension, but a choice based on a set-empty? test is essential.

In any case, I think I can implement the my understanding of the set expression for second() as follows. If we define union-minus-intersection as:

    (set-minus (apply set-union aSet)
               (apply set-intersect aSet))
then:
    (second p) = (if (set-empty? union-minus-intersection)
                     (set-elem (apply set-intersect aSet))
                     (set-elem union-minus-intersection))

The then clause is the same as the body of first(), which must be true: if the union of the sets is the same as their intersection, then the answer comes from the interesection, just as first()'s answer does.

It turns out that this solution essentially implements my first code idea above: if my formula from the previous blog entry finds a result, then return that result. Otherwise, return first(p). The circle closes.

Success! Or, I should: Success!(?) After having a bug in my original solution, I need to stay humble. But I think this does it. It passes all of my original tests as well as tests for a == b, which is the main edge case in all the discussions I have now read about set implementations of pairs. Here is a link to the final code file, if you'd like to check it out. I include the two simple test scenarios, for both a == b and a == b, as Rackunit tests.

So, all in all, this was a very good week. I got to have some fun programming, twice. I learned some set theory, with help from a colleague on Twitter. I was also reacquainted with Mark Dominus's blog, the RSS feed for which I had somehow lost track of. I am glad to have it back in my newsreader.

This experience highlights one of the selfish reasons I like for students to ask questions in class. Sometimes, they lead to learning and enjoyment for me as well. (Thanks, Henry!) It also highlights one of the reasons I like Twitter. The friends we make there participate in our learning and enjoyment. (Thanks, Carl Friedrich!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

April 13, 2022 2:27 PM

Programming Fun: Implementing Pairs Using Sets

Yesterday's session of my course was a quiz preceded by a fun little introduction to data abstraction. As part of that, I use a common exercise. First, I define a simple API for ordered pairs: (make-pair a b), (first p), and (second p). Then I ask students to brainstorm all the ways that they could implement the API in Racket.

They usually have no trouble thinking of the data structures they've been using all semester. Pairs, sure. Lists, yes. Does Racket have a hash type? Yes. I remind my students about vectors, which we have not used much this semester. Most of them haven't programmed in a language with records yet, so I tell them about structs in C and show them Racket's struct type. This example has the added benefit of seeing that Racket generates constructor and accessor functions that do the work for us.

The big payoff, of course, comes when I ask them about using a Racket function -- the data type we have used most this semester -- to implement a pair. I demonstrate three possibilities: a selector function (which also uses booleans), message passaging (which also uses symbols), and pure functions. Most students look on these solutions, especially the one using pure functions, with amazement. I could see a couple of them actively puzzling through the code. That is one measure of success for the session.

This year, one student asked if we could use sets to implement a pair. Yes, of course, but I had never tried that... Challenge accepted!

While the students took their quiz, I set myself a problem.

The first question is how to represent the pair. (make-pair a b) could return {a, b}, but with no order we can't know which is first and which is second. My students and I had just looked at a selector function, (if arg a b), which can distinguish between the first item and the second. Maybe my pair-as-set could know which item is first, and we could figure out the second is the other.

So my next idea was (make-pair a b){{a, b}, a}. Now the pair knows both items, as well as which comes first, but I have a type problem. I can't apply set operations to all members of the set and will have to test the type of every item I pull from the set. This led me to try (make-pair a b){{a}, {a, b}}. What now?

My first attempt at writing (first p) and (second p) started to blow up into a lot of code. Our set implementation provides a way to iterate over the members of a set using accessors named set-elem and set-rest. In fine imperative fashion, I used them to start whacking out a solution. But the growing complexity of the code made clear to me that I was programming around sets, but not with sets.

When teaching functional programming style this semester, I have been trying a new tactic. Whenever we face a problem, I ask the students, "What function can help us here?" I decided to practice what I was preaching.

Given p = {{a}, {a, b}}, what function can help me compute the first member of the pair, a? Intersection! I just need to retrieve the singleton member of ∩ p:

    (first p) = (set-elem (apply set-intersect p))

What function can help me compute the second member of the pair, b? This is a bit trickier... I can use set subtraction, {a, b} - {a}, but I don't know which element in my set is {a, b} and which is {a}. Serendipitously, I just solved the latter subproblem with intersection.

Which function can help me compute {a, b} from p? Union! Now I have a clear path forward: (∪ p) – (∩ p):

    (second p) = (set-elem
                   (set-minus (apply set-union p)
                              (apply set-intersect p)))

I implemented these three functions, ran the tests I'd been using for my other implementations... and they passed. I'm not a set theorist, so I was not prepared to prove my solution correct, but the tests going all green gave me confidence in my new implementation.

Last night, I glanced at the web to see what other programmers had done for this problem. I didn't run across any code, but I did find a question and answer on the mathematics StackExchange that discusses the set theory behind the problem. The answer refers to something called "the Kuratowski definition", which resembles my solution. Actually, I should say that my solution resembles this definition, which is an established part of set theory. From the StackExchange page, I learned that there are other ways to express a pair as a set, though the alternatives look much more complex. I didn't know the set theory but stumbled onto something that works.

My solution is short and elegant. Admittedly, I stumbled into it, but at least I was using the tools and thinking patterns that I've been encouraging my students to use.

I'll admit that I am a little proud of myself. Please indulge me! Department heads don't get to solve interesting problems like this during most of the workday. Besides, in administration, "elegant" and "clever" solutions usually backfire.

I'm guessing that most of my students will be unimpressed, though they can enjoy my pleasure vicariously. Perhaps the ones who were actively puzzling through the pure-function code will appreciate this solution, too. And I hope that all of my students can at least see that the tools and skills they are learning will serve them well when they run into a problem that looks daunting.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

March 16, 2022 2:52 PM

A Cool Project For Manipulating Images, Modulo Wordle

Wordle has been the popular game du jour for a while now. Whenever this sort of thing happens, CS profs think, "How can I turn this into an assignment?" I've seen a lot of discussion of ways to create a Wordle knock-off assignment for CS 1. None of this conversation has interested me much. First, I rarely teach CS 1 these days, and a game such as Wordle doesn't fit into the courses I do teach right now. Second, there's a certain element of Yogi Berra wisdom driving me away from Wordle: "It's so popular; no one goes there any more."

Most important, I had my students implementing Mastermind in my OO course back in the 2000-2005 era. I've already had most of the fun I can have with this game as an assignment. And it was a popular one with students, though challenging. The model of the game itself presents challenges for representation and algorithm, and the UI has the sort of graphic panache that tends to engage students. I remember receiving email from a student who had transferred to another university, asking me if I would still help her debug and improve her code for the assignment; she wanted to show it to her family. (Of course I helped!)

However I did see something recently that made me think of a cool assignment: 5x6 Art, a Twitter account that converts paintings and other images into minimalist 5 block-by-6 block abstract grids. The connection to Wordle is in the grid, but the color palette is much richer. Like any good CS prof, I immediately asked myself, "How can I turn this into an assignment?"

a screen capture of 5x6art from Twitter

I taught our intro course using the media computation approach popularized by Mark Guzdial back in 2006. In that course, my students processed images such as the artwork displayed above. They would have enjoyed this challenge! There are so many cool ways to think about creating a 5x6 abstraction of an input image. We could define a fixed palette of n colors, then map the corresponding region of the image onto a single block. But how to choose the color?

We could compute the average pixel value of the range and then choose the color in the palette closest to that value. Or we could create neighborhoods of different sizes around all of the palette colors so that we favor some colors for the grid over others. What if we simply compute the average pixel for the region and use that as the grid color? That would give us a much larger but much less distinct set of possible colors. I suspect that this would produce less striking outputs, but I'd really like to try the experiment and see the grids it produces.

What if we allowed ourselves a bigger grid, for more granularity in our output images? There are probably many other dimensions we could play with. The more artistically inclined among you can surely think of interesting twists I haven't found yet.

That's some media computation goodness. I may assign myself to teach intro again sometime soon just so that I can develop and use this assignment. Or stop doing other work for a day or two and try it out on my own right now.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 14, 2022 11:55 AM

Taking Plain Text Seriously Enough

Or, Plain Text and Spreadsheets -- Giving Up and Giving In

One day a couple of weeks ago, a colleague and I were discussing a student. They said, I think I had that student in class a few semesters ago, but I can't find the semester with his grade."

My first thought was, "I would just use grep to...". Then I remembered that my colleagues all use Excel for their grades.

The next day, I saw Derek Sivers' recent post that gives the advice I usually give when asked: Write plain text files.

Over my early years in computing, I lost access to a fair bit of writing that was done using various word processing applications. All stored data in proprietary formats. The programs are gone, or have evolved well beyond the version and OS I was using at the time, and my words are locked inside. Occasionally I manage to pull something useful out of one of those old files, but for the most part they are a graveyard.

No matter how old, the code and essays I wrote in plaintext are still open to me. I love being able to look at programs I wrote for my undergrad courses (including the first parser I ever wrote, in Pascal) and my senior honors project (an early effort to implement Swiss System pairings for chess tournament). All those programs have survived the move from 5-1/4" floppies, through several different media, and still open just fine in emacs. So do the files I used to create our wedding invitations, which I wrote in troff(!).

The advice to write in plain text transfers nicely from proprietary formats on our personal computers to tools that run out on the web. The same week as Sivers posted his piece, a prolific Goodreads reviewer reported losing all his work when Goodreads had a glitch. The reviewer may have written in plain text, but his reviewers are buried in someone else's system.

I feel bad for non-tech folks when they lose their data to a disappearing format or app. I'm more perplexed when a CS prof or professional programmer does. We know about plain text; we know the history of tools; we know that our plain text files will migrate into the future with us, usable in emacs and vi and whatever other plain text editors we have available there.

I am not blind, though, to the temptation. A spreadsheet program does a lot of work for us. Put some numbers here, add a formula or two over there, and boom! your grades are computed and ready for entry -- into the university's appalling proprietary system, where the data goes to die. (Try to find a student's grade from a forgotten semester within that system. It's a database, yet there are no affordances available to users for the simplest tasks...)

All of my grade data, along with most of what I produce, is in plain text. One cost of this choice is that I have to write my own code to process it. This takes a little time, but not all that much, to be honest. I don't need all of Numbers or Excel; all I need most of the time is the ability to do simple computations and a bit of sorting. If I use a comma-separated values format, all of my programming languages have tools to handle parsing, so I don't even have to do much input processing to get started. If I use Racket for my processing code, throwing a few parens into the mix enables Racket to read my files into lists that are ready for mapping and filtering to my heart's content.

Back when I started professoring, I wrote all of my grading software in whatever language we were using in the class in that semester. That seemed like a good way for me to live inside the language my students were using and remind myself what they might be feeling as they wrote programs. One nice side effect of this is that I have grading code written in everything from Cobol to Python and Racket. And data from all those courses is still searchable using grep, filterable using cut, and processable using any code I want to write today.

That is one advantage of plain text I value that Sivers doesn't emphasize: flexibility. Not only will plain text survive into the future... I can do anything I want with it. I don't often feel powerful in this world, but I feel powerful when I'm making data work for me.

In the end, I've traded the quick and easy power of Excel and its ilk for the flexible and power of plain text, at a cost of writing a little code for myself. I like writing code, so this sort of trade is usually congenial to me. Once I've made the trade, I end up building a set of tools that I can reuse, or mold to a new task with minimal effort. Over time, the cost reaches a baseline I can live with even when I might wish for a spreadsheet's ease. And on the day I want to write a complex function to sort a set of records, one that befuddles Numbers's sorting capabilities, I remember why I like the trade. (That happened yet again last Friday.)

A recent tweet from Michael Nielsen quotes physicist Steven Weinberg as saying, "This is often the way it is in physics -- our mistake is not that we take our theories too seriously, but that we do not take them seriously enough." I think this is often true of plain text: we in computer science forget to take its value and power seriously enough. If we take it seriously, then we ought to be eschewing the immediate benefits of tools that lock away our text and data in formats that are hard or impossible to use, or that may disappear tomorrow at some start-up's whim. This applies not only to our operating system and our code but also to what we write and to all the data we create. Even if it means foregoing our commercial spreadsheets except for the most fleeting of tasks.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

January 13, 2022 2:02 PM

A Quick Follow-Up on What Next

My recent post on what language or tool I should dive into next got some engagement on Twitter, with many helpful suggestions. Thank you all! So I figure I should post a quick update to report what I'm thinking at this point.

In that post, I mentioned JavaScript and Pharo by name, though I was open to other ideas. Many folks pointed out the practical value of JavaScript, especially in a context where many of my students know and use it. Others offered lots of good ideas in the Smalltalk vein, both Pharo and several lighter-weight Squeaks. A couple of folks recommended Glamorous Toolkit (GToolkit), from @feenkcom, which I had not heard of before.

I took to mind several of the suggestions that commenters made about the how to think about making the decision. For example, there is more overhead to studying Pharo and GToolkit than JavaScript or one of the lighter-weight Squeaks. Choosing one of the latter would make it easier to tinker. I think some of these comments had students in mind, but they are true even for my own study during the academic semester. Once I get into a term (my course begins one week from today), my attention gets pulled in many directions for fifteen or sixteen weeks. Being able to quickly switch contexts when jumping into a coding session means that I can jump more often and more productively.

Also, as Glenn Vanderburg pointed out, JavaScript and Pharo aren't likely to teach me much new. I have a lot of background with Smalltalk and, in many ways, JavaScript is just another language. The main benefit of working with either would be practical, not educational.

GToolkit might teach me something, though. As I looked into GToolkit, it became more tempting. The code is Smalltalk, because it is implemented in Pharo. But the project has a more ambitious vision of software that is "moldable": easier to understand, easier to figure out. GToolkit builds on Smalltalk's image in the direction of a computational notebook, which is an idea I've long wanted to explore. (I feel a little guilty that I haven't look more into the work that David Schmüdde has done developing a notebook in Clojure.) GToolkit sounds like a great way for me to open several doors at once and learn something new. To do it justice, though, I need more time and focus to get started.

So I have decided on a two-pronged approach. I will explore JavaScript during the spring semester. This will teach me more about a language and ecosystem that are central to many of my students' lives. There is little overhead to picking it up and playing with it, even during the busiest weeks of the term. I can have a little fun and maybe make some connections to my programming languages course along the way. Then for summer, I will turn my attention to GToolkit, and perhaps a bigger research agenda.

I started playing with JavaScript on Tuesday. Having just read a blog post on scripting to compute letter frequencies in Perl, I implemented some of the same ideas in JavaScript. For the most part, I worked just as my students do: relying on vague memories of syntax and semantics and, when that failed, searching about for examples online.

A couple of hours working like this refreshed my memory on the syntax I knew from before and introduced me to some features that were new to me. It took a few minutes to re-internalize the need for those pesky semicolons at the end of every line... The resulting code is not much more verbose than Perl. I drifted pretty naturally to using functional programming style, as you might imagine, and it felt reasonably comfortable. Pretty soon I was thinking more about the tradeoff between clarity and efficiency in my code than about syntax, which is a good sign. I did run into one of JavaScript's gotchas: I used for...in twice instead of for...of and was surprised by the resulting behavior. Like any programmer, I banged my head on wall for a few minutes and then recovered. But I have to admit that I had fun. I like to program.

I'm not sure what I will write next, or when I will move into the browser and play with interface elements. Suggestions are welcome!

I am pretty sure, though, that I'll start writing unit tests soon. I used SUnit briefly and have a lot of experience with JUnit. Is JSUnit a thing?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

January 06, 2022 2:47 PM

A Fresh Encounter with Hexapawn

When I was in high school, the sponsor of our our Math Club, Mr. Harpring, liked to give books as prizes and honors for various achievements. One time, he gave me Women in Mathematics, by Lynn Osen. It introduced me to Émilie du Châtelet, Sophie Germain, Emmy Noether, and a number of other accomplished women in the field. I also learned about some cool math ideas.

the initial position of a game of a Hexapawn

Another time, I received The Unexpected Hanging and Other Mathematical Diversions, a collection of Martin Gardner's columns from Scientific American. One of the chapters was about Hexapawn, a simple game played with chess pawns on a 3x3 board. The chapter described an analog computer that learned how to play a perfect game of Hexapawn. I was amazed.

I played a lot of chess in high school and was already interested in computer chess programs. Now I began to wonder what it would be like to write a program that could learn to play chess... I suspect that Gardner's chapter planted one of the seeds that grew into my study of computer science in college. (It took a couple of years, though. From the time I was eight years old, I had wanted to be an architect, and that's where my mind was focused.)

As I wrote those words, it occurred to me that I may have written about the Gardner book before. Indeed I have, in a 2013 post on building the Hexapawn machine. Some experiences stay with you.

They also intersect with the rest of the world. This week, I read Jeff Atwood's recent post about his project to bring the 1973 book BASIC Computer Games into the 21st century. This book contains the source code of BASIC programs for 101 simple games. The earliest editions of this book used a version of BASIC before it included the GOSUB command, so there are no subroutines in any of the programs! Atwood started the project as a way to bring the programs in this book to a new audience, using modern languages and idioms.

You may wonder why he and other programmers would feel so fondly about BASIC Computer Games to reimplement its programs in Java or Ruby. They feel about these books the way I felt about The Unexpected Hanging. Books were the Github of the day, only in analog form. Many people in the 1970s and 1980s got their start in computing by typing these programs, character for character, into their computers.

I was not one of those people. My only access to a computer was in the high school, where I took a BASIC programming class my junior year. I had never seen a book like BASIC Computer Games, so I wrote all my programs from scratch. As mentioned in an old OOPSLA post from 2005, the first program I wrote out of passion was a program to implement a ratings system for our chess club. Elo ratings were great application for a math student and beginning programmer.

Anyway, I went to the project's Github site to check out what was available and maybe play a game or two. And there it was: Hexapawn! Someone has already completed the port to Python, so I grabbed it and played a few games. The text interface is right out of 1973, as promised. But the program learns, also as promised, and eventually plays a perfect game. Playing it brings back memories of playing my matchbox computer from high school. I wonder now if I should write my own program that learns Hexapawn faster, hook it up with the program from the book, and let them duke it out.

Atwood's post brought to mind pleasant memories at a time when pleasant memories are especially welcome. So many experiences create who we are today, yet some seem to have made an outsized contribution. Learning BASIC and reading Martin Gardner's articles are two of those for me.

Reading that blog post and thinking about Hexapawn also reminded me of Mr. Harpring and the effect he had on me as a student of math and as a person. The effects of a teacher in high school or grade school can be subtle and easy to lose track of over time. But they can also be real and deep, and easy not to appreciate fully when we are living them. I wish I could thank Mr. Harpring again for the books he gave me, and for the gift of seeing a teacher love math.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

January 04, 2022 2:54 PM

Which Language Next?

I've never been one to write year-end retrospectives on my blog, or prospective posts about my plans for the new year. That won't change this year, this post notwithstanding.

I will say that 2021 was a weird year for me, as it was for many people. One positive was purchasing a 29" ultra-wide monitor for work at home, seen in this post from my Strange Loop series. Programming at home has been more fun since the purchase, as have been lecture prep, data-focused online meetings, and just about everything. The only downside is that it's in my basement office, which hides me away. When I want to work upstairs to be with family, it's back to the 15" laptop screen. First-world problems.

Looking forward, I'm feeling a little itchy. I'll be teaching programming languages again this spring and plan to inject some new ideas, but the real itch is: I am looking for a new project to work on, and a new language to study. This doesn't have to be a new language, just one that one I haven't gone deep on before. I have considered a few, including Swift, but right now I am thinking of Pharo and JavaScript.

Thinking about mastering JavaScript in 2022 feels like going backward. It's old, as programming languages go, and has been a dominant force in the computing world for well over a decade. But it's also the most common language many of my students know that I have never gone deep on. There is great value in studying languages for their novel ideas and academic interest, but there is also value in having expertise with a language and toolchain that my students already care about. Besides, I've really enjoyed reading about work on JIT compilation of JavaScript over the years, and it's been a long time since I wrote code in a prototype-based OO language. Maybe it's time to build something useful in JavaScript.

Studying Pharo would be going backward for me in a different way. Smalltalk always beckons. Long-time followers of this blog have read many posts about my formative experiences with Smalltalk. But it has been twenty years since I lived in an image every day. Pharo is a modern Smalltalk with a big class library and a goal of being suitable for mission-critical systems. I don't need much of a tug; Smalltalk always beckons.

My current quandary brings to mind a dinner at a Dagstuhl seminar in the summer of 2019 (*). It's been a while now, so I hope my memory doesn't fail me too badly. Mark Guzdial was talking about a Pharo MOOC he had recently completed and how he was thinking of using the language to implement a piece of software for his new research group at Michigan, or perhaps a class he was teaching in the fall. If I recall correctly, he was torn between using Pharo and... JavaScript. He laid out some of the pros and cons of each, with JavaScript winning out on several pragmatic criteria, but his heart was clearly with Pharo. Shriram Krishnamurthi gently encouraged Mark to follow his heart: programming should be joyful, and programming languages allow us to build in languages that give us enjoyment. I seconded the (e)motion.

And here I sit mulling a similar choice.

Maybe I can make this a two-language year.

~~~~~

(*) Argh! I never properly blogged about about this seminar, on the interplay between notional machines and programming language semantics, or the experience of visiting Europe for the first time. I did write one post that mentioned Dagstuhl, Paris, and Montenegro, with an expressed hope to write more. Anything I write now will be filtered through two and a half years of fuzzy memory, but it may be worth the time to get it down in writing before it's too late to remember anything useful. In the meantime: both the seminar and the vacation were wonderful! If you are ever invited to participate in a Dagstuhl seminar, consider accepting.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 10, 2021 1:53 PM

Strange Loop 3: This and That

The week after Strange Loop has been a blur of catching up with all the work I didn't do while attending the conference, or at least trying. That is actually good news for my virtual conference: despite attending Strange Loop from the comfort of my basement, I managed not to get sucked into the vortex of regular business going on here.

A few closing thoughts on the conference:

• Speaking of "the comfort of my basement", here is what my Strange Loop conference room looked like:

my Strange Loop 2021 home set-up, with laptop on the left, 29-inch monitor in the center, and a beverage to the right

The big screen is a 29" ultra-wide LG monitor that I bought last year on the blog recommendation of Robert Talbert, which has easily been my best tech purchase of the pandemic. On that screen you'll see vi.to, the streaming platform used by Strange Loop, running in Safari. To its right, I have emacs open on a file of notes and occasionally an evolving blog draft. There is a second Safari window open below emacs, for links picked up from the talks and the conference Slack channels.

On the MacBookPro to left, I am running Slack, another emacs shell for miscellaneous items, and a PDF of the conference schedule, marked up with the two talks I'm considering in each time slot.

That set-up served me well. I can imagine using it again in the future.

• Attending virtually has its downsides, but also its upsides. Saturday morning, one attendee wrote in the Slack #virtual-attendees channel:

Virtual FTW! Attending today from a campsite in upstate New York and enjoying the fall morning air

I was not camping, but I experienced my own virtual victories at lunch time, when I was able to go for a walk with my wife on our favorite walking trails.

• I didn't experience many technical glitches at the conference. There were some serious AV issues in the room during Friday's second slot. Being virtual, I was able to jump easily into and out of the room, checking in on another talk while they debugged on-site. In another talk, we virtual attendees missed out on seeing the presenter's slides. The speaker's words turned out to be enough for me to follow. Finally, Will Byrd's closing keynote seemed to drop its feed a few times, requiring viewers to refresh their browsers occasionally. I don't have any previous virtual conferences to compare to, but this all seemed pretty minor. In general, the video and audio feedbacks were solid and of high fidelity.

• One final note, not related to The Virtual Experience. Like many conferences, Strange Loop has so many good talks that I usually have to choose among two or three talks I want to see in each slot. This year, I kept track of alt-Strange Loop, the schedule of talks I didn't attend but really wanted to. Comparing this list to the list of talks I did attend gives a representative account of the choices I faced. It also would make for a solid conference experience in its own right:

  • FRI 02 -- Whoops! I Rewrote it in Rust (Brian Martin)
  • FRI 03 -- Keeping Your Open Source Project Accessible to All (Treva Williams)
  • FRI 04 -- Impacting Global Policy by Understanding Litter Data (Sean Doherty)
  • FRI 05 -- Morel, A Functional Query Language (Julian Hyde)
  • FRI 06 -- Software for Court Appointed Special Advocates (Linda Goldstein)
  • SAT 02 -- Asami: Turn your JSON into a Graph in 2 Lines (Paula Gearon)
  • SAT 03 -- Pictures Of You, Pictures Of Me, Crypto Steganography (Sean Marcia)
  • SAT 04 -- Carbon Footprint Aware Software Development Tejas Chopra
  • SAT 05 -- How Flutter Can Change the Future of Urban Communities (Edward Thornton)
  • SAT 06 -- Creating More Inclusive Tech Spaces: Paths Forward (Amy Wallhermfechtel)

There is a tie for the honor of "talk I most wanted to see but didn't": Wallhermfechtel on creating more inclusive tech spaces and Marcia on crypto steganography. I'll be watching these videos on YouTube some time soon!

As I mentioned in Day 1's post, this year I tried to force myself out of usual zone, to attend a wider range of talks. Both lists of talks reflect this mix. At heart I am an academic with a fondness for programming languages. The tech talks generally lit me up more. Even so, I was inspired by some of the talks focused on community and the use of technology for the common good. I think I used my two days wisely.

That is all. Strange Loop sometimes gives me the sort of inspiration overdose that Molly Mielke laments in this tweet. This year, though, Strange Loop 2021 gave me something I needed after eighteen months of pandemic (and even more months of growing bureaucracy in my day job): a jolt of energy, and a few thoughts for the future.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

October 02, 2021 5:37 PM

Strange Loop 2: Day Two

I am usually tired on the second day of a conference, and today was no exception. But the day started and ended with talks that kept my brain alive.

• "Poems in an Accidental Language" by Kate Compton -- Okay, so that was a Strange Loop keynote. When the video goes live on YouTube, watch it. I may blog more about the talk later, but for now know only that it included:

  • "Evenings of Recreational Ontology" (I also learned about Google Sheet parties)
  • "fitting an octopus into an ontology"
  • "Contemplate the universe, and write an API for it."
Like I said, go watch this talk!

• Quantum computing is one of those technical areas I know very little about, maybe the equivalent of a 30-minute pitch talk. I've never been super-interested, but some of my students are. So I attended "Practical Quantum Computing Today" to see what's up these days. I'm still not interested in putting much of my time into quantum computing, but now I'm better informed.

• Before my lunch walk, I attended a non-technical talk on "tech-enabled crisis response". Emma Ferguson and Colin Schimmelfing reported on their experience doing something I'd like to be able to do: spin up a short-lived project to meet a critical need, using mostly free or open-source tools. For three months early in the COVID pandemic, their project helped deliver ~950,000 protective masks from 7,000 donors to 6,000 healthcare workers. They didn't invent new tech; they used existing tools and occasionally wrote some code to connect such tools.

My favorite quote from the talk came when Ferguson related the team's realization that they had grown too big for the default limits on Google Sheets and Gmail. "We thought, 'Let's just pay Google.' We tried. We tried. But we couldn't figure it out." So they built their own tool. It is good to be a programmer.

• After lunch, Will Crichton live-coded a simple API in Rust, using traits (Rust's version of interfaces) and aggressive types. He delivered almost his entire talk within emacs, including an ASCII art opening slide. It almost felt like I was back in grad school!

• In "Remote Workstations for Discerning Artists", Michelle Brenner from Netflix described the company's cloud-based infrastructure for the workstations used by the company's artists and project managers. This is one of those areas that is simply outside my experience, so I learned a bit. At the top level, though, the story is familiar: the scale of Netflix's goals requires enabling artists to work wherever they are, whenever they are; the pandemic accelerated a process that was already underway.

• Eric Gade gave another talk in the long tradition of Alan Kay and a bigger vision for computing. "Authorship Environments: In Search of the 'Personal' in Personal Computing" started by deconstructing Steve Jobs's "bicycle for the mind" metaphor (he's not a fan of what most people take as the meaning) and then moved onto the idea of personal computing as literacy: a new level at which to interrogate ideas, including one's own.

This talk included several inspirational quotes. My favorite was was from Adele Goldberg:

There's all these layers in everything we do... We have to learn how to peel.
(I have long admired Goldberg and her work. See this post from Ada Lovelace Day 2009 for a few of my thoughts.)

As with most talks in this genre, I left feeling like there is so much more to be done, but frustrated at not knowing how to do it. We still haven't found a way to reach a wide audience with the empowering idea that there is more to computing than typing into a Google doc or clicking in a web browser.

• The closing keynote was delivered by Will Byrd. "Strange Dreams of Stranger Loops" took Douglas Hofstadter's Gödel, Escher, Bach as its inspiration, fitting both for the conference and for Byrd's longstanding explorations of relational programming. His focus today: generating quines in mini-Kanren, and discussing how quines enable us to think about programs, interpreters, and the strange loops at the heart of GEB.

As with the opening keynote I may blog more about this talk later. For now I give you two fun items:

  • Byrd expressed his thanks to D((a(d|n))oug), a regular expression that matches on Byrd (his father), Friedman (his academic mentor), and Hofstadter (his intellectual inspiration).
  • While preparing his keynote, Byrd clains to have suffered from UDIS: Unintended Doug Intimidation Syndrome. Hofstader is so cultured, so well-read, and so deep a thinker, how can the rest of us hope to contribute?
Rest assured: Byrd delivered. A great talk, as always.

Strange Loop 2021 has ended. I "hit the road" by walking upstairs to make dinner with my wife.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development

October 01, 2021 5:46 PM

Strange Loop 1: Day One

On this first day of my first virtual conference, I saw a number of Strange Loop-y talks: several on programming languages and compilers, a couple by dancers, and a meta-talk speculating on the future of conferences.

• I'm not a security guy or a cloud guy, so the opening keynote "Why Security is the Biggest Benefit of Using the Cloud" by AJ Yawn gave me a chance to hear what people in this space think and talk about. Cool trivia: Yawn played a dozen college basketball games for Leonard Hamilton at Florida State. Ankle injuries derailed his college hoops experience, and now he's a computer security professional.

• Richard Marmorstein's talk, "Artisanal, Machine-Generated API Libraries" was right on topic with my compiler course this semester. My students would benefit from seeing how software can manipulate AST nodes when generating target code.

Marmorstein uttered two of the best lines of the day:

  • "I could tell you a lot about Stripe, but all you need to know is Stripe has an API."
  • "Are your data structures working for you?"

I've been working with students all week trying to help them see how an object in their compiler such as a token can help the compiler do its job -- and make the code simpler to boot. Learning software design is hard.

• I learned a bit about the Nim programming language from Aditya Siram. As you might imagine, a language designed at the nexus Modula/Oberon, Python, and Lisp appeals to me!

• A second compiler-oriented talk, by Richard Feldman, demonstrated how opportunistic in-place mutation, a static optimization, can help a pure functional program outperform imperative code.

• After the talk "Dancing With Myself", an audience member complimented Mariel Pettee on "nailing the Strange Loop talk". The congratulations were spot-on. She hit the technical mark by describing the use of two machine learning techniques, variational auto encoding and graph neural networks. She hit the aesthetic mark by showing how computer models can learn and generate choreography. When the video for this talk goes live, you should watch.

Pettee closed with the expansive sort of idea that makes Strange Loop a must-attend conference. Dance has no universal language for "writing" choreography, and video captures only a single instance or implementation of a dance, not necessarily the full intent of the choreographer. Pettite had expected her projects to show how machine learning can support invention and co-creation, but now she sees how work like this might provide a means of documentation. Very cool. Perhaps CS can help to create a new kind of language for describing dance and movement.

• I attended Laurel Lawson's "Equitable Experiential Access: Audio Description" to learn more about ways in which videos and other media can provide a fuller, more equitable experience to everyone. Equity and inclusion have become focal points for so much of what we do at my university, and they apply directly to my work creating web-based materials for students. I have a lot to learn. I think one of my next steps will be to experience some of web pages (session notes, assignments, resource pages) solely through a screen reader.

• Like all human activities, traditional in-person conferences offer value and extract costs. Crista Lopes used her keynote closing Day 1 to take a sober look at the changes in their value and their costs in the face of technological advances over the last thirty years.

If we are honest with ourselves, virtual conferences are already able to deliver most of the value of in-person conferences (and, in some ways, provide more value), at much lower cost. The technology of going virtual is the easy part. The biggest challenges are social.

~~~~~

A few closing thoughts as Day 1 closes.

As Crista said, "Taking paid breaks in nice places never gets old." My many trips to OOPSLA and PLoP provided me with many wonderful physical experiences. Being in the same place with my colleagues and friends was always a wonderful social experience. I like driving to St. Louis and going to Strange Loop in person; sitting in my basement doesn't feel the same.

With time, perhaps my expectations will change.

It turns out, though, that "virtual Strange Loop" is a lot like "in-person Strange Loop" in one essential way: several cool new ideas arrive every hour. I'll be back for Day Two.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

September 30, 2021 4:42 PM

Off to Strange Loop

the Strange Loop splash screen from the main hall, 2018

After a couple of years away, I am attending Strange Loop. 2018 seems so long ago now...

Last Wednesday morning, I hopped in my car and headed south to Strange Loop 2018. It had been a few years since I'd listened to Zen and the Art of Motorcycle Maintenance on a conference drive, so I popped it into the tape deck (!) once I got out of town and fell into the story. My top-level goal while listening to Zen was similar to my top-level goal for attending Strange Loop this year: to experience it at a high level; not to get bogged down in so many details that I lost sight of the bigger messages. Even so, though, a few quotes stuck in my mind from the drive down. The first is an old friend, one of my favorite lines from all of literature:

Assembly of Japanese bicycle require great peace of mind.

The other was the intellectual breakthrough that unified Phaedrus's philosophy:

Quality is not an object; it is an event.

This idea has been on my mind in recent months. It seemed a fitting theme, too, for Strange Loop.

There will be no Drive South in 2021. For a variety of reasons, I decided to attend the conference virtually. The persistence of COVID is certainly one of big the reasons. Alex and the crew at Strange Loop are taking all the precautions one could hope for to mitigate risk, but even so I will feel more comfortable online this year than in rooms full of people from across the country. I look forward to attending in person again soon.

Trying to experience the conference at a high level is again one of my meta-level goals for attending. The program contains so many ideas that are new to me; I think I'll benefit most by opening myself to areas I know little or nothing about and seeing where the talks lead me.

This year, I have a new meta-level goal: to see what it is like to attend a conference virtually. Strange Loop is using Vito as its hub for streaming video and conference rooms and Slack as its online community. This will be my first virtual conference, and I am curious to see how it feels. With concerns such as climate change, public health, and equity becoming more prominent as conference-organizing committees make their plans, I suspect that we will be running more and more of our conferences virtually in the future, especially in CS. I'm curious to see how much progress has been made in the last eighteen months and how much room we have to grow.

This topic is even on the program! Tomorrow's lineup concludes with Crista Lopes speaking on the future of conferences. She's been thinking about and helping to implement conferences in new ways for a few years, so I look forward to hearing what she has to say.

Whatever the current state of virtual conferences, I fully expect that this conference will be a worthy exemplar. It always is.

So, I'm off to Strange Loop for a couple of days. I'll be in my basement.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

August 27, 2021 3:30 PM

Back To My Compilers Course

Well, a month has passed. Already, the first week of classes are in the books. My compiler course is off to as good a start as one might hope.

Week 1 of the course is an orientation to the course content and project. Content-wise, Day 1 offers a bird's-eye view of what a compiler does, then Day 2 tries to give a bird's-eye view of how a compiler works. Beginning next week, we go deep on the stages of a compiler, looking at techniques students can use to implement their compiler for a small language. That compiler project is the centerpiece and focus of the course.

Every year, I think about ways to shake up this course. (Well, not last year, because we weren't able to offer it due to COVID.) As I prepared for the course, I revisited this summary of responses to a Twitter request from John Regehr: What should be taught in a modern undergrad compiler class? It was a lot of fun to look back through the many recommendations and papers linked there. In the end, though, the response that stuck with me came from Celeste Hollenbeck, who "noted the appeal of focusing on the basics over esoterica": compilers for the masses, not compilers for compiler people.

Our class is compilers for everyone in our major, or potentially so. Its main role in our curriculum is to be one of four so-called project courses, which serve as capstones for a broad set of electives. Many of the students in the course take it to satisfy their project requirement, others take it to satisfy a distribution requirement, and a few take it just because it sounds like fun.

The course is basic, and a little old-fashioned, but that works for us. The vast majority of our students will never write a compiler again. They are in the course to learn something about how compilers work conceptually and to learn what it is like to build a large piece of software with a team. We talk about modern compiler technology such as LLVM, but working with such complicated systems would detract from the more general goals of the course for our students. Some specific skills for writing lexers and scanners, a little insight into how compilers work, and experience writing a big program with others (and living with design decisions for a couple of months!) are solid outcomes for an undergrad capstone project.

That's not to say that some students don't go on to do more with compilers... Some do. A few years ago, one of our undergrads interviewed his way into an internship with Sony PlayStation's compiler team, where he now works full time. Other students have written compilers for their own languages, including one that was integrated as a scripting language into a gaming engine he had built. In that sense, the course seems to serve the more focused students well, too.

Once more unto the breach, dear friends, once more...
-- Henry V

So, we are off. I still haven't described the source language my students will be processing this semester, as promised in my last post. Soon. Since then, though, I wrote a bunch of small programs in the language just to get a feel for it. That's as much fun as a department head gets to have most days these days.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 19, 2021 3:07 PM

Today's Thinking Prompts, in Tweets

On teaching, via Robert Talbert:

Look at the course you teach most often. If you had the power to remove one significant topic from that course, what would it be, and why?

I have a high degree of autonomy in most of the courses I teach, so power isn't the limiting factor for me. Time is a challenge to making big changes, of course. Gumption is probably what I need most right now. Summer is a great time for me to think about this, both for my compiler course this fall and programming languages next spring.

On research, via Kris Micinski:

i remember back to Dana Scott's lecture on the history of the lambda calculus where he says, "If Turing were alive today, I don't know what he'd be doing, but it wouldn't be recursive function theory." I think about that a lot.

Now I am, too. Seriously. I'm no Turing, but I have a few years left and some energy to put into something that matters. Doing so will require some gumption to make other changes in my work life first. I am reaching a tipping point.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

April 30, 2021 1:56 PM

Good News at the End of a Long Year, v2.0

A couple of weeks ago, a former student emailed me after many years. Felix immigrated to the US from the Sudan back in the 1990s and wound up at my university, where he studied computer science. While in our program, he took a course or two with me, and I supervised his undergrad research project. He graduated and got busy with life, and we lost touch.

He emailed to let me know that he was about to defend his Ph.D. dissertation, titled "Efficient Reconstruction and Proofreading of Neural Circuits", at Harvard. After graduating from UNI, he programmed at DreamWorks Interactive and EA Sports, before going to grad school and working to "unpack neuroscience datasets that are almost too massive to wrap one's mind around". He defended his dissertation successfully this week.

Congratulations, Dr. Gonda!

Felix wrote initially to ask permission to acknowledge me in his dissertation and defense. As I told him, it is an honor to be remembered so fondly after so many years. People often talk about how teachers affect their students' futures in ways that are often hard to see. This is one of those moments for me. Arriving at the end of what has been a challenging semester in the classroom for me, Felix's note boosted my spirit and energizes me a bit going into the summer.

If you'd like to learn more about Felix and his research, here is his personal webpage The Harvard School of Engineering also has a neat profile of Felix that shows you what a neat person he is.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

February 27, 2021 11:12 AM

All The Words

In a Paris Review interview, Fran Lebowitz joked about the challenge of writing:

Every time I sit at my desk, I look at my dictionary, a Webster's Second Unabridged with nine million words in it and think, All the words I need are in there; they're just in the wrong order.

Unfortunately, thinks this computer scientist, writing is a computationally more intense task than simply putting the words in the right order. We have to sample with replacement.

Computational complexity is the reason we can't have nice things.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 08, 2021 2:12 PM

My Experience Storing an Entire Course Directory in Git

Last summer, I tried something new: I stored the entire directory of materials for my database course in Git. This included all of my code, the course website, and everything else. It worked well.

The idea came from a post or tweet many years ago by Martin Fowler who, if I recall correctly, had put his entire home directory under version control. It sounded like the potential advantages might be worth the cost, so I made a note to try it myself sometime. I wasn't quite ready last summer to go all the way, so I took a baby step by creating my new course directory as a git repo and growing it file by file.

My context is pretty simple. I do almost all of my work on a personal MacBook Pro or a university iMac in my office. My main challenge is to keep my files in sync. When I make changes to a small number of files, or when the stakes of a missing file are low, copying files by hand works fine, with low overhead and no tooling necessary.

When I make a lot of changes in a short period of time, however, as I sometimes do when writing code or building my website, doing things by hand becomes more work. And the stakes of losing code or web pages are a lot higher than losing track of some planning notes or code I've been noodling with. To solve this problem, for many years I have been using rsync and a couple of simple shell scripts to manage code directories and my course web sites.

So, the primary goal for using Git in a new workflow was to replace rsync. Not being a Git guru, as many of you are, I figured that this would also force me to live with git more often and perhaps expand my tool set of handy commands.

My workflow for the semester was quite simple. When I worked in the office, there were four steps:

  1. git merge laptop
  2. [ do some work ]
  3. git commit
  4. git push

On my laptop, the opening and closing git commands changed:

  1. git pull origin main
  2. [ do some work ]
  3. git commit
  4. git push origin laptop

My work on a course is usually pretty straightforward. The most common task is to create files and record information with commit. Every once in a while, I had to back up a step with checkout.

You may say, "But you are not using git for version control!" You would be correct. The few times I checked out an older version of a file, it was usually to eliminate a spurious conflict, say, a .DS_Store file that was out of sync. Locally, I don't need a lot of version control, but using Git this way was a form of distributed version control, making sure that, wherever I was working, I had the latest version of every file.

I think this is a perfectly valid way to use Git. In some ways, Git is the new Unix. It provided me with a distributed filesystem and a file backup system all in one. The git commands ran effectively as fast as their Unix counterparts. My repo was not very much bigger than the directory would have been on its own, and I always had a personal copy of the entire repo with me wherever I went, even if I had to use another computer.

Before I started, several people reminded me that Git doesn't always work well with large images and binaries. That didn't turn out to be much of a problem for me. I had a couple of each in the repo, but they were not large and never changed. I never noticed a performance hit.

The most annoying hiccup all semester was working with OS X's .DS_Store files, which record screen layout information for OS X. I like to keep my windows looking neat and occasionally reorganize a directory layout to reflect what I'm doing. Unfortunately, OS X seems to update these files at odd times, after I've closed a window and pushed changes. Suddenly the two repos would be out of sync only because one or more .DS_Store files had changed after the fact. The momentary obstacle was quickly eliminated with a checkout or two before merging. Perhaps I should have left the .DS_Stores untracked...

All in all, I was pretty happy with the experience. I used more git, more often, than ever before and thus am now a bit more fluent than I was. (I still avoid the hairier corners of the tool, as all right-thinking people do whenever possible.) Even more, the repository contains a complete record of my work for the semester, false starts included, with occasional ruminations about troubles with code or lecture notes in my commit messages. I had a little fun after the semester ended looking back over some of those messages and making note of particular pain points.

The experiment went well enough that I plan to track my spring course in Git, too. This will be a bigger test. I've been teaching programming languages for many years and have a large directory of files, both current and archival. Not only are there more files, there are several binaries and a few larger images. I'm trying decide if I should put the entire folder into git all at once upfront or start with an empty folder a lá last semester and add files as I want or need them. The latter would be more work at early stages of development but might be a good way to clear out the clutter that has built up over twenty years.

If you have any advice on that choice, or any other, please let me know by email or on Twitter. You all have taught me a lot over the years. I appreciate it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 30, 2020 3:38 PM

That's a Shame

In the middle of an old post about computing an "impossible" integral, John Cook says:

In the artificial world of the calculus classroom, everything works out nicely. And this is a shame.

When I was a student, I probably took comfort in the fact that everything was supposed to work out nicely on the homework we did. There *was* a solution; I just had to find the pattern, or the key that turned the lock. I suspect that I was a good student largely because I was good at finding the patterns, the keys.

It wasn't until I got to grad school that things really changed, and even then course work was typically organized pretty neatly. Research in the lab was very different, of course, and that's where my old skills no longer served me so well.

In university programs in computer science, where many people first learn how to develop software, things tend to work out nicely. That is a shame, too. But it's a tough problem to solve.

In most courses, in particular introductory courses, we create assignments with that have "closed form" solutions, because we want students to practice a specific skill or learn a particular concept. Having a fixed target can be useful in achieving the desired outcomes, especially if we want to help students build confidence in their abilities.

It's important, though, that we eventually take off the training wheels and expose students to messier problems. That's where they have an opportunity to build other important skills they need for solving problems outside the classroom, which aren't designed by a benevolent instructor to have follow a pattern. As Cook says, neat problems can create a false impression that every problem has a simple solution.

Students who go on to use calculus for anything more than artificial homework problems may incorrectly assume they've done something wrong when they encounter an integral in the wild.

CS students need experience writing programs that solve messy problems. In more advanced courses, my colleagues and I all try to extend students' ability to solve less neatly-designed problems, with mixed results.

It's possible to design a coherent curriculum that exposes students to an increasingly messy set of problems, but I don't think many universities do this. One big problem is that doing so requires coordination across many courses, each of which has its own specific content outcomes. There's never enough time, it seems, to teach everything about, say, AI or databases, in the fifteen weeks available. It's easier to be sure that we cover another concept than it is to be sure students take a reliable step along the path from being able to solve elementary problems to being able to solve to the problems they'll find in the wild.

I face this set of competing forces every semester and do my best to strike a balance. It's never easy.

Courses that involve large systems projects are one place where students in my program have a chance to work on a real problem: writing a compiler, an embedded real-time system, or an AI-based system. These courses have closed form solutions of sorts, but the scale and complexity of the problems require students to do more than just apply formulas or find simple patterns.

Many students thrive in these settings. "Finally," they say, "this is a problem worth working on." These students will be fine when they graduate. Other students struggle when they have to do battle for the first time with an unruly language grammar or a set of fussy physical sensors. One of my challenges in my project course is to help this group of students move further along the path from "student doing homework" to "professional solving problems".

That would be a lot easier to do if we more reliably helped students take small steps along that path in their preceding courses. But that, as I've said, is difficult.

This post describes a problem in curriculum design without offering any solutions. I will think more about how I try to balance the forces between neat and messy in my courses, and then share some concrete ideas. If you have any approaches that have worked for you, or suggestions based on your experiences as a student, please email me or send me a message on Twitter. I'd love to learn how to do this better.

I've written a number of posts over the years that circle around this problem in curriculum and instruction. Here are three:

I'm re-reading these to see if past me has any ideas for present-day me. Perhaps you will find them interesting, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

November 28, 2020 11:04 AM

How Might A Program Help Me Solve This Problem?

Note: The following comes from the bottom of my previous post. It gets buried there beneath a lot of code and thinking out loud, but it's a message that stands on its own.

~~~~~

I demo'ed a variation of my database-briven passphrase generator to my students as we closed the course last week. It let me wrap up my time with them by reminding them that they are developing skills that can change how they see every problem they encounter in the future.

Knowing how to write programs gives you a new power. Whenever you encounter a problem, you can ask yourself, "How might a program help me solve this?"

The same is true for many more specialized CS skills. People who know how to create a language and implement an interpreter can ask themselves, "How might a language help me solve this problem?" That's one of the outcomes, I hope, of our course in programming languages.

The same is true for databases, too. Whenever you encounter a problem, you can ask yourself, "Can a database help me solve this?"

Computer science students can use the tools they learn each semester to represent and interpret information. That's a power they can use to solve many problems. It's easy to lose sight of that fact during a busy semester and worth reflecting on in calmer moments.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 25, 2020 12:15 PM

Three Bears-ing a Password Generator in SQLite

A long semester -- shorter in time than usual by more than a week, but longer psychologically than any in a long time -- is coming to an end. Teaching databases for the first time was a lot of fun, though the daily grind of preparing so much new material in real time wore me down. Fortunately, I like to program enough that there were moments of fun scattered throughout the semester as I played with SQLite for the first time.

In the spirit of the Three Bears pattern, I looked for opportunities all semester to use SQLite to solve a problem. When I read about the Diceware technique for generating passphrases, I found one.

Diceware is a technique for generating passphrases using dice to select words from the "Diceware Word List", in which each word is paired with a five digit number. All of the digits are between one and six, so five dice rolls are all you need to select a word from the list. Choose a number of words for the passphrase, roll your dice, and select your words.

The Diceware Word List is a tab-separated file of dice rolls and words. SQLite can import TSV data directly into a table, so I almost have a database. I had to preprocess the file twice to make it importable. First, the file is wrapped as a PGP signed message, so I stripped the header and footer by hand, to create diceware-wordlist.txt.

Second, some of the words in this list contain quote characters. Like many applications, SQLite struggles with CSV and TSV files that contain embedded quote characters. There may be some way to configure it to handle these files gracefully, but I didn't bother looking for one. I just replaced the ' and " characters with _ and __, respectively:

    cat diceware-wordlist.txt \
      | sed "s/\'/_/g"        \
      | sed 's/\"/__/g'       \
      > wordlist.txt

Now the file is ready to import:

    sqlite> CREATE TABLE WordList(
       ...>    diceroll CHAR(5),
       ...>    word VARCHAR(30),
       ...>    PRIMARY KEY(diceroll)
       ...>    );

sqlite> .mode tabs sqlite> .import 'wordlist.txt' WordList

sqlite> SELECT * FROM WordList ...> WHERE diceroll = '11113'; 11113 a_s

That's one of the words that used to contain an apostrophe.

So, I have a dice roll/word table keyed on the dice roll. Now I want to choose words at random from the table. To do that, I needed a couple of SQL features we had not used in class: random numbers and string concatenation. The random() function returns a big integer. A quick web search showed me this code to generate a random base-10 digit:

    SELECT abs(random())%10
    FROM (SELECT 1);
which is easy to turn into a random die roll:
    SELECT 1+abs(random())%6
    FROM (SELECT 1);

I need to evaluate this query repeatedly, so I created a view that wraps the code in what acts, effectively, a function:

    sqlite> CREATE VIEW RandomDie AS
       ...>   SELECT 1+abs(random())%6 AS n
       ...>   FROM (SELECT 1);

Aliasing the random value as 'n' is important because I need to string together a five-roll sequence. SQL's concatenation operator helps there:

    SELECT 'Eugene' || ' ' || 'Wallingford';

I can use the operator to generate a five-character dice roll by selecting from the view five times...

    sqlite> SELECT n||n||n||n||n FROM RandomDie;
    21311
    sqlite> SELECT n||n||n||n||n FROM RandomDie;
    63535

... and then use that phrase to select random words from the list:

    sqlite> SELECT word FROM WordList
       ...> WHERE diceroll =
       ...>         (SELECT n||n||n||n||n FROM RandomDie);
    fan

Hurray! Diceware defaults to three-word passphrases, so I need to do this three times and concatenate.

This won't work...

    sqlite> SELECT word, word, word FROM WordList
       ...> WHERE diceroll =
       ...>         (SELECT n||n||n||n||n FROM RandomDie);
    crt|crt|crt
... because the dice roll is computed only once. A view can help us here, too:
    sqlite> CREATE VIEW OneRoll AS
       ...>   SELECT word FROM WordList
       ...>   WHERE diceroll =
       ...>           (SELECT n||n||n||n||n FROM RandomDie);

OneRoll acts like a table that returns a random word:

    sqlite> SELECT word FROM OneRoll;
    howdy
    sqlite> SELECT word FROM OneRoll;
    scope
    sqlite> SELECT word FROM OneRoll;
    snip

Almost there. Now, this query generates three-word passphrases:

    sqlite> SELECT Word1.word || ' ' || Word2.word || ' ' || Word3.word FROM
       ...>   (SELECT * FROM OneRoll) AS Word1,
       ...>   (SELECT * FROM OneRoll) AS Word2,
       ...>   (SELECT * FROM OneRoll) AS Word3;
    eagle crab pinch

Yea! I saved this query in gen-password.sql and saved the SQLite database containing the table WordList and the views RandomDie and OneRoll as diceware.db. This lets me generate passphrases from the command line:

    $ sqlite3 diceware.db < gen-password.sql
    ywca maine over
Finally, I saved that command in a shell script named gen-password, and I now have passphrase generator ready to use with a few keystrokes. Success.

Yes, this is a lot of work to get a simple job done. Maybe I could do better with Python and a CSV reader package, or some other tools. But that wasn't the point. I was revisiting SQL and learning SQLite with my students. By overusing the tools, I learned them both a little better and helped refine my sense of when they will and won't be helpful to me in the future. So, success.

~~~~~

I demo'ed a variation of this to my students on the last day of class. It let me wrap up my time with them by pointing out that they are developing skills which can change how they see every problem they encounter in the future.

Knowing how to write programs gives you a new power. Whenever you encounter a problem, you can ask yourself, "How might a program help me solve this?" I do this daily, both as faculty member and department head.

The same is true for many more specialized CS skills. People who know how to create a language and implement an interpreter can ask themselves, "How might a language help me solve this problem?" That's one of the outcomes, I hope, of our course in programming languages.

The same is true for databases. When I came across a technique for generating passphrases, I could ask myself, "How might a database help me build a passphrase generator?"

Computer science students can use the tools they learn each semester to represent and interpret information. That's a power they can use to solve many problems. It's easy to lose sight of this incredible power during a hectic semester, and worth reflecting on in calmer moments after the semester ends.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 19, 2020 2:27 PM

An "Achievement Gap" Is Usually A Participation Gap

People often look at the difference between the highest-rated male chess player in a group and the highest-rated female chess player in the same group and conclude that there is a difference between the abilities of men and women to play chess, despite the fact that there are usually many, many more men in the group than women. But that's not even good evidence that there is an achievement gap. From What Gender Gap in Chess?:

It's really quite simple. Let's say I have two groups, A and B. Group A has 10 people, group B has 2. Each of the 12 people gets randomly assigned a number between 1 and 100 (with replacement). Then I use the highest number in Group A as the score for Group A and the highest number in Group B as the score for Group B. On average, Group A will score 91.4 and Group B 67.2. The only difference between Groups A and B is the number of people. The larger group has more shots at a high score, so will on average get a higher score. The fair way to compare these unequally sized groups is by comparing their means (averages), not their top values. Of course, in this example, that would be 50 for both groups -- no difference!

I love this paragraph. It's succinct and uses only the simplest ideas from probability and statistics. It's the sort of statistics that I would hope our university students learn in their general education stats course. While learning a little math, students can also learn about an application that helps us understand something important in the world.

The experiment described is also simple enough for beginning programmers to code up. Over the years, I've used problems like this with intro programming students in Pascal, Java, and Python, and with students learning Scheme or Racket who need some problems to practice on. I don't know whether learning science supports my goal, but I hope that this sort of problem (with suitable discussion) can do double duty for learners: learn a little programming, and learn something important about the world.

With educational opportunities like this available to us, we really should be able to turn graduates who have a decent understanding of why so many of our naive conclusions about the world are wrong. Are we putting these opportunities to good use?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

July 16, 2020 10:47 AM

Dreaming in Git

I recently read a Five Books interview about the best books on philosophical wonder. One of the books recommended by philosopher Eric Schwitzgebel was Diaspora, a science fiction novel by Greg Egan I've never read. The story unfolds in a world where people are able to destroy their physical bodies to upload themselves into computers. Unsurprisingly, this leads to some fascinating philosophical possibilities:

Well, for one thing you could duplicate yourself. You could back yourself up. Multiple times.

And then have divergent lives, as it were, in parallel but diverging.

Yes, and then there'd be the question, "do you want to merge back together with the person you diverged from?"

Egan wrote Diaspora before the heyday of distributed version control, before darcs and mercurial and git. With distributed VCS, a person could checkout a new personality, or change branches and be a different person every day. We could run diffs to figure out what makes one version of a self so different from another. If things start going too wrong, we could always revert to an earlier version of ourselves and try again. And all of this could happen with copies of the software -- ourselves -- running in parallel somewhere in the world.

And then there's Git. Imagine writing such a story now, with Git's complex model of versioning and prodigious set of commands and flags. Not only could people branch and merge, checkout and diff... A person could try something new without ever committing changes to the repository. We'd have to figure out what it means to push origin or reset --hard HEAD. We'd be able to rewrite history by rebasing, amending, and squashing. A Git guru can surely explain why we'd need to --force-with-lease or --unset-upstream, but even I can imagine the delightful possibilities of git stash in my personal improvement plan.

Perhaps the final complication in our novel would involve a merge so complex that we need a third-party diff tool to help us put our desired self back together. Alas, a Python library or Ruby gem required by the tool has gone stale and breaks an upgrade. Our hero must find a solution somewhere in her tree of blobs, or be doomed to live a forever splintered life.

If you ever see a book named Dreaming in Git or Bug Report on an airport bookstore's shelves, take a look. Perhaps I will have written the first of my Git fantasies.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

June 17, 2020 3:53 PM

Doing Concatenative Programming in a Spreadsheet

a humble spreadsheet cell

It's been a long time since I was excited by a new piece of software the way I was excited by Loglo, Avi Bryant's new creation. Loglo is "LOGO for the Glowforge", an experimental programming environment for creating SVG images. That's not a problem I need to solve, but the way Loglo works drew me in immediately. It consists of a stack programming language and a set of primitives for describing vector graphics, integrated into a spreadsheet interface. It's the use of a stack language to program a spreadsheet that excites me so much.

Actually, it's the reverse relationship that really excites me: using a spreadsheet to build and visualize a stack-based program. Long-time readers know that I am interested in this style of programming (see Summer of Joy for a post from last year) and sometimes introduce it in my programming languages course. Students understand small examples easily enough, but they usually find it hard to grok larger programs and to fully appreciate how typing in such a language can work. How might Loglo help?

In Loglo, a cell can refer to the values produced by other cells in the familiar spreadsheet way, with an absolute address such as "a1" or "f2". But Loglo cells have two other ways to refer to other cell's values. First, any cell can access the value produced by the cell to its left implicitly, because Loglo leaves the result of a cell's computation sitting on top of the stack. Second, a cell can access the value produced by the cell above it by using the special variable "^". These last two features strike me as a useful way for programmers to see their computations grow over time, which can be an even more powerful mode of interaction for beginners who are learning this programming style.

Stack-oriented programming of this sort is concatenative: programs are created by juxtaposing other programs, with a stack of values implicitly available to every operator. Loglo uses the stack as leverage to enable programmers to build images incrementally, cell by cell and row by row, referring to values on the stack as well as to predecessor cells. The programmer can see in a cell the value produced by a cumulative bit of code that includes new code in the cell itself. Reading Bryant's description of programming in Loglo, it's easy to see how this can be helpful when building images. I think my students might find it helpful when learning how to write concatenative programs or learning how types and programs work in a concatenative language.

For example, here is a concatenative program that works in Loglo as well as other stack-based languages such as Forth and Joy:

 2 3 + 5 * 2 + 6 / 3 /

Loglo tells us that it computes the value 1.5:

a stack program in Loglo

This program consists of eleven tokens, each of which is a program in its own right. More interestingly, we can partition this program into smaller units by taking any subsequences of the program:

 2 3 + 5 *   2 + 6 /   3 /
 ---------   -------   ---
These are the programs in cells A1, B1, and C1 of our spreadsheet. The first computes 25, the second uses that value to compute 4.5, and the third uses the 4.5 to compute 1.5. Notice that the programs in cells B1 and C1 require an extra value to do their jobs. They are like functions of one argument. Rather than pass an argument to the function, Loglo allows it to read a value from the stack, produced by the cell to its left.

a partial function in Loglo

By making the intermediate results visible to the programmer, this interface might help programmers better see how pieces of a concatenative program work and learn what the type of a program fragment such as 2 + 6 / (in cell B1 above) or 3 / is. Allowing locally-relative references on a new row will, as Avi points out, enable an incremental programming style in which the programmer uses a transformation computed in one cell as the source for a parameterized version of the transformation in the cell below. This can give the novice concatenative programmer an interactive experience more supportive than the usual REPL. And Loglo is a spreadsheet, so changes in one cell percolate throughout the sheet on each update!

Am I the only one who thinks this could be a really cool environment for programmers to learn and practice this style of programming?

Teaching concatenative programming isn't a primary task in my courses, so I've never taken the time to focus on a pedagogical environment for the style. I'm grateful to Avi for demonstrating a spreadsheet model for stack programs and stimulating me to think more about it.

For now, I'll play with Loglo as much as time permits and think more about its use, or use of a tool like it, in my courses. There are couple of features I'll have to get used to. For one, it seems that a cell can access only one item left on the stack by its left neighbor, which limits the kind of partial functions we can write into cells. Another is that named functions such as rotate push themselves onto the stack by default and thus require a ! to apply them, whereas operators such as + evaluate by default and thus require quotation in a {} block to defer execution. (I have an academic's fondness for overarching simplicity.) Fortunately, these are the sorts of features one gets used to whenever learning a new language. They are part of the fun.

Thinking beyond Loglo, I can imagine implementing an IDE like this for my students that provides features that Loglo's use cases don't require. For example, it would be cool to enable the programmer to ctrl-click on a cell to see the type of the program it contains, as well as an option to see the cumulative type along the row or built on a cell referenced from above. There is much fun to be had here.

To me, one sign of a really interesting project is how many tangential ideas flow out of it. For me, Loglo is teeming with ideas, and I'm not even in its target demographic. So, kudos to Avi!

Now, back to administrivia and that database course...


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

June 11, 2020 1:02 PM

Persistence Wins, Even For Someone Like You

There's value to going into a field that you find difficult to grasp, as long as you're willing to be persistent. Even better, others can benefit from your persistence, too.

In an old essay, James Propp notes that working in a field where you lack intuition can "impart a useful freedom from prejudice". Even better...

... there's value in going into a field that you find difficult to grasp, as long as you're willing to be really persistent, because if you find a different way to think about things, something that works even for someone like you, chances are that other people will find it useful too.

This reminded me of a passage in Bob Nystroms's post about his new book, Crafting Interpreters. Nystrom took a long time to finish the book in large part because he wanted the interpreter at the end of each chapter to compile and run, while at the same time growing into the interpreter discussed in the next chapter. But that wasn't the only reason:

I made this problem harder for myself because of the meta-goal I had. One reason I didn't get into languages until later in my career was because I was intimidated by the reputation compilers have as being only for hardcore computer science wizard types. I'm a college dropout, so I felt I wasn't smart enough, or at least wasn't educated enough to hack it. Eventually I discovered that those barriers existed only in my mind and that anyone can learn this.

Some students avoid my compilers course because they assume it must be difficult, or because friends said they found it difficult. Even though they are CS majors, they think of themselves as average programmers, not "hardcore computer science wizard types". But regardless of the caliber of the student at the time they start the course, the best predictor of success in writing a working compiler is persistence. The students who plug away, working regularly throughout the two-week stages and across the entire project, are usually the ones who finish successfully.

One of my great pleasures as a prof is seeing the pride in the faces of students who demo a working compiler at the end of the semester, especially in the faces of the students who began the course concerned that they couldn't hack it.

As Propp points out in his essay, this sort of persistence can pay off for others, too. When you have to work hard to grasp an idea or to make something, you sometimes find a different way to think about things, and this can help others who are struggling. One of my jobs as a teacher is to help students understand new ideas and use new techniques. That job is usually made easier when I've had to work persistently to understand the idea myself, or to find a better way to help the students who teach me the ways in which they struggle.

In Nystrom's case, his hard work to master a field he didn't grasp immediately pays of for his readers. I've been following the growth of Crafting Interpreters over time, reading chapters in depth whenever I was able. Those chapters were uniformly easy to read, easy to follow, and entertaining. They have me thinking about ways to teach my own course differently, which is probably the highest praise I can give as a teacher. Now I need to go back and read the entire book and learn some more.

Teaching well enough that students grasp what they thought was not graspable and do what they thought was not doable is a constant goal, rarely achieved. It's always a work in progress. I have to keep plugging away.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

May 14, 2020 2:35 PM

Teaching a New Course in the Fall

I blogged weekly about the sudden switch to remote instruction starting in late March, but only for three weeks. I stopped mostly because my sense of disorientation had disappeared. Teaching class over Zoom started to feel more normal, and my students and I got back into the usual rhythm. A few struggled in ways that affected their learning and performance, and a smaller few thrived. My experience was mostly okay: some parts of my work suffered as I learned how to use tools effectively, but not having as many external restrictions on my schedule offset the negatives. Grades are in, summer break begins to begin, and at least some things are right with the world.

Fall offers something new for me to learn. My fall compilers course had a lower enrollment than usual and, given the university's current financial situation, I had to cancel it. This worked out fine for the department, though, as one of our adjunct instructors asked to take next year off in order to deal with changes in his professional and personal lives. So there was a professor in need of a course, and a course in need of a professor: Database Systems.

Databases is one of the few non-systems CS courses that I have never taught as a prof or as a grad student. It's an interesting course, mixing theory and design with a lot of practical skills that students and employers prize. In this regard, it's a lot of like our OO design and programming course in Java, only with a bit more visible theory. I'm psyched to give it a go. At the very least, I should be able to practice some of those marketable skills and learn some of the newer tools involved.

As with all new preps, this course has me looking for ideas. I'm aware of a few of the standard texts, though I am hoping to find a good open-source text online, or a set of online materials out of which to assemble the readings my students will need for the semester. I'm going to be looking scouting for all the other materials I need to teach the course as well, including examples, homework assignments, and projects. I tend to write a lot of my own stuff, but I also like to learn from good courses and good examples already out there. Not being a database specialist, I am keen to see what specialists think is important, beyond what we find in traditional textbooks.

Then there is the design of the course itself. Teaching a course I've never taught before means not having an old course design to fall back on. This means more work, of course, but is a big win for curious mind. Sometimes, it's fun to start from scratch. I have always found instructional design fascinating, much like any kind of design, and building a new course leaves open a lot of doors for me to learn and to practice some new skills.

COVID-19 is a big part of why I am teaching this course, but it is not done with us. We still do not know what fall semester will look like, other than to assume that it won't look like a normal semester. Will be on campus all semester, online all semester, or a mix of both? If we do hold instruction on campus, as most universities are hoping to do, social distancing requirements will require us to do some things differently, such as meeting students in shifts every other day. This uncertainty suggests that I should design a course that depends less on synchronous, twice-weekly, face-to-face direct instruction and more on ... what?

I have a lot to learn about teaching this way. My university is expanding its professional development offerings this summer and, in addition to diving deep into databases and SQL, I'll be learning some new ways to design a course. It's exciting but also means a bit more teaching prep than usual for my summer.

This is the first entirely new prep I've taught in a while. I think the most recent was the fall of 2009, when I taught Software Engineering for the first and only time. Looking back at the course website reminds me that I created this delightful logo for the course:

course logo for Software Engineering, created using YUML

So, off to work I go. I could sure use your help. Do you know of model database courses that I should know about? What database concepts and skills should CS graduates in 2021 know? What tools should they be able to use? What has changed in the world since I last took database courses that must be reflected in today's database course? Do you know of a good online textbook for the course, or a print book that my students would find useful and be willing to pay for?

If you have any ideas to share, feel free to email me or contact me on Twitter. If not for me, do it for my students!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 10, 2020 2:16 PM

Software Can Make You Feel Alive, or It Can Make You Feel Dead

This week I read one of Craig Mod's old essays and found a great line, one that everyone who writes programs for other people should keep front of mind:

When it comes to software that people live in all day long, a 3% increase in fun should not be dismissed.

Working hard to squeeze a bit more speed out of a program, or to create even a marginally better interaction experience, can make a huge difference to someone who uses that program everyday. Some people spend most of their professional days inside one or two pieces of software, which accentuates further the human value of Mod's three percent. With shelter-in-place and work-from-home the norm for so many people these days, we face a secondary crisis of software that is no fun.

I was probably more sensitive than usual to Mod's sentiment when I read it... This week I used Blackboard for the first time, at least my first extended usage. The problem is not Blackboard, of course; I imagine that most commercial learning management systems are little fun to use. (What a depressing phrase "commercial learning management system" is.) And it's not just LMSes. We use various PeopleSoft "campus solutions" to run the academic, administrative, and financial operations on our campus. I always feel a little of my life drain away whenever I spend an hour or three clicking around and waiting inside this large and mostly functional labyrinth.

It says a lot that my first thought after downloading my final exams on Friday morning was, "I don't have to login to Blackboard again for a good long while. At least I have that going for me."

I had never used our LMS until this week, and then only to create a final exam that I could reliably time after being forced into remote teaching with little warning. If we are in this situation again in the fall, I plan to have an alternative solution in place. The programmer in me always feels an urge to roll my own when I encounter substandard software. Writing an entire LMS is not one of my life goals, so I'll just write the piece I need. That's more my style anyway.

Later the same morning, I saw this spirit of writing a better program in a context that made me even happier. The Friday of finals week is my department's biennial undergrad research day, when students present the results of their semester- or year-long projects. Rather than give up the tradition because we couldn't gather together in the usual way, we used Zoom. One student talked about alternative techniques for doing parallel programming in Python, and another presented empirical analysis of using IR beacons for localization of FIRST Lego League robots. Fun stuff.

The third presentation of the morning was by a CS major with a history minor, who had observed how history profs' lectures are limited by the tools they had available. The solution? Write better presentation software!

As I watched this talk, I was proud of the student, whom I'd had in class and gotten to know a bit. But I was also proud of whatever influence our program had on his design skills, programming skills, and thinking. This project, I thought, is a demonstration of one thing every CS student should learn: We can make the tools we want to use.

This talk also taught me something non-technical: Every CS research talk should include maps of Italy from the 1300s. Don't dismiss 3% increases in fun wherever they can be made.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

April 04, 2020 11:37 AM

Is the Magic Gone?

This passage from Remembering the LAN recalls an earlier time that feels familiar:

My father, a general practitioner, used this infrastructure of cheap 286s, 386s, and 486s (with three expensive laser printers) to write the medical record software for the business. It was used by a dozen doctors, a nurse, and receptionist. ...
The business story is even more astonishing. Here is a non-programming professional, who was able to build the software to run their small business in between shifts at their day job using skills learned from a book.

I wonder how many hobbyist programmers and side-hustle programmers of this sort there are today. Does programming attract people the way it did in the '70s or '80s? Life is so much easier than typing programs out of Byte or designing your own BASIC interpreter from scratch. So many great projects out on Github and the rest of the web to clone, mimic, adapt. I occasionally hear a student talking about their own projects in this way, but it's rare.

As Crawshaw points out toward the end of his post, the world in which we program now is much more complex. It takes a lot more gumption to get started with projects that feel modern:

So much of programming today is busywork, or playing defense against a raging internet. You can do so much more, but the activation energy required to start writing fun collaborative software is so much higher you end up using some half-baked SaaS instead.

I am not a great example of this phenomenon -- Crawshaw and his dad did much more -- but even today I like to roll my own, just for me. I use a simple accounting system I've been slowly evolving for a decade, and I've cobbled together bits and pieces of my own tax software, not an integrated system, just what I need to scratch an itch each year. Then there are all the short programs and scripts I write for work to make Spreadsheet City more habitable. But I have multiple CS degrees and a lot of years of experience. I'm not a doctor who decides to implement what his or her office needs.

I suspect there are more people today like Crawshaw's father than I hear about. I wish it were more of a culture that we cultivated for everyone. Not everyone wants to bake their own bread, but people who get the itch ought to feel like the world is theirs to explore.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development

March 15, 2020 9:35 AM

Things I've Been Reading

This was a weird week. It started with preparations for spring break and an eye on the news. It turned almost immediately into preparations for at least two weeks of online courses and a campus on partial hiatus. Of course, we don't know how the COVID-19 outbreak will develop over the next three weeks, so we may be facing the remaining seven weeks of spring semester online, with students at a distance.

Here are three pieces that helped me get through the week.

Even If You Believe

From When Bloom Filters Don't Bloom:

Advanced data structures are very interesting, but beware. Modern computers require cache-optimized algorithms. When working with large datasets that do not fit in L3, prefer optimizing for a reduced number of loads over optimizing the amount of memory used.

I've always liked the Bloom filter. It seems such an elegant idea. But then I've never used one in a setting where performance mattered. It still surprises me how well current architectures and compilers optimize performance for us in ways that our own efforts can only frustrate. The article is also worth reading for its link to a nice visualization of the interplay among the parameters of a Bloom Filter. That will make a good project in a future class.

Even If You Don't Believe

From one of Tyler Cowen's long interviews:

Niels Bohr had a horseshoe at his country house across the entrance door, a superstitious item, and a friend asked him, "Why do you have it there? Aren't you a scientist? Do you believe in it?" You know what was Bohr's answer? "Of course I don't believe in it, but I have it there because I was told that it works, even if you don't believe in it."

You don't have to believe in good luck to have good luck.

You Gotta Believe

From Larry Tesler's annotated manual for the PUB document compiler:

In 1970, I became disillusioned with the slow pace of artificial intelligence research.

The commentary on the manual is like a mini-memoir. Tesler writes that he went back to the Stanford AI lab in the spring of 1971. John McCarthy sent him to work with Les Earnest, the lab's chief administrator, who had an idea for a "document compiler", a lá RUNOFF, for technical manuals. Tesler had bigger ideas, but he implemented PUB as a learning exercise. Soon PUB had users, who identified shortcomings that were in sync with Tesler's own ideas.

The solution I favored was what we would now call a WYSIWYG interactive text editing and page layout system. I felt that, if the effect of any change was immediately apparent, users would feel more in control. I soon left Stanford to pursue my dream at Xerox PARC (1973-80) and Apple Computer (1980-1997).

Thus began the shift to desktop publishing. And here I sit, in 2020, editing this post using emacs.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

February 18, 2020 4:00 PM

Programming Feels Like Home

I saw Robin Sloan's An App Can Be a Home-Cooked Meal floating around Twitter a few days back. It really is quite good; give it a read if you haven't already. This passage captures a lot of the essay's spirit in only a few words:

The exhortation "learn to code!" has its foundations in market value. "Learn to code" is suggested as a way up, a way out. "Learn to code" offers economic leverage, a squirt of power. "Learn to code" goes on your resume.
But let's substitute a different phrase: "learn to cook." People don't only learn to cook so they can become chefs. Some do! But far more people learn to cook so they can eat better, or more affordably, or in a specific way. Or because they want to carry on a tradition. Sometimes they learn just because they're bored! Or even because -- get this -- they love spending time with the person who's teaching them.

Sloan expresses better than I ever have an idea that I blog about every so often. Why should people learn to program? Certainly it offers a path to economic gain, and that's why a lot of students study computer science in college, whether as a major, a minor, or a high-leverage class or two. There is nothing wrong with that. It is for many a way up, a way out.

But for some of us, there is more than money in programming. It gives you a certain power over the data and tools you use. I write here occasionally about how a small script or a relatively small program makes my life so much easier, and I feel bad for colleagues who are stuck doing drudge work that I jump past. Occasionally I'll try to share my code, to lighten someone else's burden, but most of the time there is such a mismatch between the worlds we live in that they are happier to keep plugging along. I can't say that I blame them. Still, if only they could program and used tools that enabled them to improve their work environments...

But... There is more still. From the early days of this blog, I've been open with you all:

Here's the thing. I like to write code.

One of the things that students like about my classes is that I love what I do, and they are welcome to join me on the journey. Just today a student in my Programming Languages drifted back to my office with me after class , where we ended up talking for half an hour and sketching code on a whiteboard as we deconstructed a vocabulary choice he made on our latest homework assignment. I could sense this student's own love of programming, and it raised my spirits. It makes me more excited for the rest of the semester.

I've had people come up to me at conferences to say that the reason they read my blog is because they like to see someone enjoying programming as much as they do. many of them share links with their students as if to say, "See, we are not alone." I look forward to days when I will be able to write in this vein more often.

Sloan reminds us that programming can be -- is -- more than a line on a resume. It is something that everyone can do, and want to do, for a lot of different reasons. It would be great if programming "were marbled deeply into domesticity and comfort, nerdiness and curiosity, health and love" in the way that cooking is. That is what makes Computing for All really worth doing.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development, Teaching and Learning

February 10, 2020 2:37 PM

Some Things I Read Recently

Campaign Security is a Wood Chipper for Your Hopes and Dreams

Practical campaign security is a wood chipper for your hopes and dreams. It sits at the intersection of 19 kinds of status quo, each more odious than the last. You have to accept the fact that computers are broken, software is terrible, campaign finance is evil, the political parties are inept, the DCCC exists, politics is full of parasites, tech companies are run by arrogant man-children, and so on.

This piece from last year has some good advice, plenty of sarcastic humor from Maciej, and one remark that was especially timely for the past week:

You will fare especially badly if you have written an app to fix politics. Put the app away and never speak of it again.

Know the Difference Between Neurosis and Process

In a conversation between Tom Waits and Elvis Costello from the late 1980s, Waits talks about tinkering too long with a song:

TOM: "You have to know the difference between neurosis and actual process, 'cause if you're left with it in your hands for too long, you may unravel everything. You may end up with absolutely nothing."

In software, when we keep code in our hands for too long, we usually end up with an over-engineered, over-abstracted boat anchor. Let the tests tell you when you are done, then stop.

Sometimes, Work is Work

People say, "if you love what you do you'll never work a day in your life." I think good work can be painful--I think sometimes it feels exactly like work.

Some weeks more than others. Trust me. That's okay. You can still love what you do.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

January 22, 2020 3:54 PM

The Roots of TDD -- from 1957

In 1957, Dan McCracken published Digital Computer Programming, perhaps the first book on the new art of programming. His book shows that the roots of extreme programming run deep. In this passage, McCracken encourages both the writing of tests before the writing of code and the involvement of the customer in the software development process:

The first attack on the checkout problem may be made before coding is begun. In order to fully ascertain the accuracy of the answers, it is necessary to have a hand-calculated check case with which to compare the answers which will later be calculated by the machine. This means that stored program machines are never used for a true one-shot problem. There must always be an element of iteration to make it pay. The hand calculations can be done at any point during programming. Frequently, however, computers are operated by computing experts to prepare the problems as a service for engineers or scientists. In these cases it is highly desirable that the "customer" prepare the check case, largely because logical errors and misunderstandings between the programmer and customer may be pointed out by such procedure. If the customer is to prepare the test solution is best for him to start well in advance of actual checkout, since for any sizable problem it will take several days or weeks to calculate the test.

I don't have a copy of this book, but I've read a couple of other early books by McCracken, including one of his Fortran books for engineers and scientists. He was a good writer and teacher.

I had the great fortune to meet Dan at an NSF workshop in Clemson, South Carolina, back in the mid-1990s. We spent many hours in the evening talking shop and watching basketball on TV. (Dan was cheering his New York Knicks on in the NBA finals, and he was happy to learn that I had been a Knicks and Walt Frazier fan in the 1970s.) He was a pioneer of programming and programming education who was willing to share his experience with a young CS prof who was trying to figure out how to teach. We kept in touch by email thereafter. It was honor to call him a friend.

You can find the above quotation in A History of Test-Driven Development (TDD), as Told in Quotes, by Rob Myers. That post includes several good quotes that Myers had to cut from his upcoming book on TDD. "Of course. How else could you program?"


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 26, 2019 1:55 PM

An Update on the First Use of the Term "Programming Language"

This tweet and this blog entry on the first use of the term "programming language" evoked responses from readers with some new history and some prior occurrences.

Doug Moen pointed me to the 1956 Fortran manual from IBM, Chapter 2 of which opens with:

Any programming language must provide for expressing numerical constants and variable quantities.

I was aware of the Fortran manual, which I link to in the notes for my compiler course, and its use of the term. But I had been linking to a document dated October 1957, and the file at fortran.com is dated October 15, 1956. That beats the January 1957 Newell and Shaw paper by a few months.

As Moen said in his email, "there must be earlier references, but it's hard to find original documents that are early enough."

The oldest candidate I have seen comes from @matt_dz. His tweet links to this 1976 Stanford tech report, "The Early Development of Programming Languages", co-authored by Knuth. On Page 26, it refers to work done by Arthur W. Burks in 1950:

In 1950, Burks sketched a so-called "Intermediate Programming Language" which was to be the step one notch above the Internal Program Language.

Unfortunately, though, this report's only passage from Burke refers to the new language as "Intermediate PL", which obscures the meaning of the 'P'. Furthermore, the title of Burke's paper uses "program" in the language's name:

Arthur W. Burks, "An intermediate program language as an aid in program synthesis", Engineering Research Institute, Report for Burroughs Adding Machine Company (Ann Arbor, Mich.: Univ. of Michigan, 1951), ii+15 pp.

The use of "program language" in this title is consistent with the terminology in Burks's previous work on an "Internal Program Language", to which Knuth et al. also refer.

Following up on the Stanford tech report, Douglas Moen found the book Mathematical Methods in Program Development, edited by Manfred Broy and Birgit Schieder. It includes a paper that attempts "to identify the first 'programming language', and the first use of that term". Here's a passage from Page 219, via Google Books:

There is not yet an indication of the term 'programming languages'. But around 1950 the term 'program' comes into wide use: 'The logic of programming electronic digital computers' (A. W. Burks 1950), 'Programme organization and initial orders for the EDSAC' (D. J. Wheeler 1950), 'A program composition technique' (H. B. Curry 1950), 'La conception du programme' (Corrado Bohm 1952), and finally 'Logical or non-mathematical programmes' (C. S. Strachey 1952).

And then, on Page 224, it comments specifically on Burks's work:

A. W. Burks ('An intermediate program language as an aid in program synthesis', 1951) was among the first to use the term program(ming) language.

The parenthetical in that phrase -- "the first to use the term program(ming) language" -- leads me to wonder if Burks may use "program language" rather than "programming language" in his 1951 paper.

Is it possible that Knuth et al. retrofitted the use of "programming language" onto Burks's language? Their report documents the early development of PL ideas, not the history of the term itself. The authors may have used a term that was in common parlance by 1976 even if Burks had not. I'd really like to find an original copy of Burks's 1951 ERI report to see if he ever uses "programming language" when talking about his Intermediate PL. Maybe after the holiday break...

In any case, the use of program language by Burks and others circa 1950 seems to be the bridge between use of the terms "program" and "language" independently and the use of "programming language" that soon became standard. If Burke and his group never used the new term for its intermediate PL, it's likely that someone else did between 1951 and release of the 1956 Fortran manual.

There is so much to learn. I'm glad that Crista Lopes tweeted her question on Sunday and that so many others have contributed to the answer!


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 23, 2019 10:23 AM

The First Use of the Term "Programming Language"?

Yesterday, Crista Lopes asked a history question on Twitter:

Hey, CS History Twitter: I just read Iverson's preface of his 1962 book carefully, and suddenly this occurred to me: did he coin the term "programming language"? Was that the first time a programming language was called "programming language"?

In a follow-up, she noted that McCarthy's CACM paper on LISP from roughly the same time called Lisp a 'programming system'", not a programming language.

I had a vague recollection from my grad school days that Newell and Simon might have used the term. I looked up IPL, the Information Processing Language they created in the mid-1950s with Shaw. IPL pioneered the notion of list processing, though at the level of assembly language. I first learned of it while devouring Newell and Simon's early work on AI and reading every thing I could find about programs such as the General Problem Solver and Logic Theorist.

That wikipedia page has a link to this unexpected cache of documents on IPL from Newell, Simon, and Shaw's days at Rand. The oldest of these is a January 1957 paper, Programming the Logic Theory Machine, by Newell and Shaw that was presented at the Western Joint Computer Conference (WJCC) the next month. It details their efforts to build computer systems to perform symbolic reasoning, as well as the language they used to code their programs.

There it is on Page 5: a section titled "Requirements for the Programming Language". They even define what they mean by programming language:

We can transform these statements about the general nature of the program of LT into a set of requirements for a programming language. By a programming language we mean a set of symbols and conventions that allows a programmer to specify to the computer what processes he wants carried out.

Other than the gendered language, that definition works pretty well even today.

The fact that Newell and Shaw defined "programming language" in this paper indicates that the term probably was not in widespread use at the time. The WJCC was a major computing conference of the day. The researchers and engineers who attended it would likely be familiar with common jargon of the industry.

Reading papers about IPL is an education across a range of ideas in computing. Researchers at the dawn of computing had to contend with -- and invent -- concepts at multiple levels of abstraction and figure out how to implement the on machines with limited size and speed. What a treat these papers are.

I love to read original papers from the beginning of our discipline, and I love to learn about the history of words. A few of my students do, too. One student stopped in after the last day of my compilers course this semester to thank me for telling stories about the history of compilers occasionally. Next semester, I teach our Programming Languages and Paradigms course again, and this little story might add a touch of color to our first days together.

All this said, I am neither a historian of computer science nor a lexicographer. If you know of an earlier occurrence of the term "programming language" than Newell and Shaw's from January 1957, I would love to hear from you by email or on Twitter.


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 20, 2019 1:45 PM

More Adventures in Constrained Programming: Elo Predictions

I like tennis. The Tennis Abstract blog helps me keep up with the game and indulge my love of sports stats at the same time. An entry earlier this month gave a gentle introduction to Elo ratings as they are used for professional tennis:

One of the main purposes of any rating system is to predict the outcome of matches--something that Elo does better than most others, including the ATP and WTA rankings. The only input necessary to make a prediction is the difference between two players' ratings, which you can then plug into the following formula:

1 - (1 / (1 + (10 ^ (difference / 400))))

This formula always makes me smile. The first computer program I ever wrote because I really wanted to was a program to compute Elo ratings for my high school chess club. Over the years I've come back to Elo ratings occasionally whenever I had an itch to dabble in a new language or even an old favorite. It's like a personal kata of variable scope.

I read the Tennis Abstract piece this week as my students were finishing up their compilers for the semester and as I was beginning to think of break. Playful me wondered how I might implement the prediction formula in my students' source language. It is a simple functional language with only two data types, integers and booleans; it has no loops, no local variables, no assignments statements, and no sequences. In another old post, I referred to this sort of language as akin to an integer assembly language. And, heaven help me, I love to program in integer assembly language.

To compute even this simple formula in Klein, I need to think in terms of fractions. The only division operator performs integer division, so 1/x for any x gives 0. I also need to think carefully about how to implement the exponentiation 10 ^ (difference / 400). The difference between two players' ratings is usually less than 400 and, in any case, almost never divisible by 400. So My program will have to take an arbitrary root of 10.

Which root? Well, I can use our gcd() function (implemented using Euclid's algorithm, of course) to reduce diff/400 to its lowest terms, n/d, and then compute the dth root of 10^n. Now, how to take the dth root of an integer for an arbitrary integer d?

Fortunately, my students and I have written code like this in various integer assembly languages over the years. For instance, we have a SQRT function that uses binary search to hone in on the integer closest to the square of a given integer. Even better, one semester a student implemented a square root program that uses Newton's method:

   xn+1 = xn - f(xn)/f'(xn)

That's just what I need! I can create a more general version of the function that uses Newton's method to compute an arbitrary root of an arbitrary base. Rather than work with floating-point numbers, I will implement the function to take its guess as a fraction, represented as two integers: a numerator and a denominator.

This may seem like a lot of work, but that's what working in such a simple programming language is like. If I want my students' compilers to produce assembly language that predicts the result of a professional tennis match, I have to do the work.

This morning, I read a review of Francis Su's new popular math book, Mathematics for Human Flourishing. It reminds us that math isn't about rules and formulas:

Real math is a quest driven by curiosity and wonder. It requires creativity, aesthetic sensibilities, a penchant for mystery, and courage in the face of the unknown.

Writing my Elo rating program in Klein doesn't involve much mystery, and it requires no courage at all. It does, however, require some creativity to program under the severe constraints of a very simple language. And it's very much true that my little programming diversion is driven by curiosity and wonder. It's fun to explore ideas in a small space uses limited tools. What will I find along the way? I'll surely make design choices that reflect my personal aesthetic sensibilities as well as the pragmatic sensibilities of a virtual machine that know only integers and booleans.

As I've said before, I love to program.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 06, 2019 2:42 PM

OOP As If You Meant It

This morning I read an old blog post by Michael Feathers, The Flawed Theory Behind Unit Testing. It discusses what makes TDD and Clean Room software development so effective for writing code with fewer defects: they define practices that encourage developers to work in a continuous state of reflection about their code. The post holds up well ten years on.

The line that lit my mind up, though, was this one:

John Nolan gave his developers a challenge: write OO code with no getters.

Twenty-plus years after the movement of object-oriented programming into the mainstream, this still looks like a radical challenge to many people. "Whenever possible, tell another object to do something rather than ask for its data." This sort of behavioral abstraction is the heart of OOP and the source of its design power. Yet it is rare to find big Java or C++ systems where most classes don't provide public accessors. When you open that door, client code will walk in -- even if you are the person writing the client code.

Whenever I look at a textbook intended for teaching undergraduates OOP, I look to see how it introduces encapsulation and the use of "getters" and "setters". I'm usually disappointed. Most CS faculty think doing otherwise would be too extreme for relative beginners. Once we open the door , though, it's a short step to using (gulp) instanceof to switch on kinds of objects. No wonder that some students are unimpressed and that too many students don't see any much value in OO programming, which as they learn it doesn't feel much different from what they've done before but which puts new limitations on them.

To be honest, though, it is hard to go Full Metal OOP. Nolan was working with professional programmers, not second-year students, and even so programming without getters was considered a challenge for them. There are certainly circumstances in which the forces at play may drive us toward cracking the door a bit and letting an instance variable sneak out. Experienced programmers know how to decide when the trade-off is worth it. But our understanding the downside of the trade-off is improved after we know how to design independent objects that collaborate to solve problems without knowing anything about the other objects beyond the services they provide.

Maybe we need to borrow an idea from the TDD As If You Meant It crowd and create workshops and books that teach and practice OOP as if we really meant it. Nolan's challenge above would be one of the central tenets of this approach, along with the Polymorphism Challenge and other practices that look odd to many programmers but which are, in the end, the heart of OOP and the source of its design power.

~~~~~

If you like this post, you might enjoy The Summer Smalltalk Taught Me OOP. It's isn't about OOP itself so much as about me throwing away systems until I got it right. But the reason I was throwing systems away was that I was still figuring out how to build an object-oriented system after years programming procedurally, and the reason I was learning so much was that I was learning OOP by building inside of Smalltalk and reading its standard code base. I'm guessing that code base still has a lot to teach many of us.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 10, 2019 11:06 AM

Three of the Hundred Falsehoods CS Students Believe

Jan Schauma recently posted a list of one hundred Falsehoods CS Students (Still) Believe Upon Graduating. There is much good fun here, especially for a prof who tries to help CS students get ready for the world, and a fair amount of truth, too. I will limit my brief comments to three items that have been on my mind recently even before reading this list.

18. 'Email' and 'Gmail' are synonymous.

CS grads are users, too, and their use of Gmail, and systems modeled after it, contributes to the truths of modern email: top posting all the time, with never a thought of trimming anything. Two-line messages sitting atop icebergs of text which will never be read again, only stored in the seemingly infinite space given us for free.

Of course, some of our grads end up in corporate and IT, managing email as merely one tool in a suite of lowest-common-denominator tools for corporate communication. The idea of email as a stream of text that can, for the most part, be read as such, is gone -- let alone the idea that a mail stream can be processed by programs such as procmail to great benefit.

I realize that most users don't ask for anything more than a simple Gmail filter to manage their mail experience, but I really wish it were easier for more users with programming skills to put those skills to good use. Alas, that does not fit into the corporate IT model, and not even the CS grads running many of these IT operations realize or care what is possible.

38. Employers care about which courses they took.

It's the time of year when students register for spring semester courses, so I've been meeting with a lot of students. (Twice as many as usual, covering for a colleague on sabbatical.) It's interesting to encounter students on both ends of the continuum between not caring at all what courses they take and caring a bit too much. The former are so incurious I wonder how they fell into the major at all. The latter are often more curious but sometimes are captive to the idea that they must, must, must take a specific course, even if it meets at a time they can't attend or is full by the time they register.

I do my best to help them get into these courses, either this spring or in a late semester, but I also try to do a little teaching along the way. Students will learn useful and important things in just about every course they take, if they want to, and taking any particular course does not have to be either the beginning or the end of their learning of that topic. And if the reason they think they must take a particular course is because future employers will care, they are going to be surprised. Most of the employers who interview our students are looking for well-rounded CS grads who have a solid foundation in the discipline and who can learn new things as needed.

90. Two people with a CS degree will have a very similar background and shared experience/knowledge.

This falsehood operates in a similar space to #38, but at the global level I reached at the end of my previous paragraph. Even students who take most of the same courses together will usually end their four years in the program with very different knowledge and experiences. Students connect with different things in each course, and these idiosyncratic memories build on one another in subsequent courses. They participate in different extracurricular activities and work different part-time jobs, both of shape and augment what they learn in class.

In the course of advising students over two, three, or four years, I try to help them see that their studies and other experiences are helping them to become interesting people who know more than they realize and who are individuals, different in some respects from all their classmates. They will be able to present themselves to future employers in ways that distinguish them from everyone else. That's often the key to getting the job they desire now, or perhaps one they didn't even realize they were preparing for while exploring new ideas and building their skillsets.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 13, 2019 3:12 PM

How a government boondoggle paved the way for the expansion of computing

In an old interview at Alphachatterbox, economist Brad DeLong adds another programming tale to the annals of unintended consequences:

So the Sage Air Defense system, which never produced a single usable line of software running on any piece of hardware -- we spent more on the Sage Air Defense System than we did on the entire Manhattan Project. And it was in one sense the ultimate government Defense Department boondoggle. But on the other hand it trained a whole generation of computer programmers at a time when very little else was useful that computer programmers could exercise their skills on.
And by the time the 1960s rolled around we not only ... the fact that Sage had almost worked provided say American Airlines with the idea that maybe they should do a computer-driven reservations system for their air travel, which I think was the next big Manhattan Project-scale computer programming project.
And as that moved on the computer programmers began finding more and more things to do, especially after IBM developed its System 360.
And we were off and running.

As DeLong says earlier in the conversation, this development upended IBM president Thomas Watson's alleged claim that there was "a use for maybe five computers in the world". This famous quote is almost certainly an urban legend, but Watson would not have been as off-base as people claim even if he had said it. In the 1950s, there was not yet a widespread need for what computers did, precisely because most people did not yet understand how computing could change the landscape of every activity. Training a slew of programmers for a project that ultimately failed had the unexpected consequence of creating the intellectual and creative capital necessary to begin exploring the ubiquitous applications of computing. Money unexpectedly well spent.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 12, 2019 3:57 PM

Pain and Shame

Today's lecture notes for my course include a link to @KentBeck's article on Prune, which I still enjoy.

The line that merits its link in today's session is:

We wrote an ugly, fragile state machine for our typeahead, which quickly became a source of pain and shame.

My students will soon likely experience those emotions about the state machines; they are building for lexers for their semester-long compiler project. I reassure them: These emotions are normal for programmers.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

August 30, 2019 4:26 PM

Unknown Knowns and Explanation-Based Learning

Like me, you probably see references to this classic quote from Donald Rumsfeld all the time:

There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say, we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know.

I recently ran across it again in an old Epsilon Theory post that uses it to frame the difference between decision making under risk (the known unknowns) and decision-making under uncertainty (the unknown unknowns). It's a good read.

Seeing the passage again for the umpteenth time, it occurred to me that no one ever seems to talk about the fourth quadrant in that grid: the unknown knowns. A quick web search turns up a few articles such as this one, which consider unknown knowns from the perspective of others in a community: maybe there are other people who know something that you do not. But my curiosity was focused on the first-person perspective that Rumsfeld was implying. As a knower, what does it mean for something to be an unknown known?

My first thought was that this combination might not be all that useful in the real world, such as the investing context that Ben Hunt writes about in Epsilon Theory. Perhaps it doesn't make any sense to think about things you don't know that you know.

As a student of AI, though, I suddenly made an odd connection ... to explanation-based learning. As I described in a blog post twelve years ago:

Back when I taught Artificial Intelligence every year, I used to relate a story from Russell and Norvig when talking about the role knowledge plays in how an agent can learn. Here is the quote that was my inspiration, from Pages 687-688 of their 2nd edition:

Sometimes one leaps to general conclusions after only one observation. Gary Larson once drew a cartoon in which a bespectacled caveman, Zog, is roasting his lizard on the end of a pointed stick. He is watched by an amazed crowd of his less intellectual contemporaries, who have been using their bare hands to hold their victuals over the fire. This enlightening experience is enough to convince the watchers of a general principle of painless cooking.

I continued to use this story long after I had moved on from this textbook, because it is a wonderful example of explanation-based learning.

In a mathematical sense, explanation-based learning isn't learning at all. The new fact that the program learns follows directly from other facts and inference rules already in its database. In EBL, the program constructs a proof of a new fact and adds the fact to its database, so that it is ready-at-hand the next time it needs it. The program has compiled a new fact, but in principle it doesn't know anything more than it did before, because it could always have deduced that fact from things it already knows.

As I read the Epsilon Theory article, it struck me that EBL helps a learner to surface unknown knowns by using specific experiences as triggers to combine knowledge it already into a piece of knowledge that is usable immediately without having to repeat the (perhaps costly) chain of inference ever again. Deducing deep truths every time you need them can indeed be quite costly, as anyone who has ever looked at the complexity of search in logical inference systems can tell you.

When I begin to think about unknown knowns in this way, perhaps it does make sense in some real-world scenarios to think about things you don't know you know. If I can figure it all out, maybe I can finally make my fortune in the stock market.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

August 08, 2019 2:42 PM

Encountering an Old Idea Three Times in Ten Days

I hope to eventually write up a reflection on my first Dagstuhl seminar, but for now I have a short story about how I encountered a new idea three times in ten days, purely by coincidence. Actually, the idea is over one hundred fifty years old but, as my brother often says, "Hey, it's new to me."

On the second day of Dagstuhl, Mark Guzdial presented a poster showing several inspirations for his current thinking about task-specific programming languages. In addition to displaying screenshots of two cool software tools, the poster included a picture of an old mechanical device that looked both familiar and strange. Telegraphy had been invented in the early 1840s, and telegraph operators needed some way to type messages. But how? The QWERTY keyboard was not created for the typewriter until the early 1870s, and no other such devices were in common use yet. To meet the need, Royal Earl House adapted a portion of a piano keyboard to create the input device for the "printing telegraph", or teleprinter. The photo on Mark's poster looked similar to the one on Wikipedia page for the teleprinter.

There was a need for a keyboard thirty years before anyone designed a standard typing interface, so telegraphers adapted an existing tool to fit their needs. What if we are in that same thirty-year gap in the design of programming languages? This has been one of Mark's inspirations as he works with non-computer scientists on task-specific programming languages. I had never seen an 1870s teleprinter before and thought its keyboard to be a rather ingenious way to solve a very specific problem with a tool borrowed from another domain.

When Dagstuhl ended, my wife and I spent another ten days in Europe on a much-needed vacation. Our first stop was Paris, and on our first full day there we visited the museum of the Conservatoire National des Arts et Métiers. As we moved into the more recent exhibits of the museum, what should I see but...

a Hughes teleprinter with piano-style keyboard, circa 1975, in the CNAM museum, Paris

... a Hughes teleprinter with piano-style keyboard, circa 1975. Déjà vu! I snapped a photo, even though the device was behind glass, and planned to share it with Mark when I got home.

We concluded our vacation with a few days in Martinici, Montenegro, the hometown of a department colleague and his wife. They still have a lot of family in the old country and spend their summers there working and relaxing. On our last day in this beautiful country, we visited its national historical museum, which is part of the National Museum of Montenegro in the royal capital of Cetinje. One of the country's most influential princes was a collector of modern technology, and many of his artifacts are in the museum -- including:

a teleprinter with piano-style keyboard in the Historical Museum of Montenegro, Cetinje

This full-desk teleprinter was close enough to touch and examine up close. (I didn't touch!) The piano keyboard on the device shows the wear of heavy use, which brings to mind each of my laptops' keyboards after a couple of years. Again, I snapped a photo, this time in fading light, and made a note to pass it on.

In ten days, I went from never having heard much about a "printing telegraph" to seeing a photo of one, hearing how it is an inspiration for research in programming language design, and then seeing two such devices that had been used in the 19th-century heyday of telegraphy. It was an unexpected intersection of my professional and personal lives. I must say, though, that having heard Mark's story made the museum pieces leap into my attention in a way that they might not have otherwise. The coincidence added a spark to each encounter.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

August 02, 2019 2:48 PM

Programming is an Infinite Construction Kit

As so often, Marvin Minsky loved to tell us about the beauty of programming. Kids love to play with construction sets like Legos, TinkerToys, and Erector sets. Programming provides an infinite construction kit: you never run out of parts!

In the linked essay, which was published as a preface to a 1986 book about Logo, Minsky tells several stories. One of the stories relates that once, as a small child, he built a large tower out of TinkerToys. The grownups who saw it were "terribly impressed". He inferred from their reaction that:

some adults just can't understand how you can build whatever you want, so long as you don't run out of sticks and spools.

Kids get it, though. Why do so many of us grow out of this simple understanding as we get older? Whatever its cause, this gap between children's imaginations and the imaginations of adults around them creates a new sort of problem when we give the children a programming language such as Logo or Scratch. Many kids take to these languages just as they do to Legos and TinkerToys: they're off to the races making things, limited only by their expansive imaginations. The memory on today's computers is so large that children never run out of raw material for writing programs. But adults often don't possess the vocabulary for talking with the children about their creations!

... many adults just don't have words to talk about such things -- and maybe, no procedures in their heads to help them think of them. They just do not know what to think when little kids converse about "representations" and "simulations" and "recursive procedures". Be tolerant. Adults have enough problems of their own.

Minsky thinks there are a few key ideas that everyone should know about computation. He highlights two:

Computer programs are societies. Making a big computer program is putting together little programs.

Any computer can be programmed to do anything that any other computer can do--or that any other kind of "society of processes" can do.

He explains the second using ideas pioneered by Alan Turing and long championed in the popular sphere by Douglas Hofstadter. Check out this blog post, which reflects on a talk Hofstadter gave at my university celebrating the Turing centennial.

The inability of even educated adults to appreciate computing is a symptom of a more general problem. As Minsky says toward the end of his essay, People who don't appreciate how simple things can grow into entire worlds are missing something important. If you don't understand how simple things can grow into complex systems, it's hard to understand much at all about modern science, including how quantum mechanics accounts for what we see in the world and even how evolution works.

You can usually do well by reading Minsky; this essay is a fine example of that. It comes linked to an afterword written by Alan Kay, another computer scientist with a lot to say about both the beauty of computing and its essential role in a modern understanding of the world. Check both out.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

July 05, 2019 12:40 PM

A Very Good Reason to Leave Your Home and Move to a New Country

He applied to switch his major from mathematics to computer science, but the authorities forbade it. "That is what tipped me to accept the idea that perhaps Russia is not the best place for me," he says. "When they wouldn't allow me to study computer science."

-- Sergey Aleynikov, as told to Michael Lewis and reported in Chapter 5 of Flash Boys.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

June 21, 2019 2:35 PM

Computing Everywhere, Sea Mammal Edition

In The Narluga Is a Strange Beluga-Narwhal Hybrid, Ed Yong tells the story of a narluga, the offspring of a beluga father and a narwhal mother:

Most of its DNA was a half-and-half mix between the two species, but its mitochondrial DNA -- a secondary set that animals inherit only from their mothers -- was entirely narwhal.

This strange hybrid had a mouth and teeth unlike either of its parents, the product of an unexpected DNA computation:

It's as if someone took the program for creating a narwhal tusk and ran it in a beluga's mouth.

The analogy to software doesn't end there, though...

There's something faintly magical about that. This fluky merger between two species ended up with a mouth that doesn't normally exist in nature but still found a way of using it. It lived neither like a beluga nor a narwhal, but it lived nonetheless.

Fluky and abnormal; a one-off, yet it adapts and survives. That sounds like a lot of the software I've used over the years and, if I'm honest, like some of the software I've written, too.

That said, nature is amazing.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 20, 2019 3:51 PM

Implementing a "Read Lines" Operator in Joy

I wasn't getting any work done today on my to-do list, so I decided to write some code.

One of my learning exercises to open the Summer of Joy is to solve the term frequency problem from Crista Lopes's Exercises in Programming Style. Joy is a little like Scheme: it has a lot of cool operations, especially higher-order operators, but it doesn't have much in the way of practical level tools for basic tasks like I/O. To compute term frequencies on an arbitrary file, I need to read the file onto Joy's stack.

I played around with Joy's low-level I/O operators for a while and built a new operator called readfile, which expects the pathname for an input file on top of the stack:

    DEFINE readfile ==
        (* 1 *)  [] swap "r" fopen
        (* 2 *)  [feof not] [fgets swap swonsd] while
        (* 3 *)  fclose.

The first line leaves an empty list and an input stream object on the stack. Line 2 reads lines from the file and conses them onto the list until it reaches EOF, leaving a list of lines under the input stream object on the stack. The last line closes the stream and pops it from the stack.

This may not seem like a big deal, but I was beaming when I got it working. First of all, this is my first while in Joy, which requires two quoted programs. Second, and more meaningful to me, the loop body not only works in terms of the dip idiom I mentioned in my previous post, it even uses the higher-order swonsd operator to implement the idiom. This must be how I felt the first time I mapped an anonymous lambda over a list in Scheme.

readfile leaves a list of lines on the stack. Unfortunately, the list is in reverse order: the last line of the file is the front of the list. Besides, given that Joy is a stack-based language, I think I'd like to have the lines on the stack itself. So I noodled around some more and implemented the operator pushlist:

    DEFINE pushlist ==
        (* 1 *)  [ null not ] [ uncons ] while
        (* 2 *)  pop.

Look at me... I get one loop working, so I write another. The loop on Line 1 iterates over a list, repeatedly taking (head . tail) and pushing head and tail onto the stack in that order. Line 2 pops the empty list after the loop terminates. The result is a stack with the lines from the file in order, first line on top:

    line-n ... line-3 line-2 line-1

Put readfile and pushlist together:

    DEFINE fileToStack == readfile pushlist.
and you get fileToStack, something like Python's readlines() function, but in the spirit of Joy: the file's lines are on the stack ready to be processed.

I'll admit that I'm pleased with myself, but I suspect that this code can be improved. Joy has a lot of dandy higher-order operators. There is probably a better way to implement pushlist and maybe even readfile. I won't be surprised if there is a more idiomatic way to implement the two that makes the two operations plug together with less rework. And I may find that I don't want to leave bare lines of text on the stack after all and would prefer having a list of lines. Learning whether I can improve the code, and how, are tasks for another day.

My next job for solving the term frequency problem is to split the lines into individual words, canonicalize them, and filter out stop words. Right now, all I know is that I have two more functions in my toolbox, I learned a little Joy, and writing some code made my day better.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

June 11, 2019 3:04 PM

Summer of Joy

"Elementary" ideas are really hard & need to be revisited
& explored & re-revisited at all levels of mathematical
sophistication. Doing so actually moves math forward.

-- James Tanton

Three summers ago, I spent a couple of weeks re-familiarizing myself with the concatenative programming language Joy and trying to go a little deeper with the style. I even wrote a few blog entries, including a few quick lessons I learned in my first week with the language. Several of those lessons hold up, but please don't look at the code linked there; it is the raw code of a beginner who doesn't yet get the idioms of the style or the language. Then other duties at work and home pulled me away, and I never made the time to get back to my studies.

my Summer of Joy folder

I have dubbed this the Summer of Joy. I can't devote the entire summer to concatenative programming, but I'm making a conscious effort to spend a couple of days each week in real study and practice. After only one week, I have created enough forward momentum that I think about problems and solutions at random times of the day, such as while walking home or making dinner. I think that's a good sign.

An even better sign is that I'm starting to grok some of the idioms of the style. Joy is different from other concatenative languages like Forth and Factor, but it shares the mindset of using stack operators effectively to shape the data a program uses. I'm finally starting to think in terms of dip, an operator that enables a program to manipulate data just below the top of the stack. As a result, a lot of my code is getting smaller and beginning to look like idiomatic Joy. When I really master dip and begin to think in terms of other "dipping" operators, I'll know I'm really on my way.

One of my goals for the summer is to write a Joy compiler from scratch that I can use as a demonstration in my fall compiler course. Right now, though, I'm still in Joy user mode and am getting the itch for a different sort of language tool... As my Joy skills get better, I find myself refactoring short programs I've written in the past. How can I be sure that I'm not breaking the code? I need unit tests!

So my first bit of tool building is to implement a simple JoyUnit. As a tentative step in this direction, I created the simplest version of RackUnit's check-equal? function possible:

    DEFINE check-equal == [i] dip i =.
This operator takes two quoted programs (a test expression and an expected result), executes them, and compares the results. For example, this test exercises a square function:
    [ 2 square ] [ 4 ] check-equal.

This is, of course, only the beginning. Next I'll add a message to display when a test fails, so that I can tell at a glance which tests have failed. Eventually I'll want my JoyUnit to support tests as objects that can be organized into suites, so that their results can be tallied, inspected, and reported on. But for now, YAGNI. With even a few simple functions like this one, I am able to run tests and keep my code clean. That's a good feeling.

To top it all off, implementing JoyUnit will force me to practice writing Joy and push me to extend my understanding while growing the set of programming tools I have at my disposal. That's another good feeling, and one that might help me keep my momentum as a busy summer moves on.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

May 07, 2019 11:15 AM

A PL Design Challenge from Alan Kay

In an answer on Quora from earlier this year:

There are several modern APL-like languages today -- such as J and K -- but I would criticize them as being too much like the classic APL. It is possible to extract what is really great from APL and use it in new language designs without being so tied to the past. This would be a great project for some grad students of today: what does the APL perspective mean today, and what kind of great programming language could be inspired by it?

The APL perspective was more radical even twenty years ago, before MapReduce became a thing and before functional programming ascended. When I was an undergrad, though, it seemed otherworldly: setting up a structure, passing it through a sequence of operators that changed its shape, and then passing it through a sequence of operators that folded up a result. We knew we weren't programming in Fortran anymore.

I'm still fascinated by APL, but I haven't done a lot with it in the intervening years. These days I'm still thinking about concatenative programming in languages like Forth, Factor, and Joy, a project I reinitiated (and last blogged about) three summers ago. Most concatenative languages work with an implicit stack, which gives it a very different feel from APL's dataflow style. I can imagine, though, that working in the concision and abstraction of concatenative languages for a while will spark my interest in diving back into APL-style programming some day.

Kay's full answer is worth a read if only for the story in which he connects Iverson's APL notation, and its effect on how we understand computer systems, to the evolution of Maxwell's equations. Over the years, I've heard Kay talk about McCarthy's Lisp interpreter as akin to Maxwell's equations, too. In some ways, the analogy works even better with APL, though it seems that the lessons of Lisp have had a larger historical effect to date.

Perhaps that will change? Alas, as Kay says in the paragraph that precedes his challenge:

As always, time has moved on. Programming language ideas move much slower, and programmers move almost not at all.

Kay often comes off as pessimistic, but after all the computing history he has lived through (and created!), he has earned whatever pessimism he feels. As usual, reading one of his essays makes me want to buckle down and do something that would make him proud.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 29, 2019 2:42 PM

The Path to Nothing

Dick Gabriel writes, in Lessons From The Science of Nothing At All:

Nevertheless, the spreadsheet was something never seen before. A chart indicating the 64 greatest events in accounting and business history contains VisiCalc.

This reminds me of a line from The Tao of Pooh:

Take the path to Nothing, and go Nowhere until you reach it.

A lot of research is like this, but even more so in computer science, where the things we produce are generally made out of nothing. Often, like VisiCalc, they aren't really like anything we've ever seen or used before either.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 16, 2019 3:40 PM

The Importance of Giving Credit in Context

From James Propp's Prof. Engel's Marvelously Improbable Machines:

Chip-firing has been rediscovered independently in three different academic communities: mathematics, physics, and computer science. However, its original discovery by Engel is in the field of math education, and I strongly feel that Engel deserves credit for having been the first to slide chips around following these sorts of rules. This isn't just for Engel's sake as an individual; it's also for the sake of the kind of work that Engel did, blending his expertise in mathematics with his experience in the classroom. We often think of mathematical sophistication as something that leads practitioners to create concepts that can only be understood by experts, but at the highest reaches of mathematical research, there's a love of clarity that sees the pinnacle of sophistication as being the achievement of hard-won simplicity in settings where before there was only complexity.

First of all, Petri nets! I encountered Petri nets for the first time in a computer architecture course, probably as a master's student, and it immediately became my favorite thing about the course. I was never much into hardware and architecture, but Petri nets showed me a connection back to graph theory, which I loved. Later, I studied how to apply temporal logic to modeling hardware and found another way to appreciate my architecture courses.

But I really love the point that Propp makes in this paragraph and the section it opens. Most people think of research and teaching as being different sort of activities. But the kind of thinking one does in one often crosses over into the other. The sophistication that researchers have and use help us make sense of complex ideas and, at their best, help us communicate that understanding to a wide audience, not just to researchers at the same level of sophistication. The focus that teachers put on communicating challenging ideas to relative novices can encourage us to seek new formulations for a complex idea and ways to construct more complex ideas out of the new formulations. Sometimes, that can lead to an insight we can use in research.

In recent years, my research has benefited a couple times from trying to explain and demonstrate concatenative programming, as in Forth and Joy, to my undergraduate students. These haven't been breakthroughs of the sort that Engel made with his probability machines, but they've certainly help me grasp in new ways ideas I'd been struggling with.

Propp argues convincingly that it's important that we tell stories like Engel's and recognize that his breakthrough came as a result of his work in the classroom. This might encourage more researchers to engage as deeply with their teaching as with their research. Everyone will benefit.

Do you know any examples similar to the one Propp relates, but in the field of computer science? If so, I would love to hear about them. Drop me a line via email or Twitter.

Oh, and if you like Petri nets, probability, or fun stories about teaching, do read Propp's entire piece. It's good fun and quite informative.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 10, 2019 10:53 AM

Weekend Shorts

Andy Ko, in SIGCSE 2019 report:

I always have to warn my students before they attend SIGCSE that it's not a place for deep and nuanced discussions about learning, nor is it a place to get critical feedback about their ideas.
It is, however, a wonderful place to be immersed in the concerns of CS teachers and their perceptions of evidence.

I'm not sure I agree that one can't have deep, nuanced discussions about learning at SIGCSE, but it certainly is not a research conference. It is a great place to talk to and learn from people in the trenches teaching CS courses, with a strong emphasis on the early courses. I have picked up a lot of effective, creative, and inspiring ideas at SIGCSE over the years. Putting them onto sure scientific footing is part of my job when I get back.

~~~~~

Stephen Kell, in Some Were Meant for C (PDF), an Onward! 2017 essay:

Unless we can understand the real reasons why programmers continue to use C, we risk researchers continuing to solve a set of problems that is incomplete and/or irrelevant, while practitioners continue to use flawed tools.

For example,

... "faster safe languages" is seen as the Important Research Problem, not better integration.

... whereas Kell believes that C's superiority in the realm of integration is one of the main reasons that C remains a dominant, essential systems language.

Even with the freedom granted by tenure, academic culture tends to restrict what research gets done. One cause is a desire to publish in the best venues, which encourages work that is valued by certain communities. Another reason is that academic research tends to attract people who are interested in a certain kind of clean problem. CS isn't exactly "round, spherical chickens in a vacuum" territory, but... Language support for system integration, interop, and migration can seem like a grungier sort of work than most researchers envisioned when they went to grad school.

"Some Were Meant for C" is an elegant paper, just the sort of work, I imagine, that Richard Gabriel had when envisioned the essays track at Onward. Well worth a read.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 28, 2019 4:29 PM

Ubiquitous Distraction

This morning, while riding the exercise bike, I read two items within twenty minutes or so that formed a nice juxtaposition for our age. First came The Cost of Distraction, an old blog post by L.M. Sacasas that reconsiders Kurt Vonnegut's classic story, "Harrison Bergeron" (*). In the story, it is 2081, and the Handicapper General of the United States ensures equality across the land by offsetting any advantages any individual has over the rest of the citizenry. In particular, those of above-average intelligence are required to wear little earpieces that periodically emit high-pitched sounds to obliterate any thoughts in progress. The mentally- and physically-gifted Harrison rebels, to an ugly end.

Soon after came Ian Bogost's Apple's AirPods Are an Omen, an article from last year that explores the cultural changes that are likely to ensue as more and more people wear AirPods and their ilk. ("Apple's most successful products have always done far more than just make money, even if they've raked in a lot of it....") AirPods free the wearer in so many ways, but they also bind us to ubiquitous distraction. Will we ever have a free moment to think deeply when our phones and laptops now reside in our heads?

As Sacasas says near the end of his post,

In the world of 2081 imagined by Vonnegut, the distracting technology is ruthlessly imposed by a government agency. We, however, have more or less happily assimilated ourselves to a way of life that provides us with regular and constant distraction. We have done so because we tend to see our tools as enhancements.

Who needs a Handicapper General when we all walk down to the nearest Apple Store or Best Buy and pop distraction devices into our own ears?

Don't get me wrong. I'm a computer scientist, and I love to program. I also love the productivity my digital tools provide me, as well as the pleasure and comfort they afford. I'm not opposed to AirBuds, and I may be tempted to get a pair someday. But there's a reason I don't carry a smart phone and that the only iPod I've ever owned is 1GB first-gen Shuffle. Downtime is valuable, too.

(*) By now, even occasional readers know that I'm a big Vonnegut fan who wrote a short eulogy on the occasion of his death, nearly named this blog after one of his short stories, and returns to him frequently.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

December 29, 2018 4:41 PM

No Big Deal

I love this line from Organizational Debt:

So my proposal for Rust 2019 is not that big of a deal, I guess: we just need to redesign our decision making process, reorganize our governance structures, establish new norms of communication, and find a way to redirect a significant amount of capital toward Rust contributors.

A solid understatement usually makes me smile. Decision-making processes, governance structure, norms of communication, and compensation for open-source developers... no big deal, indeed. We all await the results. If the results come with advice that generalizes beyond a single project, especially the open-source compensation thing, all the better.

Communication is a big part of the recommendation for 2019. Changing how communication works is tough in any organization, let alone an organization with distributed membership and leadership. In every growing organization there eventually comes the time for intentional systems of communication:

But we've long since reached the point where coordinating our design vision by osmosis is not working well. We need an active and intentional circulatory system for information, patterns, and frameworks of decision making related to design.

I'm not a member of the Rust community, only an observer. But I know that the language inspires some programmers, and I learned a bit about its tool chain and community support a couple of years ago when an ambitious student used it successfully to implement his compiler in my course. It's the sort of language we need, being created in what looks to be an admirable way. I wish the Rust team well as they tackle their organizational debt and tackle their growing pains.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 24, 2018 2:55 PM

Using a Text Auto-Formatter to Enhance Human Communication

More consonance with Paul Romer, via his conversation with Tyler Cowen: They were discussing how hard it is to learn read English than other languages, due to its confusing orthography and in particular the mismatch between sounds and their spellings. We could adopt a more rational way to spell words, but it's hard to change the orthography of large language spoken by a large, scattered population. Romer offered a computational solution:

It would be a trivial translation problem to let some people write in one spelling form, others in the other because it would be word-for-word translation. I could write you an email in rationalized spelling, and I could put it through the plug-in so you get it in traditional spelling. This idea that it's impossible to change spelling I think is wrong. It's just, it's hard, and we should -- if we want to consider this -- we should think carefully about the mechanisms.

This sounds similar to a common problem and solution in the software development world. Programmers working in teams often disagree about the orthography of code, not the spelling so much as its layout, the use of whitespace, and the placement of punctuation. Being programmers, we often address this problem computationally. Team members can stylize their code anyway they see fit but, when they check it into the common repository, they run it through a language formatter. Often, these formatters are built into our IDEs. Nowadays, some languages even come with a built-in formatting tool, such as Go and gofmt.

Romer's email plug-in would play a similar role in human-to-human communication, enabling writers to use different spelling systems concurrently. This would make it possible to introduce a more rational way to spell words without having to migrate everyone to the new system all at once. There are still challenges to making such a big change, but they could be handled in an evolutionary way.

Maybe Romer's study of Python is turning him into a computationalist! Certainly, being a programmer can help a person recognize the possibility of a computational solution.

Add this idea to his recent discovery of C.S. Peirce, and I am feeling some intellectual kinship to Romer, at least as much as an ordinary CS prof can feel kinship to a Nobel Prize-winning economist. Then, to top it all off, he lists Slaughterhouse-Five as one of his two favorite novels. Long-time readers know I'm a big Vonnegut fan and nearly named this blog for one of his short stories. Between Peirce and Vonnegut, I can at least say that Romer and I share some of the same reading interests. I like his tastes.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

December 23, 2018 10:45 AM

The Joy of Scholarship

This morning I read Tyler Cowen's conversation with Paul Romer. At one point, Romer talks about being introduced to C.S. Peirce, who had deep insights into "abstraction and how we use abstraction to communicate" (a topic Romer and Cowen discuss earlier in the interview). Romer is clearly enamored with Peirce's work, but he's also fascinated by the fact that, after a long career thinking about a set of topics, he could stumble upon a trove of ideas that he didn't even know existed:

... one of the joys of reading -- that's not a novel -- but one of the joys of reading, and to me slightly frightening thing, is that there's so much out there, and that a hundred years later, you can discover somebody who has so many things to say that can be helpful for somebody like me trying to understand, how do we use abstraction? How do we communicate clearly?

But the joy of scholarship -- I think it's a joy of maybe any life in the modern world -- that through reading, we can get access to the thoughts of another person, and then you can sample from the thoughts that are most relevant to you or that are the most powerful in some sense.

This process, he says, is the foundation for how we transmit knowledge within a culture and across time. It's how we grow and share our understanding of the world. This is a source of great joy for scholars and, really, for anyone who can read. It's why so many people love books.

Romer's interest in Peirce calls to mind my own fascination with his work. As Romer notes, Peirce had a "much more sophisticated sense about how science proceeds than the positivist sort of machine that people describe". I discovered Peirce through an epistemology course in graduate school. His pragmatic view of knowledge, along with William James's views, greatly influenced how I thought about knowledge. That, in turn, redefined the trajectory by which I approached my research in knowledge-based systems and AI. Peirce and James helped me make sense of how people use knowledge, and how computer programs might.

So I feel a great kinship with Romer in his discovery of Peirce, and the joy he finds in scholarship.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

October 02, 2018 4:04 PM

Strange Loop 5: Day Two

the video screen announcing Philip Wadler's talk

Friday was a long day, but a good one. The talks I saw were a bit more diverse than on Day One: a couple on language design (though even one of those covered a lot more ground than that), one on AI, one on organizations and work-life, and one on theory:

• "All the Languages Together", by Amal Ahmed, discussed a problem that occurs in multi-language systems: when code written in one language invalidates the guarantees made by code written in the other. Most languages are not designed with this sort of interoperability baked in, and their FFI escape hatches make anything possible within foreign code. As a potential solution, Ahmed offered principled escape hatches designed with specific language features in mind. The proposed technique seems like it could be a lot of work, but the research is in its early stages, so we will learn more as she and her students implement the idea.

This talk is yet another example of how so many of our challenges in software engineering are a result of programming language design. It's good to see more language designers taking issues like these seriously, but we have a long way to go.

• I really liked Ashley Williams's talk on on the evolution of async in Javascript and Rust. This kind of talk is right up my alley... Williams invoked philosophy, morality, and cognitive science as she reviewed how two different language communities incorporated asynchronous primitives into their languages. Programming languages are designed, to be sure, but they are also the result of "contingent turns of history" (a lá Foucault). Even though this turned out to be more of a talk about the Rust community than I had expected, I enjoyed every minute. Besides, how can you not like a speaker who says, "Yes, sometimes I'll dress up as a crab to teach."?

(My students should not expect a change in my wardrobe any time soon...)

• I also enjoyed "For AI, by AI", by Connor Walsh. The talk's subtitle, "Freedom & Evolution of the Algopoetic Avant-Garde", was a bit disorienting, as was its cold open, but the off-kilter structure of the talk was easy enough to discern once Walsh got going: first, a historical review of humans making computers write poetry, followed by a look at something I didn't know existed... a community of algorithmic poets — programs — that write, review, and curate poetry without human intervention. It's a new thing, of Walsh's creation, that looks pretty cool to someone who became drunk on the promise of AI many years ago.

I saw two other talks the second day:

  • the after-lunch address by Philip Wadler, "Categories for the Working Hacker", which I wrote about separately
  • Rachel Krol's Some Things May Never Get Fixed, about how organizations work and how developers can thrive despite how they work

I wish I had more to say about the last talk but, with commitments at home, the long drive beckoned. So, I departed early, sadly, hopped in my car, headed west, and joined the mass exodus that is St. Louis traffic on a Friday afternoon. After getting past the main crush, I was able to relax a bit with the rest of Zen and the Art of Motorcycle Maintenance.

Even a short day at Strange Loop is a big win. This was the tenth Strange Loop, and I think I've been to five, or at least that's what my blog seems to tell me. It is awesome to have a conference like this in Middle America. We who live here benefit from the opportunities it affords us, and maybe folks in the rest of the world get a chance to see that not all great computing ideas and technology happen on the coasts of the US.

When is Strange Loop 2019?


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

October 01, 2018 7:12 PM

Strange Loop 4: The Quotable Wadler

Philip Wadler is a rockstar to the Strange Loop crowd. His 2015 talk on propositions as types introduced not a few developers to one of computer science's great unities. This year, he returned to add a third idea to what is really a triumvirate: categories. With a little help from his audience, he showed that category theory has elements which correspond directly to ...

  • logical 'and', which models the record (or tuple, or pair) data type
  • logical 'or', which models the union (or variant record) data type
  • a map, which models the function data type
What's more, the product/sum dual models De Morgan's laws, but with more structure, which enables it to model sets beyond the booleans!

Wadler is an entertaining teacher; I recommend the video of his talk! But he is also as quotable as any CS prof I've encountered in a long while. Here is a smattering of his great lines from "Categories for the Working Hacker":

If you can use math to do something, do it. It will make your life better.

That's the great thing about math. It lets you see something obvious after only thirty or forty years.

Pick your favorite algebra. If you don't have one, get one.

Let's do that in Java. That's what you should always do when you learn a new idea: do it in Java.

That's what category theory is really about: avoiding traffic jams.

Sums are the secret origin of folds.

If you don't understand this, I don't mind, because it's Java.

While watching the presentation, I created a one-liner of my own: Surprise! If you do something that matches exactly what Haskell does, Haskell code will be much shorter than Java code.

This was a very good talk; I enjoyed it quite a bit. However, I also left the room with a couple of nagging questions. The talk was titled "Categories for the Working Hacker", and it did a nice job of presenting some basic ideas from category theory in a way that most any developer could understand, even one without much background in math. But... How does this knowledge make one a better hacker? Armed with this new, entertaining knowledge, what are software developers able to do that they couldn't do before?

I have my own ideas for answers to these questions, but I would love to hear Wadler's take.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 30, 2018 6:40 PM

Strange Loop 3: David Schmüdde and the Art of Misuse

the splash screen to open David Schmudde's talk, 'Misuse'

This talk, the first of the afternoon on Day 1, opened with a familiar image: René Magritte's "this is not a pipe" painting, next to a picture of an actual pipe from some e-commerce site. Throughout the talk, speaker David Schmüdde returned to the distinction between thing and referent as he looked at the phenomenon of software users who used -- misused -- software to do something other than intended by the designer. The things they did were, or became, art.

First, a disclaimer: David is a former student of mine, now a friend, and one of my favorite people in the world. I still have in my music carousel a CD labeled "Schmudde Music!!" that he made for me just before he graduated and headed off to a master's program in music at Northwestern.

I often say in my conference reports that I can't do a talk justice in a blog entry, but it's even more true of a talk such as this one. Schmüdde demonstrated multiple works of art, both static and dynamic, which created a vibe that loses most of its zing when linearized in text. So I'll limit myself here to a few stray observations and impressions from the talk, hoping that you'll be intrigued enough to watch the video when it's posted.

Art is a technological endeavor. Rembrandt and hip hop don't exist without advances in art-making technology.

Misuse can be a form of creative experimentation. Check out Jodi, a website created in 1995 and still available. In the browser, it seems to be a work of ASCII art, but show the page source... (That's a lot harder these days than it was in 1995.) Now that is ASCII art.

Schmüdde talked about another work of from the same era, entitled Rain. It used slowness -- of the network, of the browser -- as a feature. Old HTML (or was it a bug in an old version of Netscape Navigator?) allowed one HEAD tag in a file with multiple BODY tags. The artist created such a document that, when loaded in sequence, gave the appearance of rain falling in the browser. Misusing the tools under the conditions of the day enabled the artist to create an animation before animated GIFs, Flash, and other forms of animation existed.

The talk followed with examples and demos of other forms of software misuse, which could:

  • find bugs in a system
  • lead to new system features
  • illuminate a system in ways not anticipated by the software's creator
Schmüdde wondered, when we fix bugs in this way, do we make the resulting system, or the resulting interaction, less human?

Accidental misuse is life. We expect it. Intentional misuse is, or can be, art. It can surprise us.

What does art preservation look like for these works? The original hardware and software systems often are obsolete or, more likely, gone. To me, this is one of the great things about computers: we can simulate just about anything. Digital art preservation becomes a matter of simulating the systems or interactions that existed at the time the art was created. We are back to Magritte's pipe... This is not a work of art; it is a pointer to a work of art.

It is, of course, harder to recreate the experience of the art from the time it was created, but isn't this true of all art? Each of us experiences a work of art anew each time we encounter it. Our experience is never the same as the experience of those who were present when the work was first unveiled. It's often not even the same experience we ourselves had yesterday.

Schmüdde closed with a gentle plea to the technologists in the room to allow more art into their process. This is a new talk, and he was a little concerned about his ending. He may find a less abrupt way to end in the future, but to be honest, I though what he did this time worked well enough for the day.

Even taking my friendship with the speaker into account, this was the talk of the conference for me. It blended software, users, technology, ideas, programming, art, the making of things, and exploring software at its margins. These ideas may appear at the margin, but they often lie at the core of the work. And even when they don't, they surprise us or delight us or make us think.

This talk was a solid example of what makes Strange Loop a great conference every year. There were a couple of other talks this year that gave me a similar vibe, for example, Hannah Davis's "Generating Music..." talk on Day 1 and Ashley Williams's "A Tale of Two asyncs" talk on Day 2. The conference delivers top-notch technical content but also invites speakers who use technology, and explore its development, in ways that go beyond what you find in most CS classrooms.

For me, Day One of the conference ended better than most: over a beer with David at Flannery's with good conversation, both about ideas from his talk and about old times, families, and the future. A good day.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Personal

September 30, 2018 10:31 AM

Strange Loop 2: Simon Peyton Jones on Teaching CS in the Schools

Simon Peyton discusses one of the myths of getting CS into the primary and secondary classroom: it's all about the curriculum

The opening keynote this year was by Simon Peyton Jones of Microsoft Research, well known in the programming languages for Haskell and many other things. But his talk was about something considerably less academic: "Shaping Our Children's Education in Computing", a ten-year project to reform the teaching of computing in the UK primary and secondary schools. It was a wonderful talk, full of history, practical advice, lessons learned, and philosophy of computing. Rather than try to summarize everything Peyton Jones said, I will let you watch the video when it is posted (which will be as early as next week, I think).

I would, though, like to highlight one particular part of the talk, the way he describes computer science to a non-CS audience. This is an essential skill for anyone who wants to introduce CS to folks in education, government, and the wider community who often see CS as either hopelessly arcane or as nothing more than a technology or a set of tools.

Peyton Jones characterized computing as being about information, computation, and communication. For each, he shared one or two ways to discuss the idea with an educated but non-technical audience. For example:

  • Information.   Show two images, say the Mona Lisa and a line drawing of a five-pointed star. Ask which contains more information. How can we tell? How can we compare the amounts? How might we write that information down?

  • Computation.   Use a problem that everyone can relate to, such as planning a trip to visit all the US state capitals in the fewest miles or sorting a set of numbers. For the latter, he used one of the activities from CS Unplugged on sorting networks as an example.

  • Communication.   Here, Peyton Jones used the elegant and simple idea underlying the Diffie Hellman algorithm for sharing secret as his primary example. It is simple and elegant, yet it's not at all obvious to most people who don't already know it that the problem can be solved at all!

In all three cases, it helps greatly to use examples from many disciplines and to ask questions that encourage the audience to ask their own questions, form their own hypotheses, and create their own experiments. The best examples and questions actually enable people to engage with computing through their own curiosity and inquisitiveness. We are fascinated by computing; other people can be, too.

There is a huge push in the US these days for everyone to learn how to program. This creates a tension among many of us computer scientists, who know that programming isn't everything that we do and that its details can obscure CS as much as they illuminate it. I thought that Peyton Jones used a very nice analogy to express the relationship between programming and CS more broadly: Programming is to computer science as lab work is to physics. Yes, you could probably take lab work out of physics and still have physics, but doing so would eviscerate the discipline. It would also take away a lot of what draws people to the discipline. So it is with programming and computer science. But we have to walk a thin line, because programming is seductive and can ultimately distract us from the ideas that make programming so valuable in the first place.

Finally, I liked Peyton Jones's simple summary of the reasons that everyone should learn a little computer science:

  • Everyone should be able to create digital media, not just consume it.
  • Everyone should be able to understand their tools, not just use them.
  • People should know that technology is not magic.
That last item grows increasingly important in a world where the seeming magic of computers redefines every sector of our lives.

Oh, and yes, a few people will get jobs that use programming skills and computing knowledge. People in government and business love to hear that part.

Regular readers of this blog know that I am a sucker for aphorisms. Peyton Jones dropped a few on us, most earnestly when encouraging his audience to participate in the arduous task of introducing and reforming the teaching CS in the schools:

  • "If you wait for policy to change, you'll just grow old. Get on with it."
  • "There is no 'them'. There is only us."
(The second of these already had a home in my brain. My wife has surely tired of hearing me say something like it over the years.)

It's easy to admire great researchers who have invested so much time and energy into solving real-world problems, especially in our schools. As long as this post is, it covers only a few minutes from the middle of the talk. My selection and bare-bones outline don't do justice to Peyton Jones's presentation or his message. Go watch the talk when the video goes up. It was a great way to start Strange Loop.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 29, 2018 6:19 PM

Strange Loop 1: Day One

the Strange Loop splash screen from the main hall

Last Wednesday morning, I hopped in my car and headed south to Strange Loop 2018. It had been a few years since I'd listened to Zen and the Art of Motorcycle Maintenance on a conference drive, so I popped it into the tapedeck (!) once I got out of town and fell into the story. My top-level goal while listening to Zen was similar to my top-level goal for attending Strange Loop this year: to experience it at a high level; not to get bogged down in so many details that I lost sight of the bigger messages. Even so, though, a few quotes stuck in my mind from the drive down. The first is an old friend, one of my favorite lines from all of literature:

Assembly of Japanese bicycle require great peace of mind.

The other was the intellectual breakthrough that unified Phaedrus's philosophy:

Quality is not an object; it is an event.
This idea has been on my mind in recent months. It seemed a fitting theme, too, for Strange Loop.

On the first day of the conference, I saw mostly a mixture of compiler talks and art talks, including:

@mraleph's "Six Years of Dart", in which he reminisced on the evolution of the language, its ecosystem, and its JIT. I took at least one cool idea from this talk. When he compared the performance of two JITs, he gave a histogram comparing their relative performances, rather than an average improvement. A new system often does better on some programs and worse on others. An average not only loses information; it may mislead.

• Jason Dagit's "Your Secrets are Safe with Julia", about a system that explores the use of homomorphic encryption to to compile secure programs. In this context, the key element of security is privacy. As Dagit pointed out, "trust is not transitive", which is especially important when it comes to sharing a person's health data.

• I just loved Hannah Davis's talk on "Generating Music From Emotion". She taught me about data sonification and its various forms. She also demonstrated some of her attempts to tease multiple dimensions of human emotion out of large datasets and to use these dimensions to generate music that reflects the data's meaning. Very cool stuff. She also showed the short video Dragon Baby, which made me laugh out loud.

• I also really enjoyed "Hackett: A Metaprogrammable Haskell", by Alexis King. I've read about this project on the Racket mailing list for a few years and have long admired King's ability in posts there to present complex ideas clearly and logically. This talk did a great job of explaining that Haskell deserves a powerful macro system like Racket's, that Racket's macro system deserves a powerful type system like Haskell's, and that integrating the two is more challenging than simply adding a stage to the compiler pipeline.

I saw two other talks the first day:

  • the opening keynote address by Simon Peyton Jones, "Shaping Our Children's Education in Computing" [ link ]
  • David Schmüdde, "Misuser" [ link ]
My thoughts on these talks are more extensive and warrant short entries of their own, to follow.

I had almost forgotten how many different kinds of cool ideas I can encounter in a single day at Strange Loop. Thursday was a perfect reminder.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

September 20, 2018 4:44 PM

Special Numbers in a Simple Language

This fall I am again teaching our course in compiler development. Working in teams of two or three, students will implement from scratch a complete compiler for a simple functional language that consists of little more than integers, booleans, an if statement, and recursive functions. Such a language isn't suitable for much, but it works great for writing programs that do simple arithmetic and number theory. In the past, I likened it to an integer assembly language. This semester, my students are compiling a Pascal-like language of this sort that call Flair.

If you've read my blog much in the falls over the last decade or so, you may recall that I love to write code in the languages for which my students write their compilers. It makes the language seem more real to them and to me, gives us all more opportunities to master the language, and gives us interesting test cases for their scanners, parsers, type checkers, and code generators. In recent years I've blogged about some of my explorations in these languages, including programs to compute Farey numbers and excellent numbers, as well as trying to solve one of my daughter's AP calculus problems.

When I run into a problem, I usually get an itch to write a program, and in the fall I want to write it in my students' language.

Yesterday, I began writing my first new Flair program of the semester. I ran across this tweet from James Tanton, which starts:

N is "special" if, in binary, N has a 1s and b 0s and a & b are each factors of N (so non-zero).

So, 10 is special because:

  • In binary, 10 is 1010.
  • 1010 contains two 1s and two 0s.
  • Two is a factor of 10.

9 is not special because its binary rep also contains two 1s and two 0s, but two is not a factor of 9. 3 is not special because its binary rep has no 0s at all.

My first thought upon seeing this tweet was, "I can write a Flair program to determine if a number is special." And that is what I started to do.

Flair doesn't have loops, so I usually start every new program by mapping out the functions I will need simply to implement the definition. This makes sure that I don't spend much time implementing loops that I don't need. I ended up writing headers and default bodies for three utility functions:

  • convert a decimal number to binary
  • count the number of times a particular digits occurs in a number
  • determine if a number x divides evenly into a number n

With these helpers, I was ready to apply the definition of specialness:

    return divides(count(1, to_binary(n)), n)
       and divides(count(0, to_binary(n)), n)

Calling to_binary on the same argument is wasteful, but Flair doesn't have local variables, either. So I added one more helper to implement the design pattern "Function Call as Variable Assignment", apply_definition:

    function apply_definition(binary_n : integer, n : integer) : boolean
and called it from the program's main:
    return apply_definition(to_binary(n), n)

This is only the beginning. I still have a lot of work to do to implement to_binary, count and divides, using recursive function calls to simulate loops. This is another essential design pattern in Flair-like languages.

As I prepared to discuss my new program in class today, I found bug: My divides test was checking for factors of binary_n, not the decimal n. I also renamed a function and one of its parameters. Explaining my programs to students, a generalization of rubber duck debugging, often helps me see ways to make a program better. That's one of the reasons I like to teach.

Today I asked my students to please write me a Flair compiler so that I can run my program. The course is officially underway.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

September 05, 2018 3:58 PM

Learning by Copying the Textbook

Or: How to Learn Physics, Professional Golfer Edition

Bryson DeChambeau is a professional golfer, in the news recently for consecutive wins in the FedExCup playoff series. But he can also claim an unusual distinction as a student of physics:

In high school, he rewrote his physics textbook.

DeChambeau borrowed the textbook from the library and wrote down everything from the 180-page book into a three-ring binder. He explains: "My parents could have bought one for me, but they had done so much for me in golf that I didn't want to bother them in asking for a $200 book. ... By writing it down myself I was able to understand things on a whole comprehensive level.

I imagine that copying texts word-for-word was a more common learning strategy back when books were harder to come by, and perhaps it will become more common again as textbook prices rise and rise. There is certainly something to be said for it. Writing by hand takes time, and all the while our brains can absorb terms, make connections among concepts, and process the material into long-term memory. Zed Shaw argues for this as a great way to learn computer programming, implementing it as a pedagogical strategy in his "Learn <x> the Hard Way" series of books. (See Learn Python the Hard Way as an example.)

I don't think I've ever copied a textbook word-for-word, and I never copied computer programs from "Byte" magazine, but I do have similar experiences in note taking. I took elaborate notes all through high school, college, and grad school. In grad school, I usually rewrote all of my class notes -- by hand; no home PC -- as I reviewed them in the day or two after class. My clean, rewritten notes had other benefits, too. In a graduate graph algorithms course, they drew the attention of a classmate who became one of my best friends and were part of what attracted the attention of the course's professor, who asked me to consider joining his research group. (I was tempted... Graph algorithms was one of my favorite courses and research areas!)

I'm not sure many students these days benefit from this low-tech strategy. Most students who take detailed notes in my course seem to type rather than write which, if what I've read is correct, has fewer cognitive advantages. But at least those students are engaging with the material consciously. So few students seem to take detailed notes at all these days, and that's a shame. Without notes, it is harder to review ideas, to remember what they found challenging or puzzling in the moment, and to rehearse what they encounter in class into their long-term memories. Then again, maybe I'm just having a "kids these days" moment.

Anyway, I applaud DeChambeau for saving his parents a few dollars and for the achievement of copying an entire physics text. He even realized, perhaps after the fact, that it was an excellent learning strategy.

(The above passage is from The 11 Most Unusual Things About Bryson DeChambeau. He sounds like an interesting guy.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 31, 2018 3:06 PM

Reflection on a Friday

If you don't sit facing the window, you could be in any town.

I read that line this morning in Maybe the Cumberland Gap just swallows you whole, where it is a bittersweet observation of the similarities among so many dying towns across Appalachia. It's a really good read, mostly sad but a little hopeful, that applies beyond one region or even one country.

My mind is self-centered, though, and immediately reframed the sentence in a way that cast light on my good fortune.

I just downloaded a couple of papers on return-oriented programming so that I can begin working with an undergraduate on an ambitious research project. I have a homework assignment to grade sitting in my class folder, the first of the semester. This weekend, I'll begin to revise a couple of lectures for my compiler course, on NFAs and DFAs and scanning text. As always, there is a pile of department work to do on my desk and in my mind.

I live in Cedar Falls, Iowa, but if I don't sit facing the window, I could be in Ames or Iowa City, East Lansing or Durham, Boston or Berkeley. And I like the view out of my office window very much, thank you, so I don't even want to trade.

Heading into a three-day weekend, I realize again how fortunate I am. Do I put my good fortune to good enough use?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

August 17, 2018 2:19 PM

LangSec and My Courses for the Year

As I a way to get into the right frame of mind for the new semester and the next iteration of my compiler course, I read Michael Hicks's Software Security is a Programming Languages Issue this morning. Hicks incorporates software security into his courses on the principles of programming languages, with two lectures on security before having students study and use Rust. The article has links to lecture slides and supporting material, which makes it a post worth bookmarking.

I started thinking about adding LangSec to my course late in the spring semester, as I brainstormed topics that might spice the rest of the course up for both me and my students. However, time was short, so I stuck with a couple of standalone sessions on topics outside the main outline: optimization and concatenative languages. They worked fine but left me with an itch for something new.

I think I'll use the course Hicks and his colleagues teach as a starting point for figuring out how I might add to next spring's course. Students are interested in security, it's undoubtedly an essential issue for today's grads, and it is a great way to demonstrate how the design of programming languages is more than just the syntax of a loop or the lambda calculus.

Hicks's discussion of Rust also connects with my fall course. Two years ago, an advanced undergrad used Rust as the implementation language for his compiler. He didn't know the language but wanted to pair it with Haskell in his toolbox. The first few weeks of the project were a struggle as he wrestled with mastering ownership and figuring out some new programming patterns. Eventually he hit a nice groove and produced a working compiler with only a couple of small holes.

I was surprised how easy it was for me install the tools I needed to compile, test, and explore his code. That experience increased my interest in learning the language, too. Adding it to my spring course would give me the last big push I need to buckle down.

This summer has been a blur of administrative stuff, expected and unexpected. The fall semester brings the respite of work I really enjoy: teaching compilers and writing some code. Hurray!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 05, 2018 10:21 AM

Three Uses of the Knife

I just finished David Mamet's Three Uses of the Knife, a wide-ranging short book with the subtitle: "on the nature and purpose of drama". It is an extended essay on how we create and experience drama -- and how these are, in the case of great drama, the same journey.

Even though the book is only eighty or so pages, Mamet characterizes drama in so many ways that you'll have to either assemble a definition yourself or accept the ambiguity. Among them, he says that the job of drama and art is to "delight" us and that "the cleansing lesson of the drama is, at its highest, the worthlessness of reason."

Mamet clearly believes that drama is central to other parts of life. Here's a cynical example, about politics:

The vote is our ticket to the drama, and the politician's quest to eradicate "fill in the blank", is no different from the promise of the superstar of the summer movie to subdue the villain -- both promise us diversion for the price of a ticket and a suspension of disbelief.

As reader, I found myself using the book's points to ruminate about other parts of life, too. Consider the first line of the second essay:

The problems of the second half are not the problems of the first half.

Mamet uses this to launch into a consideration of the second act of a drama, which he holds equally to be a consideration of writing the second act of a drama. But with fall semester almost upon us, my thoughts jumped immediately to teaching a class. The problems of teaching the second half of a class are quite different from the problems of teaching the first half. The start of a course requires the instructor to lay the foundation of a topic while often convincing students that they are capable of learning it. By midterm, the problems include maintaining the students' interest as their energy flags and the work of the semester begins to overwhelm them. The instructor's energy -- my energy -- begins to flag, too, which echoes Mamet's claim that the journey of the creator and the audience are often substantially the same.

A theme throughout the book is how people immerse themselves in story, suspending their disbelief, even creating story when they need it to soothe their unease. Late in the book, he connects this theme to religious experience as well. Here's one example:

In suspending their disbelief -- in suspending their reason, if you will -- for a moment, the viewers [of a magic show] were rewarded. They committed an act of faith, or of submission. And like those who rise refreshed from prayers, their prayers were answered. For the purpose of the prayer was not, finally, to bring about intercession in the material world, but to lay down, for the time of the prayer, one's confusion and rage and sorrow at one's own powerlessness.

This all makes the book sound pretty serious. It's a quick read, though, and Mamet writes with humor, too. It feels light even as it seems to be a philosophical work.

The following paragraph wasn't intended as humorous but made me, a computer scientist, chuckle:

The human mind cannot create a progression of random numbers. Years ago computer programs were created to do so; recently it has been discovered that they were flawed -- the numbers were not truly random. Our intelligence was incapable of creating a random progression and therefore of programming a computer to do so.

This reminded me of a comment that my cognitive psychology prof left on the back of an essay I wrote in class. He wrote something to the effect, "This paper gets several of the particulars incorrect, but then that wasn't the point. It tells the right story well." That's how I felt about this paragraph: it is wrong on a couple of important facts, but it advances the important story Mamet is telling ... about the human propensity to tell stories, and especially to create order out of our experiences.

Oh, and thanks to Anna Gát for bringing the book to my attention, in a tweet to Michael Nielsen. Gát has been one of my favorite new follows on Twitter in the last few months. She seems to read a variety of cool stuff and tweet about it. I like that.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

July 28, 2018 11:37 AM

Three Things I Read This Morning

Why I Don't Love Gödel, Escher, Bach

I saw a lot of favorable links to this post a while back and finally got around to it. Meh. I generally agree with the author: GEB is verbose in places, there's a lot of unnecessary name checking, and the dialogues that lead off each chapter are often tedious. I even trust the author's assertion that Hofstadter's forays beyond math, logic, and computers are shallow.

So what? Things don't have to be perfect for me to like them, or for them to make me think. GEB was a swirl of ideas that caused me to think and helped me make a few connections. I'm sure if I read the book now that I would feel differently about it, but reading it when I did, as an undergrad CS major thinking about AI and the future, it energized me.

I do thank the author for his pointer (in a footnote) to Vi Hart's wonderful Twelve Tones. You should watch it. Zombie Schonberg!

The Web Aesthetic

This post wasn't quite what I expected, but even a few years old it has something to say to web designers today.

Everything on the web ultimately needs to degrade down to plain text (images require alt text; videos require transcripts), so the text editor might just become the most powerful app in the designer's toolbox.

XP Challenge: Compilers

People outside the XP community often don't realize how seriously the popularizers of XP explored the limitations of their own ideas. This page documents one of several challenges that push XP values and practices to the limits: When do they break down? Can they be adapted successfully to the task? What are the consequences of applying them in such circumstances?

Re-reading this old wiki page was worth it if only for this great line from Ron Jeffries:

The point of XP is to win, not die bravely.

Yes.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

July 08, 2018 10:47 AM

Computing Everywhere: In the Dugout and On the Diamond

How's this for a job description: "The successful candidate will be able to hit a fungo, throw batting practice, and program in SQL."

We decided that in the minor leagues, we would hire an extra coach at each level. The requirements for that coach were that he had to be able to hit a fungo, throw batting practice, and program in SQL. It's a hard universe to find where those intersect, but we were able to find enough of them--players who had played in college that maybe played one year in the minors who had a technical background and could understand analytics.

The technical skills are not enough by themselves, though. In order to turn a baseball franchise into a data-informed enterprise, you have to change the culture of the team in the trenches, working with the people who have to change their own behavior. Management must take the time necessary to guide the organization's evolution.

The above passage is from How the Houston Astros are winning through advanced analytics. I picked it up expecting a baseball article, or perhaps a data analytics article, but it reads like a typical McKinsey Report piece. It was an interesting read, but for different reasons than I had imagined.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 29, 2018 11:46 AM

Computer Science to the Second Degree

Some thoughts on studying computer science from Gian-Carlo Rota:

A large fraction of MIT undergraduates major in computer science or at least acquire extensive computer skills that are applicable in other fields. In their second year, they catch on to the fact that their required courses in computer science do not provide the whole story. Not because of deficiencies in the syllabus; quite the opposite. The undergraduate curriculum in computer science at MIT is probably the most progressive and advanced such curriculum anywhere. Rather, the students learn that side by side with required courses there is another, hidden curriculum consisting of new ideas just coming into use, new techniques and that spread like wildfire, opening up unsuspected applications that will eventually be adopted into the official curriculum.

Keeping up with this hidden curriculum is what will enable a computer scientist to stay ahead in the field. Those who do not become computer scientists to the second degree risk turning into programmers who will only implement the ideas of others.

MIT is, of course, an exceptional school, but I think Rota's comments apply to computer science at most schools. So much learning of CS happens in the spaces between courses: in the lab, in the student lounge, at meetings of student clubs, at part-time jobs, .... That can sometimes be a challenge for students who don't have much curiosity, or develop one as they are exposed to new topics.

As profs, we encourage students to be aware of all that is going on in computer science beyond the classroom and to take part in the ambient curriculum to the extent they are able. Students who become computer scientists only to the first degree can certainly find good jobs and professional success, but there are more opportunities open at the second degree. CS can also be a lot more fun there.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 17, 2018 9:52 AM

Sometimes, Evolution Does What No Pivot Can

From an old New Yorker article about Spotify:

YouTube, which is by far the largest streaming-music site in the world (it wasn't designed that way--that's just what it became)...

Companies starting in one line of business and evolving into something else is nothing new. I mean, The Connecticut Leather Company became Coleco and made video game consoles. But there's something about software that make this sort of evolution seem so normal. We build a computer system to solve one problem and find that our users -- who have needs and desires that neither we nor they fully comprehend -- use it to solve a different problem. Interesting times. Don't hem yourself in, and don't hem your software in, or the people who use it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 16, 2018 3:58 PM

Computing Everywhere: The Traveling Salesman Problem and Paris Fashion Week

I just read Pops, Michael Chabon's recent book of essays on fatherhood. The first essay, which originally appeared as an article in GQ, includes this parenthetical about his tour of Paris fashion week with his son:

-- a special mapping algorithm seemed to have been employed to ensure that every show was held as far as possible from its predecessor and its successor on the schedule --

My first thought was to approach this problem greedily: Start with the first show, then select a second show that is as far away as possible, then select a third show that is as far away as possible from that one, and so on, until all of the shows had been scheduled. But then I figured that a schedule so generated might seem laborious to travel at first, when there are plenty of faraway shows to choose from, but it might eventually start to seem pretty reasonable as the only shows left to schedule are relatively close.

We can generate a more wholly sort of unsatisfactory schedule by maximizing the total travel time of the circuit. That's the Traveling Salesman Problem, inverted. Taking this approach, our algorithm is quite simple. We start with the usual nxn matrix d, where d[i,j] equals the distance between shows i and j. Then:

  1. Replace the distance between every two show locations d[i,j], with -(d[i,j]), its additive inverse.
  2. Call your favorite TSP solver with the new graph.

Easy! I leave implementation of the individual steps as an exercise for the reader.

(By the way, Chabon's article is a sweet story about an already appreciative dad coming to appreciate his son even better. If you like that sort of thing, give it a read.)


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 29, 2018 3:41 PM

Software as Adaptation-Executer, Not Fitness-Maximizer

In Adaptation-Executers, not Fitness-Maximizers, Eliezer Yudkowsky talks about how evolution has led to human traits that may no longer be ideal in the our current environment. He also talks about tools, though, and this summary sentence made me think of programs:

So the screwdriver's cause, and its shape, and its consequence, and its various meanings, are all different things; and only one of these things is found within the screwdriver itself.

I often fall victim to thinking that the meaning of software is at least somewhat inherent in its code, but that really is what the designer intended as its use -- a mix of what Yudkowsky calls its cause and its consequence. These are things that exist only in the mind of the designer and the user, not in the computational constructs that constitute the code.

When approaching new software, especially a complex piece of code with many parts, it's helpful to remember that it doesn't really have objective meaning or consequences, only those intended by its designers and those exercised by its users. Over time, the users' conception tends to drive the designers' conception as they put the software to particular uses and even modify it to better suit these new uses.

Perhaps software is best thought of as an adaptation-executer, too, and not as a fitness-maximizer.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 27, 2018 10:20 AM

AI's Biggest Challenges Are Still To Come

Semantic Information Processing on my bookshelf

A lot of people I know have been discussing last week's NY Times op-ed about recent advances in neural networks and what they mean for AI. The article even sparked conversation among colleagues from my grad school research lab and among my PhD advisor's colleagues from when he was in grad school. It seems that many of us are frequently asked by non-CS folks what we think about recent advances in AI, from AlphaGo to voice recognition to self-driving cars. My answers sound similar to what some of my old friends say. Are we now afraid of AI being able to take over the world? Um, no. Do you think that the goals of AI are finally within reach? No. Much remains to be done.

I rate my personal interest in recent deep learning advances as meh. I'm not as down on the current work as the authors of the Times piece seem to be; I'm just not all that interested. It's wonderful as an exercise in engineering: building focused systems that solve a single problem. But, as the article points out, that's the key. These systems work in limited domains, to solve limited problems. When I want one of these problems to be solved, I am thankful that people have figured out how to solve and make it commercially available for us to use. Self-driving cars, for instance, have the potential to make the world safer and to improve the quality of my own life.

My interest in AI, though, has always been at a higher level: understanding how intelligence works. Are there general principles that govern intelligent behavior, independent of hardware or implementation? One of the first things to attract me to AI was the idea of writing a program that could play chess. That's an engineering problem in a very narrow domain. But I soon found myself drawn to cognitive issues: problem-solving strategies, reflection, explanation, conversation, symbolic reasoning. Cognitive psychology was one of my favorite courses in grad school in large part because it tried to connect low-level behaviors in the human brain connected to the symbolic level. AlphaGo is exceedingly cool as a game player, but it can't talk to me about Go, and for me that's a lot of the fun of playing.

In an email message earlier this week, my quick take on all this work was: We've forgotten the knowledge level. And for me, the knowledge level is what's most interesting about AI.

That one-liner oversimplifies things, as most one-liners do. The AI world hasn't forgotten the knowledge level so much as moved away from it for a while in order to capitalize on advances in math and processing power. The results have been some impressive computer systems. I do hope that the pendulum swings back soon as AI researchers step back from these achievements and builds some theories at the knowledge level. I understand that this may not be possible, but I'm not ready to give up on the dream yet.


Posted by Eugene Wallingford | Permalink | Categories: Computing

April 06, 2018 3:19 PM

Maps and Abstractions

I've been reading my way through Frank Chimero's talks online and ran across a great bit on maps and interaction design in What Screens Want. One of the paragraphs made me think about the abstractions that show up in CS courses:

When I realized that, a little light went off in my head: a map's biases do service to one need, but distort everything else. Meaning, they misinform and confuse those with different needs.

CS courses are full of abstractions and models of complex systems. We use examples, often simplified, to expose or emphasize a single facet a system, as a way to help students cut through the complexity. For example, compilers and full-strength interpreters are complicated programs, so we start with simple interpreters operating over simple languages. Students get their feet wet without drowning in detail.

In the service of trying not to overwhelm students, though, we run the risk of distorting how they think about the parts we left out. Worse, we sometimes distort even their thinking about the part we're focusing on, because they don't see its connections to the more complete picture. There is an art to identifying abstractions, creating examples, and sequencing instruction. Done well, we can minimize the distortions and help students come to understand the whole with small steps and incremental increases in size and complexity.

At least that's what I think on my good days. There are days and even entire semesters when things don't seem to progress as smoothly as I hope or as smoothly as past experience has led me to expect. Those days, I feel like I'm doing violence to an idea when I create an abstraction or adopt a simplifying assumption. Students don't seem to be grokking the terrain, so change the map. We try different problems or work through more examples. It's hard to find the balance sometimes between adding enough to help and not adding so much as to overwhelm.

The best teachers I've encountered know how to approach this challenge. More importantly, they seem to enjoy the challenge. I'm guessing that teachers who don't enjoy it must be frustrated a lot. I enjoy it, and even so there are times when this challenge frustrates me.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 29, 2018 3:05 PM

Heresy in the Battle Between OOP and FP

For years now, I've been listening to many people -- smart, accomplished people -- feverishly proclaim that functional programming is here to right the wrongs of object-oriented programming. For many years before that, I heard many people -- smart, accomplished people -- feverishly proclaim that object-oriented programming was superior to functional programming, an academic toy, for building real software.

Alas, I don't have a home in the battle between OOP and FP. I like and program in both styles. So it's nice whenever I come across something like Alan Kay's recent post on Quora, in response to the question, "Why is functional programming seen as the opposite of OOP rather than an addition to it?" He closes with a paragraph I could take on as my credo:

So: both OOP and functional computation can be completely compatible (and should be!). There is no reason to munge state in objects, and there is no reason to invent "monads" in FP. We just have to realize that "computers are simulators" and figure out what to simulate.

As in many things, Kay encourages to go beyond today's pop culture of programming to create a computational medium that incorporates big ideas from the beginning of our discipline. While we work on those ideas, I'll continue to write programs in both styles, and to enjoy them both. With any luck, I'll bounce between mindsets long enough that I eventually attain enlightenment, like the venerable master Qc Na. (See the koan at the bottom of that link.)

Oh: Kay really closes his post with

I will be giving a talk on these ideas in July in Amsterdam (at the "CurryOn" conference).

If that's not a reason to go to Amsterdam for a few days, I don't know what is. Some of the other speakers looks pretty good, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 12, 2018 3:43 PM

Technology is a Place Where We Live

Yesterday morning I read The Good Room, a talk Frank Chimero gave last month. Early on in the talk, Chimero says:

Let me start by stating something obvious: in the last decade, technology has transformed from a tool that we use to a place where we live.

This sentence jumped off the page both for the content of the assertion and for the decade time frame with which he bounds it. In the fall of 2003, I taught a capstone course for non-majors that is part of my university's liberal arts core. The course, titled "Environment, Technology, and Society", brings students from all majors on campus together in a course near the end of their studies, to apply their general education and various disciplinary expertises to problems of some currency in the world. As you might guess from the title, the course focuses on problems at the intersection of the natural environment, technology, and people.

My offering of the course put on a twist on the usual course content. We focused on the man-made environment we all live in, which even by 2003 had begun to include spaces carved out on the internet and web. The only textbook for the course was Donald Norman's The Design of Everyday Things, which I think every university graduate should have read. The topics for the course, though, had a decided IT flavor: the effect of the Internet on everyday life, e-commerce, spam, intellectual property, software warranties, sociable robots, AI in law and medicine, privacy, and free software. We closed with a discussion of what an educated citizen of the 21st century ought to know about the online world in which they would live in order to prosper as individuals and as a society.

The change in topic didn't excite everyone. A few came to the course looking forward to a comfortable "save the environment" vibe and were resistant to considering technology they didn't understand. But most were taking the course with no intellectual investment at all, as a required general education course they didn't care about and just needed to check off the list. In a strange way, their resignation enabled them to engage with the new ideas and actually ask some interesting questions about their future.

Looking back now after fifteen years , the course design looks pretty good. I should probably offer to teach it again, updated appropriately, of course, and see where young people of 2018 see themselves in the technological world. As Chimero argues in his talk, we need to do a better job building the places we want to live in -- and that we want our children to live in. Privacy, online peer pressure, and bullying all turned out differently than I expected in 2003. Our young people are worse off for those differences, though I think most have learned ways to live online in spite of the bad neighborhoods. Maybe they can help us build better places to live.

Chimero's talk is educational, entertaining, and quotable throughout. I tweeted one quote: "How does a city wish to be? Look to the library. A library is the gift a city gives to itself." There were many other lines I marked for myself, including:

  • Penn Station "resembles what Kafka would write about if he had the chance to see a derelict shopping mall." (I'm a big Kafka fan.)
  • "The wrong roads are being paved in an increasingly automated culture that values ease."
Check the talk out for yourself.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

March 06, 2018 4:11 PM

A Good Course in Epistemology

Theoretical physicist Marcelo Gleiser, in The More We Know, the More Mystery There Is:

But even if we did [bring the four fundamental forces together in a common framework], and it's a big if right now, this "unified theory" would be limited. For how could we be certain that a more powerful accelerator or dark matter detector wouldn't find evidence of new forces and particles that are not part of the current unification? We can't. So, dreamers of a final theory need to recalibrate their expectations and, perhaps, learn a bit of epistemology. To understand how we know is essential to understand how much we can know.

People are often surprised to hear that, in all my years of school, my favorite course was probably PHL 440 Epistemology, which I took in grad school as a cognate to my CS courses. I certainly enjoyed the CS courses I took as a grad student, and as an undergrad, too, and but my study of AI was enhanced significantly by courses in epistemology and cognitive psychology. The prof for PHL 440, Dr. Rich Hall, became a close advisor to my graduate work and a member of my dissertation committee. Dr. Hall introduced me to the work of Stephen Toulmin, whose model of argument influenced my work immensely.

I still have the primary volume of readings that Dr. Hall assigned in the course. Looking back now, I'd forgotten how many of W.V.O. Quine's papers we'd read... but I enjoyed them all. The course challenged most of my assumptions about what it means "to know". As I came to appreciate different views of what knowledge might be and how we come by it, my expectations of human behavior -- and my expectations for what AI could be -- changed. As Gleiser suggests, to understand how we know is essential to understanding what we can know, and how much.

Gleiser's epistemology meshes pretty well with my pragmatic view of science: it is descriptive, within a particular framework and necessarily limited by experience. This view may be why I gravitated to the pragmatists in my epistemology course (Peirce, James, Rorty), or perhaps the pragmatists persuaded me better than the others.

In any case, the Gleiser interview is a delightful and interesting read throughout. His humble of science may get you thinking about epistemology, too.

... and, yes, that's the person for whom a quine in programming is named. Thanks to Douglas Hofstadter for coining the term and for giving us programming nuts a puzzle to solve in every new language we learn.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns, Personal

February 26, 2018 3:55 PM

Racket Love

Racket -- "A Programmable Programming Language" -- is the cover story for next month's Communications of the ACM. The new issue is already featured on the magazine's home page, including a short video in which Matthias Felleisen explains the idea of code as more than a machine artifact.

My love of Racket is no surprise to readers of this blog. Still one of my favorite old posts here is The Racket Way, a write-up of my notes from Matthew Flatt's talk of the same name at StrangeLoop 2012. As I said in that post, this was a deceptively impressive talk. I think that's especially fitting, because Racket is a deceptively impressive language.

One last little bit of love from a recent message to the Racket users mailing list... Stewart Mackenzie describes his feelings about the seamless interweaving of Racket and Typed Racket via a #lang directive:

So far my dive into Racket has positive. It's magical how I can switch from untyped Racket to typed Racket simply by changing #lang. Banging out my thoughts in a beautiful lisp 1, wave a finger, then finger crack to type check. Just sublime.

That's what you get when your programming language is as programmable as your application.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

February 21, 2018 3:38 PM

Computer Programs Aren't Pure Abstractions. They Live in the World.

Guile Scheme guru Andy Wingo recently wrote a post about langsec, the idea that we can bake system security into our programs by using languages that support proof of correctness. Compilers can then be tools for enforcing security. Wingo is a big fan of the langsec approach but, in light of the Spectre and Meltdown vulnerabilities, is pessimistic that it really matter anymore. If bad actors can exploit the hardware that executes our programs, then proving that the code is secure doesn't do much good.

I've read a few blog posts and tweets that say Wingo is too pessimistic, that efforts to make our languages produce more secure code will still pay off. I think my favorite such remark, though, is a comment on Wingo's post itself, by Thomas Dullien:

I think this is too dark a post, but it shows a useful shock: Computer Science likes to live in proximity to pure mathematics, but it lives between EE and mathematics. And neglecting the EE side is dangerous - which not only Spectre showed, but which should have been obvious at the latest when Rowhammer hit.

There's actual physics happening, and we need to be aware of it.

It's easy for academics, and even programmers who work atop an endless stack of frameworks, to start thinking of programs as pure abstractions. But computer programs, unlike mathematical proofs, come into contact with real, live hardware. It's good to be reminded sometimes that computer science isn't math; it lives somewhere between math and engineering. That is good in so many ways, but it also has its downsides. We should keep that in mind.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 16, 2018 2:54 PM

Old Ideas and New Words

In this Los Angeles Review of Books interview, novelist Jenny Offill says:

I was reading a poet from the Tang dynasty... One of his lines from, I don't know, page 812, was "No new feelings". When I read that I laughed out loud. People have been writing about the same things since the invention of the written word. The only originality comes from the language itself.

After a week revising lecture notes and rewriting a recruiting talk intended for high school students and their parents, I know just what Offill and that Tang poet mean. I sometimes feel the same way about code.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

January 14, 2018 9:24 AM

Acceleration

This was posted on the Racket mailing list recently:

"The Little Schemer" starts slow for people who have programmed before, but seeing that I am only half-way through and already gained some interesting knowledge from it, one should not underestimate the acceleration in this book.

The Little Schemer is the only textbook I assign in my Programming Languages course. These students usually have only a little experience: often three semesters, two in Python and one in Java; sometimes just the two in Python. A few of the students who work in the way the authors intend have an A-ha! experience while reading it. Or maybe they are just lucky... Other students have only a WTF? experience.

Still, I assign the book, with hope. It's relatively inexpensive and so worth a chance that a few students can use it to grok recursion, along with a way of thinking about writing functions that they haven't seen in courses or textbooks before. The book accelerates from the most basic ideas of programming to "interesting" knowledge in a relatively short number of pages. Students who buy in to the premise, hang on for the ride, and practice the ideas in their own code soon find that they, too, have accelerated as programmers.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 28, 2017 8:46 AM

You Have to Learn That It's All Beautiful

In this interview with Adam Grant, Walter Jacobson talks about some of the things he learned while writing biographies of Benjamin Franklin, Albert Einstein, Steve Jobs, and Leonardo da Vinci. A common theme is that all four were curious and interested in a wide range of topics. Toward the end of the interview, Jacobson says:

We of humanities backgrounds are always doing the lecture, like, "We need to put the 'A' in 'STEM', and you've got to learn the arts and the humanities." And you get big applause when you talk about the importance of that.

But we also have to meet halfway and learn the beauty of math. Because people tell me, "I can't believe somebody doesn't know the difference between Mozart and Haydn, or the difference between Lear and Macbeth." And I say, "Yeah, but do you know the difference between a resistor and a transistor? Do you know the difference between an integral and a differential equation?" They go, "Oh no, I don't do math, I don't do science." I say, "Yeah, but you know what, an integral equation is just as beautiful as a brush stroke on the Mona Lisa." You've got to learn that they're all beautiful.

Appreciating that beauty made Leonardo a better artist and Jobs a better technologist. I would like for the students who graduate from our CS program to know some literature, history, and art and appreciate their beauty. I'd also like for the students who graduate from our university with degrees in literature, history, art, and especially education to have some knowledge of calculus, the Turing machine, and recombinant DNA, and appreciate their beauty.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

December 21, 2017 2:42 PM

A Writer with a Fondness for Tech

I've not read either of Helen DeWitt's novels, but this interview from 2011 makes her sound like a technophile. When struggling to write, she finds inspiration in her tools:

What is to be done?

Well, there are all sorts of technical problems to address. So I go into Illustrator and spend hours grappling with the pen tool. Or I open up the statistical graphics package R and start setting up plots. Or (purists will be appalled) I start playing around with charts in Excel.

... suddenly I discover a brilliant graphic solution to a problem I've been grappling with for years! How to display poker hands graphically in a way that sets a series of strong hands next to the slightly better hands that win.

Other times she feels the need for a prop, a lá Olivier:

I may have a vague idea about a character -- he is learning Japanese at an early age, say. But I don't know how to make this work formally, I don't know what to do with the narrative. I then buy some software that lets me input Japanese within my word-processing program. I start playing around, I come up with bits of Japanese. And suddenly I see that I can make visible the development of the character just by using a succession of kanji! I don't cut out text -- I have eliminated the need for 20 pages of text just by using this software.

Then she drops a hint about a work in progress, along with a familiar name:

Stolen Luck is a book about poker using Tuftean information design to give readers a feel for both the game and the mathematics.

Dewitt sounds like my kind of person. I wonder if I would like her novels. Maybe I'll try Lightning Rods first; it sounds like an easier read than The Last Samurai.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 24, 2017 12:30 PM

Thousand-Year Software

I recently read an old conversation between Neil Gaiman and Kazuo Ishiguro that started out as a discussion of genre but covered a lot of ground, including how stories mutate over time, and that the time scale of stories is so much larger than that of human lives. Here are a few of the passages about stories and time:

NG   Stories are long-lived organisms. They're bigger and older than we are.

NG   You sit there reading Pepys, and just for a minute, you kind of get to be 350, 400 years older than you are.

KI   There's an interesting emotional tension that comes because of the mismatch of lifespans in your work, because an event that might be tragic for one of us may not be so for the long-lived being.

KI   I'm often asked what my attitude is to film, theatrical, radio adaptations of my novels. It's very nice to have my story go out there, and if it's in a different form, I want the thing to mutate slightly. I don't want it to be an exact translation of my novel. I want it to be slightly different, because in a very vain kind of way, as a storyteller, I want my story to become like public property, so that it gains the status where people feel they can actually change it around and use it to express different things.

This last comment by Ishiguro made me think of open-source software. It can be adapted by anyone for almost any use. When we fork a repo and adapt it, how often does it grow into something new and considerably different? I often tell my compiler students about the long, mutated life of P-code, which was related by Chris Clark in a 1999 SIGPLAN Notices article:

P-code is an example [compiler intermediate representation] that took on a life of its own. It was invented by Nicklaus Wirth as the IL for the ETH Pascal compiler. Many variants of that compiler arose [Ne179], including the USCD Pascal compiler that was used at Stanford to define an optimizer [Cho83]. Chow's compiler evolved into the MIPS compiler suite, which was the basis for one of the DEC C compilers -- acc. That compiler did not parse the same language nor use any code from the ETH compiler, but the IL survived.

That's not software really, but a language processed by several generations of software. What are other great examples of software and languages that mutated and evolved?

We have no history with 100-year-old software yet, of course, let alone 300- or 1000-year-old software. Will we ever? Software is connected to the technology of a given time in ways that stories are not. Maybe, though, an idea that is embodied in a piece of software today could mutate and live on in new software or new technology many decades from now? The internet is a system of hardware and software that is already evolving into new forms. Will the world wide web continue to have life in a mutated form many years hence?

The Gaiman/Ishiguro conversation turned out to be more than I expected when I first found it. Good stuff. Oh, and as I wrap up this post, this passage resonates with me:

NG   I know that when I create a story, I never know what's going to work. Sometimes I will do something that I think was just a bit of fun, and people will love it and it catches fire, and sometimes I will work very hard on something that I think people will love, and it just fades: it never quite finds its people.

Been there, done that, my friend. This pretty well describes my experience blogging and tweeting all these years, and even writing for my students. I am a less reliable predictor of what will connect with readers than my big ego would ever have guessed.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 15, 2017 4:03 PM

A Programming Digression: Kaprekar Numbers

Earlier this week I learned about Kaprekar numbers when someone re-tweeted this my way:

Kaprekar numbers are numbers whose square in that base can be split into 2 parts that add up to the original number

So, 9 is a Kaprekar number, because 9 squared is 81 and 8+1 equals 9. 7777 is, too, because 7777 squared is 60481729 and 6048 + 1729 equals 7777.

This is the sort of numerical problem that is well-suited for the language my students are writing a compiler for this semester. I'm always looking out for fun little problems that I can to test their creations. In previous semesters, I've blogged about computing Farey sequences and excellent numbers for just this purpose.

Who am I kidding. I just like to program, even in small language that feels like integer assembly language, and these problems are fun!

So I sat down and wrote Klein functions to determine if a given number is a Kaprekar number and to generate all of the Kaprekar numbers less than a given number. I made one small change to the definition, though: I consider only numbers whose squares consist of an even-number of digits and thus can be split in half, a lá excellent numbers.

Until we have a complete compiler for our class language, I always like to write a reference program in a language such as Python so that I can validate my logic. I had a couple of meetings this morning, which gave just the time I needed to port my solution to a Klein-like subset of Python.

When I finished my program, I still had a few meeting minutes available, so I started generating longer and longer Kaprekar numbers. I noticed that there are a bunch more 6-digit Kaprekar numbers than at any previous length:

 1: 1
 2: 3
 3: 2
 4: 5
 5: 4
 6: 24

Homer Simpson says, 'D'oh!'

I started wondering why that might be... and then realized that there are a lot more 6-digit numbers overall than 5-digit -- ten times as many, of course. (D'oh!) My embarrassing moment of innumeracy didn't kill my curiosity, though. How does that 24 compare to the trend line of Kaprekar numbers by length?

 1: 1    of        9  0.11111111
 2: 3    of       90  0.03333333
 3: 2    of      900  0.00222222
 4: 5    of     9000  0.00055555
 5: 4    of    90000  0.00004444
 6: 24   of   900000  0.00002666

There is a recognizable drop-off at each length up to six, where the percentage is an order of magnitude different than expected. Are 6-digit numbers a blip or a sign of a change in the curve? I ran another round. This took much longer, because my Klein-like Python program has to compute operations like length recursively and has no data structures for caching results. Eventually, I had a count:

 7: 6    of  9000000  0.00000066

A big drop, back in line with the earlier trend. One more round, even slower.

 8: 21   of 90000000  0.00000023

Another blip in the rate of decline. This calls for some more experimentation... There is a bit more fun to have the next time I have a couple of meetings to fill.

Image: courtesy of the Simpsons wiki.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 04, 2017 3:56 PM

A Stroll Through the Gates CS Building

sneaking up on the Gates-Hillman Complex at Carnegie Mellon from Forbes St.
 
sneaking up on the Gates-Hillman Complex
from Forbes St., Pittsburgh, PA

I had a couple of hours yesterday between the end of the CS education summit and my shuttle to the airport. Rather than sit in front of a computer for two more hours, I decided to take advantage of my location, wander over to the Carnegie Mellon campus, and take a leisurely walk through the Gates Center for Computer Science. I'm glad I did.

At the beginning of my tour, I was literally walking in circles, from the ground-level entrance shown in its Wikipedia photo up to where the CS offices seem to begin, up on the fourth floor. This is one of those buildings that looks odd from the outside and is quite confusing on the inside, at least to the uninitiated. But everyone inside seemed to feel at home, so maybe it works.

It didn't take long before my mind was flooded by memories of my time as a doctoral student. Michigan State's CS program isn't as big as CMU's, but everywhere I looked I saw familiar images: Students sitting in their labs or in their offices, one or two or six at a time, hacking code on big monitors, talking shop, or relaxing. The modern world was on display, too, with students lounging comfy chairs or sitting in a little coffee shop, laptops open and earbuds in place. That was never my experience as a student, but I know it now as a faculty member.

I love to wander academic halls, in any department, really, and read what is posted on office doors and hallway walls. At CMU, I encountered the names of several people whose work I know and admire. They came from many generations... David Touretzky, whose Lisp textbook taught me a few things about programming. Jean Yang, whose work on programming languages I find cool. (I wish I were going to SPLASH later this month...) Finally, I stumbled across the office of Manuel Blum, the 1995 Turing Award winner. There were a couple of posters outside his door showing the work of his students on problems of cryptography and privacy, and on the door itself were several comic strips. The punchline of one read, "I'll retire when it stops being fun." On this, even I am in sync with a Turing Award winner.

Everywhere I turned, something caught my eye. A pointer to the Newell/Simon bridge... Newell-and-Simon, the team, were the like the Pied Piper to me when I began my study of AI. A 40- or 50-page printout showing two old researchers (Newell and Simon?) playing chess. Plaques in recognition of big donations that had paid for classrooms, labs, and auditoria, made by Famous People who were either students or faculty in the school.

CMU is quite different from my school, of course, but there are many other schools that give off a similar vibe. I can see why people want to be at an R-1, even if they aspire to be teachers more than research faculty. There is so much going on. People, labs, sub-disciplines, and interdisciplinary projects. Informal talks, department seminars, and outside speakers. Always something going on. Ideas. Energy.

On the ride to the airport later in the day, I sat in some slow, heavy traffic going one direction and saw slower, heavier traffic going in the other. As much as I enjoyed the visit, I was glad to be heading home.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

October 03, 2017 12:23 PM

Why Do CS Enrollments Surges End?

The opening talk of the CS Education Summit this week considered the challenges facing CS education in a time of surging enrollments and continued concerns about the diversity of the CS student population. In the session that followed, Eric Roberts and Jodi Tims presented data that puts the current enrollment surge into perspective, in advance of a report from the National Academy of Science.

In terms of immediate takeaway, Eric Roberts's comments were gold. Eric opened with Stein's Law: If something is unsustainable, it will stop. Stein was an economist whose eponymous law expresses one of those obvious truths we all seem to forget about in periods of rapid change: If something cannot go on forever, it won't. You don't have to create a program to make it stop. A natural corollary is: If it can't go on for long, you don't need a program to deal with it. It will pass soon.

Why is that relevant to the summit? Even without continued growth, current enrollments in CS majors is unsustainable for many schools. If the past is any guide, we know that many schools will deal with unsustainable growth by limiting the number of students who start or remain in their major.

Roberts has studied the history of CS boom-and-bust cycles over the last thirty years, and he's identified a few common patterns:

  • Limiting enrollments is how departments respond to enrollment growth. They must: the big schools can't hire faculty fast enough, and most small schools can't hire new faculty at all.

  • The number of students graduating with CS degrees drops because we limit enrollments. Students do not stop enrolling because the number of job opportunities goes down or any other cause.

    After the dot-com bust, there was a lot of talk about offshoring and automation, but the effects of that were short-term and rather small. Roberts's data shows that enrollment crashes do not follow crashes in job openings; they follow enrollment caps. Enrollments remain strong wherever they are not strictly limited.

  • When we limit enrollments, the effect is bigger on women and members of underserved communities. These students are more likely to suffer from impostor syndrome, stereotype bias, and other fears, and the increased competitiveness among students for fewer openings combines with discourages them from continuing.

So the challenge of booming enrollments exacerbates the challenge to increase diversity. The boom might decrease diversity, but when it ends -- and it will, if we limit enrollments -- our diversity rarely recovers. That's the story of the last three booms.

In order to grow capacity, the most immediate solution is to hire more professors. I hope to write more about that soon, but for now I'll mention only that the problem of hiring enough faculty to teach all of our students has at east two facets. The first is that many schools simply don't have the money to hire more faculty right now. The second is that there aren't enough CS PhDs to go around. Roberts reported that, of last year's PhD grads, 83% took positions at R1 schools. That leaves 17% for the rest of us. "Non-R1 schools can expect to hire a CS PhD every 27 years." Everyone laughed, but I could see anxiety on more than a few faces.

The value of knowing this history is that, when we go to our deans and provosts, we can do more than argue for more resources. We can show the effect of not providing the resources needed to teach all the students coming our way. We won't just be putting the brakes on local growth; we may be helping to create the next enrollment crash. At a school like mine, if we teach the people of our state that we can't handle their CS students, then the people of our state will send their students elsewhere.

The problem for any one university, of course, is that it can act only based on its own resources and under local constraints. My dean and provost might care a lot about the global issues of demand for CS grads and need for greater diversity among CS students. But their job is to address local issues with their own (small) pool of money.

I'll have to re-read the papers Roberts has written about this topic. His remarks certainly gave us plenty to think about, and he was as engaging as ever.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 02, 2017 12:16 PM

The Challenge Facing CS Education

Today and tomorrow, I am at a CS Education Summit in Pittsburgh. I've only been to Pittsburgh once before, for ICFP 2002 (the International Conference on Functional Programming) and am glad to be back. It's a neat city.

The welcome address for the summit was given by Dr. Farnam Jahanian, the interim president at Carnegie Mellon University. Jahanian is a computer scientist, with a background in distributed computing and network security. His resume includes a stint as chair of the CS department at the University of Michigan and a stint at the NSF.

Welcome addresses for conferences and workshops vary in quality. Jahanian gave quite a good talk, putting the work of the summit into historical and cultural context. The current boom in CS enrollments is happening at a time when computing, broadly defined, is having an effect in seemingly all disciplines and all sectors of the economy. What does that mean for how we respond to the growth? Will we see that the current boom presages a change to the historical cycle of enrollments in coming years?

Jahanian made three statements in particular that for me capture the challenge facing CS departments everywhere and serve as a backdrop for the summit:

  • "We have to figure out how to teach all of these students."

    Unlike many past enrollment booms, "all of these students" this time comprises two very different subsets: CS majors and non-majors. We have plenty of experience teaching CS majors, but how do you structure your curriculum and classes when you have three times as many majors? When numbers go up far enough fast enough, many schools have a qualitatively different problem.

    Most departments have far less experience teaching computer science (not "literacy") to non-majors. How do you teach all of these students, with different backgrounds and expectations and needs? What do you teach them?

  • "This is an enormous responsibility."

    Today's graduates will have careers for 45 years or more. That's a long time, especially in a world that is changing ever more rapidly, in large part due to our own discipline. How different are the long-term needs of CS majors and non-majors? Both groups will be working and living for a long time after they graduate. If computing remains a central feature of the world in the future, how we respond to enrollment growth now will have an outsized effect on every graduate. An enormous responsibility, indeed.

  • "We in CS have to think about impending cultural changes..."

    ... which means that we computer science folks will need to have education, knowledge, and interests much broader than just CS. People talk all the time about the value of the humanities in undergraduate education. This is a great example of why. One bit of good news: as near as I can tell, most of the CS faculty in this room, at this summit, do have interests and education bigger than just computer science (*). But we have to find ways to work these issues into our classrooms, with both majors and non-majors.

Thus the idea of a CS education summit. I'm glad to be here.

(*) In my experience, it is much more likely to find a person with a CS or math PhD and significant educational background in the humanities than to find a person with a humanities PhD and significant educational background in CS or math (or any other science, for that matter). One of my hopes for the current trend of increasing interest in CS among non-CS majors is that we an close this gap. All of the departments on our campuses, and thus all of our university graduates, will be better for it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 26, 2017 3:58 PM

Learn Exceptions Later

Yesterday, I mentioned rewriting the rules for computing FIRST and FOLLOW sets using only "plain English". As I was refactoring my descriptions, I realized that one of the reasons students have difficulty with many textbook treatments of the algorithms is that the books give complete and correct definitions of the sets upfront. The presence of X := ε rules complicates the construction of both sets, but they are unnecessary to understanding the commonsense ideas that motivate the sets. Trying to deal with ε too soon can interfere with the students learning what they need to learn in order to eventually understand ε!

When I left the ε rules out of my descriptions, I ended up with what I thought were an approachable set of rules:

  • The FIRST set of a terminal contains only the terminal itself.

  • To compute FIRST for a non-terminal X, find all of the grammar rules that have X on the lefthand side. Add to FIRST(X) all of the items in the FIRST set of the first symbol of each righthand side.

  • The FOLLOW set of the start symbol contains the end-of-stream marker.

  • To compute FOLLOW for a non-terminal X, find all of the grammar rules that have X on the righthand side. If X is followed by a symbol in the rule, add to FOLLOW(X) all of the items in the FIRST set of that symbol. If X is the last symbol in the rule, add to FOLLOW(X) all of the items in the FOLLOW set of the symbol on the rule's lefthand side.

These rules are incomplete, but they have offsetting benefits. Each of these cases is easy to grok with a simple example or two. They also account for a big chunk of the work students need to do in constructing the sets for a typical grammar. As a result, they can get some practice building sets before diving into the gnarlier details ε, which affects both of the main rules above in a couple of ways.

These seems like a two-fold application of the Concrete, Then Abstract pattern. The first is the standard form: we get to see and work with accessible concrete examples before formalizing the rules in mathematical notation. The second involves the nature of the problem itself. The rules above are the concrete manifestation of FIRST and FOLLOW sets; students can master them before considering the more abstract ε cases. The abstract cases are the ones that benefit most from using formal notation.

I think this is an example of another pattern that works well when teaching. We might call it "Learn Exceptions Later", "Handle Exceptions Later", "Save Exceptions For Later", or even "Treat Exceptions as Exceptions". (Naming things is hard.) It is often possible to learn a substantial portion of an idea without considering exceptions at all, and doing so prepares students for learning the exceptions anyway.

I guess I now have at least one idea for my next PLoP paper.

Ironically, writing this post brings to mind a programming pattern that puts exceptions up top, which I learned during the summer Smalltalk taught me OOP. Instead of writing code like this:

    if normal_case(x) then
       // a bunch
       // of lines
       // of code
       // processing x
    else
       throw_an_error
you can write:
    if abnormal_case(x) then
       throw_an_error

// a bunch // of lines // of code // processing x

This idiom brings the exceptional case to the top of the function and dispatches with it immediately. On the other hand, it also makes the normal case the main focus of the function, unindented and clear to the eye. It may look like this idiom violates the "Save Exceptions For Later" pattern, but code of this sort can be a natural outgrowth of following the pattern. First, we implement the function to do its normal business and makes sure that it handles all of the usual cases. Only then do we concern ourselves with the exceptional case, and we build it into the function with minimal disruption to the code.

This pattern has served me well over the years, far beyond Smalltalk.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

September 25, 2017 3:01 PM

A Few Thoughts on My Compilers Course

I've been meaning to blog about my compilers course for more than a month, but life -- including my compilers course -- have kept me busy. Here are three quick notes to prime the pump.

  • I recently came across Lindsey Kuper's My First Fifteen Compilers and thought again about this unusual approach to a compiler course: one compiler a week, growing last week's compiler with a new feature or capability, until you have a complete system. Long, long-time readers of this blog may remember me writing about this idea once over a decade ago.

    The approach still intrigues me. Kuper says that it was "hugely motivating" to have a working compiler at the end of each week. In the end I always shy away from the approach because (1) I'm not yet willing to adopt for my course the Scheme-enabled micro-transformation model for building a compiler and (2) I haven't figured out how to make it work for a more traditional compiler.

    I'm sure I'll remain intrigued and consider it again in the future. Your suggestions are welcome!

  • Last week, I mentioned on Twitter that I was trying to explain how to compute FIRST and FOLLOW sets using only "plain English". It was hard. Writing a textual description of the process made me appreciate the value of using and understanding mathematical notation. It is so expressive and so concise. The problem for students is that it is also quite imposing until they get it. Before then, the notation can be a roadblock on the way to understanding something at an intuitive level.

    My usual approach in class to FIRST and FOLLOW sets, as for most topics, is to start with an example, reason about it in commonsense terms, and only then to formalize. The commonsense reasoning often helps students understand the formal expression, thus removing some of its bite. It's a variant of the "Concrete, Then Abstract" pattern.

    Mathematical definitions such as these can motivate some students to develop their formal reasoning skills. Many people prefer to let students develop their "mathematical maturity" in math courses, but this is really just an avoidance mechanism. "Let the Math department fail them" may solve a practical problem, sometimes we CS profs have to bite the bullet and help our students get better when they need it.

  • I have been trying to write more code for the course this semester, both for my enjoyment (and sanity) and for use in class. Earlier, I wrote a couple of toy programs such as a Fizzbuzz compiler. This weekend I took a deeper dive and began to implement my students' compiler project in full detail. It was a lot of fun to be deep in the mire of a real program again. I have already learned and re-learned a few things about Python, git, and bash, and I'm only a quarter of the way in! Now I just have to make time to do the rest as the semester moves forward.

In her post, Kuper said that her first compiler course was "a lot of hard work" but "the most fun I'd ever had writing code". I always tell my students that this course will be just like that for them. They are more likely to believe the first claim than the second. Diving in, I'm remembering those feelings firsthand. I think my students will be glad that I dove in. I'm reliving some of the challenges of doing everything that I ask them to do. This is already generating a new source of empathy for my students, which will probably be good for them come grading time.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

August 14, 2017 1:42 PM

Papert 1: Mathophobia, Affect, Physicality, and Love

I have finally started reading Mindstorms. I hope to write short reflections as I complete every few pages or anytime I come across something I feel compelled to write about in the moment. This is the first entry in the imagined series.

In the introduction, Papert says:

We shall see again and again that the consequences of mathophobia go far beyond obstructing the learning of mathematics and science. The interact with other endemic "cultural toxins", for example, with popular theories of aptitudes, to contaminate peoples' images of themselves as learners. Difficulty with school math is often the first step of an invasive intellectual process that leads us all to define ourselves as bundles of aptitudes and ineptitudes, as being "mathematical" or "not mathematical", "artistic" or "not artistic", "musical" or "not musical", "profound" or "superficial", "intelligent" or "dumb". Thus deficiency becomes identity, and learning is transformed from the early child's free exploration of the world to a chore beset by insecurities and self-imposed restrictions.

This invasive intellectual process has often deeply affected potential computer science students long before they reach the university. I would love to see Papert's dream made real early enough that young people can imagine being a computer scientist earlier. It's hard to throw of the shackles after they take hold.

~~~~

The thing that sticks out as I read the first few pages of Mindstorms is its focus on the power of affect in learning. I don't recall conscious attention to my affect having much of a role in my education; it seems I was in a continual state of "cool, I get to learn something". I didn't realize at the time just what good fortune it was to have that as a default orientation.

I'm also struck by Papert's focus on the role of physicality in learning, how we often learn best when the knowledge has a concrete manifestation in our world. I'll have to think about this more... Looking back now, abstraction always seemed natural to me.

Papert's talk of love -- falling in love with the thing we learn about, but also with the thing we use to learn it -- doesn't surprise me. I know these feelings well, even from the earliest experiences I had in kindergarten.

An outside connection that I will revisit: Frank Oppenheimer's exploratorium, an aspiration I learned about from Alan Kay. What would a computational exploratorium look like?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 28, 2017 2:02 PM

The Need for Apprenticeship in Software Engineering Education

In his conversation with Tyler Cowen, Ben Sasse talks a bit about how students learn in our schools of public policy, business, and law:

We haven't figured out in most professional schools how to create apprenticeship models where you cycle through different aspects of what doing this kind of work will actually look like. There are ways that there are tighter feedback loops at a med school than there are going to be at a policy school. There are things that I don't think we've thought nearly enough about ways that professional school models should diverge from traditional, theoretical, academic disciplines or humanities, for example.

We see a similar continuum in what works best, and what is needed, for learning computer science and learning software engineering. Computer science education can benefit from the tighter feedback loops and such that apprenticeship provides, but it also has a substantial theoretical component that is suitable for classroom instruction. Learning to be a software engineer requires a shift to the other end of the continuum: we can learn important things, in the classroom, but much of the important the learning happens in the trenches, making things and getting feedback.

A few universities have made big moves in how they structure software engineering instruction, but most have taken only halting steps. They are often held back by a institutional loyalty to the traditional academic model, or out of sheer curricular habit.

The one place you see apprenticeship models in CS is, of course, graduate school. Students who enter research work in the lab under the mentorship of faculty advisors and more senior grad students. It took me a year or so in graduate school to figure out that I needed to begin to place more focus on my research ideas than on my classes. (I hadn't committed to a lab or an advisor yet.)

In lieu of a changed academic model, internships of the sort I mentioned recently can be really helpful for undergrad CS students looking to go into software development. Internships create a weird tension for faculty... Most students come back from the workplace with a new appreciation for the academic knowledge they learn in the classroom, which is good, but they also back to wonder why more of their schoolwork can't have the character of learning in the trenches. They know to want more!

Project-based courses are a way for us to bring the value of apprenticeship to the undergraduate classroom. I am looking forward to building compilers with ten hardy students this fall.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 11, 2017 3:17 PM

Blogging as "Loud Thinking"

This morning, I tweeted a quote from Sherry Turkle's Remembering Seymour Papert that struck a chord with a few people: "Seymour Papert saw that the computer would make it easier for thinking itself to become an object of thought." Here is another passage that struck a chord with me:

At the time of the juggling lesson, Seymour was deep in his experiments into what he called 'loud thinking'. It was what he was asking my grandfather to do. What are you trying? What are you feeling? What does it remind you of? If you want to think about thinking and the real process of learning, try to catch yourself in the act of learning. Say what comes to mind. And don't censor yourself. If this sounds like free association in psychoanalysis, it is. (When I met Seymour, he was in analysis with Greta Bibring.) And if it sounds like it could you get you into personal, uncharted, maybe scary terrain, it could. But anxiety and ambivalence are part of learning as well. If not voiced, they block learning.

It occurred to me that I blog as a form of "loud thinking". I don't write many formal essays or finished pieces for my blog these days. Mostly I share thoughts as they happen and think out loud about them in writing. Usually, it's just me trying to make sense of ideas that cross my path and see where they fit in with the other things I'm learning. I find that helpful, and readers sometimes help me by sharing their own thoughts and ideas.

When I first read the phrase "loud thinking", it felt awkward, but it's already growing on me. Maybe I'll try to get my compiler students to do some loud thinking this fall.

By the way, Turkle's entire piece is touching and insightful. I really liked the way she evoked Papert's belief that we "love the objects we think with" and "think with the objects we love". (And not just because I'm an old Smalltalk programmer!) I'll let you read the rest of the piece yourself to appreciate both the notion and Turkle's storytelling.

Now, for a closing confession: I have never read Mindstorms. I've read so much about Papert and his ideas over the years, but the book has never made it to the top of my stack. I pledge to correct this egregious personal shortcoming and read it as soon as I finish the novel on my nightstand. Maybe I'll think out loud about it here soon.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 30, 2017 1:35 PM

Be Honest About Programming; The Stakes Are High

It's become commonplace of late to promote programming as fun! and something everyone will want to learn, if only they had a chance. Now, I love to program, but I've also been teaching long enough to know that not everyone takes naturally to programming. Sometimes, they warm up to it later in their careers, and sometimes, they never do.

This Quartz article takes the conventional wisdom to task as misleading:

Insisting on the glamour and fun of coding is the wrong way to acquaint kids with computer science. It insults their intelligence and plants the pernicious notion in their heads that you don't need discipline in order to progress. As anyone with even minimal exposure to making software knows, behind a minute of typing lies an hour of study.

But the author does think that people should understand code and what it means to program, but not because they necessarily will program very much themselves:

In just a few years, understanding programming will be an indispensable part of active citizenship.

This is why it's important for people to learn about programming, and why it's so important not to sell it in a way that ambushes students when they encounter it for the first time. Software development is both technically and ethically challenging. All citizens will be better equipped to participate in the world if they understand these challenges at some level. Selling the challenges short makes it harder to attract people who might be interested in the ethical challenges and harder to retain people turned off by technical challenges they weren't expecting.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 25, 2017 10:05 AM

Work on Cool Hard Problems; It Pays Off Eventually

In The Secret Origin Story of the iPhone, we hear about the day Steve Jobs told Bas Ording, one of Apple's UI "wizards", ...

... to make a demo of scrolling through a virtual address book with multitouch. "I was super-excited," Ording says. "I thought, Yeah, it seems kind of impossible, but it would be fun to just try it." He sat down, "moused off" a phone-size section of his Mac's screen, and used it to model the iPhone surface. He and a scant few other designers had spent years experimenting with touch-based user interfaces -- and those years in the touchscreen wilderness were paying off.

I'm guessing that a lot of programmers understand what Ording felt in that moment: "It seems kind of impossible, but it would be fun to just try it..." But he and his team were ready. Sometimes, you work in the wilderness a while, and suddenly it all pays off.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 10, 2017 10:28 AM

98% of the Web in One Sentence

Via Pinboard's creator, the always entertaining Maciej Cegłowski:

Pinboard is not much more than a thin wrapper around some carefully tuned database queries.

You are ready to make your millions. Now all you need is an idea.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

June 08, 2017 12:10 PM

We Need a Course on Mundane Data Types

Earlier this month, James Iry tweeted:

Every CS degree covers fancy data structures. But what trips up more programmers? Times. Dates. Floats. Non-English text. Currencies.

I would like to add names to Iry's list. As a person who goes by his middle name and whose full legal name includes a suffix, I've seen my name mangled over the years in ways you might not imagine -- even by a couple of computing-related organizations that shall remain nameless. (Ha!) And my name presents only a scant few of the challenges available when we consider all the different naming conventions around the world.

This topic would make a great course for undergrads. We could call it "Humble Data Types" or "Mundane Data Types". My friends who program for a living know that these are practical data types, the ones that show up in almost all software and which consume an inordinate amount of time. That's why we see pages on the web about "falsehoods programmers believe" about time, names, and addresses -- another one for our list!

It might be hard to sell this course to faculty. They are notoriously reluctant to add new courses to the curriculum. (What would it displace?) Such conservatism is well-founded in a discipline that moves quickly through ideas, but this is a topic that has been vexing programmers for decades.

It would also be hard to sell the course to students, because it looks a little, well, mundane. I do recall a May term class a few years ago in which a couple of programmers spent days fighting with dates and times in Ruby while building a small accounting system. That certainly created an itch, but I'm not sure most students have enough experience with such practical problems before they graduate.

Maybe we could offer the course as continuing education for programmers out in the field. They are the ones who would appreciate it the most.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 07, 2017 1:43 PM

Write a Program, Not a Slide Deck

From Compress to Impress, on Jeff Bezos's knack for encoding important strategies in concise, memorable form:

As a hyper intelligent person, Jeff didn't want lossy compression or lazy thinking, he wanted the raw feed in a structured form, and so we all shifted to writing our arguments out as essays that he'd read silently in meetings. Written language is a lossy format, too, but it has the advantage of being less forgiving of broken logic flows than slide decks.

Ask any intro CS student: Even less forgiving of broken logic than prose is the computer program.

Programs are not usually the most succinct way to express an idea, but I'm often surprised by how little work it takes to express an idea about a process in code. When a program is a viable medium for communicating an idea, it provides value in many dimensions. You can run a program, which makes the code's meaning observable in its behavior. A program lays bare logic and assumptions, making them observable, too. You can tinker with a program, looking at variations and exploring their effects.

The next time you have an idea about a process, try to express it in code. A short bit prose may help, too.

None of this is intended to diminish the power of using rhetorical strategies to communicate at scale and across time, as described in the linked post. It's well worth a read. From the outside looking in, Bezos seems to be a remarkable leader.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 06, 2017 2:39 PM

Using Programs and Data Analysis to Improve Writing, World Bank Edition

Last week I read a tweet that linked to an article by Paul Romer. He is an economist currently working at the World Bank, on leave from his chair at NYU. Romer writes well, so I found myself digging deeper and reading a couple of his blog articles. One of them, Writing, struck a chord with me both as a writer and as a computer scientist.

Consider:

The quality of written prose should be higher in documents that will have many readers.

This is true of code, too. If a piece of code will be read many times, whether by one person or several, then each minute spent making it shorter and clearer improves reading comprehension every single time. That's even more important in code than in text, because so often we read code in order to change it. We need to understand it at even deeper level to ensure that our changes have the intended effect. Time spent making code better repays itself many times over.

Romer caused a bit of a ruckus when he arrived at the World Bank by insisting, to some of his colleagues' displeasure, that everyone in his division writer clearer, more concise reports. His goal was admirable: He wanted more people to be able to read and understand these reports, because they deal with policies that matter to the public.

He also wanted people to trust what the World Bank was saying by being able more readily to see that a claim was true or false. His article looks at two different examples that make a claim about the relationship between education spending and GDP per capita. He concludes his analysis of the examples with:

In short, no one can say that the author of the second claim wrote something that is false because no one knows what the second claim means.

In science, writing clearly builds trust. This trust is essential for communicating results to the public, of course, because members of the public do not generally possess the scientific knowledge they need to assess the truth of claim directly. But it is also essential for communicating results to other scientists, who must understand the claims at a deeper level in order to support, falsify, and extend them.

In the second half of the article, Romer links to a study of the language used in World Bank's yearly reports. It looks at patterns such as the frequency of the word "and" in the reports and the ratio of nouns to verbs. (See this Financial Times article for a fun little counterargument on the use of "and".)

Romer wants this sort of analysis to be easier to do, so that it can be used more easily to check and improve the World Bank's reports. After looking at some other patterns of possible interest, he closes with this:

To experiment with something like this, researchers in the Bank should be able to spin up a server in the cloud, download some open-source software and start experimenting, all within minutes.

Wonderful: a call for storing data in easy-to-access forms and a call for using (and writing) programs to analyze text, all in the name not of advancing economics technically but of improving its ability to communicate its results. Computing becomes a tool integrated into the process of the World Bank doing its designated job. We need more leaders in more disciplines thinking this way. Fortunately, we hear reports of such folks more often these days.

Alas, data and programs were not used in this way when Romer arrived at the World Bank:

When I arrived, this was not possible because people in ITS did not trust people from DEC and, reading between the lines, were tired of dismissive arrogance that people from DEC displayed.

One way to create more trust is to communicate better. Not being dismissively arrogant is, too, though calling that sort of behavior out may be what got Romer in so much hot water with the administrators and economists at the World Bank in the first place.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

May 21, 2017 10:07 AM

Computer Programs Have Much to Learn, and Much to Teach Us

In his recent interview with Tyler Cowen, Garry Kasparov talks about AI, chess, politics, and the future of creativity. In one of the more intriguing passages, he explains that building databases for chess endgames has demonstrated how little we understand about the game and offers insight into how we know that chess-playing computer programs -- now so far beyond humans that even the world champion can only score occasionally against commodity programs -- still have a long way to improve.

He gives as an example a particular position with a king, two rooks, a knight on one side versus a king and two rooks on the other. Through the retrograde analysis used to construct endgame databases, we know that, with ideal play by both sides, the stronger side can force checkmate in 490 moves. Yes, 490. Kasparov says:

Now, I can tell you that -- even being a very decent player -- for the first 400 moves, I could hardly understand why these pieces moved around like a dance. It's endless dance around the board. You don't see any pattern, trust me. No pattern, because they move from one side to another.

At certain points I saw, "Oh, but white's position has deteriorated. It was better 50 moves before." The question is -- and this is a big question -- if there are certain positions in these endgames, like seven-piece endgames, that take, by the best play of both sides, 500 moves to win the game, what does it tell us about the quality of the game that we play, which is an average 50 moves? [...]

Maybe with machines, we can actually move our knowledge much further, and we can understand how to play decent games at much greater lengths.

But there's more. Do chess-playing computer programs, so much superior to even the best human players, understand these endgames either? I don't mean "understand" in the human sense, but only in the sense of being able to play games of that quality. Kasparov moves on to his analysis of games between the best programs:

I think you can confirm my observations that there's something strange in these games. First of all, they are longer, of course. They are much longer because machines don't make the same mistakes [we do] so they could play 70, 80 moves, 100 moves. [That is] way, way below what we expect from perfect chess.

That tells us that [the] machines are not perfect. Most of those games are decided by one of the machines suddenly. Can I call it losing patience? Because you're in a position that is roughly even. [...] The pieces are all over, and then suddenly one machine makes a, you may call, human mistake. Suddenly it loses patience, and it tries to break up without a good reason behind it.

That also tells us [...] that machines also have, you may call it, psychology, the pattern and the decision-making. If you understand this pattern, we can make certain predictions.

Kasparov is heartened by this, and it's part of the reason that he is not as pessimistic about the near-term prospects of AI as some well-known scientists and engineers are. Even with so-called deep learning, our programs are only beginning to scratch the surface of complexity in the universe. There is no particular reason to think that the opaque systems evolved to drive our cars and fly our drones will be any more perfect in their domains than our game-playing programs, and we have strong evidence from the domain of games that programs are still far from perfect.

On a more optimistic note, advances in AI give us an opportunity to use programs to help us understand the world better and to improve our own judgment. Kasparov sees this in chess, in the big gaps between the best human play, the best computer play, and perfect play in even relatively simple positions; I wrote wistfully about this last year, prompted by AlphaGo's breakthrough. But the opportunity is much more valuable when we move beyond playing games, as Cowen alluded in an aside during Kasparov's explanation: Imagine how bad our politics will look in comparison to computer programs that do it well! We have much to learn.

As always, this episode of Conversations with Tyler was interesting and evocative throughout. If you are a chess player, there is an special bonus. The transcript includes a pointer to Kasparov's Immortal Game against Veselin Topalov at Wijk aan Zee in 1999, along with a discussion of some of Kasparov's thoughts on the game beginning with the pivotal move 24. Rxd4. This game, an object of uncommon beauty, will stand as an eternal reminder why, even in the face of advancing AI, it will always matter that people play and compete and create.

~~~~

If you enjoyed this entry, you might also like Old Dreams Live On. It looks more foresightful now that AlphaGo has arrived.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 28, 2017 11:27 AM

Data Compression and the Complexity of Consciousness

Okay, so this is cool:

Neuroscientists stimulate the brain with brief pulses of energy and then record the echoes that bounce back. Dreamless sleep and general anaesthesia return simple echoes; brains in conscious states produce more complex patterns. Then comes a little inspiration from data compression:

Excitingly, we can now quantify the complexity of these echoes by working out how compressible they are, similar to how simple algorithms compress digital photos into JPEG files. The ability to do this represents a first step towards a "consciousness-meter" that is both practically useful and theoretically motivated.

This made me think of Chris Ford's StrangeLoop 2015 talk about using compression to understand music. Using compressibility as a proxy for complexity gives us a first opportunity to understand all sorts of phenomena about which we are collecting data. Kolmogorov complexity is a fun tool for programmers to wield.

The passage above is from an Aeon article on the study of consciousness. I found it an interesting read from beginning to end.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

March 29, 2017 4:16 PM

Working Through A Problem Manually

This week, I have been enjoying Eli Bendersky's two-article series "Adventures in JIT Compilation":

Next I'll follow his suggestion and read the shorter How to JIT - An Introduction.

Bendersky is a good teacher, at least in the written form, and I am picking up a lot of ideas for my courses in programming languages and compilers. I recommend his articles and his code highly.

In Part 2, Bendersky says something that made me think of my students:

One of my guiding principles through the field of programming is that before diving into the possible solutions for a problem (for example, some library for doing X) it's worth working through the problem manually first (doing X by hand, without libraries). Grinding your teeth over issues for a while is the best way to appreciate what the shrinkwrapped solution/library does for you.

The presence or absence of this attitude is one of the crucial separators among CS students. Some students come into the program with this mindset already in place, and they are often the ones who advance most quickly in the early courses. Other students don't have this mindset, either by interest or by temperament. They prefer to solve problems quickly using canned libraries and simple patterns. These students are often quite productive, but they sometimes soon hit a wall in their learning. When a student rides along the surface of what they are told in class, never digging deeper, they tend to have a shallow knowledge of how things work in their own programs. Again, this can lead to a high level of productivity, but it also produces brittle knowledge. When something changes, or the material gets more difficult, they begin to struggle. A few of the students eventually develop new habits and move nicely into the group of students who likes to grind. The ones who don't make the transition continue to struggle and begin to enjoy their courses less.

There is a rather wide variation among undergrad CS students, both in their goals and in their preferred styles or working and learning. This variation is one of the challenges facing profs who hope to reaching the full spectrum of students in their classes. And helping students to develop new attitudes toward learning and doing is always a challenge.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 22, 2017 4:50 PM

Part of the Fun of Programming

As I got ready for class yesterday morning, I decided to refactor a piece of code. No big deal, right? It turned out to be a bigger deal than I expected. That's part of the fun of programming.

The function in question is a lexical addresser for a little language we use as a specimen in my Programming Languages course. My students had been working on a design, and it was time for us to build a solution as a group. Looking at my code from the previous semester, I thought that changing the order of two cases would make for a better story in class. The cases are independent, so I swapped them and ran my tests.

The change broke my code. It turns out that the old "else" clause had been serving as a convenient catch-all and was only working correctly due to an error in another function. Swapping the cases exposed the error.

Ordinarily, this wouldn't be a big deal, either. I would simply fix the code and give my students a correct solution. Unfortunately, I had less than an hour before class, so I now found myself in a scramble to find the bug, fix it, and make the changes to my lecture notes that had motivated the refactor in the first place. Making changes like this under time pressure is rarely a good idea... I was tempted to revert to the previous version, teach class, and make the change after class. But I am a programmer, dogged and often foolhardy, so I pressed on. With a few minutes to spare, I closed the editor on my lecture notes and synced the files to my teaching machine. I was tired and still had a little nervous energy coursing through me, but I felt great. That's part of the fun of programming.

I will say this: Boy, was I glad to have my test suite! It was incomplete, of course, because I found an error in my program. But the tests I did have helped me to know that my bug fix had not broken something else unexpectedly. The error I found led to several new tests that make the test suite stronger.

This experience was fresh in my mind this morning when I read "Physics Was Paradise", an interview with Melissa Franklin, a distinguished experimental particle physicist at Harvard. At one point, Franklin mentioned taking her first physics course in high school. The interviewer asked if physics immediately stood out as something she would dedicate her life to. Franklin responded:

Physics is interesting, but it didn't all of a sudden grab me because introductory physics doesn't automatically grab people. At that time, I was still interested in being a writer or a philosopher.

I took my first programming class in high school and, while I liked it very much, it did not cause me to change my longstanding intention to major in architecture. After starting in the architecture program, I began to sense that, while I liked architecture and had much to learn from it, computer science was where my future lay. Maybe somewhere deep in my mind was memory of an experience like the one I had yesterday, battling a piece of code and coming out with a sense of accomplishment and a desire to do battle again. I didn't feel the same way when working on problems in my architecture courses.

Intro CS, like intro physics, doesn't always snatch people away from their goals and dreams. But if you enjoy the fun of programming, eventually it sneaks up on you.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development

March 18, 2017 11:42 AM

Hidden Figures, Douglas Engelbart Edition

At one point in this SOHP interview, Douglas Engelbart describes his work at the Ames Research Center after graduating from college. He was an electrical engineer, building and maintaining wind tunnels, paging systems, and other special electronics. Looking to make a connection between this job and his future work, the interviewer asked, "Did they have big computers running the various operations?" Engelbart said:

I'll tell you what a computer was in those days. It was an underpaid woman sitting there with a hand calculator, and they'd have rooms full of them, that's how they got their computing done. So you'd say, "What's your job?" "I'm a computer."

Later in the interview, Engelbart talks about how his experience working with radar in the Navy contributed to his idea for a symbol-manipulating system that could help people deal with complexity and urgency. He viewed the numeric calculations done by the human computers at Ames as being something different. Still, I wonder how much this model of parallel computing contributed to his ideas, if only implcitly.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 17, 2017 9:27 AM

What It Must Feel Like to be Ivan Sutherland

In The Victorian Internet, Tom Standage, says this about the unexpected challenge facing William Cooke and Samuel Morse, the inventors of the telegraph:

[They] had done the impossible and constructed working telegraphs. Surely the world would fall at their feet. Building the prototypes, however, turned out to be the easy part. Convincing people of their significance was far more of a challenge.

That must be what it feels like to be Ivan Sutherland. Or Alan Kay, for that matter.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 16, 2017 8:50 AM

Studying Code Is More Like Natural Science Than Reading

A key passage from Peter Seibel's 2014 essay, Code Is Not Literature:

But then it hit me. Code is not literature and we are not readers. Rather, interesting pieces of code are specimens and we are naturalists. So instead of trying to pick out a piece of code and reading it and then discussing it like a bunch of Comp Lit. grad students, I think a better model is for one of us to play the role of a 19th century naturalist returning from a trip to some exotic island to present to the local scientific society a discussion of the crazy beetles they found: "Look at the antenna on this monster! They look incredibly ungainly but the male of the species can use these to kill small frogs in whose carcass the females lay their eggs."

The point of such a presentation is to take a piece of code that the presenter has understood deeply and for them to help the audience understand the core ideas by pointing them out amidst the layers of evolutionary detritus (a.k.a. kludges) that are also part of almost all code. One reasonable approach might be to show the real code and then to show a stripped down reimplementation of just the key bits, kind of like a biologist staining a specimen to make various features easier to discern.

My scientist friends often like to joke that CS isn't science, even as they admire the work that computer scientists and programmers do. I think Seibel's essay expresses nicely one way in which studying software really is like what natural scientists do. True, programs are created by people; they don't exist in the world as we find it. (At least programs in the sense of code written by humans to run on a computer.) But they are created under conditions that look a lot more like biological evolution than, say, civil engineering.

As Hal Abelson says in the essay, most real programs end up containing a lot of stuff just to make it work in the complex environments in which they operate. The extraneous stuff enables the program to connect to external APIs and plug into existing frameworks and function properly in various settings. But the extraneous stuff is not the core idea of the program.

When we study code, we have to slash our way through the brush to find this core. When dealing with complex programs, this is not easy. The evidence of adaptation and accretion obscures everything we see. Many people do what Seibel does when they approach a new, hairy piece of code: they refactor it, decoding the meaning of the program and re-coding it in a structure that communicates their understanding in terms that express how they understand it. Who knows; the original program may well have looked like this simple core once, before it evolved strange appendages in order to adapt to the condition in which it needed to live.

The folks who helped to build the software patterns community recognized this. They accepted that every big program "in the wild" is complex and full of cruft. But they also asserted that we can study such programs and identify the recurring structures that enable complex software both to function as intended and to be open to change and growth at the hands of programmers.

One of the holy grails of software engineering is to find a way to express the core of a system in a clear way, segregating the extraneous stuff into modules that capture the key roles that each piece of cruft plays. Alas, our programs usually end up more like the human body: a mass of kludges that intertwine to do some very cool stuff just well enough to succeed in a harsh world.

And so: when we read code, we really do need to bring the mindset and skills of a scientist to our task. It's not like reading People magazine.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

February 19, 2017 10:42 AM

Bret Victor's Advice for Reading Alan Kay

In Links 2013, Bret Victor offers these bits of advice for how to read Alan Kay's writings and listen to his talks:

As you read and watch Alan Kay, try not to think about computational technology, but about a society that is fluent in thinking and debating in the dimensions opened up by the computational medium.

Don't think about "coding" (that's ink and metal type, already obsolete), and don't think about "software developers" (medieval scribes only make sense in an illiterate society).

I have always been inspired and challenged by Kay's work. One of second-order challenges I face is to remember that his vision is not ultimately about people like me writing programs. It's about a culture in which every person can use computational media the way we all use backs of envelopes, sketch books, newspapers, and books today. Computation can change the way we think and exchange ideas.

Then again, it's hard enough to teach CS students to program. That is a sign that we still have work to do in understanding programming better, and also in thinking about the kind of tools we build and use. In terms of Douglas Engelbart's work, also prominently featured among Victor's 2013 influences -- we need to build tools to improve how we program before we can build tools to "improve the improving".

Links 2013 could be the reading list for an amazing seminar. There are no softballs there.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 09, 2017 4:25 PM

Knowing, Doing, and Ubiquitous Information

I was recently reading an old bit-player entry on computing number factoids when I ran across a paragraph that expresses an all-too-common phenomenon of the modern world:

If I had persisted in my wrestling match, would I have ultimately prevailed? I'll never know, because in this era of Google and MathOverflow and StackExchange, a spoiler lurks around every cybercorner. Before I could make any further progress, I stumbled upon pointers to the work of Ira Gessel of Brandeis, who neatly settled the matter ... more than 40 years ago, when he was an undergraduate at Harvard.

The matter in this case was recognizing whether an arbitrary n is a Fibonacci number or not, but it could be have been just about anything. If you need an answer to almost any question these days, it's already out there, right a your fingertips.

Google and StackExchange and MathOverflow are a boon for knowing, but not so much for doing. Unfortunately, doing often leads to a better kind of knowing. Jumping directly to the solution can rob us of some important learning. As Hayes reminds us in his articles, it also can also deprive us of a lot of fun.

You can still learn by doing and have a lot of fun doing it today -- if you can resist the temptation to search. After you struggle for a while and need some help, then having answers at our fingertips becomes a truly magnificent resource and can help us get over humps we could never have gotten over so quickly in even the not-the-so-distant past.

The new world puts a premium on curiosity, the desire to find answers for ourselves. It also values self-denial, the ability to delay gratification while working hard to find answer that we might be able to look up. I fear that this creates a new gap for us to worry about in our education systems. Students who are curious and capable of self-denial are a new kind of "haves". They have always had a leg up in schools, but ubiquitous information magnifies the gap.

Being curious, asking questions, and wanting to create (not just look up) answers have never been more important to learning.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 11, 2017 2:22 PM

An Undergraduate Research Project for Spring

Coming into the semester, I didn't have any students doing their undergraduate research under my supervision. That frees up some time each week, which is nice, but leaves my semester a bit poorer. Working with students one-on-one is one of the best parts of this job, even more so in relief against administrative duties. Working on these projects makes my weeks better, even when I don't have as much time to devote to them as I'd like.

Yesterday, a student walked in with a project that makes my semester a little busier -- and much more interesting. Last summer, he implemented some ideas on extensible effects in Haskell and has some ideas for ways to make the system more efficient.

This student knows a lot more about extensible effects and Haskell than I do, so I have some work to do just to get ready to help. I'll start with Extensible Effects: An Alternative to Monad Transformers, the paper by Oleg Kiselyov and his colleagues that introduced the idea to the wider computing community. This paper builds on work by Cartwright and Felleisen, published over twenty years ago, which I'll probably look at, too. The student has a couple of other things for me to read, which will appear in his more formal proposal this week. I expect that these papers will make my brain hurt, in the good way, and am looking forward to diving in.

In the big picture, most undergrad projects in my department are pretty minor as research goes. They are typically more D than R, with students developing something that goes beyond what they learn in any course and doing a little empirical analysis. The extensible effects project is much more ambitious. It builds on serious academic research. It works on a significant problem and proposes something new. That makes the project much more exciting for me as the supervisor.

I hope to report more later, as the semester goes on.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 10, 2017 4:07 PM

Garbage Collection -- and the Tempting Illusion of Big Breakthroughs

Good advice in this paragraph, paraphrased lightly from Modern Garbage Collection:

Garbage collection is a hard problem, really hard, one that has been studied by an army of computer scientists for decades. Be very suspicious of supposed breakthroughs that everyone else missed. They are more likely to just be strange or unusual tradeoffs in disguise, avoided by others for reasons that may only become apparent later.

It's wise always to be on the lookout for "strange or unusual tradeoffs in disguise".


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

January 08, 2017 9:48 AM

Finding a Balance Between Teaching and Doing

In the Paris Review's The Art of Fiction No. 183, the interviewer asks Tobias Wolff how he balances writing with university teaching. Wolff figures that teaching is a pretty good deal:

When I think about the kinds of jobs I've had and the ways I've lived, and still managed to get work done--my God, teaching in a university looks like easy street. I like talking about books, and I like encountering other smart, passionate readers, and feeling the friction of their thoughts against mine. Teaching forces me to articulate what otherwise would remain inchoate in my thoughts about what I read. I find that valuable, to bring things to a boil.

That reflects how I feel, too, as someone who loves to do computer science and write programs. As a teacher, I get to talk about cool ideas every day with my students, to share what I learn as I write software, and to learn from them as they ask the questions I've stopped asking myself. And they pay me. It's a great deal, perhaps the optimal point in the sort of balance that Derek Sivers recommends.

Wolff immediately followed those sentences with a caution that also strikes close to home:

But if I teach too much it begins to weigh on me--I lose my work. I can't afford to do that anymore, so I keep a fairly light teaching schedule.

One has to balance creative work with the other parts of life that feed the work. Professors at research universities, such as Wolff at Stanford, have different points of equilibrium available to them than profs at teaching universities, where course loads are heavier and usually harder to reduce.

I only teach one course a semester, which really does help me to focus creative energies around a smaller set of ideas than a heavier load does. Of course, I also have the administrative duties of a department head. They suffocate time and energy in a much less productive way than teaching does. (That's the subject of another post.)

Why can't Wolff afford to teach too many courses anymore? I suspect the answer is time. When you reach a certain age, you realize that time is no longer an ally. There are only so many years left, and Wolff probably feels the need to write more urgently. This sensation has been seeping into my mind lately, too, though I fear perhaps a bit too slowly.

~~~~

(I previously quoted Wolff from the same interview in a recent entry about writers who give advice that reminds us that there is no right way to write all programs. A lot of readers seemed to like that one.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

December 26, 2016 8:38 AM

Learn By Programming

The latest edition of my compiler course has wrapped, with grades submitted and now a few days distance between us and the work. The course was successful in many ways, even though not all of the teams were all able to implement the entire compiler. That mutes the students' sense of accomplishment sometimes, but it's not unusual for at least some of the teams to have trouble implementing a complete code generator. A compiler is a big project. Fifteen weeks is not a lot of time. In that time, students learn a lot about compilers, and also about how to work as a team to build a big program using some of the tools of modern software development. In general, I was quite proud of the students' efforts and progress. I hope they were proud of themselves.

One of the meta-lessons students tend to learn in this course is one of the big lessons of any project-centered course:

... making something is a different learning experience from remembering something.

I think that a course like this one also helps most of them learn something else even more personal:

... the discipline in art-making is exercised from within rather than without. You quickly realize that it's your own laziness, ignorance, and sloppiness, not somebody else's bad advice, that are getting in your way. No one can write your [program] for you. You have to figure out a way to write it yourself. You have to make a something where there was a nothing.

"Laziness", "ignorance", and "sloppiness" seem like harsh words, but really they aren't. They are simply labels for weaknesses that almost all of us face when we first learn to create things on our own. Anyone who has written a big program has probably encountered them in some form.

I learned these lessons as a senior, too, in my university's two-term project course. It's never fun to come up short of our hopes or expectations. But most of us do it occasionally, and never more reliably than we are first learning how to make something significant. It is good for us to realize early on our own responsibility for how we work and what we make. It empowers us to take charge of our behavior.

Black Mountain College's Lake Eden campus

The quoted passages are, with the exception of the word "program", taken from Learn by Painting, a New Yorker article about "Leap Before You Look: Black Mountain College, 1933-1957", an exhibit at the Institute of Contemporary Art in Boston. Black Mountain was a liberal arts college with a curriculum built on top of an unusual foundation: making art. Though the college lasted less than a quarter century, its effects were felt across most of art disciplines in the twentieth century. But its mission was bigger: to educate citizens, not artists, through the making art. Making something is a different learning experience from remembering something, and BMC wanted all of its graduates to have this experience.

The article was a good read throughout. It closes with a comment on Black Mountain's vision that touches on computer science and reflects my own thinking about programming. This final paragraph begins with a slight indignity to us in CS but turns quickly into an admiration:

People who teach in the traditional liberal-arts fields today are sometimes aghast at the avidity with which undergraduates flock to courses in tech fields, like computer science. Maybe those students see dollar signs in coding. Why shouldn't they? Right now, tech is where value is being created, as they say. But maybe students are also excited to take courses in which knowing and making are part of the same learning process. Those tech courses are hands-on, collaborative, materials-based (well, virtual materials), and experimental -- a digital Black Mountain curriculum.

When I meet with prospective students and their parents, I stress that, while computer science is technical, it is not vocational. It's more. Many high school students sense this already. What attracts them to the major is a desire to make things: games and apps and websites and .... Earning potential appeals to some of them, of course, but students and parents alike seem more interested in something else that CS offers them: the ability to make things that matter in the modern world. They want to create.

The good news suggested in "Learn by Painting", drawing on the Black Mountain College experiment, is that learning by making things is more than just that. It is a different and, in most ways, more meaningful way to learn about the world. It also teaches you a lot about yourself.

I hope that at least a few of my students got that out of their project course with me, in addition to whatever they learned about compilers.

~~~~

IMAGE. The main building of the former Black Mountain College, on the grounds of Camp Rockmont, a summer camp for boys. Courtesy of Wikipedia. Public domain.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 21, 2016 2:31 PM

Retaining a Sense of Wonder

A friend of mine recently shared a link to Radio Garden on a mailing list (remember those?), and in the ensuing conversation, another friend wrote:

I remember when I was a kid playing with my Dad's shortwave radio and just being flabbergasted when late one night I tuned in a station from Peru. Today you can get on your computer and communicate instantly with any spot on the globe, and that engenders no sense of wonder at all.

Such is the nature of advancing technology. Everyone becomes acclimated to amazing new things, and pretty soon they aren't even things any more.

Teachers face a particularly troublesome version of this phenomenon. Teach a subject for a few years, and pretty soon it loses its magic for you. It's all new to your students, though, and if you can let them help you see it through their eyes, you can stay fresh. The danger, though, is that it starts to look pretty ordinary to you, even boring, and you have a hard time helping them feel the magic.

If you read this blog much, you know that I'm pretty easy to amuse and pretty easy to make happy. Even so, I have to guard against taking life and computer science for granted.

Earlier this week, I was reading one reading one of the newer tutorials in Matthew Butterick's Beautiful Racket, Imagine a language: wires. In it, he builds a DSL to solve one of the problems in the 2015 edition of Advent of Code, Some Assembly Required. The problem is fun, specifying a circuit in terms of a small set of operations for wires and gates. Butterick's approach to solving it is fun, too: creating a DSL that treats the specification of a circuit as a program to interpret.

This is no big deal to a jaded old computer scientist, but remember -- or imagine -- what this solution must seem like to a non-computer scientist or to a CS student encountering the study of programming languages for the first time. With a suitable interpreter, every dataset is a program. If that isn't amazing enough, some wires datasets introduce sequencing problems, because the inputs to a gate are defined in the program after the gate. Butterick uses a simple little trick: define wires and gates as functions, not data. This simple little trick is really a big idea in disguise: Functions defer computation. Now circuit programs can be written in any order and executed on demand.

Even after all these years, computing's most mundane ideas can still astonish me sometimes. I am trying to keep my sense of wonder high and to renew it whenever it starts to flag. This is good for me, and good for my students.

~~~~

P.S. As always, I recommend Beautiful Racket, and Matthew Butterick's work more generally, quite highly. He has a nice way of teaching useful ideas in a way that appreciates their beauty.

P.P.S. The working title of this entry was "Paging Louis C.K., Paging Louis C.K." That reference may be a bit dated by now, but still it made me smile.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 16, 2016 2:14 PM

Language and Thinking

Earlier this week, Rands tweeted:

Tinkering is a deceptively high value activity.

... to which I followed up:

Which is why a language that enables tinkering is a deceptively high value tool.

I thought about these ideas a couple of days later when I read The Running Conversation in Your Head and came across this paragraph:

The idea is not that you need language for thinking but that when language comes along, it sure is useful. It changes the way you think, it allows you to operate in different ways because you can use the words as tools.

This is how I think about programming in general and about new, and better, programming languages in particular. A programmer can think quite well in just about any language. Many of us cut our teeth in BASIC, and simply learning how to think computationally allowed us to think differently than we did before. But then we learn a radically different or more powerful language, and suddenly we are able to think new thoughts, thoughts we didn't even conceive of in quite the same way before.

It's not that we need the new language in order to think, but when it comes along, it allows us to operate in different ways. New concepts become new tools.

I am looking forward to introducing Racket and functional programming to a new group of students this spring semester. First-class functions and higher-order functions can change how students think about the most basic computations such as loops and about higher-level techniques such as OOP. I hope to do a better job this time around helping them see the ways in which it really is different.

To echo the Running Conversation article again, when we learn a new programming style or language, "Something really special is created. And the thing that is created might well be unique in the universe."


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 12, 2016 3:15 PM

Computer Science Is Not That Special

I'm reminded of a student I met with once who told me that he planned to go to law school, and then a few minutes later, when going over a draft of a lab report, said "Yeah... Grammar isn't really my thing." Explaining why I busted up laughing took a while.

When I ask prospective students why they decided not to pursue a CS degree, they often say things to the effect of "Computer science seemed cool, but I heard getting a degree in CS was a lot of work." or "A buddy of mine told me that programming is tedious." Sometimes, I meet these students as they return to the university to get a second degree -- in computer science. Their reasons for returning vary from the economic (a desire for better career opportunities) to personal (a desire to do something that they have always wanted to do, or to pursue a newfound creative interest).

After you've been in the working world a while, a little hard work and some occasional tedium don't seem like deal breakers any more.

Such conversations were on my mind as I read physicist Chad Orzel's recent Science Is Not THAT Special. In this article, Orzel responds to the conventional wisdom that becoming a scientist and doing science involve a lot of hard work that is unlike the exciting stuff that draws kids to science in the first place. Then, when kids encounter the drudgery and hard work, they turn away from science as a potential career.

Orzel's takedown of this idea is spot on. (The quoted passage above is one of the article's lighter moments in confronting the stereotype.) Sure, doing science involves a lot of tedium, but this problem is not unique to science. Getting good at anything requires a lot of hard work and tedious attention to detail. Every job, every area of expertise, has its moments of drudgery. Even the rare few who become professional athletes and artists, with careers generally thought of as dreams that enable people to earn a living doing the thing they love, spend endless hours engaged in the drudgery of practicing technique and automatizing physical actions that become their professional vocabulary.

Why do we act as if science is any different, or should be?

Computer science gets this rap, too. What could be worse than fighting with a compiler to accept a program while you are learning to code? Or plowing threw reams of poorly documented API descriptions to plug your code into someone's e-commerce system?

Personally, I can think of lots of things that are worse. I am under no illusion, however, that other professionals are somehow shielded from such negative experiences. I just prefer my pains to theirs.

Maybe some people don't like certain kinds of drudgery. That's fair. Sometimes we gravitate toward the things whose drudgery we don't mind, and sometimes we come to accept the drudgery of the things we love to do. I'm not sure which explains my fascination with programming. I certainly enjoy the drudgery of computer science more than that of most other activities -- or at least I suffer it more gladly.

I'm with Orzel. Let's be honest with ourselves and our students that getting good at anything takes a lot of hard work and, once you master something, you'll occasionally face some tedium in the trenches. Science, and computer science in particular, are not that much different from anything else.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 09, 2016 1:54 PM

Two Quick Hits with a Mathematical Flavor

I've been wanting to write a blog entry or two lately about my compiler course and about papers I've read recently, but I've not managed to free up much time as semester winds down. That's one of the problems with having Big Ideas to write about: they seem daunting and, at the very least, take time to work through.

So instead here are two brief notes about articles that crossed my newsfeed recently and planted themselves in my mind. Perhaps you will enjoy them even without much commentary from me.

A Student's Unusual Proof Might Be A Better Proof

I asked a student to show that between any two rationals is a rational.

She did the following: if x < y are rational then take δ << y-x and rational and use x+δ.

I love the student's two proofs in article! Student programmers are similarly creative. Their unusual solutions often expose biases in my thinking and give me way to think about a problem. If nothing else, they help to understand better how students think about ideas that I take for granted.

Numberless Word Problems

Some girls entered a school art competition. Fewer boys than girls entered the competition.

She projected her screen and asked, "What math do you see in this problem?"

Pregnant pause.

"There isn't any math. There aren't any numbers."

I am fascinated by the possibility of adapting this idea to teaching students to think like a programmer. In an intro course, for example, students struggle with computational ideas such as loops and functions even though they have a lot of experience with these ideas embodied in their daily lives. Perhaps the language we use gets in the way of them developing their own algorithmic skills. Maybe I could use computationless word problems to get them started?

I'm giving serious thought to ways I might use this approach to help students learn functional programming in my Programming Languages course this spring. The authors describes how to write numberless word problems, and I'm wondering how I might bring the philosophy to computer science. If you have any ideas, please share!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

November 03, 2016 3:53 PM

TIL Today: The Power of Limiting Random Choices

This morning I read this blog post by Dan Luu on the use of randomized algorithms in cache eviction. He mentions a paper by Mitzenmacher, Richa, and Sitaraman called The Power of Two Random Choices that explains a cool effect we see when we limit our options when choosing among multiple options. Luu summarizes the result:

The mathematical intuition is that if we (randomly) throw n balls into n bins, the maximum number of balls in any bin is O(log n / log log n) with high probability, which is pretty much just O(log n). But if (instead of choosing randomly) we choose the least loaded of k random bins, the maximum is O(log log n / log k) with high probability, i.e., even with two random choices, it's basically O(log log n) and each additional choice only reduces the load by a constant factor.

Luu used this result to create a cache eviction policy that outperforms random eviction across the board and competes closely with the traditional LRU policy. It chooses two pages at random and evicts the least-recently used of the two. This so-called 2-random algorithm slightly outperforms LRU in larger caches and slightly underperforms LRU in smaller caches. This trade-off may be worth making because, unlike LRU, random eviction policies degrade gracefully as loops get large.

The power of two random choices has potential application in any context that fits the balls-and-bins model, including load balancing. Luu mentions less obvious application areas, too, such as circuit routing.

Very cool indeed. I'll have to look at Mitzenmacher et al. to see how the math works, but first I may try the idea out in some programs. For me, the programming is even more fun...


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 01, 2016 4:04 PM

An Adventure with C++ Compilers

I am a regular reader of John Regehr's blog, which provides a steady diet of cool compiler conversation. One of Regehr's frequent topics is undefined behavior in programming languages, and what that means for implementing and testing compilers. A lot of those blog entries involve C and C++, which I don't use all that often any more, so reading them is more spectator sport than contact sport.

This week, I got see how capricious C++ compilers can feel up close.

My students are implementing a compiler for a simple subset of a Pascal-like language. We call the simplest program in this language print-one:

    $ cat print-one.flr
    program main();
      begin
        return 1
      end.

One of the teams is writing their compiler in C++. The team completed its most recent version, a parser that validates its input or reports an error that renders its input invalid. They were excited that it finally worked:

    $ g++ -std=c++14 compiler.cpp -o compiler
    $ compiler print-one.flr 
    Valid flair program

They had tested their compiler on two platforms:

  • a laptop running OS X v10.11.5 and gcc v4.9.3
  • a desktop running Ubuntu v14.04 and gcc v4.8.4

I sat down at my desktop computer to exercise their compiler.

    $ g++ compiler.cpp -o compiler
    In file included from compiler.cpp:7:
    In file included from ./parser.cpp:3:
    In file included from ./ast-utilities.cpp:4:
    ./ast-utilities.hpp:7:22: warning: in-class initialization of non-static data
          member is a C++11 extension [-Wc++11-extensions]
        std::string name = "Node";
                    ^
    [...]
    24 warnings generated.

Oops, I forgot the -std=c++14 flag. Still, it compiled, and all of the warnings come from a part of the code has no effect on program validation. So I tried the executable:

    $ compiler print-one.flr 
    ERROR at line #3 -- unexpected <invalid>  1
    Invalid flair program

Hmm. The warnings are unrelated to part of the executable that I am testing, but maybe they are creating a problem. So I recompile with the flag:

    $ g++ -std=c++14 compiler.cpp -o compiler
    error: invalid value 'c++14' in '-std=c++14'

What? I check my OS and compiler specs:

    $ sw_vers -productVersion
    10.9.5
    $ g++ --version
    Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
    Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)
    [...]

Oh, right, Apple doesn't ship gcc any more; it ships clang and link gcc to the clang exe. I know my OS is a bit old, but it still seems odd that the -std=c++14 flag isn't supported. I google for an answer (thanks, StackOverflow!) and find that that I need to use -std=c++1y. Okay:

    $ g++ -std=c++1y compiler.cpp -o compiler
    $ compiler print-one.flr 
    ERROR at line #3 -- unexpected <invalid>  1
    Invalid flair program

Now the student compiler compiles but gives incorrect, or at least unintended, behavior. I'm surprised that both my clang and the students' gcc compile their compiler yet produce executables that give different answers. I know that gcc and clang aren't 100% compatible, but my students are using a relatively small part of C++. How can this be?

Maybe it has something to do with how clang processes the c++1y standard flag. So I backed up to the previous standard:

    $ g++ -std=c++0x compiler.cpp -o compiler
    $ compiler print-one.flr 
    ERROR at line #3 -- unexpected <invalid>  1
    Invalid flair program

Yes, that's c++0x, not c++0y. The student compiler still compiles and still gives incorrect or unintended behavior. Maybe it is a clang problem? I upload their code to our student server, which runs Linux and gcc:

    $ cat /etc/debian_version 
    8.1
    $ g++ --version
    [...]
    gcc version 4.7.2 (Debian 4.7.2-5)

This version of gcc doesn't support either c++14 or c++1?, so I fell back to c++0x:

    $ g++ -std=c++0x compiler.cpp -o compiler
    $ compiler print-one.flr 
    Valid flair program

Hurray! I can test their code.

I'm curious. I have a Macbook Pro running a newer version of OS X. Maybe...

    $ sw_vers -productVersion
    ProductName:Mac OS X
    ProductVersion:10.10.5
    BuildVersion:14F2009
    $ g++ --version
    Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/usr/include/c++/4.2.1
    Apple LLVM version 7.0.2 (clang-700.1.81)
    [...]

$ g++ -std=c++14 compiler.cpp -o compiler $ compiler print-one.flr Valid flair program

Now, the c++14 flag works, and it produces a compiler that produces the correct behavior -- or at least the intended behavior.

I am curious about this anomaly, but not curious enough to research the differences between clang and gcc, the differences between the different versions of clang, or what Apple or Debian are doing. I'm also not curious enough to figure out which nook of C++ my students have stumbled into that could expose a rift in the behavior of these various C++ compilers, all of which are standard tools and pretty good.

At least now I remember what it's like to program in a language with undefined behavior and can read Regehr's blog posts with a knowing nod of the head.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 22, 2016 2:00 PM

Competence and Creating Conditions that Minimize Mistakes

I enjoyed this interview with Atul Gawande by Ezra Klein. When talking about making mistakes, Gawande notes that humans have enough knowledge to cut way down on errors in many disciplines, but we do not always use that knowledge effectively. Mistakes come naturally from the environments in which we work:

We're all set up for failure under the conditions of complexity.

Mistakes are often more a matter of discipline and attention to detail than a matter of knowledge or understanding. Klein captures the essence of Gawande's lesson in one of his questions:

We have this idea that competence is not making mistakes and getting everything right. [But really...] Competence is knowing you will make mistakes and setting up a context that will help reduce the possibility of error but also help deal with the aftermath of error.

In my experience, this is a hard lesson for computer science students to grok. It's okay to make mistakes, but create conditions where you make as few as possible and in which you can recognize and deal with the mistakes as quickly as possible. High-discipline practices such as test-first and pair programming, version control, and automated builds make a lot more sense when you see them from this perspective.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development, Teaching and Learning

October 17, 2016 4:10 PM

Using Programming to Learn Math, Even at the University

There is an open thread on the SIGCSE mailing list called "Forget the language wars, try the math wars". Faculty are discussing how to justify math requirements on a CS major, especially for students who "just want to be programmers". Some people argue that math requirements are a barrier to recruiting students who can succeed in computer science, in particular calculus.

Somewhere along the line, Cay Horstmann wrote a couple of things I agree with. First, he said that he didn't want to defend the calculus requirement because most calculus courses do not teach students how to think "but how to follow recipes". I have long had this complaint about calculus, especially as it's taught in most US high schools and universities. Then he wrote something more positive:

What I would like to see is teaching math alongside with programming. None of my students were able to tell me what sine and cosine were good for, but when they had to program a dial [in a user interface], they said "oh".

Couldn't that "oh" have come earlier in their lives? Why don't students do programming in middle school math? I am not talking large programs--just a few lines, so that they can build models and intuition.

I agree wholeheartedly. And even if students do not have such experiences in their K-12 math classes, the least we could do help them have that "oh" experience earlier in their university studies.

My colleagues and I have been discussing our Discrete Structures course now for a few weeks, including expected outcomes, its role as a prerequisite to other courses, and how we teach it. I have suggested that one of the best ways to learn discrete math is to connect it with programs. At our university, students have taken at least one semester of programming (currently, in Python) before they take Discrete. We should use that to our advantage!

A program can help make an abstract idea concrete. When learning about set operations, why do only paper-and-pencil exercises when you can use simple Python expressions in the REPL? Yes, adding programming to the mix creates new issues to deal with, but if designed well, such instruction could both improve students' understanding of discrete structures -- as Horstmann says, helping them build models and intuition -- and give students more practice writing simple programs. An ancillary benefit might be to help students see that computer scientists can use computation to help them learn new things, thus preparing for habits that can extend to wider settings.

Unfortunately, the most popular Discrete Structures textbooks don't help much. They do try to use CS-centric examples, but they don't seem otherwise to use the fact that students are CS majors. I don't really blame them. They are writing for a market in which students study many different languages in CS 1, so they can't (and shouldn't) assume any particular programming language background. Even worse, the Discrete Structures course appears at different places throughout the CS curriculum, which means that textbooks can't assume even any particular non-language CS experience.

Returning to Horstmann's suggestion to augment math instruction with programming in K-12, there is, of course, a strong movement nationally to teach computer science in high school. My state has been disappointingly slow to get on board, but we are finally seeing action. But most of the focus in this nationwide movement is on teaching CS qua CS, with less interest in emphasis on integrating CS into math and other core courses.

For this reason, let us again take a moment to thank the people behind the Bootstrap project for leading the charge in this regard, helping teachers use programming in Racket to teach algebra and other core topics. They are even evaluating the efficacy of the work and showing that the curriculum works. This may not surprise us in CS, but empirical evidence of success is essential if we hope to get teacher prep programs and state boards of education to take the idea seriously.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 06, 2016 2:46 PM

Computers Shouldn't Need a Restart Button (Memories of Minix)

An oldie but goodie from Andrew Tanenbaum:

Actually, MINIX 3 and my research generally is **NOT** about microkernels. It is about building highly reliable, self-healing, operating systems. I will consider the job finished when no manufacturer anywhere makes a PC with a reset button. TVs don't have reset buttons. Stereos don't have reset buttons. Cars don't have reset buttons. They are full of software but don't need them. Computers need reset buttons because their software crashes a lot. I know that computer software is different from car software, but users just want them both to work and don't want lectures why they should expect cars to work and computers not to work. I want to build an operating system whose mean time to failure is much longer than the lifetime of the computer so the average user never experiences a crash.

I remember loving MINIX 1 (it was just called Minix then, of course) when I first learned it in grad school. I did not have any Unix experience coming out of my undergrad and had only begun to feel comfortable with BSD Unix in my first few graduate courses. Then I was assigned to teach the Operating Systems course, working with one of the CS faculty. He taught me a lot, but so did Tanenbaum -- through Minix. That is one of the first times I came to really understand that the systems we use (the OS, the compiler, the DBMS) were just programs that I could tinker with, modify, and even write.

Operating systems is not my area, and I have no expertise for evaluating the whole microkernel versus monolith debate. But I applaud researchers like Tanenbaum who are trying to create general computer systems that don't need to be rebooted. I'm a user, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

September 28, 2016 3:16 PM

Language, and What It's Like To Be A Bat

My recent post about the two languages resonated in my mind with an article I finished reading the day I wrote the post: Two Heads, about the philosophers Paul and Pat Churchland. The Churchlands have been on a forty-year quest to change the language we use to describe our minds, from popular terms based in intuition and introspection to terms based in the language of neuroscience. Changing language is hard under any circumstances, and it is made harder when the science they need is still in its infancy. Besides, maybe more traditional philosophers are right and we need our traditional vocabulary to make sense of what it feels like to be human?

The New Yorker article closes with these paragraphs, which sounds as if they are part of a proposal for a science fiction novel:

Sometimes Paul likes to imagine a world in which language has disappeared altogether. We know that the two hemispheres of the brain can function separately but communicate silently through the corpus callosum, he reasons. Presumably, it will be possible, someday, for two separate brains to be linked artificially in a similar way and to exchange thoughts infinitely faster and more clearly than they can now through the muddled, custom-clotted, serially processed medium of speech. He already talks about himself and Pat as two hemispheres of the same brain. Who knows, he thinks, maybe in his children's lifetime this sort of talk will not be just a metaphor.

If, someday, two brains could be joined, what would be the result? A two-selved mutant like Joe-Jim, really just a drastic version of Siamese twins, or something subtler, like one brain only more so, the pathways from one set of neurons to another fusing over time into complex and unprecedented arrangements? Would it work only with similar brains, already sympathetic, or, at least, both human? Or might a human someday be joined to an animal, blending together two forms of thinking as well as two heads? If so, a philosopher might after all come to know what it is like to be a bat, although, since bats can't speak, perhaps he would be able only to sense its batness without being able to describe it.

(Joe-Jim is a character from a science fiction novel, Robert Heinlein's Orphans of the Sky.)

What a fascinating bit of speculation! Can anyone really wonder why kids are drawn to science fiction?

Let me add my own speculation to the mix: If we do ever find a way to figure out what it's like to be a bat, people will find a way to idescribe what it's like to be a bat. They will create the language they need. Making language is what we do.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns

September 06, 2016 2:44 PM

"Inception" and the Simulation Argument

If Carroll's deconstruction of the simulation argument is right, then the more trouble we have explaining consciousness, the more that should push us to believe we're in a ground-level simulation. There's probably a higher-level version of physics in which consciousness makes sense. Our own consciousness is probably being run in a world that operates on that higher-level law. And we're stuck in a low-resolution world whose physics doesn't allow consciousness -- because if we weren't, we'd just keep recursing further until we were.

-- Scott Alexander, The View From Ground Level

two characters from the film 'Inception' walking in a dream world where space folds back on itself

In the latest installment of "You Haven't Seen That Yet?", I watched the film Inception yesterday. There was only one person watching, but still the film gets two hearty thumbs-up. All those Ellen Pages waking up, one after the other...

Over the last few years, I've heard many references to the idea from physics that we are living in a simulation, that our universe is a simulation created by beings in another universe. It seems that some physicists think and talk about this a lot, which seems odd to me. Empiricism can't help us much to unravel the problem; arguments pro and con come down to the sort of logical arguments favored by mathematicians and philosophers, abstracted away from observation of the physical world. It's a fun little puzzle, though. The computer science questions are pretty interesting, too.

Ideas like this are old hat to those of us who read a lot of science fiction growing up, in particular Philip K. Dick. Dick's stories were often predicated on suspending some fundamental attribute of reality, or our perception of it, and seeing what happened to our lives and experiences. Now that I have seen Memento (a long-time favorite of mine) and Inception, I'm pretty happy. What Philip K. Dick was with the written word to kids of my generation, Christopher Nolan is on film to a younger generation. I'm glad I've been able to experience both.

~~~~

The photo above comes from Matt Goldberg's review of Inception. It shows Arthur, the character played by Joseph Gordon-Levitt, battling with a "projection" in three-dimensional space that folds back on itself. Such folding is possible in dream worlds and is an important element in designing dreams that enable all the cool mind games that are central to the film.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

August 22, 2016 4:18 PM

A New Way to Debug Our Election Systems

In The Security of Our Election Systems, Bruce Schneier says that we no longer have time to sound alarm about security flaws in our election systems and hope that government and manufacturers will take action. Instead...

We must ignore the machine manufacturers' spurious claims of security, create tiger teams to test the machines' and systems' resistance to attack, drastically increase their cyber-defenses and take them offline if we can't guarantee their security online.

How about this:

The students in my department love to compete in cyberdefense competitions (CDCs), in which they are charged with setting up various systems and then defending them against attack from experts for some period, say, twenty-four hours. Such competitions are growing in popularity across the country.

Maybe we should run a CDC with the tables turned. Manufacturers are required to set up their systems and to run the full set of services they promise when they sell the systems to government agencies. Students across the US would then be given a window of twenty-fours or more to try to crack the systems, with the manufacturers or even our election agencies trying to keep their systems up and running securely. Any vulnerabilities that the students find would be made public, enabling the manufacturers to fix them and the state agencies to create and set up new controls.

Great idea or crazy fantasy?


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

August 07, 2016 10:36 AM

Some Advice for How To Think, and Some Personal Memories

I've been reading a bunch of the essays on David Chapman's Meaningness website lately, after seeing a link to one on Twitter. (Thanks, @kaledic.) This morning I read How To Think Real Good, about one of Chapman's abandoned projects: a book of advice for how to think and solve problems. He may never write this book as he once imagined it, but I'm glad he wrote this essay about the idea.

First of all, it was a fun read, at least for me. Chapman is a former AI researcher, and some of the stories he tells remind me of things I experienced when I was in AI. We were even in school at about the same time, though in different parts of the country and different kinds of program. His work was much more important than mine, but I think at some fundamental level most people in AI share common dreams and goals. It was fun to listen as Chapman reminisced about knowledge and AI.

He also introduced me to the dandy portmanteau anvilicious. I keep learning new words! There are so many good ones, and people make up the ones that don't exist already.

My enjoyment was heightened by the fact that the essay stimulated the parts of my brain that like to think about thinking. Chapman includes a few of the heuristics that he intended to include in his book, along with anecdotes that illustrate or motivate them. Here are three:

All problem formulations are "false", because they abstract away details of reality.

Solve a simplified version of the problem first. If you can't do even that, you're in trouble.

Probability theory is sometimes an excellent way of dealing with uncertainty, but it's not the only way, and sometimes it's a terrible way.

He elaborates on the last of these, pointing out that probability theory tends to collapse many different kinds of uncertainty into a single value. This does not work all that well in practice, because different kinds of uncertainty often need to be handles in very different ways.

Chapman has a lot to say about probability. This essay was prompted by what he sees as an over-reliance of the rationalist community on a pop version of Bayesianism as its foundation for reasoning. But as an old AI researcher, he knows that an idea can sound good and fail in practice for all sorts of reasons. He has also seen how a computer program can make clear exactly what does and doesn't work.

Artificial intelligence has always played a useful role as a reality check on ideas about mind, knowledge, reasoning, and thought. More generally, anyone who writes computer programs knows this, too. You can make ambiguous claims with English sentences, but to write a program you really have to have a precise idea. When you don't have a precise idea, your program itself is a precise formulation of something. Figuring out what that is can be a way of figuring out what you were really thing about in the first place.

This is one of the most important lessons college students learn from their intro CS courses. It's an experience that can benefit all students, not just CS majors.

Chapman also includes a few heuristics for approaching the problem of thinking, basically ways to put yourself in a position to become a better thinker. Two of my favorites are:

Try to figure out how people smarter than you think.

Find a teacher who is willing to go meta and explain how a field works, instead of lecturing you on its subject matter.

This really is good advice. Subject matter is much easier to come by than deep understanding of how the discipline work, especially in these days of the web.

The word meta appears frequently throughout this essay. (I love that the essay is posted on the metablog/ portion of his site!) Chapman's project is thinking about thinking, a step up the ladder of abstraction from "simply" thinking. An AI program must reason; an AI researcher must reason about how to reason.

This is the great siren of artificial intelligence, the source of its power and also its weaknesses: Anything you can do, I can do meta.

I think this gets at why I enjoyed this essay so much. AI is ultimately the discipline of applied epistemology, and most of us who are lured into AI's arms share an interest in what it means to speak of knowledge. If we really understand knowledge, then we ought to be able to write a computer program that implements that understanding. And if we do, how can we say that our computer program isn't doing essentially the same thing that makes us humans intelligent?

As much as I love computer science and programming, my favorite course in graduate school was an epistemology course I took with Prof. Rich Hall. It drove straight to the core curiosity that impelled me to study AI in the first place. In the first week of the course, Prof. Hall laid out the notion of justified true belief, and from there I was hooked.

A lot of AI starts with a naive feeling of this sort, whether explicitly stated or not. Doing AI research brings that feeling into contact with reality. Then things gets serious. It's all enormously stimulating.

Ultimately Chapman left the field, disillusioned by what he saw as a fundamental limitation that AI's bag of tricks could never resolve. Even so, the questions that led him to AI still motivate him and his current work, which is good for all of us, I think.

This essay brought back a lot of pleasant memories for me. Even though I, too, am no longer in AI, the questions that led me to the field still motivate me and my interests in program design, programming languages, software development, and CS education. It is hard to escape the questions of what it means to think and how we can do it better. These remain central problems of what it means to be human.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

August 03, 2016 1:56 PM

Programming: Don't Knock It Till You Try It

We have a fair number of students on campus outside of CS who want to become web designers, but few of them think they should learn to program. Some give it a try when one of our communications profs tells them how exciting and liberating it can be. In general, though, it's a hard sell. Programming sounds boring to them, full of low-level details better left to techies over in computer science.

This issue pervades the web design community. In The Bomb in the Garden, Matthew Butterick does a great job of explaining why the web as a design medium is worth saving, and pointing to ways in which programming can release the creativity we need to keep it alive.

Which brings me to my next topic--what should designers know about programming?

And I know that some of you will think this is beating a dead horse. But when we talk about restoring creativity to the web, and expanding possibilities, we can't avoid the fact that just like the web is a typographic medium, it's also a programmable medium.

And I'm a designer who actually does a lot of programming in my work. So I read the other 322,000 comments about this on the web. I still think there's a simple and non-dogmatic answer, which is this:

You don't have to learn programming, but don't knock it till you try it.

It's fun for me when one of the web design students majoring in another department takes his or her first programming class and is sparked by the possibilities that writing a program opens up. And we in CS are happy to help them go deeper into the magic.

Butterick speaks truth when he says he's a designer who does a lot of programming in his work. Check out Pollen, the publishing system he created to write web-based books. Pollen's documentation says that it "helps authors make functional and beautiful digital books". That's true. It's a very nice package.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 28, 2016 2:37 PM

Functional Programming, Inlined Code, and a Programming Challenge

an example of the cover art for the Commander Keen series of games

I recently came across an old entry on Jonathan Blow's blog called John Carmack on Inlined Code. The bulk of the entry consists an even older email message that Carmack, lead programmer on video games such as Doom and Quake, sent to a mailing list, encouraging developers to consider inlining function calls as a matter of style. This email message is the earliest explanation I've seen of Carmack's drift toward functional programming, seeking to as many of its benefits as possible even in the harshly real-time environment of game programming.

The article is a great read, with much advice borne in the trenches of writing and testing large programs whose run-time performance is key to their success. Some of the ideas involve programming language:

It would be kind of nice if C had a "functional" keyword to enforce no global references.

... while others are more about design style:

The function that is least likely to cause a problem is one that doesn't exist, which is the benefit of inlining it.

... and still others remind us to rely on good tools to help avoid inevitable human error:

I now strongly encourage explicit loops for everything, and hope the compiler unrolls it properly.

(This one may come in handy as I prepare to teach my compiler course again this fall.)

This message-within-a-blog-entry itself quotes another email message, by Henry Spencer, which contains the seeds of a programming challenge. Spencer described a piece of flight software written in a particularly limiting style:

It disallowed both subroutine calls and backward branches, except for the one at the bottom of the main loop. Control flow went forward only. Sometimes one piece of code had to leave a note for a later piece telling it what to do, but this worked out well for testing: all data was allocated statically, and monitoring those variables gave a clear picture of most everything the software was doing.

Wow: one big loop, within which all control flows forward. To me, this sounds like a devilish challenge to take on when writing even a moderately complex program like a scanner or parser, which generally contain many loops within loops. In this regard, it reminds me of the Polymorphism Challenge's prohibition of if-statements and other forms of selection in code. The goal of that challenge was to help programmers really grok how the use of substitutable objects can lead to an entirely different style of program than we tend to create with traditional procedural programming.

Even though Carmack knew that "a great deal of stuff that goes on in the aerospace industry should not be emulated by anyone, and is often self destructive", he thought that this idea might have practical value, so he tried it out. The experience helped him evolve his programming style in a promising direction. This is a great example of the power of the pedagogical pattern known as Three Bears: take an idea to its extreme in order to learn the boundaries of its utility. Sometimes, you will find that those boundaries lie beyond what you originally thought.

Carmack's whole article is worth a read. Thanks to Jonathan Blow for preserving it for us.

~~~~

The image above is an example of the cover art for the "Commander Keen" series of video games, courtesy of Wikipedia. John Carmack was also the lead programmer for this series. What a remarkable oeuvre he has produced.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

July 20, 2016 9:16 AM

A Few Quick Lessons from Five Small Joy Programs

I recently wrote about some experiences programming in Joy, in which I use Joy to solve five problems that make up a typical homework assignment early in my Programming Languages course. These problems introduce my students to writing simple functions in a functional style, using Racket. Here is my code, if you care to check it out. I'm just getting back to stack programming, so this code can surely be improved. Feel free to email me suggestions or tweet me at @wallingf!

What did these problems teach me about Joy?

  • The language's documentation is sparse. Like my students, I had to find out which Joy primitives were available to me. It has a lot of the basic arithmetic operators you'd expect, but finding them meant searching through a plain-text file. I should write Racket-caliber documentation for the language to support my own work.

  • The number and order of the arguments to a function matters a lot. A function that takes several arguments can become more complicated than the corresponding Racket function, especially if you need to use them multiple times. I encountered this on my first day back to the language. In Racket, this problem requires a compound expression, but it is relatively straightforward, because arguments have names. With all its arguments on the stack, a Joy function has to do more work simply to access values, let alone replicate them for multiple uses.

  • A slight difference in task can lead to a large change in the code. For Problem 4, I implemented operators for modular addition, subtraction, and multiplication. +mod and *mod were elegant and straightforward. -mod was a different story. Joy has a rem operator that operates like Racket's remainder, but it has no equivalent to modulo. The fact that rem returns negative values means that I need a boolean expression and quoted programs and a new kind of thinking. This militates for a bigger shift in programming style right away.

  • I miss the freedom of Racket naming style. This isn't a knock on Joy, because most every programming language restricts severely the set of characters you can use in identifiers. But after being able to name functions +mod, in-range?, and int->char in Racket, the restrictions feel even more onerous.

  • As in most programming styles, the right answer in Joy is often to write the correct helpers. The difference in level of detail between +mod and *mod on the one hand and -mod on the other indicates that I am missing solution. A better approach is to implement a modulo operator and use it to write all three mod operators. This will hide lower-level details in a general-purpose operator. modulo would make a nice addition to a library of math operators.

  • Even simple problems can excite me about the next step. Several of these solutions, especially the mod operators, cry out for higher-order operators. In Racket, we can factor out the duplication in these operators and create a function that generates these functions for us. In Joy, we can do it, too, using quoted programs of the sort you see in the code for -mod. I'll be moving on to quoted programs in more detail soon, and I can't wait... I know that they will push me farther along the transition to the unique style of stack programming.

It's neat for me to be reminded that even the simplest little functions raise interesting design questions. In Joy, use of a stack for all data values means that identifying the most natural order for the arguments we make available to an operators can have a big effect on the ability to read and write code. In what order will arguments generally appear "in the wild"?

In the course of experimenting and trying to debug my code (or, even more frustrating, trying to understand why the code I wrote worked), I even wrote my first general utility operator:

    DEFINE clear  == [] unstack.

It clears the stack so that I can see exactly what the code I'm about to run creates and consumes. It's the first entry in my own little user library, named utilities.joy.

Fun, fun, fun. Have I ever said that I like to write programs?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

July 19, 2016 10:32 AM

What Is The Function Of School Today?

Painter Adolph Gottlieb was dismissive of art school in the 1950s:

I'd have done what I'm doing now twenty years ago if I hadn't had to go through that crap. What is the function of the art school today? To confuse the student. To make a living for the teacher. The painter can learn from museums -- probably it is the only way he can to learn. All artists have to solve their problems in the context of their own civilization, painting what their time permits them to paint, extending the boundaries a little further.

It isn't much of a stretch to apply this to computer programming in today's world. We can learn so much these days from programs freely available on GitHub and elsewhere on the web. A good teacher can help, but in general is there a better way to learn how to make things than to study existing works and to make our own?

Most importantly, today's programmers-to-be have to solve their problems in the context of their own civilization: today's computing, whether that's mobile or web or Minecraft. University courses have a hard time keeping up with constant change in the programming milieu. Instead, they often rely on general principles that apply across most environments but which seem lifeless in their abstraction.

I hope that, as a teacher, I add some value for the living I receive. Students with interests and questions and goals help keep me and my courses alive. At least I can set a lower bound of not confusing my students any more than necessary.

~~~~

(The passage above is quoted from Conversations with Artists, Selden Rodman's delightful excursion through the art world of the 1950s, chatting with many of the great artists of the era. It's not an academic treatise, but rather more an educated friend chatting with creative friends. I would thank the person who recommended this, but I have forgotten whose tweet or article did.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 18, 2016 12:47 PM

Getting a Sense of Joy Style Through the Stack

Last time, I wrote about returning to the programming language Joy, without giving too many details of the language or its style. Today, I'll talk a bit more what it means to say that Joy is a "stack language" and show how I sometimes think about the stack as I am writing a simple program. The evolution of the stack is helping me to think about how data types work in this language and to get a sense of why languages such as Joy are called concatenative.

Racket and other Lisp-like languages use prefix notation, unlike the more common infix notation we use in C, Java, and most mainstream languages. A third alternative is postfix notation, in which operators follow their operands. For example, this postfix expression is a valid Joy program:

    2 3 +

computes 2 + 3.

Postfix expressions are programs in a stack-based language. They correspond to a postfix traversal of a program tree. Where is the stack? It is used to interpret the program:

  • 2 is pushed onto the stack.
  • 3 is pushed onto the stack.
  • + is an operator. An operator pops its arguments from the stack, computes a result, and pushes the result on to the stack. + is a two-argument function, so it pops two arguments from the stack, computes their sum, and pushes the 5 back onto the stack.

Longer programs work the same way. Consider 2 3 + 5 *:

  • 2 is pushed onto the stack.
  • 3 is pushed onto the stack.
  • + pops the 2 and the 3, computes a 5, and pushes it on to the stack.
  • 5 is pushed onto the stack.
  • * pops the two 5s, computes a 25, and pushes it on to the stack.

This program is equivalent to 2 + 3 - 5, er, make that (2 + 3) * 5. As long as we know the arity of each procedure, postfix notation requires no rules for the precedence of operators.

The result of the program can be the entire stack or the top value on the stack. At the end of a program, the stack could finish with more than one value on it:

    2 3 + 5 * 6 3 /

leaves a stack consisting of 25 2.

Adding an operator to the end of that program can return us to a single-value stack:

    2 3 + 5 * 6 3 / -

leaves a stack of one value, 23.

This points out an interesting feature of this style of programming: we can compose programs simply by concatenating their source code. 2 * is a program: "double my argument", where all arguments are read from the stack. If we place it at the end of 2 3 + 5 * 6 3 / -, we get a new program, one that computes the value 46.

This gives us our first hint as to why this style is called concatenative programming.

As I am re-learning Joy's style, I think a lot about the stack, which holds the partial results of the computation I am trying to program. I often find myself simulating the state of stack to help me keep track of what to do next, whether on paper or in a comment on my code.

As an example, think back to the wind chill problem I discussed last time: given the the air temperature T and the wind speed V, compute the wind chill index T'.

    T' = 35.74 + 0.6215·T - 35.75·V0.16 + 0.4275·T·V0.16

This program expects to find T and V on top of the stack:

    T V

The program 0.16 pow computes x0.16 for the 'x' on top of the stack, so it leaves the stack in this state:

    T V'

In Joy documentation, the type of a program is often described with a before/after picture of the stack. The dup2 operator we saw last time is described as:

    dup2 :: a b -> a b a b

because it assumes two arbitrary values on top of the stack and leaves the stack with those values duplicated. In this fashion, we could describe the program 0.16 pow using

    0.16 pow :: N -> N

It expects a number N to be on top of the stack and leaves the stack with another number on top.

When I'm using these expressions to help me follow the state of my program, I sometimes use problem-specific names or simple modifiers to indicate changes in value or type. For the wind-chill program, I thought of the type of 0.16 pow as

    T V -> T V'
, because the values on the stack were a temperature and a velocity-based number.

Applied to this stack, dup2 converts the stack as

    T V' -> T V' T V'
, because the values on the stack were a temperature and a velocity-based number.

If we concatenate these two programs and evaluate them against the original stack, we get:

    [ initial stack]         T V
    0.16 pow              -> T V'
    dup2                  -> T V' T V'

I've been preserving these stack traces in my documentation, for me and any students who might end up reading my code. Here is the definition of wind-chill, taken directly from the source file:

    DEFINE wind-chill == 0.16 pow        (* -> T V'        *)
                         dup2            (* -> T V' T V'   *)
                         * 0.4275 *      (* -> T V' t3     *)
                         rollup          (* -> t3 T V'     *)
                         35.75 *         (* -> t3 T t2     *)
                         swap            (* -> t3 t2 T     *)
                         0.6215 *        (* -> t3 t2 t1    *)
                         35.74           (* -> t3 t2 t1 t0 *)
                         + swap - +.     (* -> T'          *)

After the dup2 step, the code alternates between computing a term of the formula and rearranging the stack for the next computation.

Notice the role concatenation plays. I can solve each substep in almost any order, paste the resulting little programs together, and boom! I have the final solution.

I don't know if this kind of documentation helps anyone but me, or if I will think it is as useful after I feel more comfortable in the style. But for now, I find it quite helpful. I do wonder whether thinking in terms of stack transformation may be helpful as I prepare to look into what type theory means for a language like Joy. We'll see.

I am under no pretense that this is a great Joy program, even in the context of a relative novice figuring things out. I'm simply learning, and having fun doing it.


Posted by Eugene Wallingford | Permalink | Categories: Computing

July 16, 2016 2:52 PM

A First Day Back Programming in Joy

I learned about the programming language Joy in the early 2000s, when I was on sabbatical familiarizing myself with functional programming. It drew me quickly into its web. Joy is a different sort of functional language, one in which function composition replaces function application as the primary focus. At that time, I wrote a bunch of small Joy programs and implemented a simple interpreter in PLT Scheme. After my sabbatical, though, I got pulled in a lot of different directions and lost touch with Joy. I saw it only every two or three semesters when I included it in my Programming Languages course. (The future of programming might not look like you expect it to....)

This spring, I felt Joy's call again and decided to make time to dive back into the language. Looking back over my notes from fifteen years ago, I'm surprised at some of the neat thoughts I had back then and at some of the code I wrote. Unfortunately, I have forgotten most of what I learned, especially about higher-order programming in Joy. I never reached the level of a Joy master anyway, but I feel like I'm starting from scratch. That's okay.

On Thursday, I sat down to write solutions in Joy for a few early homework problems from my Programming Languages course. These problems are intended to help my students learn the basics of functional programming and Racket. I figured they could help me do the same in Joy before I dove deeper, while also reminding me of the ways that programming in Joy diverges stylistically from more traditional functional style. As a bonus, I'd have a few more examples to show my students in class next time around.

It didn't take me long to start having fun. I'll talk more in upcoming posts about Joy, this style of programming, and -- I hope -- some of my research. For now, though, I'd like to tell you about one experience I had on my first day without getting into too many details.

In one homework problem, we approximate the wind chill index using this formula:

    T' = 35.74 + 0.6215·T - 35.75·V0.16 + 0.4275·T·V0.16
where T' is the wind chill index in degrees Fahrenheit, T is the air temperature in degrees Fahrenheit, and V is the wind speed in miles/hour. In Racket, this computation gives student a chance to write a compound expression and, if adventurous, to create a local variable to hold V0.16.

In Joy, we don't pass arguments to functions as in most other languages. Its operators pop their arguments from a common data stack and push their results back on to the stack. Many of Joy's operators manipulate the data stack: creating, rearranging, and deleting various items. For example, the dup operator makes a copy of the item on top of the stack, the swap operator swaps the top two items on the stack, and the rolldown operator moves the top two items on the stack below the third.

A solution to the wind-chill problem will expect to find T and V on top of the stack:

    T V
After computing V' = V0.16, the stack looks like this:
    T V'

The formula uses these values twice, so I really need two copies of each:

    T V' T V'

With a little work, I found that this sequence of operations does the trick:

    swap dup rolldown dup rolldown swap

From there, it didn't take long to find a sequence of operators that consumed these four values and left T' on the stack.

As I looked back over my solution, I noticed the duplication of dup rolldown in the longer expression shown above and thought about factoring it out. Giving that sub-phrase a name is hard, though, because it isn't all that meaningful on its own. However, the whole phrase is meaningful, and probably useful in a lot of other contexts: it duplicates the top two items on the stack. So I factored the whole phrase out and named it dup2:

    DEFINE dup2 == swap dup rolldown dup rolldown swap.
My first refactoring in Joy!

As soon as my fingers typed "dup2", though, my mind recognized it. Surely I had seen it before... So I went looking for "dup2" in Joy's documentation. It is not a primitive operator, but it is defined in the language's initial library, inilib.joy, a prelude that defines syntactic sugar in Joy itself:

    dup2 ==  dupd dup swapd

This definition uses two other stack operators, dupd and swapd, which are themselves defined using the higher-order operator dip. I'll be getting to higher-order operators soon enough, I thought; for now I was happy with my longer but conceptually simpler solution.

I was not surprised to find dup2 already defined in Joy. It defines a lot of cool stack operators, the sorts of operations every programmer needs to build more interesting programs in the language. But I was a little disappointed that my creation wasn't new, in the way that only a beginner can be disappointed when he learns that his discovery isn't new. My disappointment was more than offset by the thought that I had recognized an operator that the language's designer thought would be useful. I was already starting to feel like a Joy programmer again.

It was a fun day, and a much-needed respite from administrative work. I intend for it to be only the first day of many more days programming with Joy.


Posted by Eugene Wallingford | Permalink | Categories: Computing

July 07, 2016 2:01 PM

Oberon: GoogleMaps as Desktop UI

Oberon is a software stack created by Niklaus Wirth and his lab at ETH Zürich. Lukas Mathis describes some of what makes Oberon unusual, including the fact that its desktop is "an infinitely large two-dimensional space on which windows ... can be arranged":

It's incredibly easy to move around on this plane and zoom in and out of it. Instead of stacking windows, hiding them behind each other (which is possible in modern versions of Oberon), you simply arrange them next to each other and zoom out and in again to switch between them. When people held presentations using Oberon, they would arrange all slides next to each other, zoom in on the first one, and then simply slide the view one screen size to the right to go to the next slide.

This sounds like interacting with Google Maps, or any other modern map app. I wonder if anyone else is using this as a model for user interaction on the desktop?

Check out Mathis's article for more. The section "Everything is a Command Line" reminds me of workspaces in Smalltalk. I used to have several workspaces open, with useful snippets of code just waiting for me to do it. Each workspace window was like a custom text-based menu.

I've always liked the idea of Oberon and even considered using the programming language in my compilers course. (I ended up using a variant.) A version of Compiler Construction is now available free online, so everyone can see how Wirth's clear thinking lead to a sparse, complete, elegant whole. I may have to build the latest installment of Oberon and see what all they have working these days.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

July 06, 2016 12:43 PM

Computational Research Lets Us Ask New Questions, Political Science Edition

The latest example comes from FiveThirtyEight.com, an organization that is built on the idea of data-driven approaches to old problems:

Almost all data about politics that you encounter comes from polls and surveys of individuals or else from analysis of geographic units such as precincts, counties and states. Individual data and geographic data do not capture the essential networks in which we all live -- households and friendships and communities. But other and newer kinds of data -- such as voter files that connect individuals to their households or network data that capture online connections -- revolutionize how we understand politics. By the end of this election cycle, expect to see many more discoveries about the social groupings that define our lives.

Computational research enables us to ask entirely new questions -- both ones that were askable before but not feasible to answer and ones that would not have been conceived before. Even if the question begins as whimsically as "How often do Democrats and Republicans marry one another?"

Back in December 2007, I commented on this in the context of communications studies and public relations. One of our CS master's students, Sergey Golitsynskiy, had just defended an M.A. thesis in communications studies that investigated the blogosphere's impact on a particular dust-up in the public relations world. Such work has the potential to help refine the idea of "the general public" within public relations, and even its nature of "publics". (Sergey is now a tenure-track professor here in communications studies.)

Data that encodes online connections enables us to explore network effects that we can't even see with simpler data. As the 538 piece says, this will revolutionize how we understand politics, along with so many other disciplines.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 27, 2016 2:12 PM

Looking Back to Chris Ford's StrangeLoop 2015 Talk

StrangeLoop 2015 is long in the books for most people, but I occasionally still think about some of the things I learned there. Chris Ford's recent blog post reminded me that I had a draft about his talk waiting to be completed and posted. Had I published this post closer to the conference, I would have called it Kolmogorov Music: Compression and Understanding.

(This is the second follow-up post about a StrangeLoop 2015 talk that is still on my mind. The previous follow-up was about Peter Alvaro's talk on languages and distributed systems.)



the opening screen for the Kolmogorov Music talk


Chris Ford stepped the podium in front of an emacs buffer. "Imagine a string of g's," he said, "infinitely long, going in both directions." This is an infinite string, he pointed out, with an 11-word description. That's the basic idea of Kolmogorov complexity, and the starting point for his talk.

I first read about Kolmogorov complexity in a couple of papers by Gregory Chaitin that I found on the early web back in the 1990s. It fascinated me then, and I went into "Kolmogorov Music", Ford's talk, with high hopes. It more than delivered. The talk was informative, technically clear, and entertaining.

Ford uses Clojure for this work in order to write macros. They allow him to talk about code at two levels: source and expansion. The source macro is his description of some piece of music, and the expansion is the music itself, "executable" by an interpreter.

He opened by demo'ing some cool music, including a couple of his own creations. Then he began his discussion of how complex a piece of music is. His measure of complexity is the ratio of the length of the evaluated data (the music) to the length of the macro (the program that generates it). This means that complexity is relative, in part, to the language of expression. If we used a language other than Clojure, the ratios would be different.

Once we settle on a programming language, we can compare the relative complexity of two pieces of music. This also gives rise to cool ideas such as conditional complexity, based on the distance between the programs that encode two pieces of music.

Compression algorithms do something quite similar: exploit our understanding of data to express it in fewer bytes. Ford said that he based his exploration on the paper Analysis by Compression by David Meredith, a "musicologist with a computational bent". Meredith thinks of listening to music as a model-building process that can be described using algorithms.

Programs have more expressive power than traditional music notation. Ford gave as an example clapping music that falls farther and farther behind itself as accompaniment continues. It's much easier to write this pattern using a programming language with repetition and offsets than using musical notation.

Everything has been cool so far. Ford pushed on to more coolness.

A minimalist idea can be described briefly. As Borges reminds us in The Library of Babel, a simple thing can contain things that are more complex than itself. Ford applied this idea to music. He recalled Carl Sagan's novel Contact, in which the constant pi was found to contain a hidden message. Inspired by Sagan, Ford looked to the Champernowne constant, a number created by concatenating all integers in succession -- 0.12345678910111213141516..., and turned it into music. Then he searched it for patterns.

Ford found something that sounded an awful lot like "Blurred Lines", a pop hit by Robin Thicke in 2013, and played it for us. He cheekily noted that his Champernowne song infringes the copyright on Thicke's song, which is quite humorous given the controversial resemblance of Thicke's song to "Got to Give It Up", a Marvin Gaye tune from 1977. Of course, Ford's song is infinitely long, so it likely infringes the copyright of every song ever written! The good news for him is that it also subsumes every song to be written in the future, offering him the prospect of a steady income as an IP troll.

Even more than usual, my summary of Ford's talk cannot possibly do it justice, because he shows code and plays music! Let me echo what was a common refrain on Twitter immediately after his talk at StrangeLoop: Go watch this video. Seriously. You'll get to see him give a talk using only emacs and a pair of speakers, and hear all of the music, too. Then check out Ford's raw material. All of his references, music, and code are available on his Github site.

After that, check out his latest blog entry. More coolness.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 13, 2016 2:47 PM

Patterns Are Obvious When You Know To Look For Them...

normalized counts of consecutive 8-digit primes (mod 31)

In Prime After Prime (great title!), Brian Hayes boils down into two sentences the fundamental challenge that faces people doing research:

What I find most surprising about the discovery is that no one noticed these patterns long ago. They are certainly conspicuous enough once you know how to look for them.

It would be so much easier to form hypotheses and run tests if interesting hypotheses were easier to find.

Once found, though, we can all see patterns. When they can be computed, we can all write programs to generate them! After reading a paper about the strong correlations among pairs of consecutive prime numbers, Hayes wrote a bunch of programs to visualize the patterns and to see what other patterns he might find. A lot of mathematicians did the same.

Evidently that was a common reaction. Evelyn Lamb, writing in Nature, quotes Soundararajan: "Every single person we've told this ends up writing their own computer program to check it for themselves."

Being able to program means being able to experiment with all kinds of phenomena, even those that seemingly took genius to discover in the first place.

Actually, though, Hayes's article gives a tour of the kind of thinking we all can do that can yield new insights. Once he had implemented some basic ideas from the research paper, he let his imagination roam. He tried different moduli. He visualized the data using heat maps. When he noticed some symmetries in his tables, he applied a cyclic shift to the data (which he termed a "twist") to see if some patterns were easier to identify in the new form.

Being curious and asking questions like these are one of the ways that researchers manage to stumble upon new patterns that no one has noticed before. Genius may be one way to make great discoveries, but it's not a reliable one for those of us who aren't geniuses. Exploring variations on a theme is a tactic we mortals can use.

Some of the heat maps that Hayes generates are quite beautiful. The image above is a heat map of the normalized counts of consecutive eight-digit primes, taken modulo 31. He has more fun making images of his twists and with other kinds of primes. I recommend reading the entire article, for its math, for its art, and as an implicit narration of how a computational scientist approaches a cool result.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

June 12, 2016 10:33 AM

The Tension Between Free and Expensive

Yesterday, William Stein's talk about the origins of SageMath spread rapidly through certain neighborhoods of Twitter. It is a thorough and somewhat depressing discussion of how hard it is to develop open source software within an academic setting. Writing code is not part of the tenure reward system or the system for awarding grants. Stein has tenure at the University of Washington but has decided that he has to start a company, SageMath, work for it full-time in order to create a viable open source alternative to the "four 'Ma's": Mathematica, Matlab, Maple, and Magma.

Stein's talk reminded me of something I read earlier this year, from a talk by Matthew Butterick:

"Information wants to be expensive, because it's so valuable ... On the other hand, information wants to be free, because the cost of getting it out is getting lower ... So you have these two fighting against each other."

This was said by a guy named Stewart Brand, way back in 1984.

So what's the message here? Information wants to be free? No, that's not the message. The message is that there are two forces in tension. And the challenge is how to balance the forces.

Proponents of open source software -- and I count myself one -- are often so glib with the mantra "information wants to be free" that we forget about the opposing force. Wolfram et al. have capitalized quite effectively on information's desire to be expensive. This force has an economic power that can overwhelm purely communitarian efforts in many contexts, to the detriment of open work. The challenge is figuring out how to balance the forces.

In my mind, Mozilla stands out as the modern paradigm of seeking a way to balance the forces between free and expensive, creating a non-profit shell on top of a commercial foundation. It also seeks ways to involve academics in process. It will be interesting to see whether this model is sustainable.

Oh, and Stewart Brand. He pointed out this tension thirty years ago. I recently recommended How Buildings Learn to my wife and thought I should look back at the copious notes I took when I read it twenty years ago. But I should read the book again myself; I hope I've changed enough since then that reading it anew brings new ideas to mind.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 08, 2016 2:48 PM

Happy Birthday Programming

Yesterday, I wrote me some Java. It was fun.

A few days ago, I started wondering if there was something unique I could send my younger daughter for her birthday today. My daughters and I were all born in presidential election years, which is neat little coincidence. This year's election is special for the birthday girl: it is her first opportunity to vote for the president. She has participated in the process throughout, which has seen both America's most vibrant campaign for progressive candidate in at least forty years and the first nomination of a woman by a major party. Both of these are important to her.

In the spirit of programming and presidential politics, I decided to write a computer program to convert images into the style of Shepard Fairey's iconic Obama "Hope" poster and then use it to create a few images for her.

I dusted off Dr. Java and fired up some code I wrote when I taught media computation in our intro course many years ago. It had been a long time since I had written any Java at all, but it came back just like riding a bike. More than decade of writing code in a language burns some pretty deep grooves in the mind.

I found RGB values to simulate the four colors in Fairey's poster in an old message to the mediacomp mailing list:

    Color darkBlue  = new Color(0, 51, 76);
    Color lightBlue = new Color(112, 150, 158);
    Color red       = new Color(217, 26, 33);
    Color yellow    = new Color(252, 227, 166);

Then came some experimentation...

  • First, I tried turning each pixel into the Fairey color to which it was closest. That gave an image that was grainy and full of lines, almost like a negative.

  • Then I calculated the saturation of each pixel (the average of its RGB values) and translated the pixel into one of the four colors depending on which quartile it was in. If the saturation was less than 256/4, it went dark blue; if it was less than 256/2, it went red; and so on. This gave a better result, but some images ended up having way too much of one or two of the colors.

  • Attempt #2 skews the outputs because in many images (most?) the saturation values are not distributed evenly over the four quartiles. So I wrote a function to "normalize" the quartiles. I recorded all of the saturation values in the image and divided them evenly across the four colors. The result was an image with an equal numbers of pixels assigned to each of the four colors.

I liked the outputs of this third effort quite a bit, at least for the photos I gave it as input. Two of them worked out especially well. With a little doctoring in Photoshop, they would have an even more coherent feel to them, like an artist might produce with a keener eye. Pretty good results for a few fun minutes of programming.

Now, let's hope my daughter likes them. I don't think she's ever received a computer-generated present before, at least not generated by a program her dad wrote!

The images I created were gifts to her, so I'll not share them here. But if you've read this far, you deserve a little something, so I give you these:

Eugene obamified by his own program, version 1
 
Eugene obamified by his own program, version 2

Now that is change we can all believe in.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

June 03, 2016 3:07 PM

The Lisp 1.5 Universal Interpreter, in Racket

John McCarthy presents Lisp from on High
 
"John McCarthy presents Recursive
Functions of Symbolic Expressions
and Their Computation by Machine,
Part I"
courtesy of Classic Programmer Paintings

Earlier this week, Alan Kay was answering questions on Hacker News and mentioned Lisp 1.5:

This got deeper if one was aware of how Lisp 1.5 had been implemented with the possibility of late bound parameter evaluation ...

Kay mentions Lisp, and especially Lisp 1.5, often whenever he is talking about the great ideas of computing. He sometimes likens McCarthy's universal Lisp interpreter to Maxwell's equations in physics -- a small, simple set of equations that capture a huge amount of understanding and enable a new way of thinking. Late-bound evaluation of parameters is one of the neat ideas you can find embedded in that code.

The idea of a universal Lisp interpreter is pretty simple: McCarthy defined the features of Lisp in terms of the language features themselves. The interpreter consists of two main procedures:

  • a procedure that evaluates an expression, and
  • a procedure that applies a procedure to its arguments.

These procedures recurse mutually to evaluate a program.

This is one of the most beautiful ideas in computing, one that we take for granted today.

The syntax and semantics of Lisp programs are so sparse and so uniform that the McCarthy's universal Lisp interpreter consisted of about one page of Lisp code. Here that page is: Page 13 of the Lisp 1.5 Programmer's Manual, published in 1962.



Page 13 of the Lisp 1.5 Programmer's Manual


You may see this image passed around the Twitter and the web these whenever Lisp 1.5 is mentioned. But the universal Lisp interpreter is a program. Why settle for a JPG image?

While preparing for the final week of my programming languages course this spring, I sat down and implemented the Lisp interpreter on Page 13 of the Lisp 1.5 manual in universal-lisp-interpreter.rkt, using Racket.

I tried to reproduce the main procedures from the manual as faithfully as I could. You see the main two functions underlying McCarthy's idea: "evaluate an expression" and "apply a function to its arguments". The program assumes the existence of only a few primitive forms from Racket:

  • the functions cons, car, cdr, atom, and eq?
  • the form lambda, for creating functions
  • the special forms quote and cond
  • the values 't and nil

't means true, and nil means both false and the empty list. My Racket implementation uses #t and #f internally, but they do not appear in the code for the interpreter.

Notice that this interpreter implements all of the language features that it uses: the same five primitive functions, the same two special forms, and lambda. It also defines label, a way to create recursive functions. (label offers a nice contrast to the ways we talk about implementing recursive functions in my course.)

The interpreter uses a few helper functions, which I also define as in the manual. evcon evaluates a cond expression, and evlis evaluates a list of arguments. assoc looks up the value for a key in an "association list", and pairlis extends an existing association list with new key/value pairs. (In my course, assoc and pairlis correspond to basic operations on a finite function, which we use to implement environments.)

I enjoyed walking through this code briefly with my students. After reading this code, I think they appreciated anew the importance of meaningful identifiers...

The code works. Open it up in Racket and play with a Lisp from the dawn of time!

It really is remarkable how much can be built out of so little. I sometimes think of the components of this program as the basic particles out of which all computation is built, akin to an atomic theory of matter. Out of these few primitives, all programs can be built.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 02, 2016 3:10 PM

Restoring Software's Good Name with a Humble Script

In a Startup School interview earlier this year, Paul Graham reminds software developers of an uncomfortable truth:

Still, to this day, one of the big things programmers do not get is how traumatized users have been by bad and hard-to-use software. The default assumption of users is, "This is going to be really painful, and in the end, it's not going to work."

I have encountered this trauma even more since beginning to work with administrators on campus a decade ago. "Campus solutions" track everything from enrollments to space usage. So-called "business-to-business software" integrates purchasing with bookkeeping. Every now and then the university buys and deploys a new system, to manage hiring, say, or faculty travel. In almost every case, interacting with the software is painful for the user, and around the edges it never quite seems to fit what most users really want.

When administrators or faculty relate their latest software-driven pain, I try to empathize while also bring a little perspective to their concerns. These systems address large issues, and trying to integrate them into a coherent whole is a very real challenge, especially for an understaffed group of programmers. Sometimes, the systems are working exactly as they should to implement an inconvenient policy. Unfortunately, users don't see the policy on a daily basis; they see the inconvenient and occasionally incomplete software that implements it.

Yet there are days when even I have to complain out loud. Using software can be painful.

Today, though, I offer a story of nascent redemption.

After reviewing some enrollment data earlier this spring, my dean apologized in advance for any errors he had made in the reports he sent to the department heads. Before he can analyze the data, he or one of the assistant deans has to spend many minutes scavenging through spreadsheets to eliminate rows that are not part of the review. They do this several times a semester, which adds up to hours of wasted time in the dean's office. The process is, of course, tedious and error-prone.

I'm a programmer. My first thought was, "A program can do this filtering almost instantaneously and never make an error."

In fact, a few years ago, I wrote a simple Ruby program to do just this sort of filtering for me, for a different purpose. I told the dean that I would be happy to adapt it for use in his office to process data for all the departments in the college. My primary goal was to help the dean; my ulterior motive was self-improvement. On top of that, this was a chance to put my money where my mouth is. I keep telling people that a few simple programs can make our lives better, and now I could provide a concrete example.

Last week, I whipped up a new Python script. This week, I demoed it to the dean and an assistant dean. The dean's first response was, "Wow, this will help us a lot." The rest of the conversation focused on ways that the program could help them even more. Like all users, once they saw what was possible, they knew even better what they really wanted.

I'll make a few changes and deliver a more complete program soon. I'll also help the users as they put it to work and run into any bugs that remain. It's been fun. I hope that this humble script is an antidote, however small, to the common pain of software that is hard to use and not as helpful as it should be. Many simple problems can be solved by simple programs.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 27, 2016 1:38 PM

Brilliance Is Better Than Magic, Because You Get To Learn It

Brent Simmons has recently suggested that Swift would be better if it were more dynamic. Some readers have interpreted his comments as an unwillingness to learn new things. In Oldie Complains About the Old Old Ways, Simmons explains that new things don't bother him; he simply hopes that we don't lose access to what we learned in the previous generation of improvements. The entry is short and worth reading in its entirety, but the last sentence of this particular paragraph deserves to be etched in stone:

It seemed like magic, then. I later came to understand how it worked, and then it just seemed like brilliance. (Brilliance is better than magic, because you get to learn it.)

This gets to close to the heart of why I love being a computer scientist.

So many of the computer programs I use every day seem like magic. This might seem odd coming from a computer scientist, who has learned how to program and who knows many of the principles that make complex software possible. Yet that complexity takes many forms, and even a familiar program can seem like magic when I'm not thinking about the details under its hood.

As a computer scientist, I get to study the techniques that make these programs work. Sometimes, I even get to look inside the new program I am using, to see the algorithms and data structures that bring to life the experience that feels like magic.

Looking under the hood reminds me that it's not really magic. It isn't always brilliance either, though. Sometimes, it's a very cool idea I've never seen or thought about before. Other times, it's merely a bunch of regular ideas, competently executed, woven together in a way that give an illusion of magic. Regular ideas, competently executed, have their own kind of beauty.

After I study a program, I know the ideas and techniques that make it work. I can use them to make my own programs.

This fall, I will again teach a course in compiler construction. I will tell a group of juniors and seniors, in complete honesty, that every time I compile and execute a program, the compiler feels like magic to me. But I know it's not. By the end of the semester, they will know what I mean; it won't feel like magic to them any more, either. They will have learned how their compilers work. And that is even better than the magic, which will never go away completely.

After the course, they will be able to use the ideas and techniques they learn to write their own programs. Those programs will probably feel like magic to the people who use them, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development

May 23, 2016 1:55 PM

Your Occasional Reminder to Use Plain Text Whenever Possible

The authors of Our Lives, Encoded found that they had lost access to much of their own digital history to the proprietary data formats of long-dead applications:

Simple text files have proven to be the only format interoperable through the decades. As a result, they are the only artifacts that remain from my own digital history.

I myself have lost access to many WordPerfect files from the '80s in their original form, though I have been migrating their content to other formats over the years. I was fortunate, though, to do most of my early work in VMS and Unix, so a surprising number of my programs and papers from that era are still readable as they were then. (Occasionally, this requires me to dust off troff to see what I intended for them to look like then.)

However, the world continues to conspire against us. Even when we are doing something that is fundamentally plain text, the creators of networks and apps build artificial barriers between their services.

One cannot simply transfer a Twitter profile over to Facebook, or message a Snapchat user with Apple's iMessage. In the sense that they are all built to transmit text and images, these platforms aren't particularly novel, they're just designed to be incompatible.

This is one more reason that you will continue to find me consorting in the ancient technology of email. Open protocol, plain text. Plenty of goodness there, even with the spam.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 15, 2016 9:36 AM

An Interview about Encryption

A local high student emailed me last week to say that he was writing a research paper about encryption and the current conversation going on regarding its role in privacy and law enforcement. He asked if I would be willing to answer a few interview questions, so that he could have a few expert quotes for his paper. I'm always glad when our local schools look to the university for expertise, and I love to help young people, so I said yes.

I have never written anything here about my take on encryption, Edward Snowden, or the FBI case against Apple, so I figured I'd post my answers. Keep in mind that my expertise is in computer science. I am not a lawyer, a political scientist, or a philosopher. But I am an informed citizen who knows a little about how computers work. What follows is a lightly edited version of the answers I sent the student.

  1. Do you use encryption? If so, what do you use?

    Yes. I encrypt several disk images that hold sensitive financial data. I use encrypted files to hold passwords and links to sensitive data. My work laptop is encrypted to protect university-related data. And, like everyone else, I happily use https: when it encrypts data that travels between me and my bank and other financial institutions on the web.

  2. In light of the recent news on groups like ISIS using encryption, and the Apple v. Department of Justice, do you support legislation that eliminates or weakens powerful encryption?

    I oppose any legislation that weakens strong encryption for ordinary citizens. Any effort to weaken encryption so that the government can access data in times of need weakens encryption for all people at all times and against all intruders.

  3. Do you think the general good of encryption (protection of data and security of users) outweighs or justifies its usage when compared to the harmful aspects of it (being used by terrorists groups or criminals)?

    I do. Encryption is one of the great gifts that computer science has given humanity: the ability to be secure in one's own thoughts, possessions, and communication. Any tool as powerful as this one can be misused, or used for evil ends.

    Encryption doesn't protect us from only the U.S. government acting in good faith. It protects people from criminals who want to steal our identities and our possessions. It protects people from the U.S. government acting in bad faith. And it protects people from other governments, including governments that terrorize their own people. If I were a citizen of a repressive regime in the Middle East, Africa, Southeast Asia, or anywhere else, I would want the ability to communicate without intrusion from my government.

    Those of us who are lucky to live in safer, more secure circumstances owe this gift to the people who are not so lucky. And weakening it for anyone weakens it for everyone.

  4. What is your response to someone who justifies government suppression of encryption with phrases like "What are you hiding?" or "I have nothing to hide."?

    I think that most people believe in privacy even when they have nothing to hide. As a nation, we do not allow police to enter our homes at any time for any reason. Most people lock their doors at night. Most people pull their window shades down when they are bathing or changing clothes. Most people do not have intimate relations in public view. We value privacy for many reasons, not just when we have something illegal to hide.

    We do allow the police to enter our homes when executing a search warrant, after the authorities have demonstrated a well-founded reason to believe it contains material evidence in an investigation. Why not allow the authorities to enter or digital devices under similar circumstances? There are two reasons.

    First, as I mentioned above, weakening encryption so that the government can access data in times of legitimate need weakens encryption for everyone all the time and makes them vulnerable against all intruders, including bad actors. It is simply not possible to create entry points only for legitimate government uses. If the government suppresses encryption in order to assist law enforcement, there will be disastrous unintended side effects to essential privacy of our data.

    Second, our digital devices are different than our homes and other personal property. We live in our homes and drive our cars, but our phones, laptops, and other digital devices contain fundamental elements of our identity. For many, they contain the entirety of our financial and personal information. They also contain programs that enact common behaviors and would enable law enforcement to recreate past activity not stored on the device. These devices play a much bigger role in our lives than a house.

  5. In 2013 Edward Snowden leaked documents detailing surveillance programs that overstepped boundaries spying on citizens. Do you think Snowden became "a necessary evil" to protect citizens that were unaware of surveillance programs?

    Initially, I was unsympathetic to Snowden's attempt to evade detainment by the authorities. The more I learned about the programs that Snowden had uncovered, the more I came to see that his leak was an essential act of whistleblowing. The American people deserve to know what their government is doing. Indeed, citizens cannot direct their government if they do not know what their elected officials and government agencies are doing.

  6. In 2013 to now, the number of users that are encrypting their data has significantly risen. Do you think that Snowden's whistleblowing was the action responsible for a massive rise in Americans using encryption?

    I don't know. I would need to see some data. Encryption is a default in more software and on more devices now. I also don't know what the trend line for user encryption looked like before his release of documents.

  7. Despite recent revelations on surveillance, millions of users still don't voluntarily use encryption. Do you believe it is fear of being labeled a criminal or the idea that encryption is unpatriotic or makes them an evil person?

    I don't know. I expect that there are a number of bigger reasons, including apathy and ignorance.


  8. Encryption defaults on devices like iPhones, where the device is encrypted while locked with a passcode is becoming a norm. Do you support the usage of default encryption and believe it protects users who aren't computer savvy?

    I like encryption by default on my devices. It comes with risks: if I lose my password, I lose access to my own data. I think that users should be informed that encryption is turned on by default, so that they can make informed choices.

  9. Should default encryption become required by law or distributed by the government to protect citizens from foreign governments or hackers?

    I think that we should encourage people to encrypt their data. At this point, I am skeptical of laws that would require it. I am not a legal scholar and do not know that the government has the authority to require it. I also don't know if that is really what most Americans want. We need to have a public conversation about this.

  10. Do you think other foreign countries are catching up or have caught up to the United States in terms of technical prowess? Should we be concerned?

    People in many countries have astonishing technical prowess. Certainly individual criminals and other governments are putting that prowess to use. I am concerned, which is one reason I encrypt my own data and encourage others to do so. I hope that the U.S. government and other American government agencies are using encryption in an effort to protect us. This is one reason I oppose the government mandating weakness in encryption mechanisms for its own purposes.

  11. The United States government disclosed that it was hacked and millions of employees information was compromised. Target suffered a breach that resulted in credit card information being stolen. Should organizations and companies be legally responsible for breaches like these? What reparations should they make?

    I am not a lawyer, but... Corporations and government agencies should take all reasonable precautions to protect the sensitive data they store about their customers and citizens. I suspect that corporations are already subject to civil suit for damages caused by data breaches, but that places burdens on people to recover damages for losses due to breached data. This is another area where we as a people need to have a deeper conversation so that we can decide to what extent we want to institute safeguards into the law.

  12. Should the US begin hacking into other countries infrastructures and businesses to potentially damage that country in the future or steal trade secrets similar to what China has done to us?

    I am not a lawyer or military expert, but... In general, I do not like the idea of our government conducting warfare on other peoples and other governments when we are not in a state of war. The U.S. should set a positive moral example of how a nation and a people should behave.

  13. Should the US be allowed to force companies and corporations to create backdoors for the government? What do believe would be the fallout from such an event?

    No. See the third paragraph of my answer to #4.

As I re-read my answers, I realize that, even though I have thought a lot about some of these issues over the years, I have a lot more thinking to do. One of my takeaways from the interview is that the American people need to think about these issues and have public conversations in order to create good public policy and to elect officials who can effectively steward the government in a digital world. In order for this to happen, we need to teach everyone enough math and computer science that they can participate effectively in these discussions and in their own governance. This has big implications for our schools and science journalism.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

May 11, 2016 2:45 PM

Why Do Academic Research in Programming Languages?

When people ask Jean Yang this question, she reminds them that most of the features in mainstream languages follow decades of research:

Yes, Guido Van Rossum was a programmer and not a researcher before he became the Benevolent Dictator of Python. But Python's contribution is not in innovating in terms of any particular paradigm, but in combining well features like object orientation (Smalltalk, 1972, and Clu, 1975), anonymous lambda functions (the lambda calculus, 1937), and garbage collection (1959) with an interactive feel (1960s).

You find a similar story with Matz and Ruby. Many other languages were designed by people working in industry but drawing explicitly on things learned by academic research.

I don't know what percentage of mainstream languages were designed by people in industry rather than academia, but the particular number is beside the point. The same is true in other areas of research, such as databases and networks. We need some research to look years and decades into the future in order to figure what is and isn't possible. That research maps the terrain that makes more applied work, whether in industry or academia, possible.

Without academics betting years of their career on crazy ideas, we are doomed to incremental improvements of old ideas.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 05, 2016 1:45 PM

Philosopher-Programmer

In her 1942 book Philosophy in a New Key, philosopher Susanne Langer wrote:

A question is really an ambiguous proposition; the answer is its determination.

This sounds like something a Prolog programmer might say in a philosophical moment. Langer even understood how tough it can be to write effective Prolog queries:

The way a question is asked limits and disposes the ways in which any answer to it -- right or wrong -- may be given.

Try sticking a cut somewhere and see what happens...

It wouldn't be too surprising if a logical philosopher reminded me of Prolog, but Langer's specialties were consciousness and aesthetics. Now that I think about it, though, this connection makes sense, too.

Prolog can be a lot of fun, though logic programming always felt more limiting to me than most other styles. I've been fiddling again with Joy, a language created by a philosopher, but every so often I think I should earmark some time to revisit Prolog someday.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

May 02, 2016 4:30 PM

A Pawn and a Move

So, a commodity chess program is now giving odds of a pawn and a move to a world top-ten player -- and winning?

The state of computer chess certainly has changed since the fall of 1979, when I borrowed Mike Jeffers's Chess Challenger 7 and played it over and over and over. I was a rank novice, really just getting my start as a player, yet after a week or so I was able to celebrate my first win over the machine, at level 3. You know what they say about practice...

My mom stopped by our study room several times during that week, trying to get me to stop playing. It turns out that she and my dad had bought me a Chess Challenger 7 for Christmas, and she didn't want me to tire of my present before I had even unwrapped it. She didn't know just how not tired I would get of that computer. I wore it out.

When I graduated with my Ph.D., my parents bought me Chess Champion 2150L, branded by in the name of world champion Garry Kasparov. The 2150 in the computer's name was a rough indication that it played expert-level chess, much better than my CC7 and much better than me. I could beat it occasionally in a slow game, but in speed chess it pounded me mercilessly. I no longer had the time or inclination to play all night, every night, in an effort to get better, so it forever remained my master.

Now US champ Hikaru Nakamura and world champ Magnus Carlsen know how I feel. The days of any human defeating even the programs you can buy at Radio Shack have long passed.

Two pawns and move odds against grandmasters, and a pawn and a move odds against the best players in the world? Times have changed.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

April 29, 2016 3:30 PM

A Personal Pantheon of Programming Books

Michael Fogus, in the latest issue of Read-Eval-Print-λove, writes:

The book in question was Thinking Forth by Leo Brodie (Brodie 1987) and upon reading it I immediately put it into my own "personal pantheon" of influential programming books (along with SICP, AMOP, Object-Oriented Software Construction, Smalltalk Best Practice Patterns, and Programmers Guide to the 1802).

Mr. Fogus has good taste. Programmers Guide to the 1802 is new to me. I guess I need to read it.

The other five books, though, are in my own pantheon influential programming books. Some readers may be unfamiliar with these books or the acronyms, or aware that so many of them are available free online. Here are a few links and details:

  • Thinking Forth teaches us how to program in Forth, a concatenative language in which programs run against a global stack. As Fogus writes, though, Brodie teaches us so much more. He teaches a way to think about programs.

  • SICP is Structure and Interpretation of Computer Programs, hailed by many as the greatest book on computer programming ever written. I am sympathetic to this claim.

  • AMOP is The Art of the Metaobject Protocol, a gem of a book that far too few programmers know about. It presents a very different and more general kind of OOP than most people learn, the kind possible in a language like Common Lisp. I don't know of an authorized online version of this book, but there is an HTML copy available.

  • Object-Oriented Software Construction is Bertrand Meyer's opus on OOP. It did not affect me as deeply as the other books on this list, but it presents the most complete, internally consistent software engineering philosophy of OOP that I know of. Again, there seems to be an unauthorized version online.

  • I love Smalltalk Best Practice Patterns and have mentioned it a couple of times over the years [ 1 | 2 ]. Ounce for ounce, it contains more practical wisdom for programming in the trenches than any book I've read. Don't let "Smalltalk" in the title fool you; this book will help you become a better programmer in almost any language and any style. I have a PDF of a pre-production draft of SBPP, and Stephane Ducasse has posted a free online copy, with Kent's blessing.

Paradigms of Artificial Intelligence Programming

There is one book on my own list that Fogus did not mention: Paradigms of Artificial Intelligence Programming, by Peter Norvig. It holds perhaps the top position in my personal pantheon. Subtitled "Case Studies in Common Lisp", this book teaches Common Lisp, AI programming, software engineering, and a host of other topics in a classical case studies fashion. When you finish working through this book, you are not only a better programmer; you also have working versions of a dozen classic AI programs and a couple of language interpreters.

Reading Fogus's paragraph of λove for Thinking Forth brought to mind how I felt when I discovered PAIP as a young assistant professor. I once wrote a short blog entry praising it. May these paragraphs stand as a greater testimony of my affection.

I've learned a lot from other books over the years, both books that would fit well on this list (in particular, A Programming Language by Kenneth Iverson) and others that belong on a different list (say, Gödel, Escher, Bach -- an almost incomparable book). But I treasure certain programming books in a very personal way.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development, Teaching and Learning

April 11, 2016 2:53 PM

A Tax Form is Really a Program

I finally got around to preparing my federal tax return this weekend. As I wrote a decade ago, I'm one of those dinosaurs who still does taxes by hand, using pencil and paper. Most of this works involves gathering data from various sources and entering numbers on a two-page Form 1040. My family's finances are relatively simple, I'm reasonably well organized, and I still enjoy the annual ritual of filling out the forms.

For supporting forms such as Schedules A and B, which enumerate itemized deductions and interest and dividend income, I reach into my books. My current accounting system consists of a small set of Python programs that I've been developing over the last few years. I keep all data in plain text files. These files are amenable to grep and simple Python programs, which I use to create lists and tally numbers to enter into forms. I actually enjoy the process and, unlike some people, enjoy reflecting once each year about how I support "we, the people" in carrying out our business. I also reflect on the Rube Goldberg device that is US federal tax code.

However, every year there is one task that annoys me: computing the actual tax I owe. I don't mind paying the tax, or the amount I owe. But I always forget how annoying the Qualified Dividends and Capital Gain Tax Worksheet is. In case you've never seen it, or your mind has erased its pain from your memory in an act of self-defense, here it is:

Qualified Dividends and Capital Gain Tax Worksheet--Line 44

It may not seem so bad at this moment, but look at that logic. It's a long sequence of "Enter the smaller of line X or line Y" and "Add lines Z and W" instructions, interrupted by an occasional reference to an entry on another form or a case statement to select a constant based on your filing status. By the time I get to this logic puzzle each year, I am starting to tire and just want to be done. So I plow through this mess by hand, and I start making mistakes.

This year I made a mistake in the middle of the form, comparing the wrong numbers when instructed to choose the smaller. I realized my mistake when I got to a line where the error resulted in a number that made no sense. (Fortunately, I was still alert enough to notice that much!) I started to go back and refigure from the line with the error, when suddenly sanity kicked it.

This worksheet is a program written in English, being executed by a tired, error-prone computer: me. I don't have to put up with this; I'm a programmer. So I turned the worksheet into a Python program.

This is what the Qualified Dividends and Capital Gain Tax Worksheet for Line 44 of Form 1040 (Page 44 of the 2015 instruction book) could be, if we weren't still distributing everything as dead PDF:

line   = [None] * 28

line[ 0] = 0.00 # unused line[ 1] = XXXX # 1040 line 43 line[ 2] = XXXX # 1040 line 9b line[ 3] = XXXX # 1040 line 13 line[ 4] = line[ 2] + line[ 3] line[ 5] = XXXX # 4952 line 4g line[ 6] = line[ 4] - line[ 5] line[ 7] = line[ 1] - line[ 6] line[ 8] = XXXX # from worksheet line[ 9] = min(line[ 1],line[ 8]) line[10] = min(line[ 7],line[ 9]) line[11] = line[9] - line[10] line[12] = min(line[ 1],line[ 6]) line[13] = line[11] line[14] = line[12] - line[13] line[15] = XXXX # from worksheet line[16] = min(line[ 1],line[15]) line[17] = line[ 7] + line[11] line[18] = line[16] - line[17] line[19] = min(line[14],line[18]) line[20] = 0.15 * line[19] line[21] = line[11] + line[19] line[22] = line[12] - line[21] line[23] = 0.20 * line[22] line[24] = XXXX # from tax table line[25] = line[20] + line[23] + line[24] line[26] = XXXX # from tax table line[27] = min(line[25],line[26])

i = 0 for l in line: print('{:>2} {:10.2f}'.format(i, l)) i += 1

This is a quick-and-dirty first cut, just good enough for what I needed this weekend. It requires some user input, as I have to manually enter values from other forms, from the case statements, and from the tax table. Several of these steps could be automated, with only a bit more effort or a couple of input statements. It's also not technically correct, because my smaller-of tests don't guard for a minimum of 0. Maybe I'll add those checks soon, or next year if I need them.

Wouldn't it be nice, though, if our tax code were written as computer code, or if we could at least download worksheets and various forms as simple programs? I know I can buy commercial software to do this, but I shouldn't have to. There is a bigger idea at play here, and a principle. Computers enable so much more than sharing PDF documents and images. They can change how we write many ideas down, and how we think. Most days, we barely scratch the surface of what is possible.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 08, 2016 4:27 PM

"Algorithm" is not a Postmodern Concept

Ken Perlin's The Romantic View of Programming opens:

I was attending a panel recently on the topic of new media art that seemed to be culturally split. There were panelists who were talking about abstract concepts like "algorithm" as though these were arcane postmodern terms, full of mysterious power and potential menace.

I encounter this sort of thinking around my own campus. Not everyone needs to learn a lot about computer science, but it would be nice if we could at least alleviate this misunderstanding.

An algorithm is not a magic incantation, even when its implementation seems to perform magic. For most people, as for Perlin, an algorithm is ultimately a means toward a goal. The article closes with the most important implication of the author's romantic view of programming: "And if some software tool you need doesn't exist, you build it." That can be as true for a new media artist as it is for a run-of-the-mill programmer like me.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 31, 2016 2:00 PM

TFW Your Students Get Abstraction

A colleague sent me the following exchange from his class, with the tag line "Best comments of the day." His students were working in groups to design a Java program for Conway's Game of Life.

Student 1: I can't comprehend what you are saying.

Student 2: The board doesn't have to be rectangular, does it?

Instructor: In Conway's design, it was. But abstractly, no.

Student 3: So you could have a board of different shapes, or you could even have a three-dimensional "board". Each cell knows its neighbors even if we can't easily display it to the user.

Instructor: Sure, "neighbor" is an abstract concept that you can implement differently depending on your need.

Student 2: I knew there was a reason I took linear algebra.

Student 1: Ok. So let's only allow rectangular boards.

Maybe Student 1 still can't comprehend what everyone is saying... or perhaps he or she understands perfectly well and is a pragmatist. YAGNI for the win!

It always makes me happy when a student encounters a situation in which linear algebra is useful and recognizes its applicability unprompted.

I salute all three of these students, and the instructor who is teaching the class. A good day.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 14, 2016 5:27 PM

Can AlphaGo Teach Us to Play Go Better?

All Systems Go -- the cover of Nature magazine
Nature heralds AlphaGo's arrival
courtesy of the American Go Association

In Why AlphaGo Matters, Ben Kamphaus writes:

AlphaGo recognises strong board positions by first recognizing visual features in the board. It's connecting movements to shapes it detects. Now, we can't see inside AlphaGo unless DeepMind decides they want to share some of the visualizations of its intermediate representations. I hope they do, as I bet they'd offer a lot of insight into both the game of Go and how AlphaGo specifically is reasoning about it.

I'm not sure seeing visualizations of AlphaGo's intermediate representations would offer much insight into either the game of Go or how AlphaGo reasons about it, but I would love to find out.

One of the things that drew me to AI when I was in high school and college was the idea that computer programs might be able to help us understand the world better. At the most prosaic level, I though this might happen in what we had to learn in order to write an intelligent program, and in how we structured the code that we wrote. At a more interesting level, I thought that we might have a new kind of intelligence with which to interact, and this interaction would help us to learn more about the domain of the program's expertise.

Alas, computer chess advanced mostly by making computers that were even faster at applying the sort of knowledge we already have. In other domains, neural networks and then statistical approaches led to machines capable of competent or expert performance, but their advances were opaque. The programs might shed light on how to engineer systems, but the systems themselves didn't have much to say to us about their domains of expertise or competence.

Intelligent programs, but no conversation. Even when we play thousands of games against a chess computer, the opponent seems otherworldly, with no new principles emerging. Perhaps new principles are there, but we cannot see them. Unfortunately, chess computers cannot explain their reasoning to us; they cannot teach us. The result is much less interesting to me than my original dreams for AI.

Perhaps we are reaching a point now where programs such as AlphaGo can display the sort of holistic, integrated intelligence that enables them to teach us something about the game -- even if only by playing games with us. If it turns out that neural nets, which are essentially black boxes to us, are the only way to achieve AI that can work with us at a cognitive level, I will be chagrined. And most pleasantly surprised.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

March 07, 2016 5:12 PM

Solving a Fun Little Puzzle with Analysis and Simulation

I'm on a mailing list of sports fans who happen also to be geeks of various kinds, including programmers and puzzle nuts. Last Friday, one of my friends posted this link and puzzle to the list:

http://fivethirtyeight.com/features/can-you-win-this-hot-new-game-show/

Two players go on a hot new game show called "Higher Number Wins". The two go into separate booths, and each presses a button, and a random number between zero and one appears on a screen. (At this point, neither knows the other's number, but they do know the numbers are chosen from a standard uniform distribution.) They can choose to keep that first number, or to press the button again to discard the first number and get a second random number, which they must keep. Then, they come out of their booths and see the final number for each player on the wall. The lavish grand prize -- a case full of gold bouillon -- is awarded to the player who kept the higher number. Which number is the optimal cutoff for players to discard their first number and choose another? Put another way, within which range should they choose to keep the first number, and within which range should they reject it and try their luck with a second number?

From there, the conversation took off quiickly with a lot of intuition and analysis. There was immediate support for the intuitive threhold of 0.5, which a simple case analysis shows to give the maximum expected value for a player, 0.625. Some head-to-head analysis of various combinations, however, showed other values winning more often than 0.5, with values around 0.6 doing the best.

What was up? One of these analyses was wrong, but we weren't sure which. One list member, who had built a quick model in Excel, said,

I think the optimum may be somewhere around .61, but I'm damned if I know why.

Another said,

I can't help thinking we're ignoring something game theoretical. I'm expecting we've all arrived at the most common wrong answer.

To confirm his suspicions, this person went off and wrote a C program -- a "terrible, awful, ugly, horrible C program" -- to compute all the expected values for all possible head-to-head cases. He announced,

We have a winner, according to my program. ... 61 wins.

He posted a snippet of the output from his program, which showed a steady rise in the win percentages for cutoffs up to a threshold of 0.61, which beat the 99 other cutoffs, with a steady decline for cutoffs thereafter.

Before reading the conversation on the mailing list, I discussed the puzzle with my wife. We were partial to 0.5, too, but that seemed too simple... So I sat down and wrote a program of my own.

My statistics skills are not as strong as many of my friends, and for this reason I like to write programs that simulate the situation at hand. My Racket program creates players who use all possible thresholds, plays matches of 1,000,000 games between each pair, and tallies up the results. Like the C program written by my buddy, my program is quick and dirty; it replays all hundred combinations on each pass, without taking advantage of the fact that the tournament matrix is symmetric. It's slower than it needs to be, but it gets the job done.

Player 57 defeats 95 opponents.
Player 58 defeats 96 opponents.
Player 59 defeats 99 opponents.
Player 60 defeats 100 opponents.
Player 61 defeats 98 opponents.
Player 62 defeats 96 opponents.
Player 63 defeats 93 opponents.

The results of my simulation mirrored the results of the brute-force case analysis. In simulation, 0.6 won, with 0.59 and 0.61 close behind. The two approaches gave similar enough results that it's highly likely there are bugs in neither program -- or both!

Once my friends were confident that 0.5 was not the winner, they were able to diagnose the error in the reasoning that made us think it was the best we could do: Although a player's choice of strategies is independent of the other player's choice, we cannot treat the other player's value as a uniform distribution over [0..1]. That is true only when they choose a threshold of 0 or 1.

In retrospect, this seems obvious, and maybe it was obvious to my mathematician friends right off the bat. But none of us on the mailing list is a professional statistician. I'm proud that we all stuck with the problem until we understood what was going on.

I love how we can cross-check our intuitions about puzzles like this with analysis and simulation. There is a nice interplay between theory and empirical investigation here. A simple theory, even if incomplete or incorrect, gives us a first approximation. Then we run a test and use the results to go back and re-think our theory. The data helped us see the holes in our thinking. What works for puzzles also works for hairier problems out in the world, too.

And we created the data we ended by writing a computer program. You know how much I like to do that. This the sort of situation we see when writing chess-playing programs and machine learning programs: We can write programs that are smarter than we are by starting from much simpler principles that we know and understand.

This experience is also yet another reminder of why, if I ever go freelance as a consultant or developer, I plan to team up with someone who is a better mathematician than I am. Or at least find such a person to whom I can sub-contract a sanity check.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 24, 2016 2:36 PM

Computer Science is the Discipline of Reinvention

The quote of the day comes courtesy of the inimitable Matthias Felleisen, on Racket mailing list:

Computer science is the discipline of reinvention. Until everyone who knows how to write 10 lines of code has invented a programming language and solved the Halting Problem, nothing will be settled :-)

One of the great things about CS is that we can all invent whatever we want. One of the downsides is that we all do.

Sometimes, making something simply for the sake of making it is a wonderful thing, edifying and enjoyable. Other times, we should heed the advice carved above the entrance to the building that housed my first office as a young faculty member: Do not do what has already been done. Knowing when to follow each path is a sign of wisdom.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 14, 2016 11:28 AM

Be Patient, But Expect Better. Then Make Better.

In Reversing the Tide of Declining Expectations Matthew Butterick exhorts designers to expect more from themselves, as well as from the tools they use. When people expect more, other people sometimes tell them to be patient. There is a problem with being patient:

[P]atience is just another word for "let's make it someone else's problem. ... Expectations count too. If you have patience, and no expectations, you get nothing.

But what if you find the available tools lacking and want something better?

Scientists often face this situation. My physicist friends seem always to be rigging up some new apparatus in order to run the experiments they want to run. For scientists and so many other people these days, though, if they want a new kind of tool, they have to write a computer program.

Butterick tells a story that shows designers can do the same:

Let's talk about type-design tools. If you've been at the conference [TYPO Berlin, 2012], maybe you saw Petr van Blokland and Frederick Berlaen talking about RoboFont yesterday. But that is the endpoint of a process that started about 15 years ago when Erik and Petr van Blokland, and Just van Rossum (later joined by many others) were dissatisfied with the commercial type-design tools. So they started building their own. And now, that's a whole ecosystem of software that includes code libraries, a new font-data format called UFO, and applications. And these are not hobbyist applications. These are serious pieces of software being used by professional type designers.

What makes all of this work so remarkable is that there are no professional software engineers here. There's no corporation behind it all. It's a group of type designers who saw what they needed, so they built it. They didn't rely on patience. They didn't wait for someone else to fix their problems. They relied on their expectations. The available tools weren't good enough. So they made better.

That is fifteen years of patience. But it is also patience and expectation in action.

To my mind, this is the real goal of teaching more people how to program: programmers don't have to settle. Authors and web designers create beautiful, functional works. They shouldn't have to settle for boring or cliché type on the web, in their ebooks, or anywhere else. They can make better. Butterick illustrates this approach to design himself with Pollen, his software for writing and publishing books. Pollen is a testimonial to the power of programming for authors (as well as a public tribute to the expressiveness of a programming language).

Empowering professionals to make better tools is a first step, but it isn't enough. Until programming as a skill becomes part of the culture of a discipline, better tools will not always be used to their full potential. Butterick gives an example:

... I was speaking to a recent design-school graduate. He said, "Hey, I design fonts." And I said, "Cool. What are you doing with RoboFab and UFO and Python?" And he said, "Well, I'm not really into programming." That strikes me as a really bad attitude for a recent graduate. Because if type designers won't use the tools that are out there and available, type design can't make any progress. It's as if we've built this great spaceship, but none of the astronauts want to go to Mars. "Well, Mars is cool, but I don't want to drive a spaceship. I like the helmet, though." Don't be that guy. Go the hell to Mars.

Don't be that person. Go to Mars. While you are at it, help the people you know to see how much fun programming can be and, more importantly, how it can help them make things better. They can expect more.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 12, 2016 3:34 PM

Computing Everywhere: Detecting Gravitational Waves

a linearly-polarized gravitational wave
a linearly-polarized gravitational wave
Wikimedia Commons (CC BY-SA 3.0 US)

This week the world is excitedly digesting news that the interferometer at LIGO has detected gravitational waves being emitted by the merger of two black holes. Gravitational waves were predicted by Einstein one hundred years ago in his theory of General Relativity. Over the course of the last century, physicists have amassed plenty of indirect evidence that such waves exist, but this is the first time they have detected them directly.

The physics world is understandably quite excited by this discovery. We all should be! This is another amazing moment in science: Build a model. Make a falsifiable prediction. Wait for 100 years to have the prediction confirmed. Wow.

We in computer science can be excited, too, for the role that computation played in the discovery. As physicist Sabine Hossenfelder writes in her explanation of the gravitational wave story:

Interestingly, even though it was long known that black hole mergers would emit gravitational waves, it wasn't until computing power had increased sufficiently that precise predictions became possible. ... General Relativity, though often praised for its beauty, does leave you with one nasty set of equations that in most cases cannot be solved analytically and computer simulations become necessary.

As with so many cool advances in the world these days, whether in the sciences or the social sciences, computational modeling and simulation were instrumental in helping to confirm the existence of Einstein's gravitational waves.

So, fellow computer scientists, celebrate a little. Then, help a young person you know to see why they might want to study CS, alone or in combination with some other discipline. Computing is one of the fundamental tools we need these days in order to contribute to the great tableau of human knowledge. Even Einstein can use a little computational help now and then.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 29, 2016 3:43 PM

Marvin Minsky and the Irony of AlphaGo

Semantic Information Processing on my bookshelf
a portion of my bookshelf
(CC BY 3.0 US)

Marvin Minsky, one of the founders of AI, died this week. His book Semantic Information Processing made a big impression on me when I read it in grad school, and his paper Why Programming is a Good Medium for Expressing Poorly Understood and Sloppily-Formulated Ideas remains one of my favorite classic AI essays. The list of his students contains many of the great names from decades of computer science; several of them -- Daniel Bobrow, Bertram Raphael, Eugene Charniak, Patrick Henry Winston, Gerald Jay Sussman, Benjamin Kuipers, and Luc Steels -- influenced my work. Winston wrote one of my favorite AI textbooks ever, one that captured the spirit of Minsky's interest in cognitive AI.

It seems fitting that Minsky left us the same week that Google published the paper Mastering the Game of Go with Deep Neural Networks and Tree Search, which describes the work that led to AlphaGo, a program strong enough to beat an expert human Go player. ( This brief article describes the accomplishment and program at a higher level.) One of the key techniques at the heart of AlphaGo is neural networks, an area Minsky pioneered in his mid-1950s doctoral dissertation and continued to work in throughout his career.

In 1969, he and Seymour Papert published a book, Perceptrons, which showed the limitations of a very simple kind of neural network. Stories about the book's claims were quickly exaggerated as they spread to people who had never read the book, and the resulting pessimism stifled neural network research for more than a decade. It is a great irony that, in the week he died, one of the most startling applications of neural networks to AI was announced.

Researchers like Minsky amazed me when I was young, and I am more amazed by them and their lifelong accomplishments as I grow older. If you'd like to learn more, check out Stephen Wolfram's personal farewell to Minsky. It gives you a peek into the wide-ranging mind that made Minsky such a force in AI for so long.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

January 28, 2016 2:56 PM

Remarkable Paragraphs: "Everything Is Computation"

Edge.org's 2016 question for sophisticated minds is, What do you consider the most interesting recent [scientific] news? What makes it important? Joscha Bach's answer is: Everything is computation. Read his essay, which contains some remarkable passages.

Computation changes our idea of knowledge: instead of treating it as justified true belief, knowledge describes a local minimum in capturing regularities between observables.

Epistemology was one of my two favorite courses in grad school (cognitive psych was the other), and "justified true belief" was the starting point for many interesting ideas of what constitutes knowledge. I don't see Bach's formulation as a replacement for justified true belief as a starting point, but rather as a specification of what beliefs are most justified in a given context. Still, Bach's way of using computation in such a concrete way to define "knowledge" is marvelous.

Knowledge is almost never static, but progressing on a gradient through a state space of possible world views. We will no longer aspire to teach our children the truth, because like us, they will never stop changing their minds. We will teach them how to productively change their minds, how to explore the never ending land of insight.

Knowledge is a never-ending process of refactoring. The phrase "how to productively change their minds" reminds me of Jon Udell's recent blog post on liminal thinking at scale. From the perspective that knowledge is a function, "changing one's mind intelligently" is the dynamic computational process that keeps the mind at a local minimum.

A growing number of physicists understand that the universe is not mathematical, but computational, and physics is in the business of finding an algorithm that can reproduce our observations. The switch from uncomputable, mathematical notions (such as continuous space) makes progress possible. Climate science, molecular genetics, and AI are computational sciences. Sociology, psychology, and neuroscience are not: they still seem to be confused by the apparent dichotomy between mechanism (rigid, moving parts) and the objects of their study. They are looking for social, behavioral, chemical, neural regularities, where they should be looking for computational ones.

This is a strong claim, and one I'm sympathetic with. However, I think that the apparent distinction between the computational sciences and the non-computational ones is a matter of time, not a difference in kind. It wasn't that long ago that most physicists thought of the universe in mathematical terms, not computational ones. I suspect that with a little more time, the orientation in other disciplines will begin to shift. Neuroscience and psychology are positioned well for such a phase shift.

In any case, Bach's response points our attention in a direction that has the potential to re-define every problem we try to solve. This may seem unthinkable to many, though perhaps not to computer scientists, especially those of us with an AI bent.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

January 15, 2016 4:02 PM

This Week's Edition of "Amazed by Computers"

As computer scientists get older, we all find ourselves reminiscing about the computers we knew in the past. I sometimes tell my students about using 5.25" floppies with capacities listed in kilobytes, a unit for which they have no frame of reference. It always gets a laugh.

In a recent blog entry, Daniel Lemire reminisces about the Cray 2, "the most powerful computer that money could buy" when he was in high school. It was took up more space than an office desk (see some photos here), had 1 GB of memory, and provided a peak performance of 1.9 gigaflops. In contrast, a modern iPhone fits in a pocket, has 1 GB of memory, too, and contains a graphics processing unit that provides more gigaflops than the Cray 2.

I saw Lemire's post a day after someone tweeted this image of a 64 GB memory card from 2016 next to a 2 GB Western Digital hard drive from 1996:

a 64 GB memory card (2016), a 2 GB hard drive (1996)

The youngest students in my class this semester were born right around 1996. Showing them a 1996 hard drive is like my college professors showing me magnetic cores: ancient history.

This sort of story is old news, of course. Even so, I occasionally remember to be amazed by how quickly our hardware gets smaller and faster. I only wish I could improve my ability to make software just as fast. Alas, we programmers must deal with the constraints of human minds and human organizations. Hardware engineers do battle only with the laws of the physical universe.

Lemire goes a step beyond reminiscing to close his entry:

And what if, today, I were to tell you that in 40 years, we will be able to fit all the computational power of your phone into a nanobot that can live in your blood stream?

Imagine the problems we can solve and the beauty we can make with such hardware. The citizens of 2056 are counting on us.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

January 12, 2016 3:58 PM

Peter Naur and "Datalogy"

Peter Naur died early this year at the age of 87. Many of you may know Naur as the "N" in BNF notation. His contributions to CS were much broader and deeper than BNF, though. He received the 2005 Turing Award in recognition of his contributions to programming language and compiler design, including his involvement in the definition of Algol 60. I have always been a huge fan of his essay Programming as Theory Building, which I share with anyone I think might enjoy it.

When Michael Caspersen sent a note to the SIGCSE mailing list, I learned something new about Naur: he coined the term datalogy for "the science of the nature and use of data" and suggested that it might be a suitable replacement for the term "computer science". I had to learn more...

It turns out that Naur coined this term in a letter to the Communications of the ACM, which ran in the July 1966 under the headline "The Science of Datalogy". This letter is available online through the ACM digital library. Unfortunately, this is behind a paywall for many of you who might be interested. For posterity, here is an excerpt from that page:

This is to advocate that the following new words, denoting various aspects of our subject, be considered for general adoption (the stress is shown by an accent):
  • datálogy, the science of the nature and use of data,
  • datamátics, that part of datalogy which deals with the processing of data by automatic means,
  • datámaton, an automatic device for processing data.

In this terminology much of what is now referred to "data processing" would be datamatics. In many cases this will be a gain in clarity because the new word includes the important aspect of data representations, while the old one does not. Datalogy might be a suitable replacement for "computer science."

The objection that possibly one of these words has already been used as a proper name of some activity may be answered partly by saying that of course the subject of datamatics is written with a lower case d, partly by remembering that the word "electronics" is used doubly in this way without inconvenience.

What also speaks for these words is that they will transfer gracefully into many other languages. We have been using them extensively in my local environment for the last few months and have found them a great help.

Finally I wish to mention that datamatics and datamaton (Danish: datamatik and datamat) are due to Paul Lindgreen and Per Brinch Hansen, while datalogy (Danish: datalogi) is my own invention.

I also learned from Caspersen's email that Naur was named the first Professor in Datalogy in Denmark, and held that titled at the University of Copenhagen until he retired in 1998.

Naur was a pioneer of computing. We all benefit from his work every day.


Posted by Eugene Wallingford | Permalink | Categories: Computing

January 07, 2016 1:52 PM

Parsimony and Obesity on the Web

Maciej Cegłowski is in fine form in his talk The Website Obesity Crisis. In it, he mentions recent projects from Facebook and Google to help people create web pages that load quickly, especially for users of mobile devices. Then he notes that their announcements do not practice what the projects preach:

These comically huge homepages for projects designed to make the web faster are the equivalent of watching a fitness video where the presenter is just standing there, eating pizza and cookies.

There is even more irony in creating special subsets of HTML "designed to be fast on mobile devices".

Why not just serve regular HTML without stuffing it full of useless crap?

William Howard Taft, a president of girth
Wikipedia photo
(photographer not credited)

Indeed. Cegłowski offers a simple way to determine whether the non-text elements of your page are useless, which he dubs the Taft Test:

Does your page design improve when you replace every image with William Howard Taft?

(Taft was an American president and chief justice widely known for his girth.)

My blog is mostly text. I should probably use more images, to spice up the visual appearance and to augment what the text says, but doing so takes more time and skill than I often have at the ready. When I do use images, they tend to be small. I am almost certainly more parsimonious than I need to be for most Internet connections in the 2010s, even wifi.

You will notice that I never embed video, though. I dug into the documentation for HTML and found a handy alternative to use in its place: the web link. It is small and loads fast.


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 11, 2015 2:59 PM

Looking Backward and Forward

Jon Udell looks forward to a time when looking backward digitally requires faithful reanimation of born-digital artifacts:

Much of our culture heritage -- our words, our still and moving pictures, our sounds, our data -- is born digital. Soon almost everything will be. It won't be enough to archive our digital artifacts. We'll also need to archive the software that accesses and renders them. And we'll need systems that retrieve and animate that software so it, in turn, can retrieve and animate the data.

We already face this challenge. My hard drive is littered by files I have a hard time opening, if I am able to at all.

Tim Bray reminds us that many of those "born-digital" artifacts will probably live on someone else's computer, including ones owned by his employer, as computing moves to a utility model:

Yeah, computing is moving to a utility model. Yeah, you can do all sorts of things in a public cloud that are too hard or too expensive in your own computer room. Yeah, the public-cloud operators are going to provide way better up-time, security, and distribution than you can build yourself. And yeah, there was a Tuesday in last week.

I still prefer to have original versions of my documents live on my hardware, even when using a cloud service. Maybe one day I'll be less skeptical, when it really is as unremarkable as Tuesday next week. But then, plain text still seems to me to be the safest way to store most data, so what do I know?


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 09, 2015 2:54 PM

What Is The Best Way Promote a Programming Language?

A newcomer to the Racket users mailing list asked which was the better way to promote the language: start a discussion on the mailing list, or ask questions on Stack Overflow. After explaining that neither was likely to promote Racket, Matthew Butterick gave some excellent advice:

Here's one good way to promote the language:
  1. Make something impressive with Racket.
  2. When someone asks "how did you make that?", give Racket all the credit.

Don't cut corners in Step 1.

This technique applies to all programming languages.

Butterick has made something impressive with Racket: Practical Typography, an online book. He wrote the book using a publishing system named Pollen, which he created in Racket. It's a great book and a joy to read, even if typography is only a passing interest. Check it out. And he gives Racket and the Racket team a lot of credit.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 08, 2015 3:55 PM

A Programming Digression: Generating Excellent Numbers

Background

Whenever I teach my compiler course, it seems as if I run across a fun problem or two to implement in our source language. I'm not sure if that's because I'm looking or because I'm just lucky to read interesting blogs and Twitter feeds.

Farey sequences as Ford circles

For example, during a previous offering, I read on John Cook's blog about Farey's algorithm for approximating real numbers with rational numbers. This was a perfect fit for the sort of small language that my students were writing a compiler for, so I took a stab at implementing it. Because our source language, Klein, was akin to an integer assembly language, I had to unravel the algorithm's loops and assignment statements into function calls and if statements. The result was a program that computed an interesting result and that tested my students' compilers in a meaningful way. The fact that I had great fun writing it was a bonus.

This Semester's Problem

Early this semester, I came across the concept of excellent numbers. A number m is "excellent" if, when you split the sequence of its digits into two halves, a and b, b² - a² equals n. 48 is the only two-digit excellent number (8² - 4² = 48), and 3468 is the only four-digit excellent number (68² - 34² = 3468). Working with excellent numbers requires only integers and arithmetic operations, which makes them a perfect domain for our programming language.

My first encounter with excellent numbers was Brian Foy's Computing Excellent Numbers, which discusses ways to generate numbers of this form efficiently in Perl. Foy uses some analysis by Mark Jason Dominus, written up in An Ounce of Theory Is Worth a Pound of Search, that drastically reduces the search space for candidate a's and b's. A commenter on the Programming Praxis article uses the same trick to write a short Python program to solve that challenge. Here is an adaptation of that program which prints all of the 10-digit excellent numbers:

    for a in range(10000, 100000):
        b = ((4*a**2+400000*a+1)**0.5+1) / 2.0
        if b == int(b):
           print( int(str(a)+str(int(b))) )

I can't rely on strings or real numbers to implement this in Klein, but I could see some alternatives... Challenge accepted!

My Standard Technique

We do not yet have a working Klein compiler in class yet, so I prefer not to write complex programs directly in the language. It's too hard to get subtle semantic issues correct without being able to execute the code. What I usually do is this:

  • Write a solution in Python.
  • Debug it until it is correct.
  • Slowly refactor the program until it uses only a Klein-like subset of Python.

This produces what I hope is a semantically correct program, using only primitives available in Klein.

Finally, I translate the Python program into Klein and run it through my students' Klein front-ends. This parses the code to ensure that it is syntactically correct and type-checks the code to ensure that it satisfies Klein's type system. (Manifest types is the one feature Klein has that Python does not.)

As mentioned above, Klein is something like integer assembly language, so converting to a Klein-like subset of Python means giving up a lot of features. For example, I have to linearize each loop into a sequence of one or more function calls, recursing at some point back to the function that kicks off the loop. You can see this at play in my Farey's algorithm code from before.

I also have to eliminate all data types other than booleans and integers. For the program to generate excellent numbers, the most glaring hole is a lack of real numbers. The algorithm shown above depends on taking a square root, getting a real-valued result, and then coercing a real to an integer. What can I do instead?

the iterative step in Newton's method

Not to worry. sqrt is not a primitive operator in Klein, but we have a library function. My students and I implement useful utility functions whenever we encounter the need and add them to a file of definitions that we share. We then copy these utilities into our programs as needed.

sqrt was one of the first complex utilities we implemented, years ago. It uses Newton's method to find the roots of an integer. For perfect squares, it returns the argument's true square root. For all other integers, it returns the largest integer less than or equal to the true root.

With this answer in hand, we can change the Python code that checks whether a purported square root b is an integer using type coercion:

    b == int(b)
into Klein code that checks whether the square of a square root equals the original number:
    isSquareRoot(r : integer, n : integer) : boolean
      n = r*r

(Klein is a pure functional language, so the return statement is implicit in the body of every function. Also, without assignment statements, Klein can use = as a boolean operator.)

Generating Excellent Numbers in Klein

I now have all the Klein tools I need to generate excellent numbers of any given length. Next, I needed to generalize the formula at the heart of the Python program to work for lengths other than 10.

For any given desired length, let n = length/2. We can write any excellent number m in two ways:

  • a10n + b (which defines it as the concatenation of its front and back halves)
  • b² - a² (which defines it as excellent)

If we set the two m's equal to one another and solve for b, we get:

        1
    b = -(1 + sqrt[4a2 + 4(10n)a + 1])
        2

Now, as in the algorithm above, we loop through all values for a with n digits and find the corresponding value for b. If b is an integer, we check to see if m = ab is excellent.

The Python loop shown above works plenty fast, but Klein doesn't have loops. So I refactored the program into one that uses recursion. This program is slower, but it works fine for numbers up to length 6:

    > python3.4 generateExcellent.py 6
    140400
    190476
    216513
    300625
    334668
    416768
    484848
    530901

Unfortunately, this version blows out the Python call stack for length 8. I set the recursion limit to 50,000, which helps for a while...

    > python3.4 generateExcellent.py 8
    16604400
    33346668
    59809776
    Segmentation fault: 11

Cool.

Next Step: See Spot Run

The port to an equivalent Klein program was straightforward. My first version had a few small bugs, which my students' parsers and type checkers helped me iron out. Now I await their full compilers, due at the end of the week, to see it run. I wonder how far we will be able to go in the Klein run-time system, which sits on top of a simple virtual machine.

If nothing else, this program will repay any effort my students make to implement the proper handling of tail calls! That will be worth a little extra-credit...

This programming digression has taken me several hours spread out over the last few weeks. It's been great fun! The purpose of Klein is to help my students learn to write a compiler. But the programmer in me has fun working at this level, trying to find ways to implement challenging algorithms and then refactoring them to run deeper or faster. I'll let you know the results soon.

I'm either a programmer or crazy. Probably both.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 01, 2015 4:38 PM

A Non-Linear Truth about Wealth and Happiness

This tweet has been making the rounds again the last few days. It pokes good fun at the modern propensity to overuse the phrase 'exponential growth', especially in situations that aren't exponential at all. This usage has even invaded the everyday speech of many of my scientist friends, and I'm probably guilty more than I'd like to admit.

In The Day I Became a Millionaire, David Heinemeier Hansson avoids this error when commenting on something he's learned about wealth and happiness:

The best things in life are free. The second best things are very, very expensive. -- Coco Chanel
While the quote above rings true, I'd add that the difference between the best things and the second best things is far, far greater than the difference between the second best things and the twentieth best things. It's not a linear scale.

I started to title this post "A Power Law of Wealth and Happiness" before realizing that I was falling into a similar trap common among computer scientists and software developers these days: calling every function with a steep end and a long tail "a power law". DHH does not claim that the relationship between cost and value is exponential, let alone that it follows a power law. I reined in my hyperbole just in time. "A Non-Linear Truth ..." may not have quite the same weight of power law academic-speak, but it sounds just fine.

By the way, I agree with DHH's sentiment. I'm not a millionaire, but most of the things that contribute to my happiness would scarcely be improved by another zero or two in my bank account. A little luck at birth afforded me almost all of what I need in life, as it has many other people. The rest is an expectations game that is hard to win by accumulating more.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

November 26, 2015 11:04 AM

I Am Thankful for Programming

I smiled a big smile when I read this passage in an interview with Victoria Gould, a British actor and mathematician:

And just as it did when she was at school, maths still brings Victoria relief and reassurance. "When teaching or acting becomes stressful, I retreat to maths a lot for its calmness and its patterns. I'll quite often, in a stressful time, go off and do a bit of linear algebra or some trigonometric identities. They're hugely calming for me." Maths as stress relief? "Absolutely, it works every time!"

It reminded me of a former colleague, a mathematician who now works at Ohio University. He used to say that he had pads and pencils scattered on tables and counters throughout his house, because "I never know when I'll get the urge to do some math."

Last night, I came home after a couple of days of catching up on department work and grading. Finally, it was time to relax for the holiday. What did I do first? I wrote a fun little program in Python to reverse an integer, using only arithmetic operators. Then I watched a movie with my wife. Both relaxed me.

I was fortunate as a child to find solace in fiddling with numbers and patterns. Setting up a system of equations and solving it algebraically was fun. I could while away many minutes playing with the square root key on my calculator, trying to see how long it would take me to drive a number to 1.

Then in high school I discovered programming, my ultimate retreat.

On this day, I am thankful for many people and many things, of course. But Gould's comments remind me that I am also thankful for the privilege of knowing how to program, and for the way it allows me to escape into a world away from stress and distraction. This is a gift.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

November 20, 2015 6:02 PM

The Scientific Value of Reading Old Texts

In Numbers, Toys and Music, the editors of Plus Magazine interview Manjul Bhargava, who won a 2014 Fields Medal for his work on a problem involving a certain class of square numbers.

Bhargava talked about getting his start on problems of this sort not by studying Gauss's work from nineteenth century, but by reading the work of the seventh century mathematician Brahmagupta in the original Sanskrit. He said it was exciting to read original texts and old translations of original texts from at least two perspectives. Historically, you see an idea as it is encountered and discovered. It's an adventure story. Mathematically, you see the idea as it was when it was discovered, before it has been reinterpreted over many years by more modern mathematicians, using newer, fancier, and often more complicated jargon than was available to the original solver of the problem.

He thinks this is an important step in making a problem your own:

So by going back to the original you can bypass the way of thinking that history has somehow decided to take, and by forgetting about that you can then take your own path. Sometimes you get too influenced by the way people have thought about something for 200 years, that if you learn it that way that's the only way you know how to think. If you go back to the beginning, forget all that new stuff that happened, go back to the beginning. Think about it in a totally new way and develop your own path.

Bhargava isn't saying that we can ignore the history of math since ancient times. In his Fields-winning work, he drew heavily on ideas about hyperelliptic curves that were developed over the last century, as well as computational techniques unavailable to his forebears. He was prepared with experience and deep knowledge. But by going back to Brahmagupta's work, he learned to think about the problem in simpler terms, unconstrained by the accumulated expectations of modern mathematics. Starting from a simpler set of ideas, he was able to make the problem his own and find his own way toward a solution.

This is good advice in computing as well. When CS researchers tell us to read the work of McCarthy, Newell and Simon, Sutherland, and Engelbart, they are channeling the same wisdom that helped Manjul Bhargava discover new truths about the structure of square numbers.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 19, 2015 2:45 PM

Hope for the Mature Researcher

In A Primer on Graph Isomorphism, Lance Fortnow puts László Babai's new algorithm for the graph isomorphism problem into context. To close, he writes:

Also we think of theory as a young person's game, most of the big breakthroughs coming from researchers early in their careers. Babai is 65, having just won the Knuth Prize for his lifetime work on interactive proofs, group algorithms and communication complexity. Babai uses his extensive knowledge of combinatorics and group theory to get his algorithm. No young researcher could have had the knowledge base or maturity to be able to put the pieces together the way that Babai did.

We often hear that research, especially research aimed at solving our deepest problems, is a young person's game. Great work takes a lot of stamina. It often requires a single-minded focus that comes naturally to a young person but which is a luxury unavailable to someone with a wider set of obligations beyond work. Babai's recent breakthrough reminds us that other forces are at play, that age and broad experience can be advantages, too.

This passage serves as a nice counterweight to Garrison Keillor's The slow rate of learning... line, quoted in my previous post. Sometimes, slow and steady are what it takes to get a big job done.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 22, 2015 4:22 PM

Aramaic, the Intermediate Language of the Ancient World

My compiler course is making the transition from the front end to the back end. Our attention is on static analysis of abstract syntax trees and will soon turn to other intermediate representations.

In the compiler world, an "intermediate representation" or intermediate language is a notation used as a stepping stone between the abstract syntax tree and the machine language that is ultimately produced. Such a stepping stone allows the compiler to take smaller steps in translation process and makes it easier to improve the code before getting down into the details of machine language.

We sometimes see intermediate languages in the "real world", too. They tend to arise as a result of cultural and geopolitical forces and, while they usually serve different purposes in human affairs than in compiler affairs, they still tend to be practical stepping stones to another language.

Consider the case of Darius I, whose Persian armies conquered most of the Middle East around 500 BC. As John McWhorter writes in The Atlantic, at the time of Darius's conquest,

... Aramaic was so well-entrenched that it seemed natural to maintain it as the new empire's official language, instead of using Persian. For King Darius, Persian was for coins and magnificent rock-face inscriptions. Day-to-day administration was in Aramaic, which he likely didn't even know himself. He would dictate a letter in Persian and a scribe would translate it into Aramaic. Then, upon delivery, another scribe would translate the letter from Aramaic into the local language. This was standard practice for correspondence in all the languages of the empire.

For sixty years, many compiler writers have dreamed of a universal intermediate language that would ease the creation of compilers for new languages and new machines, to no avail. But for several hundred years, Aramaic was the intermediate representation of choice for a big part of the Western world! Alas, Greek and Arabic later came along to supplant Aramaic, which now seems to be on a path to extinction.

This all sounds a lot like the world of programming, in which languages come and go as we develop new technologies. Sometimes a language, human or computer, takes root for a while as the result of historical or technical forces. Then a new regime or a new culture rises, or an existing culture gains in influence, and a different language comes to dominate.

McWhorter suggests that English may have risen to prominence at just the right moment in history to entrench itself as the world's intermediate language for a good long run. We'll see. Human languages and computer languages may operate on different timescales, but history treats them much the same.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 18, 2015 10:42 AM

What a Tiny Language Can Teach Us About Gigantic Systems

StrangeLoop is long in the books for most people, but I'm still thinking about some of the things I learned there. This is the first of what I hope to be a few more posts on talks and ideas still on my mind.

The conference opened with a keynote address by Peter Alvaro, who does research at the intersection of distributed systems and programming languages. The talk was titled "I See What You Mean", but I was drawn in more by his alternate title: "What a Tiny Language Can Teach Us About Gigantic Systems". Going in, I had no idea what to expect from this talk and so, in an attitude whose pessimism surprised me, I expected very little. Coming out, I had been surprised in the most delightful way.

Alvaro opened with the confounding trade-off of all abstractions: Hiding the distracting details of a system can illuminate the critical details (yay!), but the boundaries of an abstraction lock out the people who want to work with the system in a different way (boo!). He illustrated the frustration felt by those who are locked out with a tweet from @pxlplz:

SELECT bs FROM table WHERE sql="arrgh" ORDER BY hate

From this base, Alvaro moved on to his personal interests: query languages, semantics, and distributed systems. When modeling distributed systems, we want a language that is resilient to failure and tolerant of a loose ordering on the execution of operations. But we also need a way to model what programs written in the language mean. The common semantic models express a common split in computing:

  • operational semantics: a program means what it does
  • model-theoretic semantics: a program means the set of facts that makes it true

With query languages, we usually think of programs in terms of the databases of facts that makes them true. In many ways, the streaming data of a distributed system is a dual to the database query model. In the latter, program control flows down to fixed data. In distributed systems, data flows down to fixed control units. If I understood Alvaro correctly, his work seeks to find a sweet spot amid the tension between these two models.

Alvaro walked through three approaches to applicative programming. In the simplest form, we have three operators: select (σ), project (Π), and join (). The database language SQL adds to this set negation (¬). The Prolog subset Datalog makes computation of the least fixed point a basic operation. Datalog is awesome, says Alvaro, but not if you add ¬! That creates a language with too much power to allow the kind of reasoning we want to do about a program.

Declarative programs don't have assignment statements, because they introduce time into a model. An assignment statement effectively partitions the past (in which an old value holds) from the present (characterized by the current value). In a program with state, there is an hidden clock inside the program.

We all know the difficulty of managing state in a standard system. Distributed systems create a new challenge. They need to deal with time, but a relativistic time in which different programs seem to be working on their own timelines. Alvaro gave a couple of common examples:

  • a sender crashes, then restarts and begins to replay a set of transaction
  • a receiver enters garbage collection, then comes back to life and begins to respond to queued messages

A language that helps us write better distributed systems must give us a way to model relativistic time without a hidden universal clock. The rest of the talk looked at some of Alvaro's experiments aimed at finding such languages for distributed systems, building on the ideas he had introduced earlier.

The first was Dedalus, billed as "Datalog in time and space". In Dedalus, knowledge is local and ephemeral. It adds two temporal operators to the set found in SQL: @next, for making assertions about the future, and @async, for making assertions of independence between operations. Computation in Dedalus is rendezvous between data and control. Program state is a deduction.

But what of semantics? Alas, a Dedalus program has an infinite number of models, each model itself infinite. The best we can do is to pull at all of the various potential truths and hope for quiescence. That's not comforting news if you want to know what your program will mean while operating out in the world.

Dedalus as the set of operations {σ, Π, , ¬, @next, @async} takes us back to the beginning of the story: too much power for effective reasoning about programs.

However, Dedalus minus ¬ seems to be a sweet spot. As an abstraction, it hides state representation and control flow and illuminates data, change, and uncertainty. This is the direction Alvaro and his team are moving in now. One result is Bloom, a small new language founded on the Dedalus experiment. Another is Blazes, a program analysis framework that identifies potential inconsistencies in a distributed program and generates the code needed to ensure coordination among the components in question. Very interesting stuff.

Alvaro closed by returning to the idea of abstraction and the role of programming language. He is often asked why he creates new programming languages rather than working in existing languages. In either approach, he points out, he would be creating abstractions, whether with an API or a new syntax. And he would have to address the same challenges:

  • Respect users. We are they.
  • Abstractions leak. Accept that and deal with it.
  • It is better to mean well than to feel good. Programs have to do what we need them to do.

Creating a language is an act of abstraction. But then, so is all of programming. Creating a language specific to distributed systems is a way to make very clear what matters in the domain and to provide both helpful syntax and clear, reliable semantics.

Alvaro admits that this answer hides the real reason that he creates new languages:

Inventing languages is dope.

At the end of this talk, I understood its title, "I See What You Mean", better than I did before it started. The unintended double entendre made me smile. This talk showed how language interacts with problems in all areas of computing, the power language gives us as well as the limits it imposes. Alvaro delivered a most excellent keynote address and opened StrangeLoop on a high note.

Check out the full talk to learn about all of this in much greater detail, with the many flourishes of Alvaro's story-telling.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 15, 2015 8:18 AM

Perfection Is Not A Pre-Requisite To Accomplishing Something Impressive

In Not Your Typical Role Model, mathematician Hannah Fry tells us some of what she learned about Ada Lovelace, "the 19th century programmer", while making a film about her. Not all of it was complimentary. She concludes:

Ada was very, very far from perfect, but perfection is not a pre-requisite to accomplishing something impressive. Our science role models shouldn't always be there to celebrate the unachievable.

A lot of accomplished men of science were far from perfect role models, too. In the past, we've often been guilty of covering up bad behavior to protect our heroes. These days, we sometimes rush to judge them. Neither inclination is healthy.

By historical standards, it sounds like Lovelace's imperfections were all too ordinary. She was human, like us all. Lovelace thought some amazing things and wrote them down for us. Let's celebrate that.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

September 29, 2015 4:01 PM

StrangeLoop: Pixie, big-bang, and a Mix Tape

The second day of StrangeLoop was as eventful as the first. Let me share notes on a few more of the talks that I enjoyed.

the opening screen for the Pixie talk

In the morning, I learned more about Pixie, a small, lightweight version of Lisp at the talk by Timothy Baldridge, its creator. Why another Lisp, especially when its creator already knows and enjoys Clojure? Because he wanted to explore ideas he had come across over the years. Sometimes, there is no better way to do that than to create your own language. This is what programmers do.

Pixie uses Clojure-like syntax and semantics, but it diverges enough from the Clojure spec that he didn't want to be limited by calling it a variant of Clojure. "Lisp" is a generic term these days and doesn't carry as much specific baggage.

Baldridge used RPython and the PyPy tool chain to create Pixie, which has its own virtual machine and its own bytecode format. It also has garbage collection. (Implementors of RPython-based languages don't have to write their own GC; they write hints to a GC generator, which layers the GC code into the generated C of the interpreter.) Pixie also offers strong interoperation with C, which makes it possible to speed up even further hot spots in a program.

For me, the most interesting part of Pixie is its tracing just-in-time compiler. A just-in-time compiler, or "JIT", generates target code at run time, when a specific program provides more information to the translator than just the language grammar. A tracing JIT records frequently executed operations, especially in and around loops, in order to get the information the code generator needs to produce its output. Tracing JITs are an attractive idea for implementing a Lisp, in which programs tend to consist of many small functions. A tracing JIT can dive deep through all those calls and generate small, tight code.

the opening screen for the 'mix tape' talk

Rather than give a detailed talk about a specific language, David Nolen and Michael Bernstein gave a talk they dubbed "a mix tape to lovers of programming languages". They presented an incomplete and very personal look at the history of programming languages, and at each point on the timeline they offered artists whose songs reflected a similar development in the music world. Most of the music was out of the mainstream, where connoisseurs such as Nolen and Bernstein find and appreciate hidden gems. Some of it sounded odd to a modern ear, and at one point Bernstein gently assured the audience, "It's okay to laugh."

The talk even taught me one bit of programming history that I didn't know before. Apparently, John Backus got his start by trying to make a hi-fi stereo system for listening to music! This got him into radios and electronics, and then into programming computers at the hardware level. Bernstein quoted Backus as saying, "I figured there had to be a better way." This adds a new dimension to something I wrote on the occasion of Backus's passing. Backus eventually found himself writing programs to compute missile trajectories on the IBM 701 mainframe. "Much of my work has come from being lazy," Backus said, so "I started work on a programming system to make it easier to write programs." The result was Fortran.

Nolen and Bernstein introduced me to some great old songs, as well as several new artists. Songs by jazz pianist Cecil Taylor and jazz guitarist Sonny Sharrock made particular impressions on me, and I plan to track down more of their work.

it all stated with a really big-bang

Matthias Felleisen combined history and the details of a specific system in his talk about big-bang, the most recent development in a 20-year journey to find ways to integrate programming with mathematical subjects in a way that helps students learn both topics better. Over that time, he and the PLT team have created a sequence of Racket libraries and languages that enable students to begin with middle-school pre-algebra and progress in smooth steps all the way up to college-level software engineering. He argued near the end of his talk that the progression does not stop there, as extension of big-bang has led to bona fide CS research into network calculus.

All of this programming is done in the functional style. This is an essential constraint if we wish to help students learn and so real math. Felleisen declared boldly "I don't care about programming per se" when it comes to programming in the schools. Even students who never write another program again should have learned something valuable in the content area.

The meat of the talk demonstrated how big-bang makes it possible for students to create interactive, graphical programs using nothing but algebraic expressions. I can't really do justice to Matthias's story or storytelling here, so you should probably watch the video. I can say, though, that the story he tells here meshes neatly with The Racket Way as part of a holistic vision of computing unlike most anything you will find in the computing world. It's impressive both in the scope of its goals and in the scope of the tools it has produced.

More notes on StrangeLoop soon.

~~~~

PHOTO. I took both photos above from my seats in the audience at StrangeLoop. Please pardon my unsteady hand. CC BY-SA.


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 27, 2015 6:56 PM

StrangeLoop is in the Books

a plaque outside the St. Louis Public Library

The conference came and went far too quickly, with ideas enough for many more days. As always, Alex Miller and his team put on a first-class program with the special touches and the vibe that make me want to come back every year.

Most of the talks are already online. I will be writing up my thoughts on some of the talks that touched me deepest in separate entries over the next few days. For now, let me share notes on a few other talks that I really enjoyed.

Carin Meier talked about her tinkering with the ideas of chemical computing, in which we view molecules and reactions as a form of computation. In her experiments, Meier encoded numbers and operations as molecules, put them in a context in which they could react with one another, and then visualized the results. This sort of computation may seem rather inefficient next to a more direct algorithm, it may give us a way to let programs discover ideas by letting simple concepts wander around and bump into one another. This talk reminded me of AM and Eurisko, AI programs from the late 1970s which always fascinated me. I plan to try Meier's ideas out in code.

Jan Paul Posma gave us a cool look at some Javascript tools for visualizing program execution. His goal is to make it possible to shift from ordinary debugging, which follows something like the scientific method to uncover hidden errors and causes, to "omniscient debugging", in which everything we need to understand how our code runs is present in the system. Posma's code and demos reminded me of Bret Victor's work, such as learnable programming.

Caitie McCaffrey's talk on building scalable, stateful services and Camille Fournier's talk on hope and fear in distributed system design taught me a little about a part of the computing world I don't know much about. Both emphasized the importance of making trade-offs among competing goals and forces. McCaffrey's talk had a more academic feel, with references to techniques such as distributed hash tables with nondeterministic placement, whereas Fournier took a higher-level look at how context drives the balance between scale and fault tolerance. From each talk I took at least one take-home lesson for me and my students:

  • McCaffrey asked, "Should you read research papers?" and immediately answered "Yes." A lot of the ideas we need today appear in the database literature of the '60s, '70s, and '80s. Study!
  • Fournier says that people understand asynchrony and changing data better than software designers seem to think. If we take care of the things that matter most to them, such as not charging their credit cards more once, they will understand the other effects of asynchrony as simply one of the costs of living in a world that gives them amazing possibilities.

Fournier did a wonderful job stepping in to give the Saturday keynote address on short notice. She was lively, energetic, and humorous -- just what the large audience needed after a long day of talks and a long night of talking and carousing. Her command of the room was impressive.

More notes soon.

~~~~

PHOTO. One of the plaques on the outside wall of the St. Louis Public Library, just a couple of blocks from the Peabody Opera House and StrangeLoop. Eugene Wallingford, 2015. Available under a CC BY-SA license.


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 24, 2015 9:04 PM

Off to StrangeLoop

StrangeLoop 2010 logo

StrangeLoop 2015 starts tomorrow, and after a year's hiatus, I'm back. The pre-conference workshops were today, and I wish I could have been here in time for the Future of Programming workshop. Alas, I have a day job and had to teach class before hitting the road. My students knew I was eager to get away and bid me a quick goodbye as soon as we wrapped up our discussion of table-driven parsing. (They may also have been eager to finish up the scanners for their compiler project...)

As always, the conference line-up consists of strong speakers and intriguing talks throughout. Tomorrow, I'm looking forward to talks by Philip Wadler and Gary Bernhardt. Wadler is Wadler, and if anyone can shed new light in 2015 on the 'types versus unit tests' conflagration and make it fun, it's probably Bernhardt.

On Saturday, my attention is honed in on David Nolen's and Michael Bernstein's A History of Programming Languages for 2 Voices. I've been big fans of their respective work for years, swooning on Twitter and reading their blogs and papers, and now I can see them in person. I doubt I'll be able to get close, though; they'll probably be swamped by groupies. Immediately after that talk, Matthias Felleisen is giving a talk on Racket's big-bang, showing how we can use pure functional programming to teach algebra to middle school students and fold the network into the programming language.

Saturday was to begin with a keynote by Kathy Sierra, whom I last saw many years ago at OOPSLA. I'm sad that she won't be able to attend after all, but I know that Camille Fournier's talk about hopelessness and confidence in distributed systems design will be an excellent lead-off talk for the day.

I do plan one change for this StrangeLoop: my laptop will stay in its shoulder bag during all of the talks. I'm going old school, with pen and a notebook in hand. My mind listens differently when I write notes by hand, and I have to be more frugal in the notes I take. I'm also hoping to feel a little less stress. No need to blog in real time. No need to google every paper the speakers mention. No temptation to check email and do a little work. StrangeLoop will have my full attention.

The last time I came to StrangeLoop, I read Raymond Queneau's charming and occasionally disorienting "Exercises in Style", in preparation for Crista Lopes's talk about her exercises in programming style. Neither the book nor talk disappointed. This year, I am reading The Little Prince -- for the first time, if you can believe it. I wonder if any of this year's talks draw their inspiration from Saint-Exupéry? At StrangeLoop, you can never rule that kind of connection out.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

September 22, 2015 2:57 PM

"Good Character" as an Instance of Postel's Law

Mike Feathers draws an analogy I'd never thought of before in The Universality of Postel's Law: what we think of as "good character" can be thought of as an application of Postel's Law to ordinary human relations.

Societies often have the notion of 'good character'. We can attempt all sorts of definitions but at its core, isn't good character just having tolerance for the foibles of others and being a person people can count on? Accepting wider variation at input and producing less variation at output? In systems terms that puts more work on the people who have that quality -- they have to have enough control to avoid 'going off' on people when others 'go off on them', but they get the benefit of being someone people want to connect with. I argue that those same dynamics occur in physical systems and software systems that have the Postel property.

These days, most people talk about Postel's Law as a social law, and criticisms of it even in software design refer to it as creating moral hazards for designers. But Postel coined this "principle of robustness" as a way to talk about implementing TCP, and most references I see to it now relate to HTML and web browsers. I think it's pretty cool when a software design principle applies more broadly in the design world, or can even be useful for understanding human behavior far removed from computing. That's the sign of a valuable pattern -- or anti-pattern.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns, Software Development

September 19, 2015 11:56 AM

Software Gets Easier to Consume Faster Than It Gets Easier to Make

In What Is the Business of Literature?, Richard Nash tells a story about how the ideas underlying writing, books, and publishing have evolved over the centuries, shaped by the desires of both creators and merchants. One of the key points is that technological innovation has generally had a far greater effect on the ability to consume literature than on the ability to create it.

But books are just one example of this phenomenon. It is, in fact, a pattern:

For the most part, however, the technical and business-model innovations in literature were one-sided, far better at supplying the means to read a book than to write one. ...

... This was by no means unique to books. The world has also become better at allowing people to buy a desk than to make a desk. In fact, from medieval to modern times, it has become easier to buy food than to make it; to buy clothes than to make them; to obtain legal advice than to know the law; to receive medical care than to actually stitch a wound.

One of the neat things about the last twenty years has been the relatively rapid increase in the ability for ordinary people to to write and disseminate creative works. But an imbalance remains.

Over a shorter time scale, this one-sidedness has been true of software as well. The fifty or sixty years of the Software Era have given us seismic changes in the availability, ubiquity, and backgrounding of software. People often overuse the word 'revolution', but these changes really have had an immense effect in how and when almost everyone uses software in their lives.

Yet creating software remains relatively difficult. The evolution of our tools for writing programs hasn't kept pace with the evolution in platforms for using them. Neither has the growth in our knowledge of how make great software.

There is, of course, a movement these days to teach more people how to program and to support other people who want to learn on their own. I think it's wonderful to open doors so that more people have the opportunity to make things. I'm curious to see if the current momentum bears fruit or is merely a fad in a world that goes through fashions faster than we can comprehend them. It's easier still to toss out a fashion that turns out to require a fair bit of work.

Writing software is still a challenge. Our technologies have not changed that fact. But this is also true, as Nash reminds us, of writing books, making furniture, and a host of other creative activities. He also reminds us that there is hope:

What we see again and again in our society is that people do not need to be encouraged to create, only that businesses want methods by which they can minimize the risk of investing in the creation.

The urge to make things is there. Give people the resources they need -- tools, knowledge, and, most of all, time -- and they will create. Maybe one of the new programmers can help us make better tools for making software, or lead us to new knowledge.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns, Software Development

September 11, 2015 3:55 PM

Search, Abstractions, and Big Epistemological Questions

Andy Soltis is an American grandmaster who writes a monthly column for Chess Life called "Chess to Enjoy". He has also written several good books, both recreational and educational. In his August 2015 column, Soltis talks about a couple of odd ways in which computers interact with humans in the chess world, ways that raise bigger questions about teaching and the nature of knowledge.

As most people know, computer programs -- even commodity programs one can buy at the store -- now play chess better than the best human players. Less than twenty years ago, Deep Blue first defeated world champion Garry Kasparov in a single game. A year later, Deep Blue defeated Kasparov in a closely contested six-game match. By 2005, computers were crushing Top Ten players with regularity. These days, world champion Magnus Larson is no match for his chess computer.

a position in which humans see the win, but computers don't

Yet there are still moments where humans shine through. Soltis opens with a story in which two GMs were playing a game the computers thought Black was winning, when suddenly Black resigned. Surprised journalists asked the winner, GM Vassily Ivanchuk, what had happened. It was easy, he said: it only looked like Black was winning. Well beyond the computers' search limits, it was White that had a textbook win.

How could the human players see this? Were they searching deeper than the computers? No. They understood the position at a higher level, using abstractions such as "being in the square" and passed pawns like splitting a King like "pants". (We chessplayers are an odd lot.)

When you can define 'flexibility' in 12 bits,
it will go into the program.

Attempts to program computers to play chess using such abstract ideas did not work all that well. Concepts like king safety and piece activity proved difficult to implement in code, but eventually found their way into the programs. More abstract concepts like "flexibility", "initiative", and "harmony" have proven all but impossible to implement. Chess programs got better -- quickly -- when two things happened: (1) programmers began to focus on search, implementing metrics that could be applied rapidly to millions of positions, and (2) computer chips got much, much faster.

Pawn Structure Chess, by Andy Soltis

The result is that chess programs can beat us by seeing farther down the tree of possibilities than we do. They make moves that surprise us, puzzle us, and even offend our sense of beauty: "Fischer or Tal would have played this move; it is much more elegant." But they win, easily -- except when they don't. Then we explain why, using ideas that express an understanding of the game that even the best chessplaying computers don't seem to have.

This points out one of the odd ways computers relate to us in the world of chess. Chess computers crush us all, including grandmasters, using moves we wouldn't make and many of us do not understand. But good chessplayers do understand why moves are good or bad, once they figure it out. As Soltis says:

And we can put the explanation in words. This is why chess teaching is changing in the computer age. A good coach has to be a good translator. His students can get their machine to tell them the best move in any position, but they need words to make sense of it.

Teaching computer science at the university is affected by a similar phenomenon. My students can find on the web code samples to solve any problem they have, but they don't always understand them. This problem existed in the age of the book, too, but the web makes available so much material, often undifferentiated and unexplained, so, so quickly.

The inverse of computers making good moves we don't understand brings with it another oddity, one that plays to a different side of our egos. When a chess computer loses -- gasp! -- or fails to understand why a human-selected move is better than the moves it recommends, we explain it using words that make sense of human move. These are, of course, the same words and concepts that fail us most of the time when we are looking for a move to beat the infernal machine. Confirmation bias lives on.

Soltis doesn't stop here, though. He realizes that this strange split raises a deeper question:

Maybe it's one that only philosophers care about, but I'll ask it anyway:

Are concepts like "flexibility" real? Or are they just artificial constructs, created by and suitable only for feeble, carbon-based minds?

(Philosophers are not the only ones who care. I do. But then, the epistemology course I took in grad school remains one of my two favorite courses ever. The second was cognitive psychology.)

Aristotle

We can implement some of our ideas about chess in programs, and those ideas have helped us create machines we can no longer defeat over the board. But maybe some of our concepts are simply be fictions, "just so" stories we tell ourselves when we feel the need to understand something we really don't. I don't think so, the pragmatist in me keeps pushing for better evidence.

Back when I did research in artificial intelligence, I always chafed at the idea of neural networks. They seemed to be a fine model of how our brains worked at the lowest level, but the results they gave did not satisfy me. I couldn't ask them "why?" and receive an answer at the conceptual level at which we humans seem to live. I could not have a conversation with them in words that helped me understand their solutions, or their failures.

Now we live in a world of "deep learning", in which Google Translate can do a dandy job of translating a foreign phrase for me but never tell me why it is right, or explain the subtleties of choosing one word instead of another. Add more data, and it translates even better. But I still want the sort of explanation that Ivanchuk gave about his win or the sort of story Soltis can tell about why a computer program only drew a game because it saddled itself with inflexible pawn structure.

Perhaps we have reached the limits of my rationality. More likely, though, is that we will keep pushing forward, bringing more human concepts and abstractions within the bounds of what programs can represent, do, and say. Researchers like Douglas Hofstadter continue the search, and I'm glad. There are still plenty of important questions to ask about the nature of knowledge, and computer science is right in the middle of asking and answering them.

~~~~

IMAGE 1. The critical position in Ivanchuk-Jobava, Wijk aan Zee 2015, the game to which Soltis refers in his story. Source: Chess Life, August 2015, Page 17.

IMAGE 2. The cover of Andy Soltis's classic Pawn Structure Chess. Source: the book's page at Amazon.com.

IMAGE 3. A bust of Aristotle, who confronted Plato's ideas about the nature of ideals. Source: Classical Wisdom Weekly.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

September 03, 2015 3:26 PM

Compilers and the Universal Machine

I saw Eric Normand's The Most Important Idea in Computer Science a few days ago and enjoyed it. I almost always enjoy watching a programmer have fun writing a little interpreter and then share that fun with others.

In class this week, my students and I spent a few minutes playing with T-diagrams to illustrate techniques for porting, bootstrapping, and optimizing compilers, and Normand's post came to mind. So I threw a little purple prose into my classroom comments.

All these examples of building compilers by feeding programs for new compilers into old compilers ultimately depend on a single big idea from the theory of computer science: that a certain kind of machine can simulate anything -- including itself. As a result, this certain kind of machine, the Turing machine, is the very definition of computability. But this big idea also means that, whatever problem we want to solve with information, we can solve it with a program. No additional hardware needed. We can emulate any hardware we might need, new or old, in software.

This is truly one of the most important ideas in computer science. But it's also an idea that changes how we approach problems in nearly every other discipline. Philosophically, it was a monumental achievement in humanity's ongoing quest to understand the universe and our place in it.

In this course, you will learn some of the intricacies of writing programs that simulate and translate other programs. At times, that will be a challenge. When you are deep in the trenches some night, trying to find an elusive error in your code, keep the big idea in mind. Perhaps it will comfort you.

Oh, and I am teaching my compilers course again after a two-year break. Yay!


Posted by Eugene Wallingford | Permalink | Categories: Computing

August 23, 2015 10:12 AM

Science Students Should Learn How to Program, and Do Research

Physicist, science blogger, and pop science author Chad Orzel offered some advice for prospective science students in a post on his Forbes blog last week. Among other things, he suggests that science students learn to program. Orzel is among many physics profs who integrate computer simulations into their introductory courses, using the Matter and Interactions curriculum (which you may recall reading about here in a post from 2007).

I like the way Orzel explains the approach to his students:

When we start doing programming, I tell students that this matters because there are only about a dozen problems in physics that you can readily solve exactly with pencil and paper, and many of them are not that interesting. And that goes double, maybe triple for engineering, where you can't get away with the simplifying spherical-cow approximations we're so fond of in physics. Any really interesting problem in any technical field is going to require some numerical simulation, and the sooner you learn to do that, the better.

This advice complements Astrachan's Law and its variants, which assert that we should not ask students to write a program if they can do the task by hand. Conversely, if they can't solve their problems by hand, then they should get comfortable writing programs that can. (Actually, that's the contrapositive of Astrachan, but "contrapositively" doesn't sound as good.) Programming is a medium for scientists, just as math is, and it becomes more important as they try to solve more challenging problems.

Orzel and Astrachan both know that the best way to learn to program is to have a problem you need a computer to solve. Curricula such as Matter and Interactions draw on this motivation and integrate computing directly into science courses. This is good news for us in computer science. Some of the students who learn how to program in their science courses find that they like it and want to learn more. We have just the courses they need to go deeper.

I concur with all five of Orzel's suggestions for prospective science students. They apply as well to computer science students as to those interested in the physical sciences. When I meet with prospective CS students and their families, I emphasize especially that students should get involved in research. Here is Orzel's take:

While you might think you love science based on your experience in classes, classwork is a pale imitation of actual science. One of my colleagues at Williams used a phrase that I love, and quote all the time, saying that "the hardest thing to teach new research students is that this is not a three-hour lab."

CS students can get involved in empirical research, but they also have the ability to write their own programs to explore their own ideas and interests. The world of open source software enables them to engage the discipline in ways that preceding generations could only have dreamed of. By doing empirical CS research with a professor or working on substantial programs that have users other than the creators, students can find out what computer science is really about -- and find out what they want to devote their lives to.

As Orzel points out, this is one of the ways in which small colleges are great for science students: undergrads can more readily become involved in research with their professors. This advantage extends to smaller public universities, too. In the past year, we have had undergrads do some challenging work on bioinformatics algorithms, motion virtual manipulatives, and system security. These students are having a qualitatively different learning experience than students who are only taking courses, and it is an experience that is open to all undergrad students in CS and the other sciences here.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 12, 2015 10:09 AM

Graphic Art: Links in Jewish Literature

"Genesis 1:1 is the Kevin Bacon of Sefaria."

This morning I finally read Sefaria in Gephi: Seeing Links in Jewish Literature, which had been in my reading list for a few months. In it, Liz Shayne introduces a collaborative project to visualize the relationships among 100,000+ sections of Jewish literature encoded in Sefaria, an online library of Jewish texts. It's a cool project, and the blog entries about it remind us how beautiful visualizations of graphs can be. I love this basic image, in which nodes represent sections of text, color indicates the type of text, and size corresponds to the degree of the node:

a graph of relationships in the Sefaria

This is suitable for framing and would make a fine piece of art on my office wall.

Images like this can help us to understand a large dataset at a high level more easily than simply looking at the data themselves. Of course, creating the image requires some initial understanding, too. There is a give-and-take between analyzing the data and visualizing it that mutually reinforces our understanding.

As I mentioned in a December 2004 post, sometimes a computer scientist can produce a beautiful picture without intending to. One of my grad students, Nate Labelle, studied package dependencies in Linux as part of a project on power laws and open-source software. He created this image that shows the dependencies among one hundred randomly selected packages:

Linux package dependencies as art

Unlike the neat concentric Sefaria image above, Nate's image has a messy asymmetry that reflects the more decentralized nature of the Linux ecosystem. It evokes for me a line drawing of a book whose pages are being riffled. After all these years, I still think it's an attractive image.

I have not read the rest of the Sefaria blog series, but peeking ahead I saw a neat example in Sefaria III: Comparative Graphing that shows the evolution of the crowd-sourced Sefaria dataset over the course of four months:

evolution of the Sefaria dataset over time

These images look almost like a time-lapse photograph of a supernova exploding ( video). They are pretty as art, and perhaps instructive about how the Sefaria community operates.

The Ludic Analytics site has links to two additional entries for the project [ II | IV ], but the latest is dated the end of 2014. I hope that Shayne or others involved with the project write more about their use of visualizations to understand the growing dataset. If nothing else, they may create more art for my walls.


Posted by Eugene Wallingford | Permalink | Categories: Computing

July 27, 2015 2:23 PM

The Flip Side to "Programming for All"

a thin volume of William Blake

We all hear the common refrain these days that more people should learn to program, not just CS majors. I agree. If you know how to program, you can make things. Even if you don't write many programs yourself, you are better prepared to talk to the programmers who make things for you. And even if you don't need to talk to programmers, you have expanded your mind a bit to a way of thinking that is changing the world we live in.

But there are two sides to this equation, as Chris Crawford laments in his essay, Fundamentals of Interactivity:

Why is it that our entertainment software has such primitive algorithms in it? The answer lies in the people creating them. The majority are programmers. Programmers aren't really idea people; they're technical people. Yes, they use their brains a great deal in their jobs. But they don't live in the world of ideas. Scan a programmer's bookshelf and you'll find mostly technical manuals plus a handful of science fiction novels. That's about the extent of their reading habits. Ask a programmer about Rabelais, Vivaldi, Boethius, Mendel, Voltaire, Churchill, or Van Gogh, and you'll draw a blank. Gene pools? Grimm's Law? Gresham's Law? Negentropy? Fluxions? The mind-body problem? Most programmers cannot be troubled with such trivia. So how can we expect them to have interesting ideas to put into their algorithms? The result is unsurprising: the algorithms in most entertainment products are boring, predictable, uninformed, and pedestrian. They're about as interesting in conversation as the programmers themselves.

We do have some idea people working on interactive entertainment; more of them show up in multimedia than in games. Unfortunately, most of the idea people can't program. They refuse to learn the technology well enough to express themselves in the language of the medium. I don't understand this cruel joke that Fate has played upon the industry: programmers have no ideas and idea people can't program. Arg!

My office bookshelf occasionally elicits a comment or two from first-time visitors, because even here at work I have a complete works of Shakespeare, a thin volume of William Blake (I love me some Blake!), several philosophy books, and "The Brittanica Book of Usage". I really should have some Voltaire here, too. I do cover one of Crawford's bases: a recent blog entry made a software analogy to Gresham's Law.

In general, I think you're more likely to find a computer scientist who knows some literature than you are to find a literary professional who knows much CS. That's partly an artifact of our school system and partly a result of the wider range historically of literature and the humanities. It's fun to run into a colleague from across campus who has read deeply in some area of science or math, but rare.

However, we are all prone to fall into the chasm of our own specialties and miss out on the well-roundedness that makes us better at whatever specialty we practice. That's one reason that, when high school students and their parents ask me what students should take to prepare for a CS major, I tell them: four years of all the major subjects, including English, math, science, social science, and the arts; plus whatever else interests them, because that's often where they will learn the most. All of these topics help students to become better computer scientists, and better people.

And, not surprisingly, better game developers. I agree with Crawford that more programmers should be learn enough other stuff to be idea people, too. Even if they don't make games.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 26, 2015 10:03 AM

A Couple of Passages on Disintermediation

"Disintermediation" is just a fancy word for getting other people out of the space between the people who create things and the people who read or listen to those things.

1. In What If Authors Were Paid Every Time Someone Turned a Page?, Peter Wayner writes:

One latter-day Medici posted a review of my (short) book on Amazon complaining that even 99 cents was too expensive for what was just a "blog post". I've often wondered if he was writing that comment in a Starbucks, sipping a $6 cup of coffee that took two minutes to prepare.

Even in the flatter world of ebooks, Amazon has the power to shape the interactions of creators and consumers and to influence strongly who makes money and what kind of books we read.

2. Late last year, Steve Albini spoke on the surprisingly sturdy state of the music industry:

So there's no reason to insist that other obsolete bureaux and offices of the lapsed era be brought along into the new one. The music industry has shrunk. In shrinking it has rung out the middle, leaving the bands and the audiences to work out their relationship from the ends. I see this as both healthy and exciting. If we've learned anything over the past 30 years it's that left to its own devices bands and their audiences can get along fine: the bands can figure out how to get their music out in front of an audience and the audience will figure out how to reward them.

Most of the authors and bands who aren't making a lot of money these days weren't making a lot of money -- or any money at all -- in the old days, either. They had few effective ways to distribute their writings or their music.

Yes, there are still people in between bands and their fans, and writers and their readers, but Albini reminds us how much things have improved for creators and audiences alike. I especially like his takedown of the common lament, "We need to figure out how to make this work for everyone." That sentence has always struck me as the reactionary sentiment of middlemen who no longer control the space between creators and audiences and thus no longer get their cut of the transaction.

I still think often about what this means for universities. We need to figure out how to make this internet thing work for everyone...


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 24, 2015 2:07 PM

Sentences of the Day

Three sentences stood out from the pages of my morning reading. The first two form an interesting dual around power and responsibility.

The Power to Name Things

Among the many privileges of the center, for example, is the power to name things, one of the greatest powers of all.

Costica Bradatan writes this in Change Comes From the Margins, a piece on social change. We programmers know quite well the power of good names, and thus the privilege we have in being able to create them and the responsibility we have to do that well.

The Avoidance of Power as Irresponsibility

Everyone's sure that speech acts and cultural work have power but no one wants to use power in a sustained way to create and make, because to have power persistently, in even a small measure, is to surrender the ability to shine a virtuous light on one's own perfected exclusion from power.

This sentence comes from the heart of Timothy Burke's All Grasshoppers, No Ants, his piece on one of the conditions he thinks ails our society as a whole. Burke's essay is almost an elaboration of Teddy Roosevelt's well-known dismissal of critics, but with an insightful expression of how and why rootless critics damage society as a whole.

Our Impotence in the Face of Depression

Our theories about mental health are often little better than Phlogiston and Ether for the mind.

Quinn Norton gives us this sentence in Descent, a personally-revealing piece about her ongoing struggle with depression. Like many of you, I have watched friends and loved ones fight this battle, which demonstrates all too readily the huge personal costs of civilization's being in such an early stage of understanding this disease, its causes, and its effective treatment.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 21, 2015 3:02 PM

'Send' Is The Universal Verb

In the mid-1980s, Ray Ozzie left IBM with the idea of creating an all-in-one software platform for business collaboration, based on his experience using the group messaging system in the seminal computer-assisted instruction system Plato. Ozzie's idea eventually became Lotus Notes. This platform lives on today in an IBM product, but it never had the effect that Ozzie envisioned for it.

In Office, Messaging, and Verbs, Benedict Evans tells us that Ozzie's idea is alive and well and finally taking over the world -- in the form of Facebook:

But today, Facebook's platform on the desktop is pretty much Ray Ozzie's vision built all over again but for consumers instead of enterprise and for cat pictures instead of sales forecasts -- a combination of messaging with embedded applications and many different data types and views for different tasks.

"Office, Messaging, and Verbs" is an engaging essay about how collaborative work and the tools we use to do it co-evolve, changing each other in turn. You need a keyboard to do the task at hand... But is the task at hand your job, or is it merely the way you do your job today? The answer depends on where you are on the arc of evolution.

Alas, most days I need to create or consume a spreadsheet or two. Spreadsheets are not my job, but they are way people in universities and most other corporate entities do too many of their jobs these days. So, like Jack Lemmon in The Apartment, I compute my cell's function and pass it along to the next person in line.

I'm ready for us to evolve further down the curve.

~~~~

Note: I added the Oxford comma to Evans's original title. I never apologize for inserting an Oxford comma.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 20, 2015 2:59 PM

Rethinking Accounting Software and Interfaces in the 1980s

In Magic Ink: Information Software and the Graphical Interface, Bret Victor reminds us that the dominant style of user interface today was created long before today's computers:

First, our current UI paradigm was invented in a different technological era. The initial Macintosh, for example, had no network, no mass storage, and little inter-program communication. Thus, it knew little of its environment beyond the date and time, and memory was too precious to record significant history. Interaction was all it had, so that's what its designers used. And because the computer didn't have much to inform anyone of, most of the software at the time was manipulation software -- magic versions of the typewriter, easel, and ledger-book. Twenty years and an internet explosion later, software has much more to say, but an inadequate language with which to say it.

William McCarthy, creator of the REA model of accounting

Victor's mention of the accounting ledger brings to mind the work being done since the early 1980s by Bill McCarthy, an accounting professor at Michigan State. McCarthy is motivated by a similar set of circumstances. The techniques by which we do financial accounting were created long before computers came along, and the constraints that made them necessary no longer exist. But he is looking deeper than simply the interaction style of accounting software; he is interested in upending the underlying model of accounting data.

McCarthy proposed the resources, events, agents (REA) model -- essentially an application of database theory from CS -- as an alternative to traditional accounting systems. REA takes advantage of databases and other computing ideas to create a more accurate model of a business and its activity. It eliminates many of the artifacts of double-entry bookkeeping, including debits, credits, and placeholder accounts such as accounts receivable and payable, because they can generated in real time from more fine-grained source data. An REA model of a business enables a much wider range of decision support than the traditional accounting model while still allowing the firm to produce all the artifacts of traditional accounting as side effect.

(I had the good fortune to work with McCarthy during my graduate studies and even helped author a conference paper on the development of expert systems from REA models. He also served on my dissertation committee.)

In the early years, many academic accountants reacted with skepticism to the idea of REA. They feared losing the integrity of the traditional accounting model, which carried a concomitant risk to the trust placed by the public in audited financial statements. Most of these concerns were operational, not theoretical. However, a few people viewed REA as somehow dissing the system that had served the profession so well for so long.

Victor includes a footnote in Magic Ink that anticipates a similar concern from interaction designers to his proposals:

Make no mistake, I revere GUI pioneers such as Alan Kay and Bill Atkinson, but they were inventing rules for a different game. Today, their windows and menus are like buggy whips on a car. (Although Alan Kay clearly foresaw today's technological environment, even in the mid-'70s. See "A Simple Vision of the Future" in his fascinating Early History of Smalltalk (1993).)

"They were inventing rules for a different game." This sentence echoes how I have always felt about Luca Pacioli, the inventor of double-entry bookkeeping. It was a remarkable technology that helped to enable the growth of modern commerce by creating a transparent system of accounting that could be trusted by insiders and outsiders alike. But he was inventing rules for a different game -- 500 years ago. Half a century dwarfs the forty or fifty year life of windows, icons, menus, and pointing and clicking.

I sometimes wonder what might have happened if I had pursued McCarthy's line of work more deeply. It dovetails quite nicely with software patterns and would have been well-positioned for the more recent re-thinking of financial support software in the era of ubiquitous mobile computing. So many interesting paths...


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 13, 2015 2:51 PM

Thinking in Code

A conversation this morning with a student reminded me of a story one of our alumni, a local entrepreneur, told me about his usual practice whenever he has an idea for a new system or a new feature for an existing system.

The alum starts by jotting the idea down in Java, Scala, or some other programming language. He puts this sketch into a git repository and uses the readme.md file to document his thought process. He also records there links to related systems, links to papers on implementation techniques, and any other resources he thinks might be handy. The code itself can be at varying levels of completeness. He allows himself to work out some of the intermediate steps in enough detail to make code work, while leaving other parts as skeletons.

This approach helps him talk to technical customers about the idea. The sketch shows what the idea might look like at a high level, perhaps with some of the intermediate steps running in some useful way. The initial draft helps him identify key development issues and maybe even a reasonable first estimate for how long it would take to flesh out a complete implementation. By writing code and making some of it work, the entrepreneur in him begins to see where the opportunities for business value lie.

If he decides that the idea is worth a deeper look, he passes the idea onto members of his team in the form of his git repo. The readme.md file includes links to relevant reading and his initial thoughts about the system and its design. The code conveys ideas more clearly and compactly than a natural language description would. Even if his team decides to use none of the code -- and he expects they won't -- they start from something more expressive than a plain text document.

This isn't quite a prototype or a spike, but it has the same spirit. The code sketch is another variation on how programming is a medium for expressing ideas in a way that other media can't fully capture.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 29, 2015 1:58 PM

Bridging the Gap Between Learning and Doing

a sketch of bridging the gap

I recently learned about the work of Amelia McNamara via this paper published as Research Memo M-2014-002 by the Viewpoints Research Institute. McNamara is attacking an important problem: the gap between programming tools for beginners and programming tools for practitioners. In Future of Statistical Programming, she writes:

The basic idea is that there's a gap between the tools we use for teaching/learning statistics, and the tools we use for doing statistics. Worse than that, there's no trajectory to make the connection between the tools for learning statistics and the tools for doing statistics. I think that learners of statistics should also be doers of statistics. So, a tool for statistical programming should be able to step learners from learning statistics and statistical programming to truly doing data analysis.

"Learners of statistics should also be doers of statistics." -- yes, indeed. We see the same gap in computer science. People who are learning to program are programmers. They are just working at a different level of abstraction and complexity. It's always a bit awkward, and often misleading, when we give novice programmers a different set of tools than we give professionals. Then we face a new learning barrier when we ask them to move up to professional tools.

That doesn't mean that we should turn students loose unprotected in the wilds of C++, but it does require that that we have a pedagogically sound trajectory for making the connection between novice languages and tools and those used by more advanced programmers.

It also doesn't mean that we can simply choose a professional language that is in some ways suitable for beginners, such as Python, and not think any more about the gap. My recent experience reminds me that there is still a lot of complexity to help our students deal with.

McNamara's Ph.D. dissertation explored some of the ways to bridge this gap in the realm of statistics. It starts from the position that the gap should not exist and suggests ways to bridge it, via both better curricula and better tools.

Whenever I experience this gap in my teaching or see researchers trying to make it go away, I think back to Alan Kay's early vision for Smalltalk. One of the central tenets of the Smalltalk agenda was to create a language flexible and rich enough that it could accompany the beginner as he or she grew in knowledge and skill, opening up to a new level each time the learner was ready for something more powerful. Just as a kindergartener learns the same English language used by Shakespeare and Joyce, a beginning programmer might learn the same language as Knuth and Steele, one that opens up to a new level each time the learner is ready.

We in CS haven't done especially good job at this over the years. Matthias Felleisen and the How to Design Programs crew have made perhaps the most successful effort thus far. (See *SL, Not Racket for a short note on the idea.) But this project has not made a lot of headway yet in CS education. Perhaps projects such as McNamara's can help make inroads for domain-specific programmers. Alan Kay may harbor a similar hope; he served as a member of McNamara's Ph.D. committee.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 12, 2015 2:39 PM

A Cool Example of Turning Data into Program: TempleOS

Hyperlinks that point to and execute code, not transfer us to a data file:

In a file from the TempleOS source code, one line contains the passage "Several other routines include a ...", where the "other routines" part is a hyperlink. Unlike in HTML, where that ... may lead to a page listing those other routines, here a DolDoc macro is used so that a grep is actually performed when you click on it. While the HTML version could become stale if no-one updated it, this is always up-to-date.

This comes from Richard Milton's A Constructive Look At TempleOS, which highlights some of the unusual features of an OS I had never heard of until I ran across his article. As I read it, I thought of Alan Kay's assertion that a real programming language should eliminate the need to have an operating system at all. The language should give programmers access to whatever they need to access and marshall the resources of the computer. Smalltalk is a language that aspired to this goal. Today, the best example of this idea is probably Racket, which continues to put more of the underlying system into the hands of programmers via the language itself. That is an essential element of the Racket Way.

TempleOS comes at this idea from the other side, as an operating system that puts as much computing as it can in the hands of the user. This includes programming, in the form of HolyC, a homegrown variant of C. TempleOS is written in HolyC, but HolyC is also the scripting language of the system's REPL. It's odd to talk about programming TempleOS at all, though. As Milton points out, like Xerox Alto, Oberon, and Plan 9, TempleOS "blurs the lines between programs and documents". Writing a program is like creating a document of any other sort, and creating a document of any sort is a form of programming.

Trading data for code creates a different kind of barrier for new users of TempleOS. It also pays dividends by injecting a tempting sort of dynamism to the system.

In any case, programmers of a certain age will feel a kinship with the kind of experience that TempleOS seeks to provide. We grew up in an age when every computer was an open laboratory, just waiting for us to explore them at every level. TempleOS has the feel -- and, perhaps unfortunately, the look -- of the 1970s and 1980s.

Hurray for crazy little operating systems like TempleOS. Maybe we can learn something useful from them. That's how the world of programming languages works, too. If not, the creator can have a lot of fun making a new world, and the rest of us can share in the fun vicariously.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 04, 2015 2:33 PM

If the Web is the Medium, What is the Message?

How's this for a first draft:

History may only be a list of surprises, but you sure as heck don't want to lose the list.

That's part of the message in Bret Victor's second 'Web of Alexandria' post. He Puts it in starker terms:

To forget the past is to destroy the future. This is where Dark Ages come from.

Those two posts followed a sobering observation:

60% of my fav links from 10 yrs ago are 404. I wonder if Library of Congress expects 60% of their collection to go up in smoke every decade.

But it's worse than that, Victor tells us in his follow-up. As his tweet notes, the web has turned out to be unreliable as a publication medium. We publish items because we want them to persist in the public record, but they don't rarely persist for very long. However, the web has turned out to be a pernicious conversational medium as well. We want certain items shared on the web to be ephemeral, yet often those items are the ones that last forever. At one time, this may have seemed like only an annoyance, but now we know it to be dangerous.

The problem isn't that the web is a bad medium. In one sense, the web isn't really a medium at all; it's an infrastructure that enables us to create new kinds of media with historically uncharacteristic ease. The problem is that we are using web-based media for many different purposes, without understanding how each medium determines "the social and temporal scope of its messages".

The same day I read Victor's blog post, I saw this old Vonnegut quote fly by on Twitter:

History is merely a list of surprises. ... It can only prepare us to be surprised yet again.

Alas, on the web, history appears to be a list of cat pictures and Tumblr memes, with all the important surprises deleted when the author changed internet service providers.

In a grand cosmic coincidence, on the same day I read Victor's blog post and saw the Vonnegut quote fly by, I also read a passage from Marshall McLuhan in a Farnam Street post. It ends:

The modern world abridges all historical times as readily as it reduces space. Everywhere and every age have become here and now. History has been abolished by our new media.

The internet certainly amplifies the scale of McLuhan's worry, but the web has created unique form of erasure. I'm sure McLuhan would join Victor in etching an item on history's list of surprises:

Protect the past.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

June 02, 2015 1:46 PM

"I Just Need a Programmer", Screenplay Edition

Noted TV writer, director, producer, and blogger Ken Levine takes on a frequently-asked question in the latest edition of his "Friday Questions" feature:

I have a great idea for a movie, but I'm not a writer, I'm not in show biz, and I don't live in New York or LA. What do I do with this great idea? (And I'm sure you've never heard this question before, right?)

Levine is gentle in response:

This question does come up frequently. I wish I had a more optimistic answer. But the truth is execution is more valued than ideas. ...

Is there any domain where this isn't true? Yet professionals in every domain seem to receive this question all the time. I certainly receive the "I just need a programmer..." phone call or e-mail every month. If I went to cocktail parties, maybe I'd hear it at them, too.

The bigger the gap between idea and product, the more valuable, relatively speaking, execution is than having ideas. For many app ideas, executing the idea is not all that far beyond the reach of many people. Learn a little Objective C, and away you go. In three or four years, you'll be set! By comparison, writing a screenplay that anyone in Hollywood will look at (let alone turn into a blockbuster film) seems like Mount Everest.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

May 29, 2015 11:20 AM

Fulfill God's Plan. Write a Computer Program.

In his entry for The Harvard Guide to Influential Books, Psychologist Jerome Kagan recommends the novel The Eternal Smile by Par Lagerqvist. He focuses his recommendation on a single sentence:

After an interminably long search, a large group of dead people find God and the leader steps forward and asks him what purpose he had in creating human beings. God replies, "I only intended that you need never be content with nothing."

Kagan sees this sentence as capturing a thematic idea about the historical conditions that shape humanity's conception of morality. He is probably right; he's a deeply read and highly respected scholar.

When I read it, though, I thought about how lucky I am that I know how to program. When you can write a computer program, you never need to be content with the status quo in any situation that involves information and a problem to solve. You can write a program and reshape a little part of the world.

So, in a way, computer programming is a part of how humanity achieves its destiny in the universe. I hope that isn't too much hubris for a Friday morning.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 09, 2015 9:28 AM

A Few Thoughts on Graduation Day

Today is graduation day for the Class of 2015 at my university. CS students head out into the world, most with a job in hand or nearly so, ready to apply their hard-earned knowledge and skills to all variety of problems. It's an exciting time for them.

This week also brought two other events that have me thinking about the world in which my students my will live and the ways in which we have prepared them. First, on Thursday, the Technology Association of Iowa organized a #TechTownHall on campus, where the discussion centered on creating and retaining a pool of educated people to participate in, and help grow, the local tech sector. I'm a little concerned that the TAI blog says that "A major topic was curriculum and preparing students to provide immediate value to technology employers upon graduation." That's not what universities do best. But then, that is often what employers want and need.

Second, over the last two mornings, I read James Fallows's classic The Case Against Credentialism, from the archives of The Atlantic. Fallows gives a detailed account of the "professionalization" of many lines of work in the US and the role that credentials, most prominently university degrees, have played in the movement. He concludes that our current approach is biased heavily toward evaluating the "inputs" to the system, such as early success in school and other demonstrations of talent while young, rather than assessing the outputs, namely, how well people actually perform after earning their credentials.

Two passages toward the end stood out for me. In one, Fallows wonders if our professionalized society creates the wrong kind of incentives for young people:

An entrepreneurial society is like a game of draw poker; you take a lot of chances, because you're rarely dealt a pat hand and you never know exactly what you have to beat. A professionalized society is more like blackjack, and getting a degree is like being dealt nineteen. You could try for more, but why?

Keep in mind that this article appeared in 1985. Entrepreneurship has taken a much bigger share of the public conversation since then, especially in the teach world. Still, most students graduating from college these days are likely thinking of ways to convert their nineteens into steady careers, not ways to risk it all on the next Amazon or Über.

Then this quote from "Steven Ballmer, a twenty-nine-year-old vice-president of Microsoft", on how the company looked for new employees:

We go to colleges not so much because we give a damn about the credential but because it's hard to find other places where you have large concentrations of smart people and somebody will arrange the interviews for you. But we also have a lot of walk-on talent. We're looking for programming talent, and the degree is in no way, shape, or form very important. We ask them to send us a program they've written that they're proud of. One of our superstars here is a guy who literally walked in off the street. We talked him out of going to college and he's been here ever since.

Who would have guessed in 1985 the visibility and impact that Ballmer would have over the next twenty years? Microsoft has since evolved from the entrepreneurial upstart to the staid behemoth, and now is trying to reposition itself as an important player in the new world of start-ups and mobile technology.

Attentive readers of this blog may recall that I fantasize occasionally about throwing off the shackles of the modern university, which grow more restrictive every year as the university takes on more of the attributes of corporate and government bureaucracy. In one of my fantasies, I organize a new kind of preparatory school for prospective software developers, one with a more modern view of learning to program but also an attention to developing the whole person. That might not satisfy corporate America's need for credentials, but it may well prepare students better for a world that needs poker players as much as it needs blackjack players. But where would the students come from?

So, on a cloudy graduation day, I think about Fallows's suggestion that more focused vocational training is what many grads need, about the real value of a liberal university education to both students and society, and about how we can best prepare CS students participate to in the world. It is a world that needs not only their technical skills but also their understanding of what tech can and cannot do. As a society, we need them to take a prominent role in civic and political discourse.

One final note on the Fallows piece. It is a bit long, dragging a bit in the middle like a college research paper, but opens and closes strongly. With a little skimming through parts of less interest, it is worth a read. Thanks to Brian Marick for the recommendation.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

April 30, 2015 6:00 PM

Software is a Means of Communication, Just Like a Research Paper

I can't let my previous post be my only comment on Software in Scientific Research. Hinsen's bigger point is worth a post of its own.

Software is a means of communication, just like papers or textbooks.

... much like the math that appears in a paper or a textbook -- except that, done properly, a computer program runs and provides a dynamic demonstration of an idea.

The main questions asked about scientific software [qua software] are "What does it do?" and "How efficient is it?" When considering software as a means of communication, we would ask questions such as "Is it well-written, clear, elegant?", "How general is the formulation?", or "Can I use it as the basis for developing new science?".

This shift requires a different level of understanding of programs and programming than many scientists (and other people who do not program for a living) have. But it is a shift that needs to take place, so we should so all we can to help scientists and others become more fluent. (Hey to Software Carpentry and like-minded efforts.)

We take for granted that all researchers are responsible for being able to produce and, more importantly, understand the other essential parts of scientific communication:

We actually accept as normal that the scientific contents of software, i.e., the models implemented by it, are understandable only to software specialists, meaning that for the majority of users, the software is just a black box. Could you imagine this for a paper? "This paper is very obscure, but the people who wrote it are very smart, so let's trust them and base our research on their conclusions." Did you ever hear such a claim? Not me.

This is a big part of the challenge we face in getting faculty across the university to see the vital role that computing should play in modern education -- as well as the roles it should not play. The same is true in the broader culture. We'll see if efforts such as code.org can make a dent in this challenge.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 29, 2015 1:52 PM

Beautiful Sentences: Scientific Data as Program

On the way to making a larger point about the role of software in scientific research, Konrad Hinsen writes these beautiful sentences:

Software is just data that can be interpreted as instructions for a computer. One could conceivably write some interpreter that turns previously generated data into software by executing it.

They express one side of one of the great ideas of computer science, the duality of program and data:

  • Every program is data to some other program, and
  • every set of data is a program to some machine.

This is one of the reasons why it is so important for CS students to study the principles of programming languages, create languages, and build interpreters. These activities help bring this great idea to life and prepare those who understand it to solve problems in ways that are otherwise hard to imagine.

Besides, the duality is a thing of beauty. We don't have to use it as a tool in order to appreciate this sublime truth.

As Hinsen writes, few people outside of computer science (and, sadly, too many within CS) appreciate "the particular status of software as both tool an information carrier and a tool". The same might be said for our appreciation of data, and the role that language plays in bridging the gap between the two.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 20, 2015 4:02 PM

"Disjunctive Inference" and Learning to Program

Over the weekend, I read Hypothetical Reasoning and Failures of Disjunctive Inference, a well-sourced article on the problems people have making disjunctive inferences. It made me think about some of the challenges students have learning to program.

Disjunctive inference is reasoning that requires us to consider hypotheticals. A simple example from the article is "the married problem":

Jack is looking at Ann, but Ann is looking at George. Jack is married, but George is not. Is a married person looking at an unmarried person?
  1. Yes.
  2. No.
  3. Cannot be determined.

The answer is yes, of course, which is obvious if we consider the two possible cases for Ann. Most people, though, stop thinking as soon as they realize that the answer hinges on Ann's status. They don't know her status, so they can't know the answer to the question. Even so, most everyone understands the answer as soon as the reasoning is explained to them.

The reasons behind our difficulties handling disjunctive inferences are complex, including both general difficulties we have with hypotheticals and a cognitive bias sometimes called cognitive miserliness: we seek to apply the minimum amount of effort to solving problems and making decisions. This is a reasonable evolutionary bias in many circumstances, but here it is maladaptive.

The article is fascinating and well worth a full read. It points to a number of studies in cognitive psychology that seek to understand how humans behave in the face if disjunctive inferences, and why. It closes with some thoughts on improving disjunctive reasoning ability, though there are no quick fixes.

As I read the article, it occurred to me that learning to program places our students in a near-constant state of hypothetical reasoning and disjunctive inference. Tracing code that contains an if statement asks them to think alternative paths and alternative outcomes. To understand what is true after the if statement executes is disjunctive inference.

Something similar may be true for a for loop, which executes once each for multiple values of a counter, and a while loop, which runs an indeterminate number of times. These aren't disjunctive inferences, but they do require students to think hypothetically. I wonder if the trouble many of my intro CS students had last semester learning function calls involved failures of hypothetical reasoning as much as it involves difficulties with generalization.

And think about learning to debug a program.... How much of that process involves hypotheticals and even full-on disjunctive inference? If most people have trouble with this sort of reasoning even on simple tasks, imagine how much harder it must be for young people who are learning a programming language for the first time and trying to reason about programs that are much more complex than "the married problem"?

Thinking explicitly about this flaw in human thinking may help us teachers do a better job helping students to learn. In the short term, we can help them by giving more direct prompts for how to reason. Perhaps we can also help them learn to prompt themselves when faced with certain kinds of problems. In the longer term, we can perhaps help them to develop a process for solving problems that mitigates the bias. This is all about forming useful habits of thought.

If nothing else, reading this article will help me be slower to judge my students's work ethic. What looks like laziness is more likely a manifestation of a natural bias to exert the minimum amount of effort to solving problems. We are all cognitive misers to a certain extent, and that serves us well. But not always when we are writing and debugging programs.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 30, 2015 3:33 PM

Reminiscing on the Effects of Photoshop

Thomas Knoll, one of the creators of Adobe Photoshop, reminisces on the insight that gave rise to the program. His brother, John, worked on analog image composition at Industrial Light and Magic, where they had just begun to experiment with digital processing.

[ILM] had a scanner that could scan in frames from a movie, digitally process them, and then write the images out to film again.

My brother saw that and had a revelation. He said, "If we convert the movie footage into numbers, and we can convert the numbers back into movie footage, then once it's in the numerical form we could do anything to it. We'd have complete power."

I bought my first copy of Photoshop in the summer of 1992, as part of my start-up package for new faculty. In addition to the hardware and software I needed to do my knowledge-based systems research, we also outfitted the lab with a number of other tools, including Aldus Persuasion, a LaCie digital scanner, OmniPage Pro software for OCR, Adobe Premiere, and Adobe Photoshop. I felt like I could do anything I wanted with text, images, and video. It was a great power.

In truth, I barely scratched the surface of what was possible. Others took Photoshop and went places that even Adobe didn't expect them to go. The Knoll brothers sensed what was possible, but it must have been quite something to watch professionals and amateurs alike use the program to reinvent our relationship with images. Here is Thomas Knoll again:

Photoshop has so many features that make it extremely versatile, and there are artists in the world who do things with it that are incredible. I suppose that's the nature of writing a versatile tool with some low-level features that you can combine with anything and everything else.

Digital representation opens new doors for manipulation. When you give users control at both the highest levels and the lowest, who knows what they will do. Stand back and wait.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 13, 2015 3:07 PM

Two Forms of Irrelevance

When companies become irrelevant to consumers.
From The Power of Marginal, by Paul Graham:

The big media companies shouldn't worry that people will post their copyrighted material on YouTube. They should worry that people will post their own stuff on YouTube, and audiences will watch that instead.

You mean Grey's Anatomy is still on the air? (Or, as today's teenagers say, "Grey's what?")

When people become irrelevant to intelligent machines.
From Outing A.I.: Beyond the Turing Test, by Benjamin Bratton:

I argue that we should abandon the conceit that a "true" Artificial Intelligence must care deeply about humanity -- us specifically -- as its focus and motivation. Perhaps what we really fear, even more than a Big Machine that wants to kill us, is one that sees us as irrelevant. Worse than being seen as an enemy is not being seen at all.

Our new computer overlords indeed. This calls for a different sort of preparation than studying lists of presidents and state capitals.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

March 04, 2015 3:28 PM

Code as a Form of Expression, Even Spreadsheets

Even formulas in spreadsheets, even back in the early 1980s:

Spreadsheet models have become a form of expression, and the very act of creating them seem to yield a pleasure unrelated to their utility. Unusual models are duplicated and passed around; these templates are sometimes used by other modelers and sometimes only admired for their elegance.

People love to make and share things. Computation has given us another medium in which to work, and the things people make with it are often very cool.

The above passage comes from Stephen Levy's A Spreadsheet Way of Knowledge, which appeared originally in Harper's magazine in November 1984. He re-published it on Medium this week in belated honor of Spreadsheet Day last October 17, which was the 35th anniversary of VisiCalc, "the Apple II program that started it all". It's a great read, both as history and as a look at how new technologies create unexpected benefits and dangers.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 27, 2015 3:37 PM

Bad Habits and Haphazard Design

With an expressive type system for its teaching
languages, HtDP could avoid this problem to some
extent, but adding such rich types would also take
the fun out of programming.

As we approach the midpoint of the semester, Matthias Felleisen's Turing Is Useless strikes a chord in me. My students have spent the last two months learning a little Racket, a little functional programming, and a little about how to write data-driven recursive programs. Yet bad habits learned in their previous courses, or at least unchecked by what they learned there, have made the task harder for many of them than it needed to be.

The essay's title plays off the Church-Turing thesis, which asserts that all programming languages have the same expressive power. This powerful claim is not good news for students who are learning to program, though:

Pragmatically speaking, the thesis is completely useless at best -- because it provides no guideline whatsoever as to how to construct programs -- and misleading at worst -- because it suggests any program is a good program.

With a Turing-universal language, a clever student can find a way to solve any problem with some program. Even uninspired but persistent students can tinker their way to a program that produces the right answers. Unfortunately, they don't understand that the right answers aren't the point; the right program is. Trolling StackOverflow will get them a program, but too often the students don't understand whether it is a good or bad program in their current situation. It just works.

I have not been as faithful to the HtDP approach this semester as I probably should have been, but I share its desire to help students to design programs systematically. We have looked at design patterns that implement specific strategies, not language features. Each strategy focuses on the definition of the data being processed and the definition of the value being produced. This has great value for me as the instructor, because I can usually see right away why a function isn't working for the student the way he or she intended: they have strayed from the data as defined by the problem.

This is also of great value to some of my students. They want to learn how to program in a reliable way, and having tools that guide their thinking is more important than finding yet another primitive Racket procedure to try. For others, though "garage programming" is good enough; they just want get the job done right now, regardless of which muscles they use. Design is not part of their attitude, and that's a hard habit to break. How use doth breed a habit in a student!

Last semester, I taught intro CS from what Felleisen calls a traditional text. Coupled that experience with my experience so far this semester, I'm thinking a lot these days about how we can help students develop a design-centered attitude at the outset of their undergrad courses. I have several blog entries in draft form about last semester, but one thing that stands out is the extent to which every step in the instruction is driven by the next cool programming construct. Put them all on the table, fiddle around for a while, and you'll make something that works. One conclusion we can draw from the Church-Turing thesis is that this isn't surprising. Unfortunately, odds are any program created this way is not a very good program.

~~~~~

(The sentence near the end that sounds like Shakespeare is. It's from The Two Gentlemen of Verona, with a suitable change in noun.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

February 06, 2015 3:11 PM

What It Feels Like To Do Research

In one sentence:

Unless you tackle a problem that's already solved, which is boring, or one whose solution is clear from the beginning, mostly you are stuck.

This is from Alec Wilkinson's The Pursuit of Beauty, about mathematician Yitang Zhang, who worked a decade on the problem of bounded gaps between prime numbers. As another researcher says in the article,

When you try to prove a theorem, you can almost be totally lost to knowing exactly where you want to go. Often, when you find your way, it happens in a moment, then you live to do it again.

Programmers get used to never feeling normal, but tackling the twin prime problem is on a different level altogether. The same is true for any deep open question in math or computing.

I strongly recommend Wilkinson's article. It describes what life for untenured mathematicians is like, and how a single researcher can manage to solve an important problem.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 28, 2015 3:38 PM

The Relationship Between Coding and Literacy

Many people have been discussing Chris Granger's recent essay Coding is not the New Literacy, and most seem to approve of his argument. Reading it brought to my mind this sentence from Alan Kay in VPRI Memo M-2007-007a, The Real Computer Revolution Hasn't Happened Yet:

Literacy is not just being able to read and write, but being able to deal fluently with the kind of ideas that are important enough to write about and discuss.

Literacy requires both the low-level skills of reading and writing and the higher-order capacity for using them on important ideas.

That is one thing that makes me uneasy about Granger's argument. It is true that teaching people only low-level coding skills won't empower them if they don't know how to use them to use them fluently to build models that matter. But neither will teaching them how to build models without giving them access to the programming skills they need to express their ideas beyond what some tool gives them.

Like Granger, though, I am also uneasy about many of the learn-to-code efforts. Teaching people enough Javascript or Ruby to implement a web site out of the box skips past the critical thinking skills that people need to use computation effectively in their world. They may be "productive" in the short term, but they are also likely to hit a ceiling pretty soon. What then? My guess: they become frustrated and stop coding altogether.

the Scratch logo

We sometimes do a better job introducing programming to kids, because we use tools that allow students to build models they care about and can understand. In the VPRI memo, Kay describes experiences teaching elementary school, students to use eToys to model physical phenomena. In the end, they learn physics and the key ideas underlying calculus. But they also learn the fundamentals of programming, in an environment that opens up into Squeak, a flavor of Smalltalk.

I've seen teachers introduce students to Scratch in a similar way. Scratch is a drag-and-drop programming environment, but it really is a open-ended and lightweight modeling tool. Students can learn low-level coding skills and higher-level thinking skills in tandem.

That is the key to making Granger's idea work in the best way possible. We need to teach people how to think about and build models in a way that naturally evolves into programming. I am reminded of another quote from Alan Kay that I heard back in the 1990s. He reminded us that kindergarteners learn and use the same language that Shakespeare used It is possible for their fluency in the language to grow to the point where they can comprehend some of the greatest literature ever created -- and, if they possess some of Shakepeare's genius, to write their own great literature. English starts small for children, and as they grow, it grows with them. We should aspire to do the same thing for programming.

the logo for Eve

Granger reminds us that literacy is really about composition and comprehension. But it doesn't do much good to teach people how to solidify their thoughts so that they can be written if they don't know how to write. You can't teach composition until your students know basic reading and writing.

Maybe we can find a way to teach people how to think in terms of models and how to implement models in programs at the same time, in a language system that grows along with their understanding. Granger's latest project, Eve, may be a step in that direction. There are plenty of steps left for us to take in the direction of languages like Scratch, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 18, 2015 10:26 AM

The Infinite Horizon

In Mathematics, Live: A Conversation with Laura DeMarco and Amie Wilkinson, Amie Wilkinson recounts the pivotal moment when she knew she wanted to be a mathematician. Insecure about her abilities in mathematics, unsure about what she wanted to do for a career, and with no encouragement, she hadn't applied to grad school. So:

I came back home to Chicago, and I got a job as an actuary. I enjoyed my work, but I started to feel like there was a hole in my existence. There was something missing. I realized that suddenly my universe had become finite. Anything I had to learn for this job, I could learn eventually. I could easily see the limits of this job, and I realized that with math there were so many things I could imagine that I would never know. That's why I wanted to go back and do math. I love that feeling of this infinite horizon.

After having written software for an insurance company during the summers before and after my senior year in college, I knew all too well the "hole in my existence" that Wilkinson talks about, the shrinking universe of many industry jobs. I was deeply interested in the ideas I had found in Gödel, Escher, Bach, and in the idea of creating an intelligent machine. There seemed no room for those ideas in the corporate world I saw.

I'm not sure when the thought of graduate school first occurred to me, though. My family was blue collar, and I didn't have much exposure to academia until I got to Ball State University. Most of my friends went out to get jobs, just like Wilkinson. I recall applying for a few jobs myself, but I never took the job search all that seriously.

At least some of the credit belongs to one of my CS professors, Dr. William Brown. Dr. Brown was an old IBM guy who seemed to know so much about how to make computers do things, from the lowest-level details of IBM System/360 assembly language and JCL up to the software engineering principles needed to write systems software. When I asked him about graduate school, he talked to me about how to select a school and a Ph.D. advisor. He also talked about the strengths and weaknesses of my preparation, and let me know that even though I had some work to do, I would be able to succeed.

These days, I am lucky even to have such conversations with my students.

For Wilkinson, DeMarco and me, academia was a natural next step in our pursuit of the infinite horizon. But I now know that we are fortunate to work in disciplines where a lot of the interesting questions are being asked and answers by people working in "the industry". I watch with admiration as many of my colleagues do amazing things while working for companies large and small. Computer science offers so many opportunities to explore the unknown.

Reading Wilkinson's recollection brought a flood of memories to mind. I'm sure I wasn't alone in smiling at her nod to finite worlds and infinite horizons. We have a lot to be thankful for.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

January 16, 2015 2:59 PM

Programming Language As Artistic Medium

Says Ramsey Nasser:

I have always been fascinated by esolangs. They are the such an amazing intersection of technical and formal rigor on one hand and nerdy inside humor on the other. The fact that they are not just ideas, but *actual working languages* is incredible. Its something that could only exist in a field as malleable and accessible as code. NASA engineers cannot build a space station as a joke.

Because we can create programming languages as a joke, or for any other reason, a programming language can be both message and medium.

a Hello, World program in Piet

Esolang is enthusiast shorthand for esoteric programming language. I'm not an enthusiast on par with many, but I've written a few Ook! interpreters and played around with others. Piet is the most visually appealing of the esoteric languages I've encountered. The image to the right is a "Hello, World" program written in Piet, courtesy of the Wikimedia Commons.

Recently I have been reading more about the work of Nasser, a computer scientist and artist formerly at the Eyebeam Art + Technology Center. In 2010, he created the Zajal programming language as his MFA thesis project at the Parsons School of Design. Zajal was inspired by Processing and runs on top of Ruby. A couple of years ago, he received widespread coverage for Qalb, a language with Arabic script characters and a Scheme-like syntax. Zajal enables programmers to write programs with beautiful output; Qalb enables programmers to write programs that are themselves quite beautiful.

I wouldn't call Zajal or Qalb esoteric programming languages. They are, in an important way, quite serious, exploring the boundary between "creative vision" and software. As he says at the close of the interview quoted above, we now live in a world in which "code runs constantly in our pockets":

Code is a driving element of culture and politics, which means that code that is difficult to reason about or inaccessible makes for a culture and politics that are difficult to reason about and inaccessible. The conversation about programming languages has never been more human than it is now, and I believe this kind of work will only become more so as software spreads.

As someone who teaches computer science students to think more deeply about programming languages, I would love to see more and different kinds of people entering the conversation.


Posted by Eugene Wallingford | Permalink | Categories: Computing

January 12, 2015 10:26 AM

WTF Problems and Answers for Questions Unasked

Dan Meyer quotes Scott Farrand in WTF Math Problems:

Anything that makes students ask the question that you plan to answer in the lesson is good, because answering questions that haven't been asked is inherently uninteresting.

My challenge this semester: getting students to ask questions about the programming languages they use and how they work. I myself have many questions about languages! My experience teaching our intro course last semester reminded me that what interests me (and textbook authors) doesn't always interest my students.

If you have any WTF? problems for a programming languages course, please share.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 09, 2015 3:40 PM

Computer Science Everywhere, Military Edition

Military Operations Orders are programs that are executed by units. Code re-use and other software engineering principles applied regularly to these.

An alumnus of my department, a CS major-turned-military officer, wrote those lines in an e-mail responding to my recent post, A Little CS Would Help a Lot of College Grads. Contrary to what many people might imagine, he has found what he learned in computer science to be quite useful to him as an Army captain. And he wasn't even a programmer:

One of the biggest skills I had over my peers was organizing information. I wasn't writing code, but I was handling lots of data and designing systems for that data. Organizing information in a way that was easy to present to my superiors was a breeze and having all the supporting data easily accessible came naturally to me.

Skills and principles from software engineering and project development apply to systems other than software. They also provide a vocabulary for talking about ideas that non-programmers encounter every day:

I did introduce my units to the terms border cases, special cases, and layers of abstraction. I cracked a smile every time I heard those terms used in a meeting.

Excel may not be a "real programming language", but knowing the ways in which it is a language can make managers of people and resources more effective at what they do.

For more about how a CS background has been useful to this officer, check out CS Degree to Army Officer, a blog entry that expands on his experiences.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

December 31, 2014 10:15 AM

Reinventing Education by Reinventing Explanation

One of the more important essays I read in 2014 was Michael Nielsen's Reinventing Explanation. In it, Nielsen explores how we might design media that help us explain scientific ideas better than we are able with our existing tools.

... it's worth taking non-traditional media seriously not just as a vehicle for popularization or education, which is how they are often viewed, but as an opportunity for explanations which can be, in important ways, deeper.

This essay struck me deep. Nielsen wants us to consider how we might take what we have learned using non-traditional media to popularize and educate and use it to think about how to explain more deeply. I think that learning how to use non-traditional media to explain more deeply will help us change the way we teach and learn.

In too many cases, new technologies are used merely as substitutes for old technology. The web has led to an explosion of instructional video aimed at all levels of learners. No matter how valuable these videos are, most merely replace reading a textbook or a paper. But computational technology enables us to change the task at hand and even redefine what we do. Alan Kay has been telling this story for decades, pointing us to the work of Ivan Sutherland and many others from the early days of computing.

Nielsen points to Bret Victor as an example of someone trying to develop tools that redefine how we think. As Victor himself says, he is following in the grand tradition of Kay, Sutherland, et al. Victor's An Ill-Advised Personal Note about "Media for Thinking the Unthinkable" is an especially direct telling of his story.

Vi Hart is another. Consider her recent Parable of the Polygons, created with Nicky Case, which explains dynamically how local choices and create systemic bias. This simulation uses computation to help people think differently about an idea they might not understand as viscerally from a traditional explanation. Hart has a long body of working using visualization to explain differently, and the introduction of computing extends the depth of her approach.

Over the last few weeks, I have felt myself being pulled by Nielsen's essay and the example of people such as Victor and Hart to think more about how we might design media that help us to teach and explain scientific ideas more deeply. Reinventing explanation might help us reinvent education in a way that actually matters. I don't have a research agenda yet, but looking again at Victor's work is a start.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 28, 2014 11:12 AM

A Little CS Would Help a Lot of College Grads

I would love to see more CS majors, but not everyone should major in CS. I do think that most university students could benefit from learning a little programming. There are plenty of jobs not only for CS and math grads, but also for other majors who have CS and math skills:

"If you're an anthropology major and you want to get a marketing job, well, guess what? The toughest marketing jobs to fill require SQL skills," Sigelman says. "If you can ... along the peripheries of your academic program accrue some strong quantitative skills, you'll still have the advantage [in the job market]." Likewise, some legal occupations (such as intellectual property law) and maintenance and repair jobs stay open for long periods of time, according to the Brookings report, if they require particular STEM skills.

There is much noise these days about the importance of STEM, both for educated citizens and for jobs, jobs, jobs. STEM isn't an especially cohesive category, though, as the quoted Vox article reminds us, and even when we look just at economic opportunity, it misleads. We don't need more college science graduates from every STEM discipline. We do need more people with the math and CS skills that now pervade the workplace, regardless of discipline. As Kurtzleben says in the article, "... characterizing these skill shortages as a broad STEM crisis is misleading to students, and has distorted the policy debate."


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 27, 2014 8:47 AM

Let's Not Forget: CS 1 Is Hard For Most Students

... software is hard. It's harder than
anything else I've ever had to do.
-- Donald Knuth

As students were leaving my final CS 1 lab session of the semester, I overheard two talking about their future plans. One student mentioned that he was changing his major to actuarial science. I thought, wow, that's a tough major. How is a student who is struggling with basic programming going to succeed there?

When I checked on his grades, though, I found that he was doing fine in my course, about average. I also remembered that he had enjoyed best the programming exercises that computed terms of infinite arithmetic series and other crazy mathematical values that his classmates often found impenetrable. Maybe actuarial science, even with some hard math, will be a good fit for him.

It really shouldn't surprise us that some students try computer science and decide to major in something else, even something that looks hard to most people. Teaching CS 1 again this semester after a long break reminded me just how much we expect from the students in our introductory course:

  • Details. Lots and lots of details. Syntax. Grammar. Vocabulary, both in a programming language and about programming more generally. Tools for writing, editing, compiling, and running programs.

  • Experimentation. Students have to design and execute experiments in order to figure out how language constructs work and to debug the programs they write. Much of what they learn is by trial and error, and most students have not yet developed skills for doing that in a controlled fashion.

  • Design. Students have to decompose problems and combine parts into wholes. They have to name things. They have to connect the names they see with ideas from class, the text, and their own experience.

  • Abstraction. Part of the challenge in design comes from abstraction, but abstract ideas are everywhere in learning about CS and how to program. Variables, choices, loops and recursion, functions and arguments and scope, ... all come not just as concrete forms but also as theoretical notions. These notions can sometimes be connected to the students' experience of the physical world, but the computing ideas are often just different enough to disorient the student. Other CS abstractions are so different as to appear unique.

In a single course, we expect students to perform tasks in all three of these modes, while mastering a heavy load of details. We expect them to learn by deduction, induction, and abduction, covering many abstract ideas and many concrete details. Many disciplines have challenging first courses, but CS 1 requires an unusual breadth of intellectual tools.

Yes, we can improve our students' experience with careful pedagogy. Over the last few decades we've seen many strong efforts. And yes, we can help students through the process with structural support, emotional support, and empathy. In the end, though, we must keep this in mind: CS 1 is going to be a challenge for most students. For many, the rewards will be worth the struggle, but that doesn't mean it won't take work, patience, and persistence along the way -- by both the students and the teachers.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 26, 2014 8:32 AM

Editing and the Illusion of Thought

Martin Amis, in The Paris Review, The Art of Fiction No. 151:

By the way, it's all nonsense about how wonderful computers are because you can shift things around. Nothing compares with the fluidity of longhand. You shift things around without shifting them around--in that you merely indicate a possibility while your original thought is still there. The trouble with a computer is that what you come out with has no memory, no provenance, no history--the little cursor, or whatever it's called, that wobbles around the middle of the screen falsely gives you the impression that you're thinking. Even when you're not.

My immediate reaction was that Mr. Amis needs version control, but there is something more here.

When writing with pencil and paper, we work on an artifact that embodies the changes it has gone through. We see the marks and erasures; we see the sentence where it once was once at the same time we see the arrow telling us where it now belongs. When writing in a word processor, our work appears complete, even timeless, though we know it isn't. Mark-up mode lets us see some of the document's evolution, but the changes feel more distant from our minds. They live out there.

I empathize with writers like Amis, whose experience predates the computer. Longhand feels different. Teasing out what what was valuable, even essential, in previous experience and what was merely the limitation of our tools is one of the great challenges of any time. How do we make new tools that are worth the change, that enable us to do more and better?


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

December 24, 2014 2:05 PM

Computer Science Everywhere, Christmas Eve Edition

Urmson says Google is better positioned than a traditional automaker to crack the riddle of self-driving, because it's more about software than hardware: "When you look at what we're doing, on the surface, you see a vehicle. But the heart of it is computer science.

That is Chris Urmson, the head of Google's self-driving car program, quoted in this article. (Apparently, senior citizens are a natural market for driverless cars.)

Everywhere we look these days, we see gadgets. Increasingly, though, at the heart of them is computer science.


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 25, 2014 1:43 PM

Concrete Play Trumps All

Areschenko-Johannessen, Bundesliga 2006-2007

One of the lessons taught by the computer is that concrete play trumps all.

This comment appeared in the review of a book of chess analysis [ paywalled ]. The reviewer is taking the author to task for talking about the positional factors that give one player "a stable advantage" in a particular position, when a commercially-available chess program shows the other player can equalize easily, and perhaps even gain an advantage.

It is also a fitting comment on our relationship with computers these days more generally. In areas such as search and language translation, Google helped us see that conventional wisdom can often be upended by a lot of data and many processors. In AI, statistical techniques and neural networks solve problems in ways that models of human cognition cannot. Everywhere we turn, it seems, big data and powerful computers are helping us to redefine our understanding of the world.

We humans need not lose all hope, though. There is still room for building models of the world and using them to reason, just as there is room for human analysis of chess games. In chess, computer analysis is pushing grandmasters to think differently about the game. The result is a different kind of understanding for the more ordinary of us, too. We just have to be careful to check our abstract understanding against computer analysis. Concrete play trumps all, and it tests our hypotheses. That's good science, and good thinking.

~~~~

(The chess position is from Areschenko-Johannessen 2006-2007, used as an example in Chess Training for Post-Beginners by Yaroslav Srokovski and cited in John Hartmann's review of the book in the November 2014 issue of Chess Life.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 23, 2014 8:50 AM

Supply, Demand, and K-12 CS

When I meet with prospective students and their parents, we often end up discussing why most high schools don't teach computer science. I tell them that, when I started as a new prof here, about a quarter of incoming freshmen had taken a year of programming in high school, and many other students had had the opportunity to do so. My colleagues and I figured that this percentage would go way up, so we began to think about how we might structure our first-year courses when most or all students already knew how to program.

However, the percentage of incoming students with programming experience didn't go up. It went way down. These days, about 10% of our freshman know how to program when they start our intro course. Many of those learned what they know on their own. What happened, today's parents ask?

A lot of things happened, including the dot-com bubble, a drop in the supply of available teachers, a narrowing of the high school curriculum in many districts, and the introduction of high-stakes testing. I'm not sure how much each contributed to the change, or whether other factors may have played a bigger role. Whatever the causes, the result is that our intro course still expects no previous programming experience.

Yesterday, I saw a post by a K-12 teacher on the Racket users mailing list that illustrates the powerful pull of economics. He is leaving teaching for software development industry, though reluctantly. "The thing I will miss the most," he says, "is the enjoyment I get out of seeing youngsters' brains come to life." He also loves seeing them succeed in the careers that knowing how to program makes possible. But in that success lies the seed of his own career change:

Speaking of my students working in the field, I simply grew too tired of hearing about their salaries which, with a couple of years experience, was typically twice what I was earning with 25+ years of experience. Ultimately that just became too much to take.

He notes that college professors probably know the feeling, too. The pull must be much stronger on him and his colleagues, though; college CS professors are generally paid much better than K-12 teachers. A love of teaching can go only so far. At one level, we should probably be surprised that anyone who knows how to program well enough to teach thirteen- or seventeen-year-olds to do it stays in the schools. If not surprised, we should at least be deeply appreciative of the people who do.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

November 20, 2014 3:23 PM

When I Procrastinate, I Write Code

I procrastinated one day with my intro students in mind. This is the bedtime story I told them as a result. Yes, I know that I can write shorter Python code to do this. They are intro students, after all.

~~~~~

Once upon a time, a buddy of mine, Chad, sent out a tweet. Chad is a physics prof, and he was procrastinating. How many people would I need to have in class, he wondered, to have a 50-50 chance that my class roster will contain people whose last names start with every letter of the alphabet?

    Adams
    Brown
    Connor
    ...
    Young
    Zielinski

This is a lot like the old trivia about how we only need to have 23 people in the room to have a 50-50 chance that two people share a birthday. The math for calculating that is straightforward enough, once you know it. But last names are much more unevenly distributed across the alphabet than birthdays are across the days of the year. To do this right, we need to know rough percentages for each letter of the alphabet.

I can procrastinate, too. So I surfed over to the US Census Bureau, rummaged around for a while, and finally found a page on Frequently Occurring Surnames from the Census 2000. It provides a little summary information and then links to a couple of data files, including a spreadsheet of data on all surnames that occurred at least 100 times in the 2000 census. This should, I figure, cover enough of the US population to give us a reasonable picture of how peoples' last names are distributed across the alphabet. So I grabbed it.

(We live in a wonderful time. Between open government, open research, and open source projects, we have access to so much cool data!)

The spreadsheet has columns with these headers:

    name,rank,count,prop100k,cum_prop100k,      \
                    pctwhite,pctblack,pctapi,   \
                    pctaian,pct2prace,pcthispanic

The first and third columns are what we want. After thirteen weeks, we know how to do compute the percentages we need: Use the running total pattern to count the number of people whose name starts with 'a', 'b', ..., 'z', as well as how many people there are altogether. Then loop through our collection of letter counts and compute the percentages.

Now, how should we represent the data in our program? We need twenty-six counters for the letter counts, and one more for the overall total. We could make twenty-seven unique variables, but then our program would be so-o-o-o-o-o long, and tedious to write. We can do better.

For the letter counts, we might use a list, where slot 0 holds a's count, slot 1 holds b's count, and so one, through slot 25, which holds z's count. But then we would have to translate letters into slots, and back, which would make our code harder to write. It would also make our data harder to inspect directly.

    ----  ----  ----  ...  ----  ----  ----    slots in the list

0 1 2 ... 23 24 25 indices into the list

The downside of this approach is that lists are indexed by integer values, while we are working with letters. Python has another kind of data structure that solves just this problem, the dictionary. A dictionary maps keys onto values. The keys and values can be of just about any data type. What we want to do is map letters (characters) onto numbers of people (integers):

    ----  ----  ----  ...  ----  ----  ----    slots in the dictionary

'a' 'b' 'c' ... 'x' 'y' 'z' indices into the dictionary

With this new tool in hand, we are ready to solve our problem. First, we build a dictionary of counters, initialized to 0.

    count_all_names = 0
    total_names = {}
    for letter in 'abcdefghijklmnopqrstuvwxyz':
        total_names[letter] = 0

(Note two bits of syntax here. We use {} for dictionary literals, and we use the familiar [] for accessing entries in the dictionary.)

Next, we loop through the file and update the running total for corresponding letter, as well as the counter of all names.

    source = open('app_c.csv', 'r')
    for entry in source:
        field  = entry.split(',')        # split the line
        name   = field[0].lower()        # pull out lowercase name
        letter = name[0]                 # grab its first character
        count  = int( field[2] )         # pull out number of people
        total_names[letter] += count     # update letter counter
        count_all_names     += count     # update global counter
    source.close()

Finally, we print the letter → count pairs.

    for (letter, count_for_letter) in total_names.items():
        print(letter, '->', count_for_letter/count_all_names)

(Note the items method for dictionaries. It returns a collection of key/value tuples. Recall that tuples are simply immutable lists.)

We have converted the data file into the percentages we need.

    q -> 0.002206197888442366
    c -> 0.07694634659082318
    h -> 0.0726864447688946
    ...
    f -> 0.03450702533438715
    x -> 0.0002412718532764804
    k -> 0.03294646311104032

(The entries are not printed in alphabetical order. Can you find out why?)

I dumped the output to a text file and used Unix's built-in sort to create my final result. I tweet Chad, Here are your percentages. You do the math.

Hey, I'm a programmer. When I procrastinate, I write code.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

November 11, 2014 7:53 AM

The Internet Era in One Sentence

I just love this:

When a 14-year-old kid can blow up your business in his spare time, not because he hates you but because he loves you, then you have a problem.

Clay Shirky attributes it to Gordy Thompson, who managed internet services at the New York Times in the early 1990s. Back then, it was insightful prognostication; today, it serves as an epitaph for many an old business model.

Are 14-year-old kids making YouTube videos to replace me yet?


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 31, 2014 2:52 PM

Ada Lovelace, AI Visionary

We hear a lot about Ada Lovelace being the first computer programmer, but that may not be her most impressive computing first. When I read Steven Johnson's The Tech Innovators of the Victorian Age I learned that she may have been the first modern person to envision the digital computer as a vehicle for an intelligent machine.

Though I have heard about Ada's work with Charles Babbage before, I didn't know any of the details. An engineer had written an essay about the Analytical Engine in Italian, and Lovelace set out to translate it into English. But she also added her own comments to the text as footnotes. It was in a footnote that she recorded "a series of elemental instruction sets that could be used to direct the calculations of the Analytical Engine". When people say Lovelace was the first computer programmer, they are referring to this footnote.

Some people contend that Lovelace did not write this program; rather, Babbage had outlined some procedures and that she refined them. If that is true, then Lovelace and Babbage still conspired on a noteworthy act: they were the first people to collaborate on a program. How fitting that the first computer program was a team effort.

That is only the beginning. Writes Johnson,

But her greatest contribution lay not in writing instruction sets but, rather, in envisioning a range of utility for the machine that Babbage himself had not considered. "Many persons," she wrote, "imagine that because the business of the engine is to give its results in numerical notation, the nature of its processes must consequently be arithmetical and numerical, rather than algebraical and analytical. This is an error. The engine can arrange and combine its numerical quantities exactly as if they were letters or any other general symbols."

Lovelace foresaw the use of computation for symbol manipulation, analytical reasoning, and even the arts:

"Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and musical composition were susceptible of such expressions and adaptations, the Engine might compose elaborate and scientific pieces of music of any degree of complexity or extent."

The Analytical Engine could be used to simulate intelligent behavior. Lovelace imagined artificial intelligence.

Johnson calls this perhaps the most visionary footnote in the history of print. That may be a bit over the top, but can you blame him? Most people of the 19th century could hardly conceive of the idea of a programmable computer. By the middle of the 20th century, many people understood that computers could implement arithmetic processes that would change many areas of life. But for most people, the idea of an "intelligent machine" was fantastic, not realistic.

In 1956, a group of visionary scientists organized the Dartmouth conferences to brainstorm from the belief that "every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it". The Darmouth summer project may have been a seminal event in the history of AI. However, over a century earlier, Ada Lovelace saw the potential that a computing machine could partake in language and art. That may have been the first seminal moment in AI history.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 29, 2014 3:56 PM

Computing Future and Computing Past: tilde.club

Administrative and teaching duties have been keeping me busy of late, but I've enjoyed following along with tilde.club, a throwback, shell-based, Unix community started by Paul Ford and blogged about by him on his ~ford page there.

tilde.club feels like 1986 to me, or maybe 2036. In one sense, it is much less than today's social networks. In many other ways, it is so much more. The spirit of learning and adventure and connecting are more important there than glitzy interface and data anlytics and posturing for a public that consists of hundreds of Facebook 'friends' and Twitter 'followers'.

Ford mentions the trade-off in his long Medium article:

It's not like you can build the next Facebook or Twitter or Google on top of a huge number of Internet-connected Linux servers. Sure, Facebook, Twitter, and Google are built on top of a huge number of loosely connected Linux servers. But you know what I mean.

This project brings to mind a recent interview with writer William Gibson, in which he talks about the future and the past. In particular, this passage expresses a refreshingly different idea of what knowledge from the future would be most interesting -- and useful -- today:

If there were somehow a way for me to get one body of knowledge from the future -- one volume of the great shelf of knowledge of a couple of hundred years from now -- I would want to get a history. I would want to get a history book. I would want to know what they think of us.

I often wonder what the future will think of this era of computing, in which we dream too small and set the bar of achievement too low. We can still see the 1960s and 1970s in our rearview mirror, yet the dreams and accomplishments of that era are forgotten by so many people today -- even computer scientists, who rarely ever think about that time at all.

tilde.club is the sort of project that looks backward and yet enables us to look forward. Eliminate as much noise as possible and see what evolves next. I'm curious to see where it goes.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 17, 2014 3:05 PM

Assorted Quotes

... on how the world evolves.

On the evolution of education in the Age of the Web. Tyler Cowen, in Average Is Over, via The Atlantic:

It will become increasingly apparent how much of current education is driven by human weakness, namely the inability of most students to simply sit down and try to learn something on their own.

I'm curious whether we'll ever see a significant change in the number of students who can and do take the reins for themselves.

On the evolution of the Web. Jon Udell, in A Web of Agreements and Disagreements:

The web works as well as it does because we mostly agree on a set of tools and practices. But it evolves when we disagree, try different approaches, and test them against one another in a marketplace of ideas. Citizens of a web-literate planet should appreciate both the agreements and the disagreements.

Some disagreements are easier to appreciate after they fade into history.

On the evolution of software. Nat Pryce on the Twitter, via The Problematic Culture of "Worse is Better":

Eventually a software project becomes a small amount of useful logic hidden among code that copies data between incompatible JSON libraries

Not all citizens of a web-literate planet appreciate disagreements between JSON libraries. Or Ruby gems.

On the evolution of start-ups. Rands, in The Old Guard:

... when [the Old Guard] say, "It feels off..." what they are poorly articulating is, "This process that you're building does not support one (or more) of the key values of the company."

I suspect the presence of incompatible JSON libraries means that our software no longer supports the key values of our company.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Managing and Leading, Software Development, Teaching and Learning

October 16, 2014 3:54 PM

For Programmers, There Is No "Normal Person" Feeling

I see this in the lab every week. One minute, my students sit peering at their monitors, their heads buried in their hands. They can't do anything right. The next minute, I hear shouts of exultation and turn to see them, arms thrust in the air, celebrating their latest victory over the Gods of Programming. Moments later I look up and see their heads again in their hands. They are despondent. "When will this madness end?"

Last week, I ran across a tweet from Christina Cacioppo that expresses nicely a feeling that has been vexing so many of my intro CS students this semester:

I still find programming odd, in part, because I'm either amazed by how brilliant or how idiotic I am. There's no normal-person feeling.

Christina is no beginner, and neither am I. Yet we know this feeling well. Most programmers do, because it's a natural part of tackling problems that challenge us. If we didn't bounce between feeling puzzlement and exultation, we wouldn't be tackling hard-enough problems.

What seems strange to my students, and even to programmers with years of experience, is that there doesn't seem to be a middle ground. It's up or down. The only time we feel like normal people is when we aren't programming at all. (Even then, I don't have many normal-person feelings, but that's probably just me.)

I've always been comfortable with this bipolarity, which is part of why I have always felt comfortable as a programmer. I don't know how much of this comfort is natural inclination -- a personality trait -- and how much of it is learned attitude. I am sure it's a mixture of both. I've always liked solving puzzles, which inspired me to struggle with them, which helped me get better struggling with them.

Part of the job in teaching beginners to program is to convince them that this is a habit they can learn. Whatever their natural inclination, persistence and practice will help them develop the stamina they need to stick with hard problems and the emotional balance they need to handle the oscillations between exultation and despondency.

I try to help my students see that persistence and practice are the answer to most questions involving missing skills or bad habits. A big part of helping them this is coaching and cheerleading, not teaching programming language syntax and computational concepts. Coaching and cheerleading are not always tasks that come naturally to computer science PhDs, who are often most comfortable with syntax and abstractions. As a result, many CS profs are uncomfortable performing them, even when that's what our students need most. How do we get better at performing them? Persistence and practice.

The "no normal-person feeling" feature of programming is an instance of a more general feature of doing science. Martin Schwartz, a microbiologist at the University of Virginia, wrote a marvelous one-page article called The importance of stupidity in scientific research that discusses this element of being a scientist. Here's a representative sentence:

One of the beautiful things about science is that it allows us to bumble along, getting it wrong time after time, and feel perfectly fine as long as we learn something each time.

Scientists get used to this feeling. My students can, too. I already see the resilience growing in many of them. After the moment of exultation passes following their latest conquest, they dive into the next task. I see a gleam in their eyes as they realize they have no idea what to do. It's time to bury their heads in their hands and think.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 15, 2014 3:54 PM

Maybe We Just Need to Teach Better

Maybe We Just Need to Teach Better

A couple of weeks ago, I wrote Skills We Can Learn in response to a thread on the SIGCSE mailing list. Mark Guzdial has now written a series of posts in response to that thread, most recently Teaching Computer Science Better To Get Better Results. Here is one of the key paragraphs in his latest piece:

I watch my children taking CS classes, along with English, Chemistry, Physics, and Biology classes. In the CS classes, they code. In the other classes, they do on-line interactive exercises, they write papers, they use simulations, they solve problems by-hand. Back in CS, the only activity is coding with feedback. If we only have one technique for teaching, we shouldn't be surprised if it doesn't always work.

Mark then offers a reasonable hypothesis: We get poor results because we use ineffective teaching methods.

That's worthy of a new maxim of the sort found in my previous post: If things aren't going well in my course, it's probably my fault. Mark's hypothesis sounds more professional.

A skeptic might say that learning to program is like learning to speak a new human language, and when we learn new human languages we spend most of our time reading, writing, and speaking, and getting feedback from these activities. In an introductory programming course, the programming exercises are where students read, write, and get feedback. Isn't that enough?

For some students, yes, but not for all. This is also true in introductory foreign language courses, which is why teachers in those courses usually include games and other activities to engage the students and provide different kinds of feedback. Many of us do more than just programming exercises in computer science courses, too. In courses with theory and analysis, we give homework that asks students to solve problems, compute results, or give proofs for assertions about computation.

In my algorithms course, I open most days with a game. Students play the game for a while, and then we discuss strategies for playing the game well. I choose games whose playing strategies illustrate some algorithm design technique we are studying. This is a lot more fun than yet another Design an algorithm to... exercise. Some students seem to understand the ideas better, or at least differently, when they experience the ideas in a wider context.

I'm teaching our intro course right now, and over the last few weeks I have come to appreciate the paucity of different teaching techniques and methods used by a typical textbook. This is my first time to teach the course in ten years, and I'm creating a lot of my own materials from scratch. The quality and diversity of the materials are limited by my time and recent experience, with the result being... a lot of reading and writing of code.

What of the other kinds of activities that Mark mentions? Some code reading can be turned into problems that the students solve by hand. I have tried a couple of debugging exercises that students seemed to find useful. I'm only now beginning to see the ways in which those exercises succeeded and failed, as the students take on bigger tasks.

I can imagine all sorts of on-line interactive exercises and simulations that would help in this course. In particular, a visual simulator for various types of loops could help students see a program's repetitive behavior more immediately than watching the output of a simple program. Many of my students would likely benefit from a Bret Victor-like interactive document that exposes the internal working of, say, a for loop. Still others could use assistance with even simpler concepts, such as sequences of statements, assignment to variables, and choices.

In any case, I second Mark's calls to action. We need to find more and better methods for teaching CS topics. We need to find better ways to make proven methods available to CS instructors. Most importantly, we need to expect more of ourselves and demand more from our profession.

When things go poorly in my classroom, it's usually my fault.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 06, 2014 4:02 PM

A New Programming Language Can Inspire Us

In A Fresh Look at Rust, Armin Ronacher tells us that some of what inspires him about Rust:

For me programming in Rust is pure joy. Yes I still don't agree with everything the language currently forces me to do but I can't say I have enjoyed programming that much in a long time. It gives me new ideas how to solve problems and I can't wait for the language to get stable.

Rust is inspiring for many reasons. The biggest reason I like it is because it's practical. I tried Haskell, I tried Erlang and neither of those languages spoke "I am a practical language" to me. I know there are many programmers that adore them, but they are not for me. Even if I could love those languages, other programmers would never do and that takes a lot of enjoyment away.

I enjoy reading personal blog entries from people excited by a new language, or newly excited by a language they are visiting again after a while away. I've only read Rust code, not written it, but I know just how Ronacher feels. These two paragraphs touch on several truths about how languages excite us:

  • Programmers are often most inspired when a language shows them new ideas how to solve problems.
  • Even if we love a language, we won't necessarily love every feature of the language.
  • What inspires us is personal. Other people can be inspired by languages that do not excite us.
  • Community matters.

Many programmers make a point of learning a new language periodically. When we do, we are often most struck by a language that teaches us new ways to think about problems and how to solve them. These are usually the languages that have the most teach us at the moment.

As Kevin Kelly says, progress sometimes demands that we let go of problems. We occasionally have to seek new problems, in order to be excited by new ways to answer them.

This all is very context-specific, other. How wonderful it is to live in a time with so many languages available to learn from. Let them all flourish, I say.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 02, 2014 3:46 PM

Skills We Can Learn

In a thread on motivating students on the SIGCSE mailing list, a longtime CS prof and textbook author wrote:

Over the years, I have come to believe that those of us who can become successful programmers have different internal wiring than most in the population. We know you need problem solving, mathematical, and intellectual skills but beyond that you need to be persistent, diligent, patient, and willing to deal with failure and learn from it.

These are necessary skills, indeed. Many of our students come to us without these skills and struggle to learn how to think like a computer scientist. And without persistence, diligence, patience, and a willingness to deal with failure and learn from it, anyone will likely have a difficult time learning to program.

Over time, it's natural to begin to think that these attributes are prerequisites -- things a person must have before he or she can learn to write programs. But I think that's wrong.

As someone else pointed out in the thread, too many people believe that to succeed in certain disciplines, one must be gifted, to possess an inherent talent for doing that kind of thing. Science, math, and computer science fit firmly in that set of disciplines for most people. Carol Dweck has shown that having such a "fixed" mindset of this sort prevents many people from sticking with these disciplines when they hit challenges, or even trying to learn them in the first place.

The attitude expressed in the quote above is counterproductive for teachers, whose job it is to help students learn things even when the students don't think they can.

When I talk to my students, I acknowledge that, to succeed in CS, you need to be persistent, diligent, patient, and willing to deal with failure and learn from it. But I approach these attributes from a growth mindset:

Persistence, diligence, patience, and willingness to learn from failure are habits anyone can develop with practice. Students can develop these habits regardless of their natural gifts or their previous education.

Aristotle said that excellence is not an act, but a habit. So are most of the attributes we need to succeed in CS. They are habits, not traits we are born with or actions we take.

Donald Knuth once said that only about 2 per cent of the population "resonates" with programming the way he does. That may be true. But even if most of us will never be part of Knuth's 2%, we can all develop the habits we need to program at a basic level. And a lot more than 2% are capable of building successful careers in the discipline.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

September 23, 2014 4:37 PM

The Obstacles in the Way of Teaching More Students to Program

All students should learn to program? Not so fast, says Larry Cuban in this Washington Post blog entry. History, including the Logo movement, illustrates several ways in which such a requirement can fail. I've discussed Cuban's article with a couple of colleagues, and all are skeptical. They acknowledge that he raises important issues, but in the end they offer a "yeah, but...". It is easy to imagine that things are different now, and the result will be similarly different.

I am willing to believe that things may be different this time. They always are. I've written favorably here in the past of the value of more students learning to program, but I've also been skeptical of requiring it. Student motivations change when they "have to take that class". And where will all the teachers come from?

In any case, it is wise to be alert to how efforts to increase the reach of programming instruction have fared. Cuban reminds us of some of the risks. One line in his article expresses what is, to my mind, the biggest challenge facing this effort:

Traditional schools adapt reforms to meet institutional needs.

Our K-12 school system is a big, complex organism (actually, fifty-one of them). It tends to keep moving in the direction of its own inertia. If a proposed reform fits its needs, the system may well adopt it. If it doesn't, but external forces push the new idea onto system, the idea is adapted -- assimilated into what the institution already wants to be, not what the reform actually promises.

We see this in the university all the time, too. Consider accountability measures such as student outcomes assessment. Many schools have adopted the language of SOA, but rarely do faculty and programs change all that much how they behave. They just find ways to generate reports that keep the external pressures at bay. The university and its faculty may well care about accountability, but they tend to keep on doing it the way they want to do it.

So, how can we maximize the possibility of substantive change in the effort to teach more students how to program, and not simply create a new "initiative" with frequent mentions in brochures and annual reports? Mark Guzdial has been pointing us in the right direction. Perhaps the most effective way to change K-12 schools is to change the teachers we send into the schools. We teach more people to be computing teachers, or prepare more teachers in the traditional subjects to teach computing. We prepare them to recognize opportunities to introduce computing into their courses and curricula.

In this sense, universities have an irreplaceable role to play in the revolution. We teach the teachers.

Big companies can fund programs such as code.org and help us reach younger students directly. But that isn't enough. Google's CS4HS program has been invaluable in helping universities reach current K-12 teachers, but they are a small percentage of the installed base of teachers. In our schools of education, we can reach every future teacher -- if we all work together within and across university boundaries.

Of course, this creates a challenge at the meta-level. Universities are big, complex organisms, too. They tends to keep moving in the direction of their own inertia. Simply pushing the idea of programming instruction onto system from the outside is more likely to result in harmless assimilation than in substantive change. We are back to Cuban's square one.

Still, against all these forces, many people are working to make a change. Perhaps this time will be different after all.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 22, 2014 10:33 AM

Strange Loop 2014 Videos Are Up

generic Strange Loop logo

Wow. Strange Loop just ended Friday evening, and already videos of nearly all the talks are available on a YouTube channel. (A few have been delayed at the speaker's request.)

I regret missing the conference this year. I've been a regular attendee over the years and much enjoyed last year's edition. But it's probably just as well that the tickets sold out before I bought mine. My intro course has kept me pedaling full speed since school started, and I would have regretted missing a lab day and a class session just as we are getting to the meat of the course. I followed along with the conference on Twitter as time permitted.

The video titles foreshadow the usual treasure trove of Strange Loop content. It would be easier to list the talks I don't want to watch than the ones I do. A few I'll watch early on include Stephen Kell's "Liberating the Smalltalk Lurking in C and Unix", Stefanie Schirmer's "Dynamic Programming At Ease", Mark Allen's "All Of This Has Happened Before, and It Will All Happen Again", Julia Evans's "You Can Be a Kernel Hacker!", and Michael Nygard's "Simulation Testing".

An underrated advantage of actually attending a conference is not being able to be in two places at one time. Having to make a choice is sometimes a good thing; it helps us to preserve limited resources. The downside to the wonderfulness of having all the videos available on-line, for viewing at my leisure, is that I want to watch them all -- and I don't have enough leisure!


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 12, 2014 1:49 PM

The Suffocating Gerbils Problem

I had never heard of the "suffocating gerbils" problem until I ran across this comment in a Lambda the Ultimate thread on mixing declarative and imperative approaches to GUI design. Peter Van Roy explained the problem this way:

A space rocket, like the Saturn V, is a complex piece of engineering with many layered subsystems, each of which is often pushed to the limits. Each subsystem depends on some others. Suppose that subsystem A depends on subsystem B. If A uses B in a way that was not intended by B's designers, even though formally B's specification is being followed by A, then we have a suffocating gerbils problem. The mental image is that B is implemented by a bunch of gerbils running to exhaustion in their hoops. A is pushing them to do too much.

I first came to appreciate the interrelated and overlapping functionality of engineered subsystems in graduate school, when I helped a fellow student build a software model of the fuel and motive systems of an F-18 fighter plane. It was quite a challenge for our modeling language, because the functions and behaviors of the systems were intertwined and did not follow obviously from the specification of components and connections. This challenge motivated the project. McDonnell Douglas was trying to understand the systems in a new way, in order to better monitor performance and diagnose failures. (I'm not sure how the project turned out...)

We suffocate gerbils at the university sometimes, too. Some functions depend on tenure-track faculty teaching occasional overloads, or the hiring of temporary faculty as adjuncts. When money is good, all is well. As budgets tighten, we find ourselves putting demands on these subsystems to meet other essential functions, such as advising, recruiting, and external engagement. It's hard to anticipate looming problems before they arrive in full failure; everything is being done according to specification.

Now there's a mental image: faculty gerbils running to exhaustion.

If you are looking for something new to read, check out some of Van Roy's work. His Concepts, Techniques, and Models of Computer Programming offers all kinds of cool ideas about programming language design and use. I happily second the sentiment of this tweet:

Note to self: read all Peter Van Roy's LtU comments in chronological order and build the things that don't exist yet: http://lambda-the-ultimate.org/user/288/track?from=120&sort=asc&order=last%20post

There are probably a few PhD dissertations lurking in those comments.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

September 04, 2014 3:32 PM

Language Isn't Just for Experts

Stephen Ramsey wrote The Mythical Man-Finger, in defense of an earlier piece on the virtues of the command line. The gist of his argument is this:

... the idea that language is for power users and pictures and index fingers are for those poor besotted fools who just want toast in the morning is an extremely retrograde idea from which we should strive to emancipate ourselves.

Ramsay is an English professor who works in digital humanities. From the writings posted on his web site, it seems that he spends nearly as much time teaching and doing computing these days as he spends on the humanities. This opens him to objections from his colleagues, some of whom minimize the relevance of his perspective for other humanists by reminding him that he is a geek. He is one of those experts who can't see past his own expertise. We see this sort of rhetorical move in tech world all the time.

I think the case is quite the opposite. Ramsay is an expert on language. He knows that language is powerful, that language is more powerful than the alternatives in many contexts. When we hide language from our users, we limit them. Other tools can optimize for a small set of particular use cases, but they generally make it harder to step outside of those lines drawn by the creator of the tools: to combine tasks in novel ways, to extend them, to integrate them with other tools.

Many of my intro students are just beginning to see what knowing a programming language can mean. Giving someone language is one of the best ways to empower them, and also a great way to help them even see what is possible.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 30, 2014 7:43 AM

A Monad Sighting in Pop Literature

Lab experiments are invaluable in the hard sciences, in part because neutrinos and monads don't change their behavior when they are being watched; but humans do.

Several things ran through my mind when I read this sentence.

  • "Monads don't change their behavior when watched." Wow. The authors of this book must know a little functional programming.

  • Monads mentioned in the same sentence as neutrinos, which are fundamental particles of the universe? Oh, no. This will only make the smug functional programming weenies more smug.

  • Monads are part of the "hard sciences"? These authors really do get functional programming!

  • This sentence appears in a chapter called "The Three Hardest Words in the English Language". That joke writes itself.

  • Maybe I shouldn't be surprised to see this sentence. The book called Think Like a Freak.

I kid my monad-loving friends; I kid. The rest of the book is pretty good, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing

July 21, 2014 10:52 AM

Wesley's Quoted Quote

My recent post Burn All Your Sermons was triggered by a quote taken out of context. Theologian John Wesley did not say:

Once in seven years I burn all my sermons...

He said:

"Once in seven years I burn all my sermons..."

Those "" make all the difference. Wesley wasn't saying that he himself burns all his sermons every seven years; he was talking about the practice doing so. Imagine the assistant of Wesley who, upon seeing this passage in the theologian's diary, burned all of Wesley's old sermons in an effort to ingratiate himself with the boss, only later to find out that Wesley very much intended to use them again. Fiery furnace, indeed.

This sort of indirection isn't important only for human communication. It is a key idea in computing. I wrote a blog post last year about such quotations and how this distinction is an important element in Jon Udell's notion of "thinking like the web". Thinking like the web isn't always foreign to the way most of us already think and work; sometimes it simply emphasizes a particular human practice that until now has been less common.

Studying a little computer science can help, though. Programmers have multiple ways of speaking indirectly about an action such as "burn all the sermons". In Scheme, I might express the program to burn all the sermons in a collection as:

(burn sermons)

We can quote this program, in much the same way that the "" above do, as:

'(burn sermons)

This is actually shorthand for (quote (burn sermons)). The result is a piece of data, much like Wesley's quotation of another person's utterance, that we can manipulate a variety of ways.

This sort of quotation trades on the distinction between data and process. In a post a few years back, I talked a bit about how this distinction is only a matter of perspective, that at a higher level data and program are two sides of the same coin.

However, we can also "quote" our sermon-burning program in a way that stays on the side of process. Consider this program:

(lambda () (burn sermons))

The result is a program that, when executed, will execute the sermon-burning program. Like the data version of the quote, it turns the original statement into something that we can talk about, pass around as a value, and manipulate in a variety of ways. But it does so by creating another program.

This technique, quite simple at its heart, plays a helpful role in the way many of computer language processors work.

Both techniques insert a level of indirection between a piece of advice -- burn all your sermons -- and its execution. That is a crucial distinction when we want to talk about an idea without asserting the idea's truth at that moment. John Wesley knew that, and so should we.


Posted by Eugene Wallingford | Permalink | Categories: Computing

July 16, 2014 2:11 PM

Burn All Your Sermons

Marketers and bridge players have their Rules of Seven. Teachers and preachers might, too, if they believe this old saw:

Once in seven years I burn all my sermons; for it is a shame if I cannot write better sermons now than I did seven years ago.

I don't have many courses in which I lecture uninterrupted for long periods of time. Most of my courses are a mixture of short lectures, student exercises, and other activities that explore or build upon whatever we are studying. Even when I have a set of materials I really like, which have been successful for me and my students in the past, I am forever reinventing them, tweaking and improving as we move through the course. This is in the same spirit as the rule of seven: surely I can make something better since the last time I taught the course.

Having a complete set of materials for a course to start from can be a great comfort. It can also be a straitjacket. The high-level structure of a course design limits how we think about the essential goals and topics of the course. The low-level structure generally optimizes for specific transitions and connections, which limits how easily we can swap in new examples and exercises.

Even as an inveterate tinkerer, I occasionally desire to break out of the straitjacket of old material and make a fresh start. Burn it all and start over. Freedom! What I need to remember will come back to me.

The adage quoted above tells us to do this regularly even if we don't feel the urge. The world changes around us. Our understanding grows. Our skills as a writer and storyteller grow. We can do better.

Of course, starting over requires time. It's a lot quicker to prep a course by pulling a prepped course out of an old directory of courses and cleaning it up around the edges. When I decide to redesign a course from bottom up, I usually have to set aside part of a summer to allow for long hours writing from scratch. This is a cost you have to take into account any time you create a new course.

Being in computer science makes it easier to force ourselves to start from scratch. While many of the principles of CS remain the same across decades, the practices and details of the discipline change all the time. And whatever we want to say about timeless principles, the undergrads in my courses care deeply about having some currency when they graduate.

In Fall 2006, I taught our intro course. The course used Java, which was the first language in our curriculum at that time. Before that, the last time I had taught the course, our first language was Pascal. I had to teach an entirely new course, even though many of the principles of programming I wanted to teach were the same.

I'm teaching our intro course again this fall for the first time since 2006. Python is the language of choice now. I suppose I could dress my old Java course in a Python suit, but that would not serve my students well. It also wouldn't do justice to the important ideas of the course, or Python. Add to this that I am a different -- and I hope better -- teacher and programmer now than I was eight years ago, and I have all the reasons I need to design a new course.

So, I am getting busy. Burn all the sermons.

Of course, we should approach the seven-year advice with some caution. The above passage is often attributed to theologian John Wesley. And indeed he did write it. However, as is so often the case, it has been taken out of context. This is what Wesley actually wrote in his journal:

Tuesday, September 1.--I went to Tiverton. I was musing here on what I heard a good man say long since--"Once in seven years I burn all my sermons; for it is a shame if I cannot write better sermons now than I could seven years ago." Whatever others can do, I really cannot. I cannot write a better sermon on the Good Steward than I did seven years ago; I cannot write a better on the Great Assize than I did twenty years ago; I cannot write a better on the Use of Money, than I did nearly thirty years ago; nay, I know not that I can write a better on the Circumcision of the Heart than I did five-and-forty years ago. Perhaps, indeed, I may have read five or six hundred books more than I had then, and may know a little more history, or natural philosophy, than I did; but I am not sensible that this has made any essential addition to my knowledge in divinity. Forty years ago I knew and preached every Christian doctrine which I preach now.

Note that Wesley attributes the passage to someone else -- and then proceeds to deny its validity in his own preaching! We may choose to adopt the Rule of Seven in our teaching, but we cannot do so with Wesley as our prophet.

I'll stick with my longstanding practice of building on proven material when that seems best, and starting from scratch whenever the freedom to tell a new story outweighs the value of what has worked for me and my students in the past.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 10, 2014 3:08 PM

The Passing of the Postage Stamp

In this New York Times article on James Baldwin's ninetieth birthday, scholar Henry Louis Gates laments:

On one hand, he's on a U.S. postage stamp; on the other hand, he's not in the Common Core.

I'm not qualified to comment on Baldwin and his place in the Common Core. In the last few months, I read several articles about and including Baldwin, and from those I have come to appreciate better his role in twentieth-century literature. But I also empathize with anyone trying to create a list of things that every American should learn in school.

What struck me in Gates's comment was the reference to the postage stamp. I'm old enough to have grown up in a world where the postage stamp held a position of singular importance in our culture. It enabled communication at a distance, whether geographical or personal. Stamps were a staple of daily life.

In such a world, appearing on a stamp was an honor. It indicated a widespread acknowledgment of a person's (or organization's, or event's) cultural impact. In this sense, the Postal Service's decision to include James Baldwin on a stamp was a sign of his importance to our culture, and a way to honor his contributions to our literature.

Alas, this would have been a much more significant and visible honor in the 1980s or even the 1990s. In the span of the last decade or so, the postage stamp has gone from relevant and essential to archaic.

When I was a boy, I collected stamps. It was a fun hobby. I still have my collection, even if it's many years out of date now. Back then, stamp collecting was a popular activity with a vibrant community of hobbyists. For all I know, that's still true. There's certainly still a vibrant market for some stamps!

But these days, whenever I use a new stamp, I feel as if I'm holding an anachronism in my hands. Computing technology played a central role in the obsolescence of the stamp, at least for personal and social communication.

Sometimes people say that we in CS need to a better job helping potential majors see the ways in which our discipline can be used to effect change in the world. We never have to look far to find examples. If a young person wants to be able to participate in how our culture changes in the future, they can hardly do better than to know a little computer science.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

July 03, 2014 2:13 PM

Agile Moments: Conspicuous Progress and Partial Value

Dorian Taylor, in Toward a Theory of Design as Computation:

You can scarcely compress the time it takes to do good design. The best you can do is arrange the process so that progress is conspicuous and the partially-completed result has its own intrinsic value.

Taylor's piece is about an idea much bigger than simply software methodology, but this passage leapt off the page at me. It seems to embody two of the highest goals of the various agile approaches to making software: progress that is conspicuous and partial results that have intrinsic value to the user.

If you like ambition attempts to create a philosophy of design, check out the whole essay. Taylor connects several disparate sources:

  • Edwin Hutchins and Cognition in the Wild,
  • Donald Norman and Things That Make Us Smart, and
  • Douglas Hofstadter and Gödel, Escher, Bach
with the philosophy of Christopher Alexander, in particular Notes on the Synthesis of Form and The Nature of Order. Ambitious it is.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

July 02, 2014 4:31 PM

My Jacket Blurb for "Exercises in Programming Style"

On Monday, my copy of Crista Lopes's new book, Exercises in Programming Style, arrived. After blogging about the book last year, Crista asked me to review some early chapters. After I did that, the publisher graciously offered me a courtesy copy. I'm glad it did! The book goes well beyond Crista's talk at StrangeLoop last fall, with thirty three styles grouped loosely into nine categories. Each chapter includes historical notes and a reading list for going deeper. Readers of this blog know that I often like to go deeper.

I haven't had a chance to study any of the chapters deeply yet, so I don't have a detailed review. For now, let me share the blurb I wrote for the back cover. It gives a sense of why I was so excited by the chapters I reviewed last summer and by Crista's talk last fall:

It is difficult to appreciate a programming style until you see it in action. Cristina's book does something amazing: it shows us dozens of styles in action on the same program. The program itself is simple. The result, though, is a deeper understanding of how thinking differently about a problem gives rise to very different programs. This book not only introduced me to several new styles of thinking; it also taught me something new about the styles I already know well and use every day.

The best way to appreciate a style is to use it yourself. I think Crista's book opens the door for many programmers to do just that with many styles most of us don't use very often.

As for the blurb itself: it sounds a little stilted as I read it now, but I stand by the sentiment. It is very cool to see my blurb and name along side blurbs from James Noble and Grady Booch, two people whose work I respect so much. Very cool. Leave it to James to sum up his thoughts in a sentence!

While you are waiting for your copy of Crista's book to arrive, check out her recent blog entry on the evolution of CS papers in publication over the last 50+ years. It presents a lot of great information, with some nice images of pages from a few classics. It's worth a read.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 27, 2014 3:55 PM

Beautiful Words, File Format Edition

In The Great Works of Software, Paul Ford tells us that the Photoshop file format is

a fascinating hellish palimpsest.

"Palimpsest" is one of those words I seem always have to look up whenever I run across it. What a lyrical word.

After working with a student a few summers ago on a translator from Photoshop PSD format to HTML/CSS (mentioned in the first paragraph of this essay, I can second the assertion that PSD is fascinating and hellish. Likewise, however often it has changed over time, it looks in several places as if it is held together with bailing wire.

Ford said it better than I could have, though.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 25, 2014 2:03 PM

You Shouldn't Need a License to Program

In Generation Liminal, Dorian Taylor recalls how the World Wide Web arrived at the perfect time in his life:

It's difficult to appreciate this tiny window of opportunity unless you were present for it. It was the World-Wild West, and it taught me one essential idea: that I can do things. I don't need a license, and I don't need credentials. I certainly don't need anybody telling me what to do. I just need the operating manual and some time to read it. And with that, I can bring some amazing -- and valuable -- creations to life.

I predate the birth of the web. But when we turned on the computers at my high school, BASIC was there. We could program, and it seemed the natural thing to do. These days, the dominant devices are smart phones and iPads and tablets. Users begin their experience far away from the magic of creating. It is a user experience for consumers.

One day many years ago, my older daughter needed to know how many words she had written for a school assignment. I showed her Terminal.app and wc. She was amazed by its simplicity; it looked like nothing else she'd ever seen. She still uses it occasionally.

I spent several days last week watching middle schoolers -- play. They consumed other people's creations, including some tools my colleagues set up for them. They have creative minds, but for the most part it doesn't occur to them that they can create things, too.

We need to let them know they don't need our permission to start, or credentials defined by anyone else. We need to give them the tools they need, and the time to play with them. And, sometimes, we need to give them a little push to get started.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 23, 2014 3:13 PM

The Coder's High Beats The Rest

At least David Auerbach thinks so. One of the reasons is that programming has a self-perpetuating cycle of creation, implementation, repair, and new birth:

"Coding" isn't just sitting down and churning out code. There's a fair amount of that, but it's complemented by large chunks of testing and debugging, where you put your code through its paces and see where it breaks, then chase down the clues to figure out what went wrong. Sometimes you spend a long time in one phase or another of this cycle, but especially as you near completion, the cycle tightens -- and becomes more addictive. You're boosted by the tight feedback cycle of coding, compiling, testing, and debugging, and each stage pretty much demands the next without delay. You write a feature, you want to see if it works. You test it, it breaks. It breaks, you want to fix it. You fix it, you want to build the next piece. And so on, with the tantalizing possibility of -- just maybe! -- a perfect piece of code gesturing at you in the distance.

My experience is similar. I can get lost for hours in code, and come out tired but mentally energized. Writing has never given me that kind of high, but then I've not written a really long piece of prose in a long time. Perhaps writing fiction could give me the sort of high I experience when deep in a program.

What about playing games? Back in my younger days, I experienced incredible flow while playing chess for long stretches. I never approached master level play, but a good game could still take my mind to a different level of consciousness. That high differed from a coder's high, though, in that it left me tired. After a three-round day at a chess tournament, all I wanted to do was sleep.

Getting lost in a computer game gives me a misleading feeling of flow, but it differs from the chess high. When I come out of a session lost in most computer games, I feel destroyed. The experience doesn't give life the way coding does, or the way I imagine meditation does. I just end up feeling tired and used. Maybe that's what drug addiction feels like.

I was thinking about computer games even before reading Auerbach's article. Last week, I was sitting next to one of the more mature kids in our summer camp after he had just spent some time gaming, er, collecting data for our our study of internet traffic. We had an exchange that went something like this:

Student: I love this feeling. I'd like to create a game like this some day.

Eugene: You can!

Student: Really? Where?

Eugene: Here. A group of students in my class last month wrote a computer game next door. And it's way cooler than playing a game.

I was a little surprised to find that this young high schooler had no idea that he could learn computer programming at our university. Or maybe he didn't make the connection between computer games and computer programs.

In any case, this is one of the best reasons for us CS profs to get out of their university labs and classrooms and interact with younger students. Many of them have no way of knowing what computer science is, what they can do with computer science, or what computer science can do for them -- unless we show them!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 20, 2014 1:27 PM

Programming Everywhere, Business Edition

Q: What do you call a company that has staff members with "programmer" or "software developer" in their titles?

A: A company.

Back in 2012, Alex Payne wrote What Is and Is Not A Technology Company to address a variety of issues related to the confounding of companies that sell technology with companies that merely use technology to sell something else. Even then, developing technology in house was a potential source of competitive advantage for many businesses, whether that involved modifying existing software or writing new.

The competitive value in being able to adapt and create software is only larger and more significant in the last two years. Not having someone on staff with "programmer" in the title is almost a red flag even for non-tech companies these days.

Those programmers aren't likely to have been CS majors in college, though. We don't produce enough. So we need to find a way to convince more non-majors to learn a little programming.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 19, 2014 2:11 PM

Yet Another Version of Alan Kay's Definition of "Object-Oriented"

In 2003, Stefan Ram asked Alan Kay to explain some of the ideas and history behind the term "object-oriented". Ram posted Kay's responses for all to see. Here is how Kay responded to the specific question, "What does 'object-oriented [programming]' mean to you?":

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.

Messaging and extreme late-binding have been consistent parts of Kay's answer to this question over the years. He has also always emphasized the encapsulated autonomy of objects, with analogy to cells from biology and nodes on the internet. As Kay has said many times, in his conception of the basic unit of computation is a whole computer.

For some reason, I really like the way Kay phrased the encapsulated autonomy clause in this definition: local retention and protection and hiding of state-process. It's not poetry or anything, but it has a rhythm.

Kay's e-mail mentions another of Kay's common themes, that most computer scientists didn't take full advantage of the idea of objects. Instead, we stayed too close to the dominant data-centric perspective. I often encounter this with colleagues who confound object-oriented programming with abstract data types. A system designed around ADTs will not offer the same benefits that Kay envisions for objects defined by their interactions.

In some cases, the words we adopted for OO concepts may have contributed to the remaining bias toward data, even if unintentionally. For example, Kay thinks that the term "polymorphism" hews too closely to the standard concept of a function to convey the somewhat different notion of an object as embodying multiple algebras.

Kay's message also mentions two projects I need to learn more about. I've heard of Robert Balzer's Dataless Programming paper but never read it. I've heard of GEDANKEN, a programming language project by John Reynolds, but never seen any write-up. This time I downloaded GEDANKEN: A Simple Typeless Language Which Permits Functional Data Structures and Coroutines, Reynolds's tech report from Argonne National Lab. Now I am ready to become a little better informed than I was this morning.

The messages posted by Ram are worth a look. They serve as a short precursor to (re-)reading Kay's history of Smalltalk paper. Enjoy!


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 17, 2014 2:38 PM

Cookies, Games, and Websites: A Summer Camp for Kids

Cut the Rope 2 logo

Today is the first day of Cookies, Games, and Websites, a four-day summer camp for middle-school students being offered by our department. A colleague of mine developed the idea for a workshop that would help kids of that age group understand better what goes on when they play games on their phones and tablets. I have been helping, as a sounding board for ideas during the prep phase and now as a chaperone and helper during the camp. A local high school student has been providing much more substantial help, setting up hardware and software and serving as a jack-of-all-trades.

The camp's hook is playing games. To judge from this diverse group of fifteen students from the area, kids this age already know very well how to download, install, and play games. Lots of games. Lots and lots of games. If they had spent as much time learning to program as they seem to have spent playing games, they would be true masters of the internet.

The first-order lesson of the camp is privacy. Kids this age play a lot of games, but they don't have a very good idea how much network traffic a game like Cut the Rope 2 generates, or how much traffic accessing Instagram generates. Many of their apps and social websites allow them to exercise some control over who sees what in their space, but they don't always know what that means. More importantly, they don't realize how important all this all is, because they don't know how much traffic goes on under the hood when they use their mobiles devices -- and even when they don't!

127.0.0.1

The second-order lesson of the camp, introduced as a means to an end, is computing: the technology that makes communication on the web possible, and some of the tools they can use to look at and make sense of the network traffic. We can use some tools they already know and love, such as Google maps, to visualize the relevant data.

This is a great idea: helping young people understand better the technology they use and why concepts like privacy matter to them when they are using that technology. If the camp is successful, they will be better-informed users of on-line technology, and better prepared to protect their identities and privacy. The camp should be a lot of fun, too, so perhaps one or two of them will be interested diving deeper into computer science after the camp is over.

This morning, the campers learned a little about IP addresses and domain names, mostly through interactive exercises. This afternoon, they are learning a little about watching traffic on the net and then generating traffic by playing some of their favorite games. Tomorrow, we'll look at all the traffic they generated playing, as well as all the traffic generated while their tablets were idle overnight.

We are only three-fourths of the way through Day 1, and I have already learned my first lesson: I really don't want to teach middle school. The Grinch explains why quite succinctly: noise, noise, NOISE! One thing seems to be true of any room full of fifteen middle-school students: several of them are talking at any given time. They are fun people to be around, but they are wearing me out...


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 05, 2014 2:45 PM

Choosing the Right Languages for Early CS Instruction is Important

In today's ACM interview, Donald Knuth identifies one of the problems he has with computer science instruction:

Similarly, the most common fault in computer classes is to emphasize the rules of specific programming languages, instead of to emphasize the algorithms that are being expressed in those languages. It's bad to dwell on form over substance.

I agree. The challenges are at least two in number:

  • ... finding the right level of support for the student learning his or her first language. It is harder for students to learn their first language than many people realize until after they've tried to teach them.

  • ... helping students develop the habit and necessary skills to learn new languages on their own with some facility. For many, this involves overcoming the fear they feel until they have done it on their own a time or two.

Choosing the right languages can greatly help in conquering Challenges 1 and 2. Choosing the wrong languages can make overcoming them almost impossible, if only because we lose students before they cross the divide.

I guess that makes choosing the right languages Challenge 3.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 29, 2014 2:14 PM

Invention, One Level Down

Brent Simmons wrote a blog entry on his time at UserLand. After describing a few of the ideas that founder Dave Winer created and extending, such as RSS and blogging, Simmons said this about Winer:

The tech was his invention too: he built the thing he needed to be able to build other things.

This is among the highest praise one can bestow on an inventor. It's also one of the things I like about computer science. The hallmark of so many interesting advances in computing is the creation of a technology or language that makes the advance possible. Sometimes the enabling technology turns out to be pretty important in its own right. Sometimes, it's a game changer. But even when it is only a scaffold to something bigger, it needed to be created.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 28, 2014 4:20 PM

Programming for Everyone, Intro Physics Edition

Rhett Allain asked his intro physics students to write a short bit of Python code to demonstrate some idea from the course, such as the motion of an object with a constant force, or projectile motion with air resistance. Apparently, at least a few complained: "Wait! I'm not a computer scientist." That caused Allain to wonder...

I can just imagine the first time a physics faculty told a class that they needed to draw a free body diagram of the forces on an object for the physics solutions. I wonder if a student complained that this was supposed to be a physics class and not an art class.

As Allain points out, the barriers that used to prevent students from doing numerical calculations in computer programs have begun to disappear. We have more accessible languages now, such as Python, and powerful computers are everywhere, capable of running VPython and displaying beautiful visualizations.

About all that remains is teaching all physics students, even the non-majors, a little programming. The programs they write are simply another medium through which they can explore physical phenomena and perhaps come to understand them better.

Allain is exactly right. You don't have to be an artist to draw simple diagrams or a mathematician to evaluate an integral. All students accept, if grudgingly, that people might reasonably expect them to present an experiment orally in class.

Students don't have to be "writers", either, in order for teachers or employers to reasonably expect them to write an essay about physics or computer science. Even so, you might be surprised how many physics and computer science students complain if you ask them to write an essay. And if you dare expect them to spell words correctly, or to write prose somewhat more organized than Faulkner stream of consciousness -- stand back.

(Rant aside, I have been quite lucky this May term. I've had my students write something for me every night, whether a review of something they've read or a reflection on the practices they are struggling to learn. There's been nary a complaint, and most of their writings have been organized, clear, and enjoyable to read.)

You don't have to be a physicist to like physics. I hope that most educated adults in the 21st century understand how the physical world works and appreciate the basic mechanisms of the universe. I dare to hope that many of them are curious enough to want to learn more.

You also don't have to be a computer programmer, let alone a computer scientist, to write a little code. Programs are simply another medium through which we can create and express ideas from across the spectrum of human thought. Hurray to Allain for being in the vanguard.

~~~~

Note. Long-time readers of this blog may recognize the ideas underlying Allain's approach to teaching introductory physics. He uses Matter and Interactions, a textbook and set of supporting materials created by Ruth Chabay and Bruce Sherwood. Six years ago, I wrote about some of Chabay's and Sherwood's ideas in an entry on creating a dialogue between science and CS and mentioned the textbook project in an entry on scientists who program. These entries were part of a report on my experiences attending SECANT, a 2007 NSF workshop on the intersection of science, computation, and education.

I'm glad to see that the Matter and Interactions project continued to fruition and has begun to seep into university physics instruction. It sounds like a neat way to learn physics. It's also a nice way to pick up a little "stealth programming" along the way. I can imagine a few students creating VPython simulations and thinking, "Hey, I'd like to learn more about this programming thing..."


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 25, 2014 12:03 PM

CS Prof From Iowa Was a 'Heroine of Computing' -- and a Nun

While cleaning up the house recently for a family visit, I came across a stack of newspaper articles I'd saved from last fall. Among them was an article about a September 7, 2013, exhibition at The National Museum of Computing in Bletchley Park, Milton Keynes, England. The exhibition was titled "Celebrating the Heroines of Computing". That alone would have made the article worth clipping, but it had a closer connection to me: it featured a CS professor from the state of Iowa, who was also a Catholic nun.

Sister Mary Kenneth Keller, with Paul Laube, MD, undated

Sister Mary Kenneth Keller was a professed member of the Sisters of Charity of the Blessed Virgin Mary, an order of nuns based in Dubuque, Iowa. If you have had the privilege of working or studying with nuns, you know that they are often amazing people. Sister Mary Kenneth certainly was. She was also a trailblazer who studied computer science before it was a thing and helped to create a CS department:

As the first person to receive a Ph.D. in computer science from the University of Wisconsin-Madison, she was a strong advocate for women entering the field of computer science. For nearly 20 years she served as chair of the newly-created computer science department at Clarke University and was among the first to recognize the future importance of computers in the sciences, libraries and business. Under her leadership at Clarke, a master's degree program in computer applications in education was included.

Claims that some individual was the "first person to receive a Ph.D. in computer science" have been relatively common over the years. The Department of Computer Science at Wisconsin has a page listing Ph.D.'s conferred, 1965-1970, which list Sister Mary Kenneth first, for a dissertation titled "Inductive Inference on Computer Generated Patterns". But that wasn't her only first; this ACM blog piece by Ralph London asserts that Keller is the first woman to receive a Ph.D. in CS anywhere in the US, and one of the first two US CS Ph.D.s overall.

This bit of history is only a small part of Keller's life in academia and computing. She earned a master's degree in math at DePaul University in the early 1950s. In 1958, she worked at the Dartmouth University Computer Center as part of an NSF workshop, during which time she participated in the development of the BASIC programming language. She wrote four books on computing and served as consultant for a group of business and government organizations that included the city of Dubuque and the state of Illinois.

Sister Mary Kenneth spent her career on the faculty of Clarke University, apparently chairing the Department of Computer Science until her retirement. The university's computer center is named the Keller Computer Center and Information Service in her honor, as is a scholarship for students of computing.

I'd been in Iowa twenty years before I first heard this story of an Iowan's role in the history of computing. Her story also adds to the history of women in computing and, for me, creates a whole new area in the history of computing: women religious. A pretty good find for cleaning up the house.

~~~~

The passage quoted above come from an article by Jody Iler, "BVM to be Featured as One of the 'Heroines of Computing'", which ran some time last fall in The Witness, the newspaper of the Archdiocese of Dubuque. I found substantially the same text on a news archive page on the web site of the Sisters of Charity, BVM. There is, of course, a Wikipedia page for Sister Mary Kenneth that reports many of the same details of her life.

The photo above, which appears both in Iler's article and on the web site, shows Sister Mary Kenneth with Dr. Paul Laube, a flight surgeon from Dubuque who consulted with her on some computing matter. (Laube's obituary indicates he lived an interesting life as well.) In the article, the photo is credited to Clarke University.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 07, 2014 3:39 PM

Thinking in Types, and Good Design

Several people have recommended Pat Brisbin's Thinking in Types for programmers with experience in dynamically-typed languages who are looking to grok Haskell-style typing. He wrote it after helping one of his colleagues of mine was get unstuck with a program that "seemed conceptually simple but resulted in a type error" in Haskell when implemented in a way similar to a solution in a language such as Python or Ruby.

This topic is of current interest to me at a somewhat higher level. Few of our undergrads have a chance to program in Haskell as a part of their coursework, though a good number of them learn Scala while working at a local financial tech company. However, about two-thirds of undergrads now start with a one or two semesters of Python, and types are something of a mystery to them. This affects their learning of Java and colors how they think about types if they take my course on programming languages.

So I read this paper. I have two comments.

First, let me say that I agree with my friends and colleagues who are recommending this paper. It is a clear, concise, and well-written description of how to use Haskell's types to think about a problem. It uses examples that are concrete enough that even our undergrads could implement with a little help. I may use this as a reading in my languages course next spring.

Second, I think think this paper does more than simply teach people about types in a Haskell-like language. It also gives a great example of how thinking about types can help programmers create better designs for their programs, even if they are working in an object-oriented language! Further, it hits right at the heart of the problem we face these days, with students who are used to working in scripting languages that provide high-level but very generic data structures.

The problem that Brisbin addresses happens after he helps his buddy create type classes and two instance classes, and they reach this code:

    renderAll [ball, playerOne, playerTwo]

renderAll takes a list of values that are Render-able. Unfortunately, in this case, the arguments come from two different classes... and Haskell does not allow heterogeneous lists. We could try to work around this feature of Haskell and "make it fit", but as Brisbin points out, doing so would cause you to lose the advantages of using Haskell in the first place. The compiler wouldn't be able to find errors in the code.

The Haskell way to solve the problem is to replace the generic list of stuff we pass to renderAll with a new type. With a new Game type that composes a ball with two players, we are able to achieve several advantages at once:

  • create a polymorphic render method for Game that passes muster with the type checker
  • allow the type checker to ensure that this element of our program is correct
  • make the program easier to extend in a type-safe way
  • our program is correct
  • and, perhaps most importantly, express the intent of the program more clearly

It's this last win that jumped off the page for me. Creating a Game class would give us a better object-oriented design in his colleague's native language, too!

Students who become accustomed to programming in languages like Python and Ruby often become accustomed to using untyped lists, arrays, hashes, and tuples as their go-to collections. They are oh, so, handy, often the quickest route to a program that works on the small examples at hand. But those very handy data structures promote sloppy design, or at least enable it; they make it easy not to see very basic objects living in the code.

Who needs a Game class when a Python list or Ruby array works out of the box? I'll tell you: you do, as soon as you try to almost anything else in your program. Otherwise, you begin working around the generality of the list or array, writing code to handle special cases really aren't special cases at all. They are simply unbundled objects running wild in the program.

Good design is good design. Most of the features of a good design transcend any particular programming style or language.

So: This paper is a great read! You can use it to learn better how to think like a Haskell programmer. And you can use it to learn even if thinking like a Haskell programmer is not your goal. I'm going to use it, or something like it, to help my students become better OO programmers.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

April 09, 2014 3:26 PM

Programming Everywhere, Vox Edition

In a report on the launch of Vox Media, we learn that line between software developers and journalists at Vox is blurred, as writers and reporters work together "to build the tools they require".

"It is thrilling as a journalist being able to envision a tool and having it become a real thing," Mr. Topolsky said. "And it is rare."

It will be less rare in the future. Programming will become a natural part of more and more people's toolboxes.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 04, 2014 12:43 PM

Creative Recombination of Existing Ideas

In a post on his move to California, Jon Udell notes that he may be out of step with the dominant view of the tech industry there:

And I think differently about innovation than Silicon Valley does. I don't think we lack new ideas. I think we lack creative recombination of proven tech, and the execution and follow-through required to surface its latent value.

As he found with the Elm City project, sometimes a good idea doesn't get traction quickly, even with sustained effort. Calendar aggregation seems like such a win even for a university the size of mine, yet a lot of folks don't get it. It's hard to know whether the slowness results from the idea, the technology, or the general resistance of communities to change how they operate.

In any case, Udell is right: there is a lot of latent value in the "creative recombination" of existing ideas. Ours is a remix culture, too. That's why it's so important to study widely in and out of computing, to build the base of tools needed to have a great idea and execute on it.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 31, 2014 3:21 PM

Programming, Defined and Re-imagined

By Chris Granger of Light Table fame:

Programming is our way of encoding thought such that the computer can help us with it.

Read the whole piece, which recounts Granger's reflection after the Light Table project left him unsatisfied and he sought answers. He concludes that we need to re-center our idea of what programming is and how we can make it accessible to more people. Our current idea of programming doesn't scale because, well,

It turns out masochism is a hard sell.

Every teacher knows this. You can sell masochism to a few proud souls, but not to anyone else.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 12, 2014 3:55 PM

Not Content With Content

Last week, the Chronicle of Higher Ed ran an article on a new joint major at Stanford combining computer science and the humanities.

[Students] might compose music or write a short story and translate those works, through code, into something they can share on the web.

"For students it seems perfectly natural to have an interest in coding," [the program's director] said. "In one sense these fields might feel like they're far apart, but they're getting closer and closer."

The program works in both directions, by also engaging CS students in the societal issues created by ubiquitous networks and computing power.

We are doing something similar at my university. A few years ago, several departments began to collaborate on a multidisciplinary program called Interactive Digital Studies which went live in 2012. In the IDS program, students complete a common core of courses from the Communication Studies department and then take "bundles" of coursework involving digital technology from at least two different disciplines. These areas of emphasis enable students to explore the interaction of computing with various topics in media, the humanities, and culture.

Like Stanford's new major, most of the coursework is designed to work at the intersection of disciplines, rather than pursuing disciplines independently, "in parallel".

The initial version of the computation bundle consists of an odd mix of application tools and opportunities to write programs. Now that the program is in place, we are finding that students and faculty alike desire more depth of understanding about programming and development. We are in the process of re-designing the bundle to prepare students to work in a world where so many ideas become web sites or apps, and in which data analytics plays an important role in understanding what people do.

Both our IDS program and Stanford's new major focus on something that we are seeing increasingly at universities these days: the intersections of digital technology and other disciplines, in particular the humanities. Computational tools make it possible for everyone to create more kinds of things, but only if people learn how to use new tools and think about their work in new ways.

Consider this passage by Jim O'Loughlin, a UNI English professor, from a recent position statement on the the "digital turn" of the humanities:

We are increasingly unlikely to find writers who only provide content when the tools for photography, videography and digital design can all be found on our laptops or even on our phones. It is not simply that writers will need to do more. Writers will want to do more, because with a modest amount of effort they can be their own designers, photographers, publishers or even programmers.

Writers don't have to settle for producing "content" and then relying heavily on others to help bring the content to an audience. New tools enable writers to take greater control of putting their ideas before an audience. But...

... only if we [writers] are willing to think seriously not only about our ideas but about what tools we can use to bring our ideas to an audience.

More tools are within the reach of more people now than ever before. Computing makes that possible, not only for writers, but also for musicians and teachers and social scientists.

Going further, computer programming makes it possible to modify existing tools and to create new tools when the old ones are not sufficient. Writers, musicians, teachers, and social scientists may not want to program at that level, but they can participate in the process.

The critical link is preparation. This digital turn empowers only those who are prepared to think in new ways and to wield a new set of tools. Programs like our IDS major and Stanford's new joint major are among the many efforts hoping to spread the opportunities available now to a larger and more diverse set of people.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

March 08, 2014 10:18 AM

Sometimes a Fantasy

This week I saw a link to The Turing School of Software & Design, "a seven-month, full-time program for people who want to become professional developers". It reminded me of Neumont University, a ten-year-old school that offers a B.S. degree program in Computer science that students can complete in two and a half years.

While riding the bike, I occasionally fantasize about doing something like this. With the economics of universities changing so quickly [ 1 | 2 ], there is an opportunity for a new kind of higher education. And there's something appealing about being able to work closely with a cadre of motivated students on the full spectrum of computer science and software development.

This could be an accelerated form of traditional CS instruction, without the distractions of other things, or it could be something different. Traditional university courses are pretty confining. "This course is about algorithms. That one is about programming languages." It would be fun to run a studio in which students serve as apprentices making real stuff, all of us learning as we go along.

A few years ago, one of our ChiliPLoP hot topic groups conducted a greenfield thought experiment to design an undergrad CS program outside of the constraints of any existing university structure. Student advancement was based on demonstrating professional competencies, not completing packaged courses. It was such an appealing idea! Of course, there was a lot of hard work to be done working out the details.

My view of university is still romantic, though. I like the idea of students engaging the great ideas of humanity that lie outside their major. These days, I think it's conceivable to include the humanities and other disciplines in a new kind of CS education. In a recent blog entry, Hollis Robbins floats the idea of Home College for the first year of a liberal arts education. The premise is that there are "thousands of qualified, trained, energetic, and underemployed Ph.D.s [...] struggling to find stable teaching jobs". Hiring a well-rounded tutor could be a lot less expensive than a year at a private college, and more lucrative for the tutor than adjuncting.

Maybe a new educational venture could offer more than targeted professional development in computing or software. Include a couple of humanities profs, maybe some a social scientist, and it could offer a more complete undergraduate education -- one that is economical both in time and money.

But the core of my dream is going broad and deep in CS without the baggage of a university. Sometimes a fantasy is all you need. Other times...


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 01, 2014 11:35 AM

A Few Old Passages

I was looking over a couple of files of old notes and found several quotes that I still like, usually from articles I enjoyed as well. They haven't found their way into a blog entry yet, but they deserve to see the light of day.

Evidence, Please!

From a short note on the tendency even among scientists to believe unsubstantiated claims, both in and out of the professional context:

It's hard work, but I suspect the real challenge will lie in persuading working programmers to say "evidence, please" more often.

More programmers and computer scientists are trying to collect and understand data these days, but I'm not sure we've made much headway in getting programmers to ask for evidence.

Sometimes, Code Before Math

From a discussion of The Expectation Maximization Algorithm:

The code is a lot simpler to understand than the math, I think.

I often understand the language of code more quickly than the language of math. Reading, or even writing, a program sometimes helps me understand a new idea better than reading the math. Theory is, however, great for helping me to pin down what I have learned more formally.

Grin, Wave, Nod

From Iteration Inside and Out, a review of the many ways we loop over stuff in programs:

Right now, the Rubyists are grinning, the Smalltalkers are furiously waving their hands in the air to get the teacher's attention, and the Lispers are just nodding smugly in the back row (all as usual).

As a Big Fan of all three languages, I am occasionally conflicted. Grin? Wave? Nod? Look like the court jester by doing all three simultaneously?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 21, 2014 3:35 PM

Sticking with a Good Idea

My algorithms students and I recently considered the classic problem of finding the closest pair of points in a set. Many of them were able to produce a typical brute-force approach, such as:

    minimum ← 
    for i ← 1 to n do
        for j ← i+1 to n do
            distance ← sqrt((x[i] - x[j])² + (y[i] - y[j])²)
            if distance < minimum then
               minimum ← distance
               first   ← i
               second  ← j
    return (first, second)

Alas, this is an O(n²) process, so we considered whether we might do better with a divide-and-conquer approach. It did not look promising, though. Divide-and-conquer doesn't let us solve the sub-problems independently. What if the closest pair straddles two partitions?

This is a common theme in computing, and problem solving more generally. We try a technique only to find that it doesn't quite work. Something doesn't fit, or a feature of the domain violates a requirement of the technique. It's tempting in such cases to give up and settle for something less.

Experienced problem solvers know not to give up too quickly. Many of the great advances in computing came under conditions just like this. Consider Leonard Kleinrock and the theory of packet switching.

In a Computing Conversations podcast published last year, Kleinrock talks about his Ph.D. research. He was working on the problem of how to support a lot of bursty network traffic on a shared connection. (You can read a summary of the podcast in an IEEE Computer column also published last year.)

His wonderful idea: apply the technique of time sharing from multi-user operating systems. The system could break all messages into "packets" of a fixed size, let messages take turns on the shared line, then reassemble each message on the receiving end. This would give every message a chance to move without making short messages wait too long behind large ones.

Thus was born the idea of packet switching. But there was a problem. Kleinrock says:

I set up this mathematical model and found it was analytically intractable. I had two choices: give up and find another problem to work on, or make an assumption that would allow me to move forward. So I introduced a mathematical assumption that cracked the problem wide open.

His "independence assumption" made it possible for him to complete his analysis and optimize the design of a packet-switching network. But an important question remained: Was his simplifying assumption too big a cheat? Did it skew the theoretical results in such a way that his model was no longer a reasonable approximation of how networks would behave in the real world?

Again, Kleinrock didn't give up. He wrote a program instead.

I had to write a program to simulate these networks with and without the assumption. ... I simulated many networks on the TX-2 computer at Lincoln Laboratories. I spent four months writing the simulation program. It was a 2,500-line assembly language program, and I wrote it all before debugging a single line of it. I knew if I didn't get that simulation right, I wouldn't get my dissertation.

High-stakes programming! In the end, Kleinrock was able to demonstrate that his analytical model was sufficiently close to real-world behavior that his design would work. Every one of us reaps the benefit of his persistence every day.

Sometimes, a good idea poses obstacles of its own. We should not let those obstacles beat us without a fight. Often, we just have to find a way to make it work.

This lesson applies quite nicely to using divide-and-conquer on the closest pairs problem. In this case, we don't make a simplifying assumption; we solve the sub-problem created by our approach:

After finding a candidate for the closest pair, we check to see if there is a closer pair straddling our partitions. The distance between the candidate points constrains the area we have to consider quite a bit, which makes the post-processing step feasible. The result is an O(n log n) algorithm that improves significantly on brute force.

This algorithm, like packet switching, comes from sticking with a good idea and finding a way to make it work. This is a lesson every computer science student and novice programmer needs to learn.

There is a complementary lesson to be learned, of course: knowing when to give up on an idea and move on to something else. Experience helps us tell the two situations apart, though never with perfect accuracy. Sometimes, we just have to follow an idea long enough to know when it's time to move on.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 19, 2014 4:12 PM

Teaching for the Perplexed and the Traumatized

Teaching for the Perplexed and the Traumatized

On need for empathy when writing about math for the perplexed and the traumatized, Steven Strogatz says:

You have to help them love the questions.

Teachers learn this eventually. If students love the questions, they will do an amazing amount of working searching for answers.

Strogatz is writing about writing, but everything he says applies to teaching as well, especially teaching undergraduates and non-majors. If you teach only grad courses in a specialty area, you may be able to rely on the students to provide their own curiosity and energy. Otherwise having empathy, making connections, and providing Aha! moments are a big part of being successful in the classroom. Stories trump formal notation.

This semester, I've been trying a particular angle on achieving this trifecta of teaching goodness: I try to open every class session with a game or puzzle that the students might care about. From there, we delve into the details of algorithms and theoretical analysis. I plan to write more on this soon.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 02, 2014 5:19 PM

Things That Make Me Sigh

In a recent article unrelated to modern technology or the so-called STEM crisis, a journalist writes:

Apart from mathematics, which demands a high IQ, and science, which requires a distinct aptitude, the only thing that normal undergraduate schooling prepares a person for is... more schooling.

Sigh.

On the one hand, this seems to presume that one need neither a high IQ nor any particular aptitude to excel in any number of non-math and science disciplines.

On the other, it seems to say that if one does not have the requisite characteristics, which are limited to a lucky few, one need not bother with computer science, math or science. Best become a writer or go into public service, I guess.

I actually think that the author is being self-deprecating, at least in part, and that I'm probably reading too much into one sentence. It's really intended as a dismissive comment on our education system, the most effective outcome of which often seems to be students who are really good at school.

Unfortunately, the attitude expressed about math and science is far too prevalent, even in our universities. It demeans our non-scientists as well as our scientists and mathematicians. It also makes it even harder to convince younger students that, with a little work and practice, they can achieve satisfying lives and careers in technical disciplines.

Like I said, sigh.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 27, 2014 11:39 AM

The Polymath as Intellectual Polygamist

Carl Djerassi, quoted in The Last Days of the Polymath:

Nowadays people [who] are called polymaths are dabblers -- are dabblers in many different areas. I aspire to be an intellectual polygamist. And I deliberately use that metaphor to provoke with its sexual allusion and to point out the real difference to me between polygamy and promiscuity.

On this view, a dilettante is merely promiscuous, making no real commitment to any love interest. A polymath has many great loves, and loves them all deeply, if not equally.

We tend to look down on dilettantes, but they can perform a useful service. Sometimes, making a connection between two ideas at the right time and in the right place can help spur someone else to "go deep" with the idea. Even when that doesn't happen, dabbling can bring great personal joy and provide more substantial entertainment than a lot of pop culture.

Academics are among the people these days with a well-defined social opportunity to be explore at least two areas deeply and seriously: their chosen discipline and teaching. This is perhaps the most compelling reason to desire a life in academia. It even offers a freedom to branch out into new areas later in one's career that is not so easily available to people who work in industry.

These days, it's hard to be a polymath even inside one's own discipline. To know all sub-areas of computer science, say, as well as the experts in those sub-areas is a daunting challenge. I think back to the effort my fellow students and I put in over the years that enabled us to take the Ph.D. qualifying exams in CS. I did quite well across the board, but even then I didn't understand operating systems or programming languages as well as experts in those areas. Many years later, despite continued reading and programming, the gap has only grown.

I share the vague sense of loss, expressed by the author of the article linked to above, of a time when one human could master multiple areas of discourse and make fundamental advances to several. We are certainly better off for collective understanding the world so much much better, but the result is a blow to a certain sort of individual mind and spirit.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

January 26, 2014 3:05 PM

One Reason We Need Computer Programs

Code bridges the gap between theory and data. From A few thoughts on code review of scientific code:

... there is a gulf of unknown size between the theory and the data. Code is what bridges that gap, and specifies how edge cases, weird features of the data, and unknown unknowns are handled or ignored.

I learned this lesson the hard way as a novice programmer. Other activities, such as writing and doing math, exhibit the same characteristic, but it wasn't until I started learning to program that the gap between theory and data really challenged me.

Since learning to program myself, I have observed hundreds of CS students encounter this gap. To their credit, they usually buckle down, work hard, and close the gap. Of course, we have to close the gap for every new problem we try to solve. The challenge doesn't go away; it simply becomes more manageable as we become better programmers.

In the passage above, Titus Brown is talking to his fellow scientists in biology and chemistry. I imagine that they encounter the gap between theory and data in a new and visceral way when they move into computational science. Programming has that power to change how we think.

There is an element of this, too, in how techies and non-techies alike sometimes lose track of how hard it is to create a successful start up. You need an idea, you need a programmer, and you need a lot of hard work to bridge the gap between idea and executed idea.

Whether doing science or starting a company, the code teaches us a lot about out theory. The code makes our theory better.

As Ward Cunningham is fond of saying, it's all talk until the tests run.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

January 23, 2014 4:14 PM

The CS Mindset

Chad Orzel often blogs about the physics mindset, the default orientation that physicists tend to have toward the world, and the way they think about and solve problems. It is fun to read a scientist talking about doing science.

Earlier this week I finally read this article about the popularity of CS50, an intro CS course at Harvard. It's all about how Harvard is "is righting an imbalance in academia" by finally teaching practical skills to its students. When I read:

"CS50 made me look at Harvard with new eyes," Guimaraes said.

That is a sea change from what Harvard represented to the outside world for decades: the guardian of a classic education, where the value of learning is for its own sake.

I sighed audibly, loud enough for the students on the neighboring exercise equipment to hear. A Harvard education used to be about learning only for its own sake, but now students can learn practical stuff, too. Even computer programming!

As I re-read the article now, I see that it's not as blunt as that throughout. Many Harvard students are learning computing because of the important role it plays in their futures, whatever their major, and they understand the value of understanding it better. But there are plenty of references to "practical ends" and Harvard's newfound willingness to teach practical skills it once considered beneath it.

Computer programming is surely one of those topics old Charles William Eliot would deem unworthy of inclusion in Harvard's curriculum.

I'm sensitive to such attitudes because I think computer science is and should be more. If you look deeper, you will see that the creators of CS50 think so, too. On its Should I take CS50? FAQ page, we find:

More than just teach you how to program, this course teaches you how to think more methodically and how to solve problems more effectively. As such, its lessons are applicable well beyond the boundaries of computer science itself.

The next two sentences muddy the water a bit, though:

That the course does teach you how to program, though, is perhaps its most empowering return. With this skill comes the ability to solve real-world problems in ways and at speeds beyond the abilities of most humans.

With this skill comes something else, something even more important: a discipline of thinking and a clarity of thought that are hard to attain when you learn "how to think more methodically and how to solve problems more effectively" in the abstract or while doing almost any other activity.

Later the same day, I was catching up on a discussion taking place on the PLT-EDU mailing list, which is populated by the creators, users, and fans of the Racket programming language and the CS curriculum designed in tandem with it. One poster offered an analogy for talking to HS students about how and why they are learning to program. A common theme in the discussion that ensued was to take the conversation off of the "vocational track". Why encourage students to settle for such a limiting view of what they are doing?

One snippet from Matthias Felleisen (this link works only if you are a member of the list) captured my dissatisfaction with the tone of the Globe article about CS50:

If we require K-12 students to take programming, it cannot be justified (on a moral basis) with 'all of you will become professional programmers.' I want MDs who know the design recipe, I want journalists who write their articles using the design recipe, and so on.

The "design recipe" is a thinking tool students learn in Felleisen "How to Design Programs" curriculum. It is a structured way to think about problems and to create solutions. Two essential ideas stand out for me:

  • Students learn the design recipe in the process of writing programs. This isn't an abstract exercise. Creating a working computer program is tangible evidence that student has understood the problem and created a clear solution.
  • This way of thinking is valuable for everyone. We will all better off if our doctors, lawyers, and journalists are able to think this way.

This is one of my favorite instantiations of the vague term computational thinking so many people use without much thought. It is a way of thinking about problems both abstractly and concretely, that leads to solutions that we have verified with tests.

You might call this the CS mindset. It is present in CS50 independent of any practical ends associated with tech start-ups and massaging spreadsheet data. It is practical on a higher plane. It is also present in the HtDP curriculum and especially in the Racket Way.

It is present in all good CS education, even the CS intro courses that more students should be taking -- even if they are only going to be doctors, lawyers, or journalists.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 16, 2014 4:22 PM

Another Take on "Know the Past"

Ted Nelson offers a particularly stark assessment of how well we fulfill our obligation to know the past in his eulogy for Douglas Engelbart:

To quote Joan of Arc, from Shaw's play about her: "When will the world be ready to receive its saints?"

I think we know the answer -- when they are dead, pasteurized and homogenized and simplified into stereotypes, and the true depth and integrity of their ideas and initiatives are forgotten.

Nelson's position is stronger yet, because he laments the way in which Engelbart and his visions of the power of computing were treated throughout his career. How, he wails, could we have let this vision slip through our hands while Engelbart lived among us?

Instead, we worked on another Java IDE or a glue language for object-relational mapping. All the while, as Nelson says, "the urgent and complex problems of mankind have only grown more urgent and more complex."

This teaching is difficult; who can accept it?


Posted by Eugene Wallingford | Permalink | Categories: Computing

January 08, 2014 3:06 PM

"I'm Not a Programmer"

In The Exceptional Beauty of Doom 3's Source Code, Shawn McGrath first says this:

I've never really cared about source code before. I don't really consider myself a 'programmer'.

Then he says this:

Dyad has 193k lines of code, all C++.

193,000 lines of C++? Um, dude, you're a programmer.

Even so, the point is worth thinking about. For most people, programming is a means to an end: a way to create something. Many CS students start with a dream program in mind and only later, like McGrath, come to appreciate code for its own sake. Some of our graduates never really get there, and appreciate programming mostly for what they can do with it.

If the message we send from academic CS is "come to us only if you already care about code for its own sake", then we may want to fix our message.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

January 07, 2014 4:09 PM

Know The Past

In this profile of computational geneticist Jason Moore, the scientist speaks explains how his work draws on work from the 1910s, which may offer computational genetics a better path forward than the work that catapulted genetics forward in the 1990s and 2000s.

Yet despite his use of new media and advanced technology, Moore spends a lot of time thinking about the past. "We have a lot to learn from early geneticists," he says. "They were smart people who were really thinking deeply about the problem."

Today, he argues, genetics students spend too much time learning to use the newest equipment and too little time reading the old genetics literature. Not surprisingly, given his ambivalent attitude toward technology, Moore believes in the importance of history. "Historical context is so important for what we do," he says. "It provides a grounding, a foundation. You have to understand the history in order ... to understand your place in the science."

Anyone familiar with the history of computing knows there is another good reason to know your history: Sometimes, we dream too small these days, and settle for too little. We have a lot to learn from early computer scientists.

I intend to make this a point of emphasis in my algorithms course this spring. I'd like to expose students to important new ideas outside the traditional canon (more on that soon), while at the same time exposing them to some of the classic work that hasn't been topped.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 24, 2013 11:35 AM

Inverting the Relationship Between Programs and Literals

This chapter on quasi-literals in the E language tells the story of Scott Kim teaching the author that Apple's HyperCard was "powerful in a way most of had not seen before". This unexpected power led many people to misunderstand its true importance.

Most programs are written as text, sequences of characters. In this model, a literal is a special form of embedded text. When the cursor is inside the quotes of a literal string, we are "effectively no longer in a program editor, but in a nested text editor". Kim calls this a 'pun': Instead of writing text to be evaluated by another program, we are creating output directly.

What if we turn things inside out and embed our program in literal text?

Hypercard is normally conceived of as primarily a visual application builder with an embedded silly programming language (Hypertalk). Think instead of a whole Hypercard stack as a mostly literal program. In addition to numbers and strings, the literals you can directly edit include bitmaps, forms, buttons, and menus. You can literally assemble most things, but where you need dynamic behavior, there's a hole in the literal filled in by a Hypertalk script. The program editor for this script is experienced as being nested inside the direct manipulation user interface editor.

There is a hole in the literal text, where a program goes, instead of a hole in the program, where literal text goes.

HyperTalk must surely have seemed strange to most programmers in 1987. Lisp programmers had long used macros and so knew the power of nesting code to be eval'ed inside of literal text. Of course, the resulting text was then passed on the eval to be treated again as program!

The inside-out idea of HyperCard is alive today in the form of languages such as PHP, which embed code in HTML text:

    <body>
    <?php
      echo $_SERVER['HTTP_USER_AGENT'];
    ?>
    </body>

This is a different way to think about programming, one perhaps suitable for bringing experts in some domains toward the idea of writing code gradually from documents in their area of expertise.

I sometimes have the students in my compiler course implement a processor for a simple Mustache-like template language as an early warm-up homework assignment. I do not usually require them to go as far as Turing-complete embedded code, but they create a framework that makes it possible. I think I'll look for ways to bring more of this idea into the next offering of our more general course on programming languages.

(HyperCard really was more than many people realized at the time. The people who got it became Big Fans, and the program still has an ardent following. Check out this brief eulogy, which rhapsodizes on "the mystically-enchanting mantra" at the end of the application's About box: "A day of acquaintance, / And then the longer span of custom. / But first -- / The hour of astonishment.")


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 16, 2013 2:20 PM

More Fun with Integer "Assembly Language": Brute-Forcing a Function Minimum

Or: Irrational Exuberance When Programming

My wife and daughter laughed at me yesterday.

A few years ago, I blogged about implementing Farey sequences in Klein, a language for which my students at the time were writing a compiler. Klein was a minimal functional language with few control structures, few data types, and few built-in operations. Computing rational approximations using Farey's algorithm was a challenge in Klein that I likened to "integer assembly programming".

I clearly had a lot of fun with that challenge, especially when I had the chance to watch my program run using my students' compilers.

This semester, I am again teaching the compiler course, and my students are writing a compiler for a new version of Klein.

Last week, while helping my daughter with a little calculus, I ran across a fun new problem to solve in Klein:

the task of optimizing cost across the river

There are two stations on opposite sides of a river. The river is 3 miles wide, and the stations are 5 miles apart along the river. We need to lay pipe between the stations. Pipe laid on land costs $2.00/foot, and pipe laid across the river costs $4.00/foot. What is the minimum cost of the project?

This is the sort of optimization problem one often encounters in calculus textbooks. The student gets to construct a couple of functions, differentiate one, and find a maximum or minimum by setting f' to 0 and solving.

Solving this problem in Klein creates some of challenges. Among them are that ideally it involves real numbers, which Klein doesn't support, and that it requires a square root function, which Klein doesn't have. But these obstacles are surmountable. We already have tools for computing roots using Newton's method in our collection of test programs. Over a 3mi-by-5mi grid, an epsilon of a few feet approximates square roots reasonably well.

My daughter's task was to use the derivative of the cost function but, after talking about the problem with her, I was interested more in "visualizing" the curve to see how the cost drops as one moves in from either end and eventually bottoms out for a particular length of pipe on land.

So I wrote a Klein program that "brute-forces" the minimum. It loops over all possible values in feet for land pipe and compares the cost at each value to the previous value. It's easy to fake such a loop with a recursive function call.

The programmer's challenge in writing this program is that Klein has no local variables other function parameters. So I had to use helper functions to simulate caching temporary variables. This allowed me to give a name to a value, which makes the code more readable, but most importantly it allowed me to avoid having to recompute expensive values in what was already a computationally-expensive program.

This approach creates another, even bigger challenge for my students, the compiler writers. My Klein program is naturally tail recursive, but tail call elimination was left as an optional optimization in our class project. With activation records for all the tail calls stored on the stack, a compiler has to use a lot of space for its run-time memory -- far more than is available on our default target machine.

How many frames do we need? Well, we need to compute the cost at every foot along a (5 miles x 5280 feet/mile) rectangle, for a total of 26,400 data points. There will, of course, be other activation records while computing the last value in the loop.

Will I be able to see the answer generated by my program using my students' compilers? Only if one or more of the teams optimized tail calls away. We'll see soon enough.

So, I spent an hour or so writing Klein code and tinkering with it yesterday afternoon. I was so excited by the time I finished that I ran upstairs to tell my wife and daughter all about it: my excitement at having written the code, and the challenge it sets for my students' compilers, and how we could compute reasonable approximations of square roots of large integers even without real numbers, and how I implemented Newton's method in lieu of a sqrt, and...

That's when my wife and daughter laughed at me.

That's okay. I am programmer. I am still excited, and I'd do it again.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

December 11, 2013 12:01 PM

"Costs $20" is Functionally Indistinguishable from Gone

In his write-up on the origin of zero-based indexing in computing, Mike Hoye comments on the difficulties he had tracking down original sources:

Part of the problem is access to the historical record, of course. I was in favor of Open Access publication before, but writing this up has cemented it: if you're on the outside edge of academia, $20/paper for any research that doesn't have a business case and a deep-pocketed backer is completely untenable, and speculative or historic research that might require reading dozens of papers to shed some light on longstanding questions is basically impossible. There might have been a time when this was OK and everyone who had access to or cared about computers was already an IEEE/ACM member, but right now the IEEE -- both as a knowledge repository and a social network -- is a single point of a lot of silent failure. "$20 for a forty-year-old research paper" is functionally indistinguishable from "gone", and I'm reduced to emailing retirees to ask them what they remember from a lifetime ago because I can't afford to read the source material.

I'm an academic. When I am on campus, I have access to the ACM Digital Library. When I go home, I do not. I could pay for a personal subscription, but that seems an unnecessary expense when I am on campus so much.

I never have access to IEEE Xplore, Hoy's "single point of silent failure". Our university library chose to drop its institutional subscription a few years ago, and for good reason: it is ridiculously expensive, especially relative to the value we receive from it university-wide. (We don't have an engineering college.) We inquired about sharing a subscription with our sister schools, as we are legally under a single umbrella, but at least at that time, IEEE didn't allow such sharing.

What about non-academics, such as Hoye? We are blessed in computing with innumerable practitioners who study our history, write about, and create new ideas. Some are in industry and may have access to these resources, or an expense account. Many others, though, work on their own as independent contractors and researchers. They need access to materials, and $20 a pop is an acceptable expense.

Their loss if our loss. If Hoye had not written his article on the history of zero-based indexing, most of us wouldn't know the full story.

As time goes by, I hope that open access to research publications continues to grow. We really shouldn't have to badger retired computer scientists with email asking what they remember now about a topic they wrote an authoritative paper on forty years ago.


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 10, 2013 3:33 PM

Your Programming Language is Your Raw Material, Too

Recently someone I know retweeted this familiar sentiment:

If carpenters were hired like programmers:
"Must have at least 5 years experience with the Dewalt 18V 165mm Circular Saw"

This meme travels around the world in various forms all the time, and every so often it shows up in one of my inboxes. And every time I think, "There is more to the story."

In one sense, the meme reflects a real problem in the software world. Job ads often use lists of programming languages and technologies as requirements, when what the company presumably really wants is a competent developer. I may not know the particular technologies on your list, or be expert in them, but if I am an experienced developer I will be able to learn them and become an expert.

Understanding and skill run deeper than a surface list of tools.

But. A programming language is not just a tool. It is a building material, too.

Suppose that a carpenter uses a Dewalt 18V 165mm circular saw to add a room to your house. When he finishes the project and leaves your employ, you won't have any trace of the Dewalt in his work product. You will have a new room.

He might have used another brand of circular saw. He may not have used a power tool at all, preferring the fine craftsmanship of a handsaw. Maybe he used no saw of any kind. (What a magician!) You will still have the same new room regardless, and your life will proceed in the very same way.

Now suppose that a programmer uses the Java programming language to add a software module to your accounting system. When she finishes the project and leaves your employ, you will have the results of running her code, for sure. But you will have a trace of Java in her work product. You will have a new Java program.

If you intend to use the program again, to generate a new report from new input data, you will need an instance of the JVM to run it. If want to modify the program to work differently, then you will also need a Java compiler to create the byte codes that run in the JVM. If you want to extend the program to do more, then you again will need a Java compiler and interpreter.

Programs are themselves tools, and we use programming languages to build them. So, while the language itself is surely a tool at one level, at another level it is the raw material out of which we create other things.

To use a particular language is to introduce a slew of other dependencies to the overall process: compilers, interpreters, libraries, and sometimes even machine architectures. In the general case, to use a particular language is to commit at least some part of the company's future attention to both the language and its attendant tooling.

So, while I am sympathetic to sentiment behind our recurring meme, I think it's important to remember that a programming language is more than just a particular brand of power tool. It is the stuff programs are made of.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 04, 2013 3:14 PM

Agile Moments, "Why We Test" Edition

Case 1: Big Programs.

This blog entry tells the sad story of a computational biologist who had to retract six published articles. Why? Their conclusions depended on the output of a computer program, and that program contained a critical error. The writer of the entry, who is not the researcher in question, concludes:

What this should flag is the necessity to aggressively test all the software that you write.

Actually, you should have tests for any program you use to draw important conclusions, whether you wrote it or not. The same blog entry mentions that a grad student in the author's previous lab had found several bugs a molecular dynamics program used by many computational biologists. How many published results were affected before they were found?

Case 2: Small Scripts.

Titus Brown reports finding bugs every time he reused one of his Python scripts. Yet:

Did I start doing any kind of automated testing of my scripts? Hell no! Anyone who wants to write automated tests for all their little scriptlets is, frankly, insane. But this was one of the two catalysts that made me personally own up to the idea that most of my code was probably somewhat wrong.

Most of my code has bugs but, hey, why write tests?

Didn't a famous scientist define insanity as doing the same thing over and over but expecting different results?

I consider myself insane, too, but mostly because I don't write tests often enough for my small scripts. We say to ourselves that we'll never reuse them, so we don't need tests. But we don't throw them away, and then we do reuse them, perhaps with a tweak here or there.

We all face time constraints. When we run a script the first time, we may well pay enough attention to the output that we are confident it is correct. But perhaps we can all agree that the second time we use a script, we should write tests for it if we don't already have them.

There are only three numbers in computing, 0, 1, and many. The second time we use a program is a sign from the universe that we need the added confidence provided by tests.

To be fair, Brown goes on to offer some good advice, such as writing tests for code after you find a bug in it. His article is an interesting read, as is almost everything he writes about computation and science.

Case 3: The Disappointing Trade-Off.

Then there's this classic from Jamie Zawinski, as quoted in Coders at Work:

I hope I don't sound like I'm saying, "Testing is for chumps." It's not. It's a matter of priorities. Are you trying to write good software or are you trying to be done by next week? You can't do both.

Sigh. If you you don't have good software by next week, maybe you aren't done yet.

I understand that the real world imposes constraints on us, and that sometimes worse is better. Good enough is good enough, and we rarely need a perfect program. I also understand that Zawinski was trying to be fair to the idea of testing, and that he was surely producing good enough code before releasing.

Even still, the pervasive attitude that we can either write good programs or get done on time, but not both, makes me sad. I hope that we can do better.

And I'm betting that the computational biologist referred to in Case 1 wishes he had had some tests to catch the simple error that undermined five years worth of research.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 30, 2013 9:45 AM

The Magic at the Heart of AI

This paragraph from The Man Who Would Teach Machines to Think expresses a bit of my uneasiness with the world of AI these days:

As our machines get faster and ingest more data, we allow ourselves to be dumber. Instead of wrestling with our hardest problems in earnest, we can just plug in billions of examples of them. Which is a bit like using a graphing calculator to do your high-school calculus homework -- it works great until you need to actually understand calculus.

I understand the desire to solve real problems and the resulting desire to apply opaque mathematics to large data sets. Like most everyone, I revel in what Google can do for me and watch in awe when Watson defeats the best human Jeopardy! players ever. But for me, artificial intelligence was about more than just getting the job done.

Over the years teaching AI, my students often wanted to study neural networks in much greater detail than my class tended to go. But I was more interested in approaches to AI and learning that worked at a more conceptual level. Often we could find a happy middle ground while studying genetic algorithms, which afforded them the magic of something-for-nothing and afforded me the potential for studying ideas as they evolved over time.

(Maybe my students were simply exhibiting Astrachan's Law.)

When I said goodbye to AAAI a few years ago, I mentioned Hofstadter's work as one of my early inspirations -- Gödel, Escher, Bach and the idea of self-reference, with its "intertwining worlds of music, art, mathematics, and computers". That entry said I was leaving AAAI because my own work had moved in a different direction. But it left unstated a second truth, which The Man Who Would Teach Machines to Think asserts as Hofstadter's own reason for working off the common path: the world of AI had moved in a different direction, too.

For me, as for Hofstadter, AI has always meant more than engineering a solution. It was about understanding scientifically something that seemed magical, something that is both deeply personal and undeniably universal to human experience, about how human consciousness seems to work. My interest in AI will always lie there.

~~~~~

If you enjoy the article about Hofstadter and his work linked above, perhaps you will enjoy a couple of entries I wrote after he visited my university last year:


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 24, 2013 10:54 AM

Teaching Algorithms in 2014

This spring, I will be teaching the undergraduate algorithms course for first time in nine years, since the semester before I became department head. I enjoy this course. It gives both the students and me opportunities to do a little theory, a little design, and a little programming. I also like to have some fun, using what we learn to play games and solve puzzles.

Nine years is a long time in computing, even in an area grounded in well-developed theory. I will need to teach a different sort of course. At the end of this entry, I ask for your help in meeting this challenge.

Algorithms textbooks don't look much different now than they did in the spring of 2005. Long-time readers of this blog know that I face the existential crisis of selecting a textbook nearly every semester. Picking a textbook requires balancing several forces, including the value they give to the instructor, the value they give to the student during and after the course, and the increasing expense to students.

My primary focus in these decisions is always on net value to the students. I like to write my own material anyway. When time permits, I'd rather write as much as I can for students to read than outsource that responsibility (and privilege) to a textbook author. Writing my lecture notes in depth lets me weave a lot of different threads together, including pointers into primary and secondary sources. Students benefit from learning to read non-textbook material, the sort they will encounter as throughout their careers.

My spring class brings a new wrinkle to the decision, though. Nearly fifty students are enrolled, with the prospect a few more to come. This is a much larger group than I usually work with, and large classes carry a different set of risks than smaller courses. In particular, when something goes wrong in a small section, it is easier to recover through one-on-one remediation. That option is not so readily available for a fifty-person course.

There is more risk in writing new lecture material than in using a textbook that has been tested over time. A solid textbook can be a security blanket as much for the instructor as for the student. I'm not too keen on selecting a security blanket for myself, but the predictability of a text is tempting. There is one possible consolation in such a choice: perhaps subordinating my creative impulses to the design of someone's else's textbook will make me more creative as a result.

But textbook selection is a fairly ordinary challenge for me. The real question is: Which algorithms should we teach in this course, circa 2014? Surely the rise of big data, multi-core processors, mobile computing, and social networking require a fresh look at the topics we teach undergrads.

Perhaps we need only adjust the balance of topics that we currently teach. Or maybe we need to add a new algorithm or data structure to the undergraduate canon. If we teach a new algorithm, or a new class of algorithms, which standard material should be de-emphasized, or displaced altogether? (Alas, the semester is still but fifteen weeks long.)

Please send me your suggestions! I will write up a summary of the ideas you share, and I will certainly use your suggestions to design a better algorithms course for my students.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 19, 2013 4:49 PM

First Model, Then Improve

Not long ago, I read Unhappy Truckers and Other Algorithmic Problems, an article by Tom Vanderbilt that looks at efforts to optimize delivery schedules at UPS and similar companies. At the heart of the challenge lies the traveling salesman problem. However, in practice, the challenge brings companies face-to-face with a bevy of human issues, from personal to social, psychological to economic. As a result, solving this TSP is more complex than what we see in the algorithms courses we take in our CS programs.

Yet, in the face of challenges both computational and human, the human planners working at these companies do a pretty good job. How? Over the course of time, researchers figured out that finding optimal routes shouldn't be their main goal:

"Our objective wasn't to get the best solution," says Ted Gifford, a longtime operations research specialist at Schneider. "Our objective was to try to simulate what the real world planners were really doing."

This is a lesson I learned the hard way, too, back in graduate school, when my advisor's lab was trying to build knowledge-based systems for real clients, in chemical engineering, aeronautics, business, and other domains. We were working with real people who were solving hard problems under serious constraints.

At the beginning I was a typically naive programmer, armed with fancy AI techniques and unbounded enthusiasm. I soon learned that, if you walk into a workplace and propose to solve all the peoples' problems with a program, things don't go as smoothly as the programmer might hope.

First of all, this impolitic approach generally creates immediate pushback. These are people, with personal investment in the way things work now. They tend to bristle when a 20-something grad student walks in the door promoting the wonder drug for all their ills. Some might even fear that you are right, and success for your program will mean negative consequences for them personally. We see this dynamic in Vanderbilt's article.

There's a deeper reason that things don't go so smoothly, though, and it's the real lesson of Vanderbilt's piece. Until you implement the existing solution to the problem, you don't really understand the problem yet.

These problems are complex, often with many more constraints than typical theoretical solutions have dealt with. The humans solving the problem often have many years of experience contributing to their approach. They have deep knowledge of the domain, but also repeated exposure to the exceptions and edge cases that sometimes confound theoretical solutions. They use heuristics that are hard to tease apart or articulate.

I learned that it's easy to solve a problem if you are solving the wrong one.

A better way to approach these challenges is: First, model the existing system, including the extant solution. Then, look for ways to improve on the solution.

This approach often gives everyone involved greater confidence that the programmers understand -- and so are solving -- the right problem. It also enables the team to make small, incremental changes to the system, with a correspondingly higher probability of success. Together, these two outcomes greatly increase the chance of human buy-in from the current workers. This makes it easier for the whole team to recognize the need for larger-scale changes to the process, and to support and contribute to an improved solution.

Vanderbilt tells a similarly pragmatic story. He writes:

When I suggest to Gifford that he's trying to understand the real world, mathematically, he concurs, but adds: "The word 'understand' is too strong--we are happy to get positive outcomes."

Positive outcomes are what the company wants. Fortunately for the academics who work on such problems in industry, achieving good outcomes is often an effective way to test theories, encounter their shortcomings, and work on improvements. That, too, is something I learned in grad school. It was a valuable lesson.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 14, 2013 2:55 PM

Toward A New Data Science Culture in Academia

Fernando Perez has a nice write-up, An Ambitious Experiment in Data Science, describing a well-funded new project in which teams at UC Berkeley, the University of Washington, and NYU will collaborate to "change the culture of universities to create a data science culture". A lot of people have been quoting Perez's entry for its colorful assessment of academic incentives and reward structures. I like this piece for the way Perez defines and outlines the problem, in terms of both data science across disciplines and academic culture in general.

For example:

Most scientists are taught to treat computation as an afterthought. Similarly, most methodologists are taught to treat applications as an afterthought.

Methodologists here includes computer scientists, who are often more interested in new data structures, algorithms, and protocols.

This "mirror" disconnect is a problem for a reason many people already understand well:

Computation and data skills are all of a sudden everybody's problem.

(Here are a few past entries of mine that talk about how programming and the nebulous "computational thinking" have spread far and wide: 1 | 2 | 3 | 4.)

Perez rightly points out that the open-source software, while imperfect, often embodies the principles or science and scientific collaboration better than the academy. It will be interesting to see how well this data science project can inject OSS attitudes into big research universities.

He is concerned because, as I have noted before, are, as a whole, a conservative lot. Perez says this in a much more entertaining way:

There are few organizations more proud of their traditions and more resistant to change than universities (churches and armies might be worse, but that's about it).

I think he gives churches and armies more credit than they deserve.

The good news is that experiments of the sort being conducted in the Berkley/UW/NYU project are springing up on a smaller scale around the world. There is some hope for big change in academic culture if a lot of different people at a lot of different institutions experiment, learn, and create small changes that can grow together as they bump into one another.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 29, 2013 3:49 PM

My PLoP 2013 Retrospective

wrapper of the Plopp candy bar I received from Rebecca Rikner

PLoP 2013 was as much fun and as invigorating as I had hoped it would be. I hadn't attended in eight years, but it didn't take long to fall back into the rhythm of writers' workshops interspersed among invited talks, focus group sessions, BoFs, mingling, and (yes) games.

I was lead moderator for Workshop 010010, which consisted of pedagogical patterns papers. The focus of all of them was interactivity, whether among students building LEGO Mindstorms robots or among students and instructor on creative projects. The idea of the instructor as an active participant, even "generative" in the sense meant by Christopher Alexander, dominated our discussion. I look forward to seeing published versions of the papers we discussed.

The other featured events included invited talks by Jenny Quillien and Ward Cunningham and a 20-year retrospective panel featuring people who were present at the beginning of PLoP, the Hillside Group, and software patterns.

Quillien spent six years working with Alexander during the years he created The Nature of Order. Her talk shared some of the ways that Alexander was disappointed in the effect of his seminal "A Pattern Language" had on the world, both as a result of people misunderstanding it and as a result of the books inherent faults. Along the way, she tried to give pragmatic advice to people trying to document patterns of software. I may try to write up some of her thoughts, and some of my own in response, in the coming weeks.

Cunningham presented his latest work on federated wiki, the notion of multiple, individual wikis "federated" in relationships that share and present information for a common good. Unlike the original wiki, in which collaboration happened in a common document, federated wiki has a fork button on every page. Anyone can copy, modify, and share pages, which are then visible to everyone and available for merging back into the home wikis.

the favicon for my federated wiki on Ward's server

Ward set me up with a wiki in the federation on his server before I left on Saturday. I want to play with it a bit before I say much more than this: Federated wiki could change how communities share and collaborate in much the same way that wiki did.

I also had the pleasure of participating in one other structured activity while at PLoP. Takashi Iba and his students at Keio University in Japan are making a documentary about the history of the patterns community. Takashi invited me to sit for an interview about pedagogical patterns and their history within the development of software patterns. I was happy to help. It was a fun challenge to explain my understanding of what a pattern language is, and to think about what my colleagues and I struggled with in trying to create small pattern languages to guide instruction. Of course, I strayed off to the topic of elementary patterns as well, and that led to more interesting discussion with Takashi. I look forward to seeing their film in the coming years.

More so than even other conferences, unstructured activity plays a huge role in any PLoP conference. I skipped a few meals so that I could walk the extensive gardens and grounds of Allerton Park (and also so that I would not gain maximum pounds from the plentiful and tasty meals that were served). I caught up with old friends such as Ward, Kyle Brown, Bob Hanmer, Ralph Johnson, and made too many new friends to mention here. All the conversation had my mind swirling with new projects and old... Forefront in my mind is exploring again the idea of design and implementation patterns of functional programming. The time is still right, and I want to help.

Now, to write my last entry or two from StrangeLoop...

~~~~

Image 1. A photo of the wrapper of a Plopp candy bar, which I received as a gift from Rebecca Rikner. PLoP has a gifting tradition, and I received a box full of cool tools, toys, mementoes, and candy. Plopp is a Swedish candy bar, which made it a natural gift for Rebecca to share from her native land. (It was tasty, too!)

Image 2. The favicon for my federated wiki on Ward's server, eugene.fed.wiki.org. I like the color scheme that fed.wiki.org gave me -- and I'm glad to be early enough an adopter that I could claim my first name as the name of my wiki. The rest of the Eugenes in the world will have to settle for suffix numbers and all the other contortions that come with arriving late to the dance.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

October 19, 2013 7:38 AM

The Proto Interpreter for J

(Update: Josh Grams took my comment about needing a week of work to grok this code as a challenge. He figured it out much more quickly than that and wrote up an annotated version of the program as he went along.

I like finding and reading about early interpreters for programming languages, such as the first Lisp interpreter or Smalltalk-71, which grew out of a one-page proof of concept written by Dan Ingalls on a bet with Alan Kay.

So I was quite happy recently to run across Appendix A from An Implementation of J, from which comes the following code. Arthur Whitney whipped up this one-page interpreter fragment for the AT&T 3B1 one weekend in 1989 to demonstrate his idea for a new APL-like like language. Roger Hui studied this interpreter for a week before writing the first implementation of J.

typedef char C;typedef long I;
typedef struct a{I t,r,d[3],p[2];}*A;
#define P printf
#define R return
#define V1(f) A f(w)A w;
#define V2(f) A f(a,w)A a,w;
#define DO(n,x) {I i=0,_n=(n);for(;i<_n;++i){x;}}
I *ma(n){R(I*)malloc(n*4);}mv(d,s,n)I *d,*s;{DO(n,d[i]=s[i]);}
tr(r,d)I *d;{I z=1;DO(r,z=z*d[i]);R z;}
A ga(t,r,d)I *d;{A z=(A)ma(5+tr(r,d));z->t=t,z->r=r,mv(z->d,d,r);
 R z;}
V1(iota){I n=*w->p;A z=ga(0,1,&n);DO(n,z->p[i]=i);R z;}
V2(plus){I r=w->r,*d=w->d,n=tr(r,d);A z=ga(0,r,d);
 DO(n,z->p[i]=a->p[i]+w->p[i]);R z;}
V2(from){I r=w->r-1,*d=w->d+1,n=tr(r,d);
 A z=ga(w->t,r,d);mv(z->p,w->p+(n**a->p),n);R z;}
V1(box){A z=ga(1,0,0);*z->p=(I)w;R z;}
V2(cat){I an=tr(a->r,a->d),wn=tr(w->r,w->d),n=an+wn;
 A z=ga(w->t,1,&n);mv(z->p,a->p,an);mv(z->p+an,w->p,wn);R z;}
V2(find){}
V2(rsh){I r=a->r?*a->d:1,n=tr(r,a->p),wn=tr(w->r,w->d);
 A z=ga(w->t,r,a->p);mv(z->p,w->p,wn=n>wn?wn:n);
 if(n-=wn)mv(z->p+wn,z->p,n);R z;}
V1(sha){A z=ga(0,1,&w->r);mv(z->p,w->d,w->r);R z;}
V1(id){R w;}V1(size){A z=ga(0,0,0);*z->p=w->r?*w->d:1;R z;}
pi(i){P("%d ",i);}nl(){P("\n");}
pr(w)A w;{I r=w->r,*d=w->d,n=tr(r,d);DO(r,pi(d[i]));nl();
 if(w->t)DO(n,P("< ");pr(w->p[i]))else DO(n,pi(w->p[i]));nl();}

C vt[]="+{~<#,"; A(*vd[])()={0,plus,from,find,0,rsh,cat}, (*vm[])()={0,id,size,iota,box,sha,0}; I st[26]; qp(a){R a>='a'&&a<='z';}qv(a){R a<'a';} A ex(e)I *e;{I a=*e; if(qp(a)){if(e[1]=='=')R st[a-'a']=ex(e+2);a= st[ a-'a'];} R qv(a)?(*vm[a])(ex(e+1)):e[1]?(*vd[e[1]])(a,ex(e+2)):(A)a;} noun(c){A z;if(c<'0'||c>'9')R 0;z=ga(0,0,0);*z->p=c-'0';R z;} verb(c){I i=0;for(;vt[i];)if(vt[i++]==c)R i;R 0;} I *wd(s)C *s;{I a,n=strlen(s),*e=ma(n+1);C c; DO(n,e[i]=(a=noun(c=s[i]))?a:(a=verb(c))?a:c);e[n]=0;R e;}

main(){C s[99];while(gets(s))pr(ex(wd(s)));}

I think it will take me a week of hard work to grok this code, too. Whitney's unusually spare APL-like C programming style is an object worthy of study in its own right.

By the way, Hui's Appendix A bears the subtitle Incunabulum, a word that means a work of art or of industry of an early period. So, I not only discovered a new bit of code this week; I also learned a cool new word. That's a good week.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 16, 2013 11:38 AM

Poetry as a Metaphor for Software

I was reading Roger Hui's Remembering Ken Iverson this morning on the elliptical, and it reminded me of this passage from A Conversation with Arthur Whitney. Whitney is a long-time APL guru and the creator of the A, K, and Q programming languages. The interviewer is Bryan Cantrill.

BC: Software has often been compared with civil engineering, but I'm really sick of people describing software as being like a bridge. What do you think the analog for software is?

AW: Poetry.

BC: Poetry captures the aesthetics, but not the precision.

AW: I don't know, maybe it does.

A poet's use of language is quite precise. It must balance forces in many dimensions, including sound, shape, denotation, and connotation. Whitney seems to understand this. Richard Gabriel must be proud.

Brevity is a value in the APL world. Whitney must have a similar preference for short language names. I don't know the source of his names A, K, and Q, but I like Hui's explanation of where J's name came from:

... on Sunday, August 27, 1989, at about four o'clock in the afternoon, [I] wrote the first line of code that became the implementation described in this document.

The name "J" was chosen a few minutes later, when it became necessary to save the interpreter source file for the first time.

Beautiful. No messing around with branding. Gotta save my file.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 11, 2013 1:42 PM

The Tang of Adventure, and a Lively Appreciation

"After you've learned the twelve times table," John Yarnelle asks, "what else is there to do?"

The concepts of modern mathematics give the student something else to do in great abundance and variety at all levels of his development. Not only may he discover unusual types of mathematical structures where, believe it or not, two and two does not equal four, but he may even be privileged to invent a new system virtually on his own. Far from a sense of stagnation, there is the tang of adventure, the challenge of exploration; perhaps also a livelier appreciation of the true nature of mathematical activity and mathematical thought.

Not only the tang of adventure; students might also come to appreciate what math really is. That's an admirable goal for any book or teacher.

This passage comes from Yarnelle's Finite Mathematical Structures, a 1964 paperback that teaches fields, groups, and algebras with the prose of a delighted teacher. I picked this slender, 66-page gem up off a pile of books being discarded by a retired math professor a decade ago. How glad I am that none of the math profs who walked past that pile bothered to claim it before I happened by.

We could use a few CS books like this, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 07, 2013 12:07 PM

StrangeLoop: Exercises in Programming Style

[My notes on StrangeLoop 2013: Table of Contents]

Crista Lopes

I had been looking forward to Crista Lopes's StrangeLoop talk since May, so I made sure I was in the room well before the scheduled time. I even had a copy of the trigger book in my bag.

Crista opened with something that CS instructors have learned the hard way: Teaching programming style is difficult and takes a lot of time. As a result, it's often not done at all in our courses. But so many of our graduates go into software development for the careers, where they come into contact with many different styles. How can they understand them -- well, quickly, or at all?

To many people, style is merely the appearance of code on the screen or printed. But it's not. It's more, and something entirely different. Style is a constraint. Lopes used images of a few stylistic paintings to illustrate the idea. If an artist limits herself to pointillism or cubism, how can she express important ideas? How does the style limit the message, or enhance it?

But we know this is true of programming as well. The idea has been a theme in my teaching for many years. I occasionally write about the role of constraints in programming here, including Patterns as a Source of Freedom, a few programming challenges, and a polymorphism challenge that I've run as a workshop.

Lopes pointed to a more universal example, though: the canonical The Elements of Programming Style. Drawing on this book and other work in software, she said that programming style ...

  • is a way to express tasks
  • exists at all scales
  • recurs at multiple scales
  • is codified in programming language

For me, the last bullet ties back most directly to idea of style as constraint. A language makes some things easier to express than others. It can also make some things harder to express. There is a spectrum, of course. For example, some OO languages make it easy to create and use objects; others make it hard to do anything else! But the language is an enabler and enforcer of style. It is a proxy for style as a constraint on the programmer.

Back to the talk. Lopes asked, Why is it so important that we understand programming style? First, a style provides the reader with a frame of reference and a vocabulary. Knowing different styles makes us a more effective consumers of code. Second, one style can be more appropriate for a given problem or context than another style. So, knowing different styles makes us a more effective producers of code. (Lopes did not use the producer-consumer distinction in the talk, but it seems to me a nice way to crystallize her idea.)

the cover of Raymond Queneau's Exercises in Style

The, Lopes said, I came across Raymond Queneau's playful little book, "Exercises in Style". Queneau constrains himself in many interesting ways while telling essentially the same story. Hmm... We could apply the same idea to programming! Let's do it.

Lopes picked a well-known problem, the common word problem famously solved in a Programming Pearls column more than twenty-five years. This is a fitting choice, because Jon Bentley included in that column a critique of Knuth's program by Doug McIlroy, who considered both engineering concerns and program style in his critique.

The problem is straightforward: identify and print the k most common terms that occur in a given text document, in decreasing order. For the rest of the talk, Lopes presented several programs that solve the problem, each written in a different style, showing code and highlighting its shape and boundaries.

Python was her language of choice for the examples. She was looking for a language that many readers would be able to follow and understand, and Python has the feel of pseudo-code about it. (I tell my students that it is the Pascal of their time, though I may as well be speaking of hieroglyphics.) Of course, Python has strengths and weaknesses that affect its fit for some styles. This is an unavoidable complication for all communication...

Also, Lopes did not give formal names to the styles she demonstrated. Apparently, at previous versions of this talk, audience members had wanted to argue over the names more than the styles themselves! Vowing not to make that mistake again, she numbered her examples for this talk.

That's what programmers do when they don't have good names.

In lieu of names, she asked the crowd to live-tweet to her what they thought each style is or should be called. She eventually did give each style a fun, informal name. (CS textbooks might be more evocative if we used her names instead of the formal ones.)

I noted eight examples shown by Lopes in the talk, though there may have been more:

  • monolithic procedural code -- "brain dump"
  • a Unix-style pipeline -- "code golf"
  • procedural decomposition with a sequential main -- "cook book"
  • the same, only with functions and composition -- "Willy Wonka"
  • functional decomposition, with a continuation parameter -- "crochet"
  • modules containing multiple functions -- "the kingdom of nouns"
  • relational style -- (didn't catch this one)
  • functional with decomposition and reduction -- "multiplexer"

Lopes said that she hopes to produce solutions using a total of thirty or so styles. She asked the audience for help with one in particular: logic programming. She said that she is not a native speaker of that style, and Python does not come with a logic engine built-in to make writing a solution straightforward.

Someone from the audience suggested she consider yet another style: using a domain-specific language. That would be fun, though perhaps tough to roll from scratch in Python. By that time, my own brain was spinning away, thinking about writing a solution to the problem in Joy, using a concatenative style.

Sometimes, it's surprising just how many programming styles and meaningful variations people have created. The human mind is an amazing thing.

The talk was, I think, a fun one for the audience. Lopes is writing a book based on the idea. I had a chance to review an early draft, and now I'm looking forward to the finished product. I'm sure I'll learn something new from it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

October 04, 2013 3:12 PM

StrangeLoop: Rich Hickey on Channels and Program Design

[My notes on StrangeLoop 2013: Table of Contents]

Rich Hickey setting up for his talk

Rich Hickey spoke at one of the previous StrangeLoops I attended, but this was my first time to attend one of his talks in person. I took the shaky photo seen at the right as proof. I must say, he gives a good talk.

The title slide read "Clojure core.async Channels", but Hickey made a disclaimer upfront: this talk would be about what channels are and why Clojure has them, not the details of how they are implemented. Given that there were plenty of good compiler talks elsewhere at the conference, this was a welcome change of pace. It was also a valuable one, because many more people will benefit from what Hickey taught about program design than would have benefited from staring at screens full of Clojure macros. The issues here are important ones, and ones that few programmers understand very well.

The fundamental problem is this: Reactive programs need to be machines, but functions make bad machines. Even sequences of functions.

The typical solution to this problem these days is to decompose the system logic into a set of response handlers. Alas, this leads to callback hell, a modern form of spaghetti code. Why? Even though the logic has been decomposed into pieces, it is still "of a piece", essentially a single logical entity. When this whole is implemented across multiple handlers, we can't see it as a unit, or talk about it easily. We need to, though, because we need to design the state machine that it comprises.

Clojure's solution to the problem, in the form of core.async, is the channel. This is an implementation of Tony Hoare's communicating sequential process. One of the reasons that Hickey likes this approach is that it lets a program work equally well in fully threaded apps and in apps with macro-generated inversion of control.

Hickey then gave some examples of code using channels and talked a bit about the implications of the implementation for system design. For instance, the language provides handy put! and take! operators for integrating channels with code at the edge of non-core.async systems. I don't have much experience with Clojure, so I'll have to study a few examples in detail to really appreciate this.

For me, the most powerful part of the talk was an extended discussion of communication styles in program. Hickey focused on the trade-offs between direct communication via shared state and indirect communication via channels. He highlighted six or seven key distinctions between the two and how these affect the way a system works. I can't do this part of the talk justice, so I suggest you watch the video of the talk. I plan to watch it again myself.

I had always heard that Hickey was eminently quotable, and he did not disappoint. Here are three lines that made me smile:

  • "Friends don't let friends put logic in handlers."
  • "Promises and futures are the one-night stands" of asynchronous architecture.
  • "Unbounded buffers are a recipe for a bad program. 'I don't want to think about this bug yet, so I'll leave the buffer unbounded.'"

That last one captures the indefatigable optimism -- and self-delusion -- that characterizes so many programmers. We can fix that problem later. Or not.

In the end, this talk demonstrates how a good engineer approaches a problem. Clojure and its culture reside firmly in the functional programming camp. However, Hickey recognizes that, for the problem at hand, a sequence of functional calls is not the best solution. So he designs a solution that allows programmers to do FP where it fits best and to do something else where FP doesn't. That's a pragmatic way to approach problems.

Still, this solution is consistent with Clojure's overall design philosophy. The channel is a first-class object in the language. It converts a sequence of functional calls into data, whereas callbacks implement the sequence in code. As code, we see the sequence only at run-time. As data, we see it in our program and can use it in all the ways we can use any data. This consistent focus on making things into data is an attractive part of the Clojure language and the ecosystem that has been cultivated around it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 28, 2013 12:17 PM

StrangeLoop: This and That, Volume 2

[My notes on StrangeLoop 2013: Table of Contents]

I am at a really good talk and look around the room. So many people are staring at their phones, scrolling away. So many others are staring at their laptops, typing away. The guy next to me: doing both at the same time. Kudos, sir. But you may have missed the point.

~~~~

Conference talks are a great source of homework problems. Sometimes, the talk presents a good problem directly. Others, watching the talk sets my subconscious mind in motion, and it creates something useful. My students thank you. I thank you.

~~~~

Jenny Finkel talked about the difference between two kinds of recommenders: explorers, who forage for new content, and exploiters, who want to see what's already popular. The former discovers cool new things occasionally but fails occasionally, too. The latter is satisfied most of the time but rarely surprised. As conference goes, I felt this distinction at play in my own head this year. When selecting the next talk to attend, I have to take a few risks if I ever hope to find something unexpected. But when I fail, a small regret tugs at me.

~~~~

We heard a lot of confident female voices on the StrangeLoop stages this year. Some of these speakers have advanced academic degrees, or at least experience in grad school.

~~~~

The best advice I received on Day 1 perhaps came not from a talk but from the building:

The 'Do not Climb on Bears' sign on a Peabody statue

"Please do not climb on bears." That sounds like a good idea most everywhere, most all the time.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

September 27, 2013 4:26 PM

StrangeLoop: Add All These Things

[My notes on StrangeLoop 2013: Table of Contents]

I took a refreshing walk in the rain over the lunch hour on Friday. I managed to return late and, as a result, missed the start of Avi Bryant's talk on algebra and analytics. Only a few minutes, though, which is good. I enjoyed this presentation.

Bryant didn't talk about the algebra we study in eighth or ninth grade, but the mathematical structure math students encounter in a course called "abstract" or "modern" algebra. A big chunk of the talk focused on an even narrower topic: why +, and operators like it, are cool.

One reason is that grouping doesn't matter. You can add 1 to 2, and then add 4 to the result, and have the same answer as if you added 4 to 1, and then added 2 to the result. This is, of course, the associative property.

Another is that order doesn't matter. 1 + 2 is the same as 2 + 1. That's the commutative property.

Yet another is that, if you have nothing to add, you can add nothing and have the same value you started with. 4 + 0 = 4. 0 is the identity element for addition.

Finally, when you add two numbers, you get a number back. This is not quite as true in computers as in math, because an operation can cause an overflow or underflow and create an error. But looked at through fuzzy lenses, this is true in our computers, too. This is the closure property for addition of integers and real numbers.

Addition isn't the only operation on numbers that has these properties. Finding the maximum value in a set of numbers, does, too. The maximum of two numbers is a number. max(x,y) = max(y,x), and if we have three or more numbers, it doesn't how matter how we group them; max will find the maximum among them. The identity value is tricky -- there is no smallest number... -- but in practice we can finesse this by using the smallest number of a given data type, or even allowing max to take "nothing" as a value and return its other argument.

When we see a pattern like this, Bryant said, we should generalize:

  • We have a function f that takes two values from a set and produces another member of the same set.
  • The order of f's arguments doesn't matter.
  • The grouping of f's arguments doesn't matter.
  • There is some identity value, a conceptual "zero", that doesn't matter, in the sense that f(i,zero) for any i is simply i.

There is a name for this pattern. When we have such as set and operation, we have a commutative monoid.

     S ⊕ S → S
     x ⊕ y = y ⊕ x
     x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z
     x ⊕ 0 = x

I learned about this and other such patterns in grad school when I took an abstract algebra course for kicks. No one told me at the time that I'd being seeing them again as soon as someone created the Internet and it unleashed a torrent of data on everyone.

Just why we are seeing the idea of a commutative monoid again was the heart of Bryant's talk. When we have data coming into our company from multiple network sources, at varying rates of usage and data flow, and we want to extract meaning from the data, it can be incredibly handy if the meaning we hope to extract -- the sum of all the values, or the largest -- can be computed using a commutative monoid. You can run multiple copies of your function at the entry point of each source, and combine the partial results later, in any order.

Bryant showed this much more attractively than that, using cute little pictures with boxes. But then, there should be an advantage to going to the actual talk... With pictures and fairly straightforward examples, he was able to demystify the abstract math and deliver on his talk's abstract:

A mathematician friend of mine tweeted that anyone who doesn't understand abelian groups shouldn't build analytics systems. I'd turn that around and say that anyone who builds analytics systems ends up understanding abelian groups, whether they know it or not.

That's an important point. Just because you haven't studied group theory or abstract algebra doesn't mean you shouldn't do analytics. You just need to be prepared to learn some new math when it's helpful. As programmers, we are all looking for opportunities to capitalize on patterns and to generalize code for use in a wider set of circumstances. When we do, we may re-invent the wheel a bit. That's okay. But also look for opportunities to capitalize on patterns recognized and codified by others already.

Unfortunately, not all data analysis is as simple as summing or maximizing. What if I need to find an average? The average operator doesn't form a commutative monoid with numbers. It falls short in almost every way. But, if you switch from the set of numbers to the set of pairs [n, c], where n is a number and c is a count of how many times you've seen n, then you are back in business. Counting is addition.

So, we save the average operation itself as a post-processing step on a set of number/count pairs. This turns out to be a useful lesson, as finding the average of a set is a lossy operation: it loses track of how many numbers you've seen. Lossy operations are often best saved for presenting data, rather than building them directly into the system's computation.

Likewise, finding the top k values in a set of numbers (a generalized form of maximum) can be handled just fine as long as we work on lists of numbers, rather than numbers themselves.

This is actually one of the Big Ideas of computer science. Sometimes, we can use a tool or technique to solve a problem if only we transform the problem into an equivalent one in a different space. CS theory courses hammer this home, with oodles of exercises in which students are asked to convert every problem under the sun into 3-SAT or the clique problem. I look for chances to introduce my students to this Big Idea when I teach AI or any programming course, but the lesson probably gets lost in the noise of regular classwork. Some students seem to figure it out by the time they graduate, though, and the ones who do are better at solving all kinds of problems (and not by converting them all 3-SAT!).

Sorry for the digression. Bryant didn't talk about 3-SAT, but he did demonstrate several useful problem transformations. His goal was more practical: how can we use this idea of a commutative monoid to extract as many interesting results from the stream of data as possible.

This isn't just an academic exercise, either. When we can frame several problems in this way, we are able to use a common body of code for the processing. He called this body of code an aggregator, comprising three steps:

  • prepare the data by transforming it into the space of a commutative monoid
  • reduce the data to a single value in that space, using the appropriate operator
  • present the result by transforming it back into its original space

In practice, transforming the problem into the space of a monoid presents challenges in the implementation. For example, it is straightforward to compute the number of unique values in a collection of streams by transforming each item into a set of size one and then using set union as the operator. But union requires unbounded space, and this can be inconvenient when dealing with very large data sets.

One approach is to compute an estimated number of uniques using a hash function and some fancy arithmetic. We can make the expected error in estimate smaller and smaller by using more and more hash functions. (I hope to write this up in simple code and blog about it soon.)

Bryant looked at one more problem, computing frequencies, and then closed with a few more terms from group theory: semigroup, group, and abelian group. Knowing these terms -- actually, simply knowing that they exist -- can be useful even for the most practical of practitioners. They let us know that there is more out there, should our problems become harder or our needs become larger.

That's a valuable lesson to learn, too. You can learn all about abelian groups in the trenches, but sometimes it's good to know that there may be some help out there in the form of theory. Reinventing wheels can be cool, but solving the problems you need solved is even cooler.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

September 24, 2013 4:38 PM

StrangeLoop: Compilers, Compilers, Compilers

[My notes on StrangeLoop 2013: Table of Contents]

I went to a lot of talks about compilation. There seemed to be more this year than last, but perhaps I was suffering from a perception bias. I'm teaching compilers this semester and have been reading a bit about V8 and Crankshaft on the elliptical of late.

Many of the talks I saw revolved around a common theme: dynamic run-time systems. Given the prominence these days of Javascript, Python, Ruby, Lua, and their like, it's not surprising that finding better ways to organize dynamic run-times and optimize their performance are receiving a lot of attention.

The problem of optimizing dynamic run-time systems is complicated by the wide range of tasks being performed dynamically: type checking, field access, function selection, and the most common whipping horse of my static-language friends, garbage collection. Throw in eval, which allows the execution of arbitrary code, possibly changing even the definition of core classes, and it's amazing that our dynamic languages can run in human time at all. That's a tribute to the people who have been creating compilers for us over the years.

As I listened to these talks, my ears were tuned to ideas and topics that I need to learn more about. That's what my notes captured best. Here are a few ideas that stood out.

The JavaScript interpreter, interpreted. Martha Girdler gave a quick, jargon-free tour of how Javascript works, using Javascript code to illustrate basic ideas like contexts. This sort of talk can help relatively inexperienced developers understand the common "pain points" of the language, such as variable hoisting.

Fast and Dynamic. Maxime Chevalier-Boisvert went a level deeper, tracing some of the fundamental ideas used to implement run-time systems from their historical roots in Lisp, Smalltalk, and Self up to research prototypes such as Chevalier-Boisvert's own Higgs compiler.

Many of the ideas are familiar to anyone who has had an undergrad compiler course, such as type tagging and microcoded instructions. Others are simple extensions of such ideas, such as inline caching, which is a workhorse in any dynamic compiler. Still others have entered popular discussion only recently. Maps, which are effectively hidden classes, originated in Self and are now being applied and extended in a number of interesting ways.

Two ideas from this talk that I would like to learn more about are hybrid type inference, which Chevalier-Boisvert mentioned in the context of Chrome and Firefox, and basic block versioning, a technique being explored in the Higgs compiler.

In closing, the speaker speculated on the better compilers of the future. Some of the advances will come from smarter CPUs, which might execute potential future paths in parallel, and more principled language design. But many will come from academic research that discovers new techniques and improves exiting ones.

Some of the ideas of the future are probably already available and simply haven't caught on yet. Chevalier-Boisvert offered three candidates: direct manipulation of the AST, pattern matching, and the persistent image. I certainly hear a lot of people talking about the first of these, but I've yet to see a compelling implementation yet.

Ruby Doesn't Have to Be Slow. In this session, Alex Gaynor explained why dynamic languages don't have to be slow. Though Ruby was his working example, everything he said applies to Javascript, Python, Lua, and other dynamic languages. He then talked about how he is putting these ideas to work in Topaz, a fast Ruby interpreter written in RPython. Topaz uses a number of advanced techniques, including a tracing JIT, type-specialized field look-up, maps, quasi-immutable fields, and escape analysis. It supports a subset of Ruby, though much of what is missing now is simply standard library classes and methods.

Two of the more interesting points of this talk for me were about meta-issues. First, he opened with an elaboration of the claim, "Ruby is slow", which he rightfully rejected as too imprecise to be meaningful. What people probably mean is something like, "Code written in Ruby executes CPU-bound tasks slower than other languages." I would add that, for many of my CS colleagues, the implicit benchmark is compiled C.

Further, Ruby users tend to respond to this claim poorly. Rather than refute it, they accept its premise and dance around its edges. Saddest, he says, is when they say, "If it turns out to matter, we can rewrite the program in some serious language." The compiler nerd in him says, "We can do this." Topaz is, in part, an attempt to support that claim.

Second, in response to an audience question, he claimed that people responsible for Java got something right fifteen years: they convinced people to abandon their C extensions. If the Ruby world followed course, and moved away from external dependencies that restrict what the compiler and run-time system can know, then many performance improvements would follow.

Throughout this talk, I kept coming back to JRuby in my mind...

The Art of Finding Shortcuts. Vyacheslav " @mraleph" Egorov's talk was ostensibly about an optimizing compiler for Dart, but like most of the compiler talks this year, it presented ideas of value for handling any dynamic language. Indeed, this talk gave a clear introduction to what an optimizing compiler does, what in-line caching is, and different ways that the compiler might capitalize on them.

According to Egorov, writing an optimizing compiler for language like Dart is the art of finding -- and taking -- shortcuts. The three key issues to address are representation, resolution, and redundancy. You deal with representation when you design your run-time system. The other two fall to the optimizing compiler.

Resolution is fundamentally a two-part question. Given the expression obj.prop,

  • What is obj?
  • Where is prop?

In-line caches eliminate redundancy by memoizing where/what pairs. The goal is to use the same hidden class maps to resolve property access whenever possible. Dart's optimizer uses in-line caching to give type feedback for use in improving the performance of loads and stores.

Egorov was one of the most quotable speakers I heard at StrangeLoop this year. In addition to "the art of finding shortcuts", I noted several other pithy sayings that I'll probably steal at some point, including:

  • "If all you have is an in-line cache, then everything looks like an in-line cache stub."
  • "In-lining is a Catch-22." You can't know if you will benefit from inlining unless you try, but trying (and undoing) is expensive.

Two ideas I plan to read more about after hearing this talk are allocation sinking and load forwarding.

~~~~

I have a lot of research to do now.


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 23, 2013 4:22 PM

StrangeLoop: This and That, Volume 1

[My notes on StrangeLoop 2013: Table of Contents]

the Peabody Opera House's Broadway series poster

I'm working on a post about the compiler talks I attended, but in the meantime here are a few stray thoughts, mostly from Day 1.

The Peabody Opera House really is a nice place to hold a conference of this size. If StrangeLoop were to get much larger, it might not fit.

I really don't like the word "architected".

The talks were scheduled pretty well. Only once in two days did I find myself really wanting to go to two talks at the same time. And only once did I hear myself thinking, "I don't want to hear any of these...".

My only real regret from Day 1 was missing Scott Vokes's talk on data compression. I enjoyed the talk I went to well enough, but I think I would have enjoyed this one more.

What a glorious time to be a programming language theory weenie. Industry practitioners are going to conferences and attending talks on dependent types, continuations, macros, immutable data structures, and functional reactive programming.

Moon Hooch? Interesting name, interesting sound.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

September 22, 2013 3:51 PM

StrangeLoop: Jenny Finkel on Machine Learning at Prismatic

[My notes on StrangeLoop 2013: Table of Contents]

The conference opened with a talk by Jenny Finkel on the role machine learning play at Prismatic, the customized newsfeed service. It was a good way to start the conference, as it introduced a few themes that would recur throughout, had a little technical detail but not too much, and reported a few lessons from the trenches.

Prismatic is trying to solve the discovery problem: finding content that users would like to read but otherwise would not see. This is more than simply a customized newsfeed from a singular journalistic source, because it draws from many sources, including other reader's links, and because it tries to surprise readers with articles that may not be explicitly indicated by their profiles.

The scale of the problem is large, but different from the scale of the raw data facing Twitter, Facebook, and the like. Finkel said that Prismatic is processing only about one million timely docs at a time, with the set of articles turning over roughly weekly. The company currently uses 5,000 categories to classify the articles, though that number will soon go up to the order of 250,000.

The complexity here comes from the cross product of readers, articles, and categories, along with all of the features used to try to tease out why readers like the things they do and don't like the others. On top of this are machine learning algorithms that are themselves exponentially expensive to run. And with articles turning over roughly weekly, they have to be amassing data, learning from it, and moving on constantly.

The main problem at the heart of a service like this is: What is relevant? Everywhere one turns in AI, one sees this question, or its more general cousin, Is this similar? In many ways, this is the problem at the heart of all intelligence, natural and artificial.

Prismatic's approach is straight from AI, too. They construct a feature vector for each user/article pair and then try to learn weights that, when applied to the values in a given vector, will rank desired articles high and undesired articles low. One of the key challenges when doing this kind of working is to choose the right features to use in the vector. Finkel mentioned a few used by Prismatic, including "Does the user follow this topic?", "How many times has the reader read an article from this publisher?", and "Does the article include a picture?"

With a complex algorithm, lots of data, and a need to constantly re-learn, Prismatic has to make adjustments and take shortcuts wherever possible in order to speed up the process. This is a common theme at a conference where many speakers are from industry. First, learn your theory and foundations; learn the pragmatics and heuristics need to turn basic techniques into the backbone of practical applications.

Finkel shared one pragmatic idea of this sort that Prismatic uses. They look for opportunities to fold user-specific feature weights into user-neutral features. This enables their program to compute many user-specific dot products using a static vector.

She closed the talk with five challenges that Prismatic has faced that other teams might be on the look out for:

Bugs in the data. In one case, one program was updating a data set before another program could take a snapshot of the original. With the old data replaced by the new, they thought their ranker was doing better than it actually was. As Finkel said, this is pretty typical for an error in machine learning. The program doesn't crash; it just gives the wrong answer. Worse, you don't even have reason to suspect something is wrong in the offending code.

Presentation bias. Readers tend to look at more of the articles at the top of a list of suggestions, even if they would have enjoyed something further down the list. This is a feature of the human brain, not of computer programs. Any time we write programs that interact with people, we have to be aware of human psychology and its effects.

Non-representative subsets. When you are creating a program that ranks things, its whole purpose is to skew a set of user/article data points toward the subset of articles that the reader most wants to read. But this subset probably doesn't have the same distribution as the full set, which hampers your ability to use statistical analysis to draw valid conclusions.

Statistical bleeding. Sometimes, one algorithm looks better than it is because it benefits from the performance of the other. Consider two ranking algorithms, one an "explorer" that seeks out new content and one an "exploiter" that recommend articles that have already been found to be popular. If we in comparing their performances, the exploiter will tend to look better than it is because it benefits from the successes of the explorer without being penalized for its failures. It is crucial to recognize that one feature you measure is not dependent on another. (Thanks to Christian Murphy for the prompt!)

Simpson's Paradox. The iPhone and the web have different clickthrough rates. They once found them in a situation where one recommendation algorithm performed worse than another on both platforms, yet better overall. This can really disorient teams who follow up experiments by assessing the results. The issue here is usually a hidden variable that is confounding the results.

(I remember discussing this classic statistical illusion with a student in my early years of teaching, when we encountered a similar illusion in his grade. I am pretty sure that I enjoyed our discussion of the paradox more than he did...)

This part of a talk is of great value to me. Hearing about another team's difficulties rarely helps me avoid the same problems in my own projects, but it often does help me recognize those problems when they occur and begin thinking about ways to work around them. This was a good way for me to start the conference.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

September 22, 2013 10:27 AM

Back from StrangeLoop 2013

the front of my StrangeLoop 2013 badge

I'm back home for StrangeLoop 2013. It was, again, an outstanding conference: a great location, excellent amenities, fun side events, and -- most importantly -- a solid set of talks: diverse topics, strong technical content, and a some very good speakers. Alex Miller and his team put on a good conference.

This year, I went to the talks old school: with a steno notebook and no technology but a camera. As a result, a couple of things are different about how I'm blogging the conference. First, I did not write or post any entries during the event itself. Second, my notes are a bit shorter than usual and will need to be typed up before they become blog entries. I'll write my thoughts up over the next week or so and post the entries as they emerge.

This entry will serve as a table of contents for my StrangeLoop posts, a home base for readers who might stumble onto one post and care to read more. For now, I'll list a few entries I expect to write, but I'll only know what belongs here after I have written them.

Primary entries:

Ancillary entries:

Is it too early to start looking forward to StrangeLoop 2014?


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 10, 2013 3:40 PM

A Laugh at My Own Expense

This morning presented a short cautionary tale for me and my students, from a silly mistake I made in a procmail filter.

Back story: I found out recently that I am still subscribed to a Billy Joel fan discussion list from the 1990s. The list has been inactive for years, or I would have been filtering its messages to a separate mailbox. Someone has apparently hacked the list, as a few days ago it started spewing hundreds of spam messages a day.

I was on the road for a few days after the deluge began and was checking mail through a shell connection to the mail server. Because I was busy with my trip and checking mail infrequently, I just deleted the messages by hand. When I got back, Mail.app soon learned they were junk and filtered them away for me. But the spam was still hitting my inbox on the mail server, where I read my mail occasionally even on campus.

After a session on the server early this morning, I took a few minutes to procmail them away. Every message from the list has a common pattern in the Subject: line, so I copied it and pasted it into a new procmail recipe to send all list traffic to /dev/null :

    :0
    * ^Subject.*[billyjoel]
    /dev/null

Do you see the problem? Of course you do.

I didn't at the time. My blindness probably resulted from a combination of the early hour, a rush to get over to the gym, and the tunnel vision that comes from focusing on a single case. It all looked obvious.

This mistake offers programming lessons at several different levels.

The first is at the detailed level of the regular expression. Pay attention to the characters in your regex -- all of them. Those brackets really are in the Subject: line, but by themselves mean something else in the regex. I need to escape them:

    * ^Subject.*\[billyjoel\]

This relates to a more general piece of problem-solving advice. Step back from individual case you are solving and think about the code you are writing more generally. Focused on the annoying messages from the list, the brackets are just characters in a stream. Looked at from the perspective of the file of procmail recipes, they are control characters.

The second is at the level of programming practice. Don't /dev/null something until you know it's junk. Much better to send the offending messages to a junk mbox first:

    * ^Subject.*\[billyjoel\]
    in.tmp.junk

Once I see that all and only the messages from the list are being matched by the pattern, I can change that line send list traffic where it belongs. That's a specific example of the sort of defensive programming that we all should practice. Don't commit to solutions too soon.

This, too, relates to more general programming advice about software validation and verification. I should have exercised a few test cases to validate my recipe before turning it loose unsupervised on my live mail stream.

I teach my students this mindset and program that way myself, at least most of the time. Of course, the time you most need test cases will be the time you don't write them.

The day provided a bit of irony to make the story even better. The topic of today's session in my compilers course? Writing regular expressions to describe the tokens in a language. So, after my mail admin colleague and I had a good laugh at my expense, I got to tell the story to my students, and they did, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

August 31, 2013 11:32 AM

A Good Language Conserves Programmer Energy

Game programmer Jeff Wofford wrote a nice piece on some of the lessons he learned by programming a game in forty-eight hours. One of the recurring themes of his article is the value of a high-powered scripting language for moving fast. That's not too surprising, but I found his ruminations on this phenomenon to be interesting. In particular:

A programmer's chief resource is the energy of his or her mind. Everything that expends or depletes that energy makes him or her less effective, more tired, and less happy.

A powerful scripting language sitting atop the game engine is one of the best ways to conserve programmer energy. Sometimes, though, a game programmer must work hard to achieve the performance required by users. For this reason, Wofford goes out of his way not to diss C++, the tool of choice for many game programmers. But C++ is an energy drain on the programmer's mind, because the programmer has to be in a constant state of awareness of machine cycles and memory consumption. This is where the trade-off with a scripting language comes in:

When performance is of the essence, this state of alertness is an appropriate price to pay. But when you don't have to pay that price -- and in every game there are systems that have no serious likelihood of bottlenecking -- you will gain mental energy back by essentially ignoring performance. You cannot do this in C++: it requires an awareness of execution and memory costs at every step. This is another argument in favor of never building a game without a good scripting language for the highest-level code.

I think this is true of almost every large system. I sure wish that the massive database systems at the foundation of my university's operations had scripting languages sitting on top. I even want to script against the small databases that are the lingua franca of most businesses these days -- spreadsheets. The languages available inside the tools I use are too clunky or not powerful, so I turn to Ruby.

Unfortunately, most systems don't come with a good scripting language. Maybe the developers aren't allowed to provide one. Too many CS grads don't even think of "create a mini-language" as a possible solution to their own pain.

Fortunately for Wofford, he both has the skills and inclination. One of his to-dos after the forty-eight hour experience is all about language:

Building a SWF importer for my engine could work. Adding script support to my engine and greatly refining my tools would go some of the distance. Gotta do something.

Gotta do something.

I'm teaching our compiler course again this term. I hope that the dozen or so students in the course leave the university knowing that creating a language is often the right next action and having the skills to do it when they feel compelled to do something.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

August 29, 2013 4:31 PM

Asimov Sees 2014, Through Clear Eyes and Foggy

Isaac Asimov, circa 1991

A couple of years ago, I wrote Psychohistory, Economics, and AI, in which I mentioned Isaac Asimov and one way that he had influenced me. I never read Asimov or any other science fiction expecting to find accurate predictions of future. What drew me in was the romance of the stories, dreaming "what if?" for a particular set of conditions. Ultimately, I was more interested in the relationships among people under different technological conditions than I was in the technology itself. Asimov was especially good at creating conditions that generated compelling human questions.

Some of the scenarios I read in Asimov's SF turned out to be wildly wrong. The world today is already more different from the 1950s than the world of the Foundation, set thousands of years in the future. Others seem eerily on the mark. Fortunately, accuracy is not the standard by which most of us judge good science fiction.

But what of speculation about the near future? A colleague recently sent me a link to Visit to the World's Fair of 2014, an article Asimov wrote in 1964 speculating about the world fifty years hence. As I read it, I was struck by just how far off he was in some ways, and by how close he was in others. I'll let you read the story for yourself. Here are a few selected passages that jumped out at me.

General Electric at the 2014 World's Fair will be showing 3-D movies of its "Robot of the Future," neat and streamlined, its cleaning appliances built in and performing all tasks briskly. (There will be a three-hour wait in line to see the film, for some things never change.)

3-D movies are now common. Housecleaning robots are not. And while some crazed fans will stand in line for many hours to see the latest comic-book blockbuster, going to a theater to see a movie has become much less important part of the culture. People stream movies into their homes and into their hands. My daughter teases me for caring about the time any TV show or movie starts. "It's on Hulu, Dad." If it's not on Hulu or Netflix or the open web, does it even exist?

Any number of simultaneous conversations between earth and moon can be handled by modulated laser beams, which are easy to manipulate in space. On earth, however, laser beams will have to be led through plastic pipes, to avoid material and atmospheric interference. Engineers will still be playing with that problem in 2014.

There is no one on the moon with whom to converse. Sigh. The rest of this passage sounds like fiber optics. Our world is rapidly becoming wireless. If your device can't connect to the world wireless web, does it even exist?

In many ways, the details of technology are actually harder to predict correctly than the social, political, economic implications of technological change. Consider:

Not all the world's population will enjoy the gadgety world of the future to the full. A larger portion than today will be deprived and although they may be better off, materially, than today, they will be further behind when compared with the advanced portions of the world. They will have moved backward, relatively.

Spot on.

When my colleague sent me the link, he said, "The last couple of paragraphs are especially relevant." They mention computer programming and a couple of its effects on the world. In this regard, Asimov's predictions meet with only partial success.

The world of A.D. 2014 will have few routine jobs that cannot be done better by some machine than by any human being. Mankind will therefore have become largely a race of machine tenders. Schools will have to be oriented in this direction. ... All the high-school students will be taught the fundamentals of computer technology will become proficient in binary arithmetic and will be trained to perfection in the use of the computer languages that will have developed out of those like the contemporary "Fortran" (from "formula translation").

The first part of this paragraph is becoming truer every day. Many people husband computers and other machines as they do tasks we used to do ourselves. The second part is, um, not true. Relatively few people learn to program at all, let alone master a programming language. And how many people understand this t-shirt without first receiving an impromptu lecture on the street?

Again, though, Asimov is perhaps closer on what technological change means for people than on which particular technological changes occur. In the next paragraph he says:

Even so, mankind will suffer badly from the disease of boredom, a disease spreading more widely each year and growing in intensity. This will have serious mental, emotional and sociological consequences, and I dare say that psychiatry will be far and away the most important medical specialty in 2014. The lucky few who can be involved in creative work of any sort will be the true elite of mankind, for they alone will do more than serve a machine.

This is still speculation, but it is already more true than most of us would prefer. How much truer will it be in a few years?

My daughters will live most of their lives post-2014. That worries the old fogey in me a bit. But it excites me more. I suspect that the next generation will figure the future out better than mine, or the ones before mine, can predict it.

~~~~

PHOTO. Isaac Asimov, circa 1991. Britannica Online for Kids. Web. 2013 August 29. http://kids.britannica.com/comptons/art-136777.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

August 22, 2013 2:45 PM

A Book of Margin Notes on a Classic Program?

I recently stumbled across an old How We Will Read interview with Clive Thompson and was intrigued by his idea for a new kind of annotated book:

I've had this idea to write a provocative piece, or hire someone to write it, and print it on-demand it with huge margins, and then send it around to four people with four different pens -- red, blue, green and black. It comes back with four sets of comments all on top of the text. Then I rip it all apart and make it into an e-book.

This is an interesting mash-up of ideas from different eras. People have been writing in the margins of books for hundreds of years. These days, we comment on blog entries and other on-line writing in plain view of everyone. We even comment on other people's comments. Sites such as Findings.com, home of the Thompson interview, aim to bring this cultural practice to everything digital.

Even so, it would be pretty cool to see the margin notes of three or four insightful, educated people, written independently of one another, overlaid in a single document. Presentation as an e-book offers another dimension of possibilities.

Ever the computer scientist, I immediately began to think of programs. A book such as Beautiful Code gives us essays from master programmers talking about their programs. Reading it, I came to appreciate design decisions that are usually hidden from readers of finished code. I also came to appreciate the code itself as a product of careful thought and many iterations.

My thought is: Why not bring Thompson's mash-up of ideas to code, too? Choose a cool program, perhaps one that changed how we work or think, or one that unified several ideas into a standard solution. Print it out with huge margins, and send it to three of four insightful, thoughtful programmers who read it, again or for the first time, and mark it up with their own thoughts and ideas. It comes back with four sets of comments all on top of the text. Rip it apart and create an e-book that overlays them all in a single document.

Maybe we can skip the paper step. Programming tools and Web 2.0 make it so easy to annotate documents, including code, in ways that replace handwritten comments. That's how most people operate these days. I'm probably showing my age in harboring a fondness for the written page.

In any case, the idea stands apart from the implementation. Wouldn't it be cool to read a book that interleaves and overlays the annotations made by programmers such as Ward Cunningham and Grady Booch as they read John McCarthy's first Lisp interpreter, the first Fortran compiler from John Backus's team, QuickDraw, or Qmail? I'd stand in line for a copy.

Writing this blog entry only makes the idea sound more worth doing. If you agree, I'd love to hear from you -- especially if you'd like to help. (And especially if you are Ward Cunningham and Grady Booch!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

July 18, 2013 2:22 PM

AP Computer Science in Iowa High Schools

Mark Guzdial posted a blog entry this morning pointing to a Boston Globe piece, Interest in computer science lags in Massachusetts schools. Among the data supporting this assertion was participation in Advanced Placement:

Of the 85,753 AP exams taken by Massachusetts students last year, only 913 were in computing.

Those numbers are a bit out of context, but they got me to wondering about the data for Iowa. So I tracked down this page on AP Program Participation and Performance Data 2012 and clicked through to the state summary report for Iowa. The numbers are even more dismal than Massachusetts's.

Of the 16,413 AP exams taken by Iowa students in 2012, only sixty-nine were in computer science. The counts for groups generally underrepresented in computing were unsurprisingly small, given that Iowa is less diverse than many US states. Of the sixty-nine, fifty-four self-reported as "white", ten as "Asian", and one as "Mexican-American", with four not indicating a category.

The most depressing number of all: only nine female students took the AP Computer Science exam last year in Iowa.

Now, Massachusetts has roughly 2.2 times as many people as Iowa, but even so Iowa compares unfavorably. Iowans took about one-fifth as many AP exams as many Massachusetts students, and for CS the percentage drops to 7.5%. If AP exams indicate much about the general readiness of a state's students for advanced study in college, then Iowa is at a disadvantage.

I've never been a huge proponent of the AP culture that seems to dominate many high schools these days (see, for instance, this piece), but the low number of AP CS exams taken in Iowa is consistent with what I hear when I talk to HS students from around the state and their parents: Iowa schools are not teaching much computer science at all. The university is the first place most students have an opportunity to take a CS course, and by then the battle for most students' attention has already been lost. For a state with a declared goal of growing its promising IT sector, this is a monumental handicap.

Those of us interested in broadening participation in CS face an even tougher challenge. Iowa's demographics create some natural challenges for attracting minority students to computing. And if the AP data are any indication, we are doing a horrible job of reaching women in our high schools.

There is much work to do.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 17, 2013 1:41 PM

And This Gray Spirit Yearning

On my first day as a faculty member at the university, twenty years ago, the department secretary sent me to Public Safety to pick up my office and building keys. "Hi, I'm Eugene Wallingford," I told the person behind the window, "I'm here to pick up my keys." She smiled, welcomed me, and handed them to me -- no questions asked.

Back at the department, I commented to one of my new colleagues that this seemed odd. No one asked to see an ID or any form of authorization. They just handed me keys giving me access to a lot of cool stuff. My colleague shrugged. There has never been a problem here with unauthorized people masquerading as new faculty members and picking up keys. Until there is a problem, isn't it nice living in a place where trust works?

Things have changed. These days, we don't order keys for faculty; we "request building access". This phrase is more accurate than a reference to keys, because it includes activating the faculty ID to open electronically-controlled doors. And we don't simply plug a new computer into an ethernet jack and let faculty start working; to get on the wireless network, we have to wait for the Active Directory server to sync with the HR system, which updates only after electronic approval of a Personnel Authorization Form that set up of the employee's payroll record. I leave that as a run-on phrase, because that's what living it feels like.

The paperwork needed to get a new faculty member up and running these days reminds me just how simple life was when in 1992. Of course, it's not really "paperwork" any more.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

July 15, 2013 2:41 PM

Version Control for Writers and Publishers

Mandy Brown again, this time on on writing tools without memory:

I've written of the web's short-term memory before; what Manguel trips on here is that such forgetting is by design. We designed tools to forget, sometimes intentionally so, but often simply out of carelessness. And we are just as capable of designing systems that remember: the word processor of today may admit no archive, but what of the one we build next?

This is one of those places where the software world has a tool waiting to reach a wider audience: the version control system. Programmers using version control can retrieve previous states of their code all the way back to its creation. The granularity of the versions is limited only by the frequency with which they "commit" the code to the repository.

The widespread adoption of version control and the existence of public histories at place such as GitHub have even given rise to a whole new kind of empirical software engineering, in which we mine a large number of repositories in order to understand better the behavior of developers in actual practice. Before, we had to contrive experiments, with no assurance that devs behaved the same way under artificial conditions.

Word processors these days usually have an auto-backup feature to save work as the writer types text. Version control could be built into such a feature, giving the writer access to many previous versions without the need to commit changes explicitly. But the better solution would be to help writers learn the value of version control and develop the habits of committing changes at meaningful intervals.

Digital version control offers several advantages over the writer's (and programmer's) old-style history of print-outs of previous versions, marked-up copy, and notebooks. An obvious one is space. A more important one is the ability to search and compare old versions more easily. We programmers benefit greatly from a tool as simple as diff, which can tell us the textual differences between two files. I use diff on non-code text all the time and imagine that professional writers could use it to better effect than I.

The use of version control by programmers leads to profound changes in the practice of programming. I suspect that the same would be true for writers and publishers, too.

Most version control systems these days work much better with plain text than with the binary data stored by most word processing programs. As discussed in my previous post, there are already good reasons for writers to move to plain text and explicit mark-up schemes. Version control and text analysis tools such as diff add another layer of benefit. Simple mark-up systems like Markdown don't even impose much burden on the writer, resembling as they do how so many of us used to prepare text in the days of the typewriter.

Some non-programmers are already using version control for their digital research. Check out William Turkel's How To for doing research with digital sources. Others, such The Programming Historian and A Companion to Digital Humanities, don't seem to mention it. But these documents refer mostly to programs for working with text. The next step is to encourage adoption of version control for writers doing their own thing: writing.

Then again, it has taken a long time for version control to gain such widespread acceptance even among programmers, and it's not yet universal. So maybe adoption among writers will take a long time, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 12, 2013 3:08 PM

"Either Take Control, or Cede it to the Software"

Mandy Brown tells editors and "content creators" to take control of their work:

It's time content people of all stripes recognized the WYSIWYG editor for what it really is: not a convenient shortcut, but a dangerous obstacle placed between you and the actual content. Because content on the web is going to be marked up one way or another: you either take control of it or you cede it to the software, but you can't avoid it. WYSIWYG editors are fine for amateurs, but if you are an editor, or copywriter, or journalist, or any number of the kinds of people who work with content on the web, you cannot afford to be an amateur.

Pros can sling a little code, too.

Brown's essay reminded me of a blog entry I was discussing with a colleague recently, Andrew Hayes's Why Learn Syntax? Hayes tells statisticians that they, too, should take control of their data, by learning the scripting language of the statistical packages they use. Code is a record of an analysis, which allows it to be re-run and shared with others. Learning to write code also hones one's analytical skills and opens the door to features not available through the GUI.

These articles speak to two very different audiences, but the message is the same. Don't just be a user of someone else's tools and be limited to their vision. Learn to write a little code and take back the power to create.


Posted by Eugene Wallingford | Permalink | Categories: Computing

July 11, 2013 2:57 PM

Talking to the New University President about Computer Science

Our university recently hired a new president. Yesterday, he and the provost came to a meeting of the department heads in humanities, arts, and sciences, so that he could learn a little about the college. The dean asked each head to introduce his or her department in one minute or less.

I came in under a minute, as instructed. Rather than read a litany of numbers that he can read in university reports, I focused on two high-level points:

  • Major enrollment has recovered nicely since the deep trough after the dot.com bust and is now steady. We have near-100% placement, but local and state industry could hire far more graduates.
  • For the last few years we have also been working to reach more non-majors, which is a group we under-serve relative to most other schools. This should be an important part of the university's focus on STEM and STEM teacher education.

I closed with a connection to current events:

We think that all university graduates should understand what 'metadata' is and what computer programs can do with it -- enough so that they can understand the current stories about the NSA and be able to make informed decisions as a citizen.

I hoped that this would be provocative and memorable. The statement elicited laughs and head nods all around. The president commented on the Snowden case, asked me where I thought he would land, and made an analogy to The Man Without a Country. I pointed out that everyone wants to talk about Snowden, including the media, but that's not even the most important part of the story. Stories about people are usually of more interest than stories about computer programs and fundamental questions about constitutional rights.

I am not sure how many people believe that computer science is a necessary part of a university education these days, or at least the foundations of computing in the modern world. Some schools have computing or technology requirements, and there is plenty of press for the "learn to code" meme, even beyond the CS world. But I wonder how many US university graduates in 2013 understand enough computing (or math) to understand this clever article and apply that understand to the world they live in right now.

Our new president seemed to understand. That could bode well for our department and university in the coming years.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 08, 2013 1:05 PM

A Random Thought about the Metadata and Government Surveillance

In a recent mischievous mood, I decided it might be fun to see the following.

The next whistleblower with access to all the metadata that the US government is storing on its citizens assembles a broad list of names: Republican and Democrat; legislative, executive, and judicial branches; public official and private citizens. The only qualification for getting on the list is that the person has uttered any variation of the remarkably clueless statement, "If you aren't doing anything wrong, then you have nothing to hide."

The whistleblower thens mine the metadata and, for each person on this list, publishes a brief that demonstrates just how much someone with that data can conclude -- or insinuate -- about a person.

If they haven't done anything wrong, then they don't have anything to worry about. Right?


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 07, 2013 9:32 AM

Interesting Sentences, Personal Weakness Edition

The quest for comeuppance is a misallocation of personal resources. -- Tyler Cowen

Far too often, my reaction to events in the world around me is to focus on other people not following rules, and the unfairness that results. It's usually not my business, and even when it is, it's a foolish waste of mental energy. Cowen expresses this truth nicely in neutral, non-judgmental language. That may help me develop a more productive mental habit.

What we have today is a wonderful bike with training wheels on. Nobody knows they are on, so nobody is trying to take them off. -- Alan Kay, paraphrased from The MIT/Brown Vannevar Bush Symposium

Kay is riffing off Douglas Engelbart's tricycle analogy, mentioned last time. As a computer scientist, and particularly one fortunate enough to have been exposed to the work of Ivan Sutherland, Englebart, Kay and the Xerox PARC team, and so many others, I should be more keenly conscious that we are coasting along with training wheels on. I settle for limited languages and limited tools.

Even sadder, when computer scientists and software developers settle for training wheels, we tend to limit everyone else's experience, too. So my apathy has consequences.

I'll try to allocate my personal resources more wisely.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

July 06, 2013 9:57 AM

Douglas Engelbart Wasn't Just Another Computer Guy

Bret Victor nails it:

Albert Einstein, Discoverer of Photoelectric Effect, Dies at 76

In the last few days, I've been telling family and friends about Engelbart's vision and effect on the world the computing, and thus on their world. He didn't just "invent the mouse".

It's hard to imagine these days just how big Engelbart's vision was for the time. Watching The Mother of All Demos now, it's easy to think "What's the big deal? We have all that stuff now." or even "Man, that video looks prehistoric." First of all, we don't have all that stuff today. Watch again. Second, in a sense, that demo was prehistory. Not only did we not have such technology at the time, almost no one was thinking about it. It's not that people thought such things were impossible; they couldn't think about them at all, because no one had conceived them yet. Engelbart did.

Engelbart didn't just invent a mouse that allows us to point at files and web links. His ideas helped point an entire industry toward the future.

Like so many of our computing pioneers, though, he dreamed of more than what we have now, and expected -- or at least hoped -- that we would build on the industry's advanced to make his vision real. Engelbart understood that skills which make people productive are probably difficult to learn. But they are so valuable that the effort is worth it. I'm reminded of Alan Kay's frequent use of a violin as an example, compared to a a simpler music-making device, or even to a radio. Sure, a violin is difficult to play well. But when you can play -- wow.

Engelbart was apparently fond of another example: the tricycle

Riding a bicycle -- unlike a tricycle -- is a skill that requires a modest degree of practice (and a few spills), but the rider of a bicycle quickly outpaces the rider of a tricycle.

Most of the computing systems we use these days are tricycles. Doug Engelbart saw a better world for us.


Posted by Eugene Wallingford | Permalink | Categories: Computing

July 03, 2013 10:22 AM

Programming for Everyone, Venture Capital Edition

Christina Cacioppo left Union Square Ventures to learn how to program:

Why did I want to do something different? In part, because I wanted something that felt more tangible. But mostly because the story of the internet continues to be the story of our time. I'm pretty sure that if you truly want to follow -- or, better still, bend -- that story's arc, you should know how to write code.

So, rather than settle for her lot as a non-programmer, beyond the accepted school age for learning these things -- technology is a young person's game, you know -- Cacioppo decided to learn how to build web apps. And build one.

When did we decide our time's most important form of creation is off-limits? How many people haven't learned to write software because they didn't attend schools that offered those classes, or the classes were too intimidating, and then they were "too late"? How much better would the world be if those people had been able to build their ideas?

Yes, indeed.

These days, she is enjoying the experience of making stuff: trying ideas out in code, discarding the ones that don't work, and learning new things every day. Sounds like a programmer to me.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 01, 2013 11:10 AM

Happy in my App

In a typically salty post, Jamie Zawinski expresses succinctly one of the tenets of my personal code:

I have no interest in reading my feeds through a web site (no more than I would tolerate reading my email that way, like an animal).

Living by this code means that, while many of my friends and readers are gnashing their teeth on this first of July, my life goes on uninterrupted. I remain a happy long-time user of NetNewsWire (currently, v3.1.6).

Keeping my feeds in sync with NetNewsWire has always been a minor issue, as I run the app on at least two different computers. Long ago, I wrote a couple of extremely simple scripts -- long scp commands, really -- that do a pretty good job. They don't give me thought-free syncing, but that's okay.

A lot of people tell me that apps are dead, that the HTML5-powered web is the future. I do know that we're very quickly running out of stuff we can't do in the browser and applaud the people who are making that happen. If I were a habitual smartphone-and-tablet user, I suspect that I would be miffed if web sites made me download an app just to read their web content. All that said, though, I still like what a clean, simple app gives me.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 26, 2013 2:30 PM

An Opportunity to Learn, Born of Deprivation

Earlier this summer, my daughter was talking about something one of her friends had done with Instagram. As a smug computer weenie, I casually mentioned that she could do that, too.

She replied, "Don't taunt me, Dad."

You see, no one in our family has a cell phone, smart or otherwise, so none of us use Instagram. That's not a big deal for dear old dad, even though (or perhaps because) he's a computer scientist. But she is a teenager growing up in an entirely different world, filled with technology and social interaction, and not having a smart phone must surely seem like a form of child abuse. Occasionally, she reminds us so.

This gave me a chance to explain that Instagram filters are, at their core, relatively simple little programs, and that she could learn to write them. And if she did, she could run them on almost any computer, and make them do things that even Instagram doesn't do.

I had her attention.

So, this summer I am going to help her learn a little Python, using some of the ideas from media computation. At the end of our first pass, I hope that she will be able to manipulate images in a few basic ways: changing colors, replacing colors, copying pixels, and so on. Along the way, we can convert color images to grayscale or sepia tones, posterize images, embed images, and make simple collages.

That will make her happy. Even if she never feels the urge to write code again, she will know that it's possible. And that can be empowering.

I have let my daughter know that we probably will not write code that does as good a job as what she can see in Instagram or Photoshop. Those programs are written by pros, and they have evolved over time. I hope, though, that she will appreciate how simple the core ideas are. As James Hague said in a recent post, then key idea in most apps require relatively few lines of code, with lots and lots of lines wrapped around them to handle edge cases and plumbing. We probably won't write much code for plumbing... unless she wants to.

Desire and boredom often lead to creation. They also lead to the best kind of learning.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

June 14, 2013 2:48 PM

The History of Achievement in AI

... often looks something like this:

  1. X := some task that people do well, but not computers.
  2. "It would really be impressive if a computer could X."
  3. Computer does X.
  4. "That's not intelligent. The computer is only doing search (or number crunching, or ...)."
  5. X := something else.
  6. Go to 2.

A common variation of this pattern is to replace Step 3 with a different dodge:

"That's no big deal. X doesn't really require intelligence."

In either case, the target moves.

Occasionally, the critic must admit, if grudgingly, that the task requires intelligence, whatever that means, and that the computer performs it well. But there is still one last move available to deflect the achievement from the computer:

"This is a human achievement. People had to program the computer."

I suspect that until a computer learns everything it knows from scratch -- whatever that means -- this pattern will repeat. We humans have an image to protect.

~~~~

Postscript. I wrote this after reading a short interview interview with playwright Matt Charman, who has dramatized Deep Blue's epic 1997 match win over world chess champion Garry Kasparov. Note that Charman does not employ the dodges I list. He simply chose to focus on the human personalities involved in the drama. And those personalities are worthy of exploration, especially the fascinating Kasparov!


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 13, 2013 3:01 PM

It's Okay to Talk About Currying!

James Hague offers some sound advice for writing functional programming tutorials. I agree with most of it, having learned the hard way by trying to teach functional style to university students for many years. But I disagree with one of his suggestions: I think it's okay to talk about currying.

Hague's concern with currying is practical:

Don't get so swept up in that theory that you forget the obvious: in any programming language ever invented, there's already a way to easily define functions of multiple arguments. That you can build this up from more primitive features is not useful or impressive to non-theoreticians.

Of course, my context is a little different. We teach functional programming in a course on programming languages, so a little theory is important. We want students not only to be able to write code in a functional style but also to understand some of the ideas at the foundation of the languages they use. We also want them to understand a bit about how different programming styles relate to one another.

But even in the context of teaching people to think functionally and to write code in that style, I think it's okay to talk about currying. Indeed, it is essential. Currying is not simply a theoretical topic. It is a valuable programming technique.

Here is an example. When we write a language interpreter, we often write a procedure names eval-exp. It takes two arguments: an expression to evaluate, and a list of variable/value bindings.

   (define eval-exp
     (lambda (exp env)
       ...))

The binding list, sometimes called an environment, is a map of names declared in the local block to their values, along with the bindings from the blocks that contain the local block. Each time the interpreter enters a new block, it pushes a new set of name/value pairs onto the binding list and recurses.

To evaluate a function call for which arguments are passed by value, the interpreter must first evaluate all of the function's arguments. As the arguments are all in the same block, they are evaluated using the same binding list. We could write a new procedure to evaluate the arguments recursively, but this seems like a great time to map a procedure over a list: (map eval-exp args), get a list of the results, and pass them to the code that applies the function to them.

We can't do that, though, because eval-exp is a two-argument procedure, and map works only with a one-argument procedure. But the same binding list is used to evaluate each of the expressions, so that argument to eval-exp is effectively a constant for the purposes of the mapping operation.

So we curry eval-exp:

   (define eval-exp-with
     (lambda (bindings)
       (lambda (exp)
         (eval-exp exp bindings))))

... to create the one-argument evaluator that we need, and we use it to evaluate the arguments with map:

   ; in eval-exp
   (map (eval-exp-with env) arguments)

In most functional languages, we can use a nameless lambda to curry eval-exp "in place" and avoid writing an explicit helper function:

   ; an alternative approach in eval-exp
   (map (lambda (exp)
          (eval-exp exp bindings))
        arguments)

This doesn't look much like currying because we never created the procedure that takes the bindings argument. But we can reach this same piece of code by writing eval-exp-with, calling it in eval-exp, and then using program derivation to substitute the value of the call for the call itself. This is actually a nice connection to be able to make in a course about programming languages!

When I deliver short tutorials on functional style, currying often does not make the cut, because there are so many cool and useful ideas to cover. But it doesn't take long writing functional code before currying becomes useful. As this example shows, currying is a practical tool for the functional programmer to have in his or her pocket. In FP, currying isn't just theory. It's part of the style.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

June 05, 2013 1:52 PM

I Fooled Around and Fell in Love

Cue the Elvin Bishop [ video ]...

I smile whenever I see this kind of statement on a website's About page:

Erika Carlson was studying clinical psychology in 2011, when she wrote her first line of Python code. She fell in love with programming, decided to change paths, and is now a software developer at Pillar Technology.

I fell in love upon writing my first line of code, too.

Not everyone will have the same reaction Erika and I had, but it's good that we give people at least an opportunity to learn how to program. Knowing that someone might react this way focuses my mind on giving novice programmers a good enough experience that they can, if they are so inclined.

My teaching should never get in the way of true love.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 31, 2013 1:44 PM

Quotes of the Week, in Four Dimensions

Engineering.

Michael Bernstein, in A Generation Ago, A Thoroughly Modern Sampling:

The AI Memos are an extremely fertile ground for modern research. While it's true that what this group of pioneers thought was impossible then may be possible now, it's even clearer that some things we think are impossible now have been possible all along.

When I was in grad school, we read a lot of new and recent research papers. But the most amazing, most educational, and most inspiring stuff I read was old. That's often true today as well.

Science.

Financial Agile tweets:

"If it disagrees with experiment, it's wrong". Classic.

... with a link to The Scientific Method with Feynman, which has a wonderful ten-minute video of the physicist explaining how science works. Among its important points is that guessing is huge part of science. It's just that scientists have a way of telling which guesses are right and which are wrong.

Teaching.

James Boyk, in Six Words:

Like others of superlative gifts, he seemed to think the less gifted could do as well as he, if only they knew a few powerful specifics that could readily be conveyed. Sometimes he was right!

"He" is Leonid Hambro, who played with Victor Borge and P. D. Q. Bach but was also well-known as a teacher and composer. Among my best teachers have been some extraordinarily gifted people. I'm thankful for the time they tried to convey their insights to the likes of me.

Art.

Amanda Palmer, in a conference talk:

We can only connect the dots that we collect.

Palmer uses this sentence to explain in part why all art is about the artist, but it means something more general, too. You can build, guess, and teach only with the raw materials that you assemble in your mind and your world. So collect lots of dots. In this more prosaic sense, Palmer's sentence applies to not only to art but also to engineering, science, and teaching.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

May 10, 2013 4:03 PM

Using Language to Understand a Data Set

Today was our twice-annual undergraduate research presentation day. Every B.S. student must do an undergraduate research project and present the results publicly. For the last few years, we have pooled the presentations on the morning of the Friday in finals week, after all the exams are given and everyone has a chunk of time free to present. It also means that more students and professors can attend, which makes for more a more engaging audience and a nice end to everyone's semester.

I worked with one undergraduate research student this spring. As I mentioned while considering the role of parsing in a compilers course, this student was looking for patterns in several years of professional basketball play-by-play data. His ultimate goal was to explore ways of measuring the impact of individual defensive performance in the NBA -- fairly typical MoneyBall stuff applied to an skill that is not well measured or understood.

This project fell into my hands serendipitously. The student had approached a couple of other professors, who upon hearing the word "basketball" immediately pointed him to me. Of course, the project is really a data analytics project that just happens to involve a dataset from basketball, but... Fortunately, I am interested in both the approach and the domain!

As research sometimes does, this problem led the student to a new problem first. In order to analyze data in the way he wanted, he needed data of a particular sort. There is plenty of play-by-play data available publicly on the web, but it's mostly prepared for presentation in HTML. So he first had to collect the data by scraping the web, and then organize it into a data format amenable to analysis.

This student had taken my compiler course the last time around, and his ability to parse several files of similar but just-different-enough data proved to be invaluable. As presented on sites like nba.com, the data is no where near ready to be studied.

As the semester wore on, he and I came to realize that his project this semester wouldn't be the data analysis he originally intended to do. It was a substantial project simply to make sense of the data he had found.

As he presented his work today, I realized something further. He was using language to understand a data set.

He started by defining a grammar to model the data he found, so that he could parse it into a database. This involved recognizing categories of expression that were on the surface of the data, such as made and missed field goals, timeouts, and turnovers. When he ran this first version of his parser, he found unhandled entries and extended his grammar.

Then he looked at the semantics of the data and noticed discrepancies deeper in the data. The number of possessions his program observed in a game differed from the expected values, sometimes wildly and with no apparent pattern.

As we looked deeper, we realized that the surface syntax of the data often obscured some events that would extend or terminate a possession. A simple example is a missed FT, which sometimes ends a possession and sometimes not. It depends in part on the next event in the timeline.

To handle these case, the student created new syntactic categories that enabled his parser to resolve such issues by recognized composite events in the data. As he did this, his grammar grew, and his parser became better at building a more accurate semantic model of the game.

This turned out to be a semester-long project in its own right. He's still not done and intends to continue with this research after graduation. We were both a bit surprised at how much effort it took to corral the data, but in retrospect we should not have been too surprised. Data are collected and presented with many different purposes in mind. Having an accurate deep model of the underlying the phenomenon in question isn't always one of them.

I hope the student was pleased with his work and progress this semester. I was. In addition to its practical value toward solving a problem of mutual interest, it reminded me yet again of the value of language in understanding the world around us, and the remarkable value that the computational ideas we study in computer science have to offer. For some reason, it also reminded me, pleasantly, of the Racket Way. As I noted in that blog entry, this is really the essence of computer science.

Of course, if some NBA team were to give my student the data he needs in suitable form, he could dive into the open question of how better to measure individual defensive performance in basketball. He has some good ideas, and the CS and math skills needed to try them out.

Some NBA team should snatch this guy up.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 08, 2013 12:11 PM

Not So Random Sentences

I start with a seemingly random set of sentences to blog about and, in the process of writing about them, find that perhaps they aren't so random after all.

An Era of Sharing Our Stuff

Property isn't theft; property is an inefficient distribution of resources.

This assertion comes from an interesting article on "economies of scale as a service", in reaction to a Paul Graham tweet:

Will ownership turn out to be largely a hack people resorted to before they had the infrastructure to manage sharing properly?

Open-source software, the Creative Commons, crowdsourcing. The times they are a-changin'.

An Era of Observing Ourselves

If the last century was marked by the ability to observe the interactions of physical matter -- think of technologies like x-ray and radar -- this century is going to be defined by the ability to observe people through the data they share.

... from The Data Made Me Do It.

I'm not too keen on being "observed" via data by every company in the world, even as understand the value it can brings the company and even me. But I like very much the idea that I can observe myself more easily and more productively. For years, I collected and studied data about my running and used what I learned to train and race better. Programmers are able to do this better now than ever before. You can learn a lot just by watching.

An Era of Thinking Like Scientist

... which leads to this line attributed to John C. Reynolds, an influential computer scientist who passed away recently:

Well, we know less than we did before, but more of what we know is actually true.

It's surprising how easy it is to know stuff when we don't have any evidence at all. Observing the world methodically, building models, and comparing them to what we observe in the future helps to know less of the wrong stuff and more of the right stuff.

Not everyone need be a scientist, but we'd all be better off if more of us thought like a scientist more often.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Personal

April 21, 2013 10:25 AM

Catnip for Programmers

This morning, Maciej Ceglowski of Pinboard introduced me to the Matasano crypto challenges, a set of exercises created by Thomas Ptacek and his team as a tool for teaching programmers a little about cryptography, some of its challenges, and the need for more awareness of how easy it is to do it wrong. With the coming break from the grind of the academic year, I plan on giving them a try.

After having completed the exercises himself, Ceglowski observes:

Crypto is like catnip for programmers. It is hard to keep us away from it, because it's challenging and fun to play with. And programmers respond very badly to the insinuation that they're not clever enough to do something. We see the F-16 just sitting there, keys in the ignition, no one watching, lights blinking, ladder extended. And some infosec nerd is telling us we can't climb in there, even though we just want to taxi around a little and we've totally read the manual.

I've noticed this with a number of topics in computing. In addition to cryptography, data compression and sorting/searching are sirens to the best programmers among our students. "What do you mean we can't do better?"

For many undergrads, the idea of writing a compiler seems a mystery. Heck, I admit to my students that even after years of teaching the course I remain in awe of my language tools and the people who build them. This challenge keeps a steady if relatively small stream of programmers flowing into our "Translation of Programming Languages" project course.

One of the great things about all these challenges is that after we tackle them, we have not only the finished product in hand but also what we learn about the topic -- and ourselves -- along the way. Then we are ready for a bigger challenge, and another program to write.

For CS faculty, catnip topics are invaluable ways to draw more students into the spell of computing, and more deeply. We are always on the lookout.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 20, 2013 10:25 AM

Reminiscing about Making My First Computer

Steve Wozniak

A friend put a copy of GameInformer magazine in my box yesterday with a pointer to an interview with the Great and Powerful Woz, Steve Wozniak. It's a short interview, only two pages, but it reminded me just how many cool things Wozniak (and so many others) did in the mid-1970s. It also reminded me of my younger days, coming into contact with the idea of games and machine learning for the first time.

Woz described how, after seeing Pong in a video arcade, he went home and built his own Pong game out of twenty-eight $1 chips. Steve Jobs took the game to Atari, where he encountered Nolan Bushnell, who had an idea for a single-player version of Pong. Thus did Woz design Breakout, a game with an especially apt name. It helped define Apple Computer.

The thought of building a computer game out of chips still amazes me. I was never a hardware guy growing up. I never had access to computer chips or that culture, and I had little inclination to fiddle with electronics, save for a few attempts to take apart radios and put them back together. When I designed things as a kid, they were houses or office buildings. I was going to be an architect. But Woz's story reminded me of one experience that foreshadowed my career as a computer scientist.

One year in school, I won a math contest. First prize was a copy of The Unexpected Hanging and Other Mathematical Diversions, a collection of Martin Gardner's columns from Scientific American. Chapter 8 was called "A Matchbox Game-Learning Machine". It described Hexapawn, a game played on a 3x3 board with chess pawns. The game was no more complex than Tic Tac Toe, but it was new. And I loved board games.

Gardner's article had more in store for me, though, than simply another game to study. He described how to create a "computer" -- a system of matchboxes -- that learns how to play the game! Here's how:

You make one box for each possible board position. In the box, you put different colored marbles corresponding to the moves that can be played in the position. Then you play a bunch of games against the matchbox computer. When it is the computer's turn to move, you pick up the box for that board position, shake it, and see which marble lands in the lower-right corner of the box. That's the computer's move.

When the game is over, the computer gets feedback. If it won the game, then put all the marbles back in their boxes. If it lost, punish it by keeping the marble responsible for its last move; put all the rest back in their boxes. Gardner claimed that by following this strategy, the matchbox computer would learn to play a perfect game in something under fifty moves.

This can't possibly work, can it? So I built it. And it did learn. I was happy, and amazed.

I remember experimenting a bit. Maybe a move wasn't always a loser? So I seeded the computer with more than one marble for each candidate move, so that the computer could overcome bad luck. Hexapawn is so simple that this wasn't necessary -- losing moves are losing moves -- but the computer still learned to play a perfect game, just a bit slower than before.

This is one of the earliest experiences I remember that started me down the road of studying artificial intelligence. Reading copious amounts of science fiction pushed me in that direction, too, but this was different. I had made something, and it learned. I was hooked.

So, I wasn't a hardware kid, but I had a hardware experience. It just wasn't digital hardware. But my inclination was always more toward ideas than gadgets. My interests quickly turned to writing programs, which made it so much easier to tinker with variations and to try brand-new ideas.

(Not so quickly, though, that I turned away from my dream of being an architect. The time I spent in college studying architecture turned out to be valuable in many ways.)

Wozniak was a hardware guy, but he quickly saw the potential of software. "Games were not yet software, and [the rise of the microprocessor] triggered in my mind: microprocessors can actually program games." He called the BASIC interpreter he wrote "Game BASIC". Ever the engineer, he designed the Apple II with integrated hardware and software so that programmers could write cool games.

I don't have a lot in common with Steve Wozniak, but one thing we share is the fun we have playing games. And, in very different ways, we once made computers that changed our lives.

~~~~

The GameInformer interview is on-line for subscribers only, but there is a cool video of Wozniak playing Tetris -- and talking about George H.W. Bush and Mikhail Gorbachev!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

April 05, 2013 3:21 PM

The Role of Parsing in a Compilers Course

I teach compilers again this fall. I'm looking forward to summer, when I'll have a chance to write some code and play with some ideas for the course.

This morning I thought a bit about a topic that pops up every time I prep the course. The thoughts were prompted by a tweet from James Coglan, which said "Really wish this Compilers course weren't half about parsing. ... get on with semantics." The ellipsis is mine; James's tweet said something about using lex/yacc to get something up and running fast. Then, presumably, we could get on to the fun of semantics.

This is a challenge for my compilers course, too. I know I don't want to rush through scanning and parsing, yet I also wish I had more time for static analysis, optimization, and code generation. Even though I know the value of parsing, I wish I had equal time for a lot of other cool topics.

Geoff Wozniak's response expressed one of the reasons parsing still has such a large role in my compilers course, and so many others:

Parsing is like assembly language: it seems superfluous at the time, but provides deep understanding later. It's worth it.

That's part of what keeps me from de-emphasizing it in my course. Former students often report back to me that they have used their skill at writing parsers frequently in their careers, whether for parsing DSLs they whip up or for making sense of a mess of data they want to process.

A current student is doing an undergrad research project that involves finding patterns in several years of professional basketball play-by-play data, and his ability to parse several files of similar but just-different-enough data proved invaluable. Of course, he was a bit surprised that corralling the data took as much effort as it did. Kind of like how scanning and parsing are such a big part of a compiler project.

I see now that James has tweeted a retraction:

... ppl are RTing something I said about wishing the Compilers course would get done with parsing ASAP. Don't believe this any more.

I understand the change of opinion. After going writing a compiler for a big language and learning the intricacies that are possible, it's easy to reach Geoff's position: a deep understanding comes from the experience.

That doesn't mean I don't wish my semester were twenty weeks instead of fifteen, so that I could go deeper on some other topics, too. I figure there will always be some tension in the design of the course for just that reason.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 01, 2013 3:16 PM

Good Sentences, Programming State Edition

I've read a couple of interesting papers recently that included memorable sentences related to program state.

First, Stuart Sierra in On the Perils of Dynamic Scope:

Global state is the zombie in the closet of every Clojure program.

This essay explains the difference between scope and extent, a distinction that affects how easy it is to some of what happens in a program with closures and first-order functions with free variables. Sierra also shows the tension between variables of different kinds, using examples from Clojure. An informative read.

Next, Rob Pike in Go at Google: Language Design in the Service of Software Engineering, a write-up of his SPLASH 2012 keynote address:

The motto [of the Go language] is, "Don't communicate by sharing memory, share memory by communicating."

Imperative programmers who internalize this simple idea are on their way to understanding and using functional programming style effectively. The inversion of sharing and communication turns a lot of design and programming patterns inside out.

Pike's notes provide a comprehensive example of how a new language can grow out of the needs of a particular set of applications, rather than out of programming language theory. The result can look a little hodgepodge, but using such a language often feels just fine. (This reminds me of a different classification of languages with similar practical implications.)

~~~~

(These papers weren't published April Fool's Day, so I don't think I've been punked.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 27, 2013 12:46 PM

Programming Language as Operating System

We are deep in the semester now, using Racket in our programming languages course. I was thinking recently about how little of Racket's goodness we use in this course. We use it primarily as a souped-up R5RS Scheme and handy IDE. Tomorrow we'll see some of Racket's tools for creating new syntax, which will explore one of the rich niches of the system my students haven't seen yet.

I'm thinking about ways to introduce a deeper understanding of The Racket Way, in which domain concepts are programming language constructs and programming languages are extensible and composable. But it goes deeper. Racket isn't just a language, or a set of languages. It is an integrated family of tools to support language creation and use. To provide all these services, Racket acts like an operating system -- and gives you full programmatic access to the system.

(You can watch the video of Flatt's StrangeLoop talk "The Racket Way" at InfoQ -- and you should.)

The idea is bigger than Racket, of course. Dan Ingalls expressed this idea in his 1981 Byte article, Design Principles Behind Smalltalk:

Operating System: An operating system is a collection of things that don't fit into a language. There shouldn't be one.

Alan Kay talks often about this philosophy. The divide between programming language and operating system makes some things more difficult for programmers, and complicates the languages and tools we use. It also creates a divide in the minds of programmers and imposes unnecessary limitations on what programmers think is possible. One of things that appealed to me in Flatt's StrangeLoop talk is that presented a vision of programming without those limits.

There are implications of this philosophy, and costs. Smalltalk isn't just a language, with compilers and tools that you use at your Unix prompt. It's an image, and a virtual machine, and an environment. You don't use Smalltalk; you live inside it.

After you live in Smalltalk for a while, it feels strange to step outside and use other languages. More important, when you live outside Smalltalk and use traditional languages and tools, Smalltalk feels uncomfortable at best and foreboding at worst. You don't learn Smalltalk; you assimilate. -- At least that's what it feels like to many programmers.

But the upside of the "programming language as operating system" mindset you find in Smalltalk and Racket can be huge.

This philosophy generalizes beyond programming languages. emacs is a text editor that subsumes most everything else you do, if you let it. (Before I discovered Smalltalk in grad school, I lived inside emacs for a couple of years.)

You can even take this down to the level of the programs we write. In a blog entry on delimited continuations, Andy Wingo talks about the control this construct gives the programmer over how their programs work, saying:

It's as if you were implementing a shell in your program, as if your program were an operating system for other programs.

When I keep seeing the same idea pop up in different places, with a form that fits the niche, I'm inclined to think I am seeing one of the Big Ideas of computer science.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 20, 2013 4:39 PM

Team Rings, Not Turing Awards

Alan Kay

Alan Kay recently wrote this on the Fundamentals of New Computing mailing list:

My own personal thoughts about what was accomplished [with Smalltalk] are completely intertwined with what our entire group was able to do in a few years at PARC. I would give us credit for a very high level combination of "computer science" and "software engineering" and "human centered design" and "commingled software and hardware", etc. The accomplishment was the group's accomplishment. And this whole (to me at least) was a lot more interesting than just a language idea.

I hasten to redirect personal praise to the group accomplishment whenever it happens.

I think this is also true for the larger ARPA-PARC community, and why it was able to accomplish so much at so many levels.

The "awards to individuals" structure beloved of other fields and of journalists completely misses the nature of this process. Any recognition should be like "World Series" rings -- everybody gets one, and that's it.

When Kay spoke at the 2004 OOPSLA Educators' Symposium as part of his Turing Award festivities, he frequently acknowledged the contributions of his team, in particular Dan Ingalls, and the influence that so many other people had on his team's work. Kay must have particularly appreciated receiving the Charles Stark Draper Prize together with Butler Lampson, Robert Taylor, and Charles Thacker, who helped create the conditions in which his team thrived.

In academia, we talk a lot about teamwork, but we tend to isolate individual performance for recognition. I like Kay's analogy to the rings received by teams that win sports championships. In those venues, the winners are unmistakably teams, even when a Michael Jordan or a Tom Brady stands out. That's how academic research tends to work, too. Perhaps we should make that clear more often in the awards we give.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Managing and Leading

March 12, 2013 4:30 PM

Two Views on Why Lisp's Syntax Matters

In year or so, I have seen a few people write to debunk the idea that Lisp is special because its code is written in the primary data structure of the language, list. The one I remember best is Dave Herman's Homoiconicity isn't the point, which points out that the critical technical feature that makes Lisp syntax powerful is that it can be read without being parsed.

This morning I read an old Slashdot post in which Lisp guru Kent Pitman gives a more philosophical answer to the question about what makes Lisp's syntax so special:

I like Lisp's willingness to represent itself. People often explain this as its ability to represent itself, but I think that's wrong. Most languages are capable of representing themselves, but they simply don't have the will to.

That's a nice turn of phrase:  : Lisp is willing to represent itself in data, whereas most languages don't have "the will" to do so. It's not about possibility, but facility.

It's easier to manipulate and generate programs from inside a Lisp or Scheme program than any other language that most of us might see on a daily basis. Rubyists manipulate nested arrays of symbols that encode abstract syntax trees, but this style feels somewhat artificial, and besides Ruby's syntax is so large that it's hard for a Ruby program to process other Ruby programs in this way.

As Pitman says, the fact that Lisp programs are represented by lists is almost besides the point. It might well have been arrays of some other data structure. The key is that it is the program's structure being represented, and not the character-level syntax of the programs. This is the same reason that code can be read without being parsed, and that the macro system can be so powerful.

It's also what makes it so easy to provide powerful support for programmers in their text editors and other tools. These tools don't require a lot of machinery or runtime to navigate and manipulate program code. The structure of the code lies close to its surface.

In the end, I like have two ways to think about Lisp's and Scheme's syntactic advantages: the technical reasons that live in the read procedure and the visceral reasons that embody how programmers feel will they work with a syntax that is willing to help the programmer.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 11, 2013 4:25 PM

Does Readability Give a False Sense of Understandability?

In Good for Whom?, Daniel Lyons writes about the readability of code. He starts with Dan Ingall's classic Design Principles Behind Smalltalk, which places a high value on a system being comprehensible by a single person, and then riffs on readability in J and Smalltalk.

Early on, Lyons made me smile when he noted that, while J is object-oriented, it's not likely to be used that way by many people:

... [because] to use advanced features of J one must first use J, and there isn't a lot of that going on either.

As a former Smalltalker, I know how he feels.

Ultimately, Lyons is skeptical about claims that readability increases the chances that a language will attract a large audience. For one thing, there are too many counterexamples in both directions. Languages like C, which "combines the power of assembly language with the readability of assembly language" [ link ], are often widely used. Languages such as Smalltalk, Self, and Lisp, which put a premium on features such as purity and factorability, which in turn enhance readability, never seem to grow beyond a niche audience.

Lyons's insight is that readability can mislead. He uses as an example the source code of the J compiler, which is written in C but in a style mimicking J itself:

So looking at the J source code, it's easy for me to hold my nose and say, that's totally unreadable garbage; how can that be maintained? But at the same time, it's not my place to maintain it. Imagine if it were written in the most clean, beautiful C code possible. I might be able to dupe myself into thinking I could maintain it, but it would be a lie! Is it so bad that complex projects like J have complex code? If it were a complex Java program instead, I'd still need substantial time to learn it before I would stand a chance at modifying it. Making it J-like means I am required to understand J to change the source code. Wouldn't I have to understand J to change it anyway?

There is no point in misleading readers who have trouble understanding J-like code into thinking they understand the compiler, because they don't. A veneer of readability cannot change that.

I know how Lyons feels. I sometimes felt the same way as I learned Smalltalk by studying the Smalltalk system itself. I understood how things worked locally, within a method and then within a class, but I didn't the full network of classes that made up the system. And I had the scars -- and trashed images -- to prove it. Fortunately, Smalltalk was able to teach me many things, including object-oriented programming, along the way. Eventually I came to understand better, if not perfectly, how Smalltalk worked down its guts, but that took a lot of time and work. Smalltalk's readability made the code accessible to me early, but understanding still took time.

Lyons's article brought to mind another insight about code's understandability that I blogged about many years ago in an entry on comments in code. This insight came from Brian Marick, himself no stranger to Lisp or Smalltalk:

[C]ode can only ever be self-explanatory with respect to an expected reader.

Sometimes, perhaps it's just as well that a language or a program not pretend to be more understandable than it really is. Maybe a barrier to entry is good, by keeping readers out until they are ready to wield the power it affords.

If nothing else, Lyons's stance can be useful as a counterweight to an almost unthinking admiration of readable syntax and programming style.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

February 21, 2013 3:13 PM

Ray Bradbury Channels Alan Kay

... in a Comic-Con 2010 interview:

Don't think about things, just do them.
Don't predict them, just make them.

This goes a bit farther than Kay's "The best way to predict the future is invent it". In particular, I think he is okay with thinking about things.

Text and audio excerpts of the Bradbury interview are available on-line at Brain Pickings.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 18, 2013 12:59 PM

Code Duplication as a Hint to Think Differently

Last week, one of my Programming Languages students sent me a note saying that his homework solution worked correctly but that he was bothered by some duplicated code.

I was so happy.

Any student who has me for class for very long hears a lot about the dangers of duplication for maintaining code, and also that duplication is often a sign of poor design. Whenever I teach OOP or functional programming, we learn ways to design code that satisfy the DRY principle and ways to eliminate it via refactoring when it does sneak in.

I sent the student an answer, along with hearty congratulations for recognizing the duplication and wanting to eliminate it. My advice

When I sat down to blog the solution, I had a sense of deja vu... Hadn't I written this up before? Indeed I had, a couple of years ago: Increasing Duplication to Eliminate Duplication. Even in the small world of my own teaching, it seems there is nothing new under the sun.

Still, there was a slightly different feel to the way I talked about this in class later that day. The question had come earlier in the semester this time, so the code involved was even simpler. Instead of processing a vector or a nested list of symbols, we were processing with a flat list of symbols. And, instead of applying an arbitrary test to the list items, we were simply counting occurrences of a particular symbol, s.

The duplication occurred in the recursive case, where the procedure handles a pair:

    (if (eq? s (car los))
        (+ 1 (count s (cdr los)))      ; <---
        (count s (cdr los)))           ; <---

Then we make the two sub-cases more parallel:

    (if (eq? s (car los))
        (+ 1 (count s (cdr los)))      ; <---
        (+ 0 (count s (cdr los))))     ; <---

And then use distributivity to push the choice down a level:

    (+ (if (eq? s (car los)) 1 0)
       (count s (cdr los)))            ; <--- just once!

This time, I made a point of showing the students that not only does this solution eliminate the duplication, it more closely follows the command to follow the shape of the data:

When defining a program to process an inductively-defined data type, the structure of the program should follow the structure of the data.

This guideline helps many programmers begin to write recursive programs in a functional style, rather than an imperative style.

Note that in the first code snippet above, the if expression is choosing among two different solutions, depending on whether we see the symbol s in the first part of the pair or not. That's imperative thinking.

But look at the list-of-symbols data type:

    <list-of-symbols> ::= ()
                        | (<symbol> . <list-of-symbols>)

How many occurrences of s are in a pair? Obviously, the number of s's found in the car of the list plus the number of s's found in the cdr of the list. If we design our solution to match the code to the data type, then the addition operation should be at the top to begin:

    (+ ; number of s's found in the car
       ; number of s's found in the cdr )

If we define the answer for the problem in terms of the data type, we never create the duplication-by-if in the first place. We think about solving the subproblems for the car and the cdr, fill in the blanks, and arrive immediately at the refactored code snippet above.

I have been trying to help my students begin to "think functionally" sooner this semester. There is a lot or room for improvement yet in my approach. I'm glad this student asked his question so early in the semester, as it gave me another chance to model "follow the data" thinking. In any case, his thinking was on the right track.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

February 12, 2013 2:53 PM

Student Wisdom on Monad Tutorials

After class today, a few of us were discussing the market for functional programmers. Talk turned to Clojure and Scala. A student who claims to understand monads said:

To understand monad tutorials, you really have to understand monads first.

Priceless. The topic of today's class was mutual recursion. I think we are missing a base case here.

I don't know whether this is a problem with monads, a problem with the writers of monad tutorials, or a problem with the rest of us. If it is true, then it seems a lot of people are unclear on the purpose of a tutorial.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

February 07, 2013 5:01 PM

Quotes of the Day

Computational Thinking Division. From Jon Udell, another lesson that programming and computing teach us which can be useful out in the world:

Focus on understanding why the program is doing what it's doing, rather than why it's not doing what you wanted it to.

This isn't the default approach of everyone. Most of my students have to learn this lesson as a part of learning how to program. But it can be helpful outside of programming, in particular by influencing how we interact with people. As Udell says, it can be helpful to focus on understanding why one's spouse or child or friend is doing what she is doing, rather than on why she isn't doing what you want.

Motivational Division. From the Portland Ballet, of all places, several truths about being a professional dancer that generalize beyond the studio, including:

There's a lot you don't know.
There may not be a tomorrow.
There's a lot you can't control.
You will never feel 100% ready.

So get to work, even if it means reading the book and writing the code for the fourth time. That is where the fun and happiness are. All you can affect, you affect by the work you do.

Mac Chauvinism Division. From Matt Gemmell, this advice on a particular piece of software:

There's even a Windows version, so you can also use it before you've had sufficient success to afford a decent computer.

But with enough work and a little luck, you can afford better next time.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Managing and Leading

January 26, 2013 5:52 PM

Computing Everywhere: Indirection

Alice: The hardest word you'll ever be asked to spell is "ichdericious".

Bob: Yikes. Which word?

A few of us have had fun with the quotations in English and Scheme over the last few days, but this idea is bigger than symbols as data values in programs or even words and strings in natural language. They are examples of a key element of computational thinking, indirection, which occurs in real life all the time.

A few years ago, my city built a new water park. To account for the influx of young children in the area, the city dropped the speed limit in the vicinity of the pool from 35 MPH to 25 MPH. The speed limit in that area has been 35 MPH for a long time, and many drivers had a hard time adjusting to the change. So the city put up a new traffic sign a hundred yards up the road, to warn drivers of the coming change. It looks like this one:

traffic sign: 40 MPH speed limit ahead

The white image in the middle of this sign is a quoted version of what drivers see down the road, the usual:

traffic sign: 40 MPH speed limit

Now, many people slow down to the new speed limit well in advance, often before reaching even the warning sign. Maybe they are being safe. Then again, maybe they are confusing a sign about a speed limit sign with the speed limit sign itself.

If so, they have missed a level of indirection.

I won't claim that computer scientists are great drivers, but I will say that we get used to dealing with indirection as a matter of course. A variable holds a value. A pointer holds the address of a location, which holds a value. A URL refers to a web page. The list goes on.

Indirection is a fundamental element in the fabric of computation. As computation becomes an integral part of nearly everyone's daily life, there is a lot to be gained by more people understanding the idea of indirection and recognizing opportunities to put it to work to mutual benefit.

Over the last few years, Jon Udell has been making a valiant attempt to bring this issue to the attention of computer scientists and non-computer scientists alike. He often starts with the idea of a hyperlink in a web page, or the URL to which it is tied, as a form of computing indirection that everyone already groks. But his goal is to capitalize on this understanding to sneak the communication strategy of pass by reference into people's mental models.

As Udell says, most people use hyperlinks every day but don't use them as well as they might, because the distinction between "pass by value" and "pass by reference" is not a part of their usual mental machinery:

The real problem, I think, is that if you're a newspaper editor, or a city official, or a citizen, pass-by-reference just isn't part of your mental toolkit. We teach the principle of indirection to programmers. But until recently there was no obvious need to teach it to everybody else, so we don't.

He has made the community calendar his working example of pass by reference, and his crusade:

In the case of calendar events, you're passing by value when you send copies of your data to event sites in email, or when you log into an events site and recopy data that you've already written down for yourself and published on your own site.

You're passing by reference when you publish the URL of your calendar feed and invite people and services to subscribe to your feed at that URL.

"Pass by reference rather than by value" is one of Udell's seven ways to think like the web, his take on how to describe computational thinking in a world of distributed, network media. That essay is a good start on an essential module in any course that wants to prepare people to live in a digital world. Without these skills, how can we hope to make the best use of technology when it involves two levels of indirection, as shared citations and marginalia do?

Quotation in Scheme and pass-by-reference are different issue, but they are related in a fundamental way to the concept of indirection. We need to arm more people with this concept than just CS students learning how programming languages work.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 25, 2013 4:47 PM

More on Real-World Examples of Quotation

My rumination on real-world examples of quotation to use with my students learning Scheme sparked the imaginations of several readers. Not too surprisingly, they came up with better examples than my own... For example, musician and software developer Chuck Hoffman suggested:

A song, he sang.
"A song", he sang.

The meaning of these is clearly different depending on whether we treat a song as a variable or as a literal.

My favorite example came from long-time friend Joe Bergin:

"Lincoln" has seven letters.
Lincoln has seven letters.

Very nice. Joe beat me with my own example!

As Chuck wrote, song titles create an interesting challenge, whether someone is singing a certain song or singing in a way defined by the words that happen to also be the song's title. I have certainly found it hard to find words both that are part of a title or a reference and that flow seamlessly in a sentence.

This turns out to be a fun form of word play, independent of its use as a teaching example. Feel free to send me your favorites.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 24, 2013 4:32 PM

Real-Life Examples of Quotation in Scheme

The new semester is fully underway, and I'm already enjoying Programming Languages. My Tuesday session this week felt like a hodgepodge of topics, including Scheme definitions and conditionals, and didn't inspire my students much. Today's session on pairs and lists seemed to go much more smoothly, at least from my side of the classroom.

One thing that has been different the first two weeks this time around has been several questions about the quote character in Scheme, which is shorthand for the special form quote.

The purpose of the quote is to tell the interpreter to take its argument literally. When the argument is a list, say, '(* 2 3), quotation prevents the interpreter from evaluating the list as a Scheme procedure call. When the argument is a symbol, say, 'a, the quote lets the interpreter know not to treat the a as an identifier, looking up the value bound to that name in the current environment. Instead, it is treated as the literal symbol a. Most of our students have not yet worked in languages where symbols are first-class data values, so this idea takes some getting used to.

In the course of talking about quotation with them, I decided to relate this idea to an example of quotation from real life. The first thing that came to mind at that instant was the distinction between these two sentences:

Lincoln was disappointing.
"Lincoln" was disappointing.

In the former, Lincoln is a name to be evaluated. Depending on the context, it could refer to the 16th president of the United States, the capital of Nebraska, or some other object in the world. (The sentence doesn't have to be true, of course!)

In the latter, quoting Lincoln makes it a title. I intended for this "literal" reference to the word Lincoln to evoke the current feature film of that name.

Almost immediately I began to second-guess my example. The quoted Lincoln is still a name for something -- a film, or a boo, or some such -- and so still needs to be "dereferenced" to retrieve the object signified. It's just that we treat titles differently than other names.

So it's close to what I wanted to convey, but it could mislead students in a dangerous way.

The canonical real-world example of quotation is to quote a word so that we treat the utterance as the word itself. Consider:

Creativity is overused.
"Creativity" is overused.

In the former, creativity is a name to be evaluated. It signifies an abstract concept, a bundle of ideas revolving around creation, originality, art, and ingenuity. We might say creativity is overused in a context where people should be following the rules but are instead blazing their own trails.

In the latter, the quoted creativity signifies the word itself, taken literally. We might say "creativity" is overused to suggest an author improve a piece of writing by choosing a near-synonym such as "cleverness" or "originality", or by rephrasing a sentence so that the abstract concept is recast as the verb in an active statement.

This example stays more faithful to the use of quote in Scheme, where an expression is taken literally, with no evaluation of of any kind needed.

I like giving examples of how programming concepts exist in other parts of our lives and world. Even when they are not perfect matches, they can sometimes help a student's mind click on the idea as it works in a programming language or style.

I like it better when I use better examples!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 18, 2013 2:42 PM

Alive with Infinite Possibilities

the PARC 5-key Chord Keyboard, courtesy the Buxton collection

Engelbart's Violin tells the interesting story of Douglas Engelbart's chorded keyboard, or "chorder", an input device intended as a supplement to the traditional keyboard. Engelbart was part of a generation that saw computing as a universe of unlimited possibilities, and more than many others he showed us glimpses of what it could be.

I grew up in an age when an unadorned BASIC interpreter was standard equipment on any computer, and with so little software available to us, we all wrote programs to make the machine do our bidding. In a narrower way, we felt the sense of unlimited possibilities that drove Engelbart, Sutherland, and the generations that came before us. If only we all had vision as deep.

Unfortunately, not many teenagers get to have that kind of experience anymore. BASIC became VB.Net, a corporate language for a corporate world. The good news is that languages like Python and even JavaScript make programming accessible to more people again, but the ethos of anyone learning to program on his or her own at home seems to have died off.

Engelbart's Violin uses strong language to judge the current state of computing, with some of its strongest lamenting the "cruel discrepancy" between the experience of a creative child learning to program and the world of professional programming:

When you are a teenager, alone with a (programmable) computer, the universe is alive with infinite possibilities. You are a god. Master of all you survey. Then you go to school, major in "Computer Science", graduate -- and off to the salt mines with you, where you will stitch silk purses out of sow's ears in some braindead language, building on the braindead systems created by your predecessors, for the rest of your working life. There will be little room for serious, deep creativity. You will be constrained by the will of your master (whether the proverbial "pointy-haired boss", or lemming-hordes of fickle startup customers) and by the limitations of the many poorly-designed systems you will use once you no longer have an unconstrained choice of task and medium.

Ouch. We who teach CS at the university find ourselves trapped between the needs of a world that employs most of our graduates and the beauty that computing offers. Alas, what Alan Kay said about Engelbart applies more broadly: "Engelbart, for better or for worse, was trying to make a violin.... [M]ost people don't want to learn the violin." I'm heartened to see so many people, including my own colleagues, working so hard to bring the ethos and joy of programming back to children, using Scratch, media computation, and web programming.

This week, I began a journey with thirty or so undergraduate CS students, who over the next four months will learn Scheme and -- I hope -- get a glimpse of the infinite possibilities that extend beyond their first jobs, or even their last. At the very least, I hope I don't shut any more doors on them.

~~~~

PHOTO. The PARC 5-key Chord Keyboard, from the Buxton collection at Microsoft Research.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 10, 2013 3:59 PM

The Pleasure of Elegant Composition in a Programming Language

At the 1978 APL Conference, Alan Perlis gave a talk called Almost Perfect Artifacts Improve only in Small Ways, in which he said:

What attracted me, then, to APL was a feeling that perhaps through APL one might begin to acquire some of the dimensions in programming that we revere in natural language -- some of the pleasures of composition; of saying things elegantly; of being brief, poetic, artistic, that makes our natural languages so precious to us. That aspect of programming was one that I've long been interested in but have never found any lever for coming close to in my experience with languages of the FORTRAN, ALGOL, PL/I school.

I learned APL as as an undergrad and knew immediately that thinking in it was unlike thinking in any other language I had learned, even Lisp. These languages, though, shared a Wow! factor. They enabled programs that did amazing things, or ordinary things in amazing ways. By contrast, BASIC and FORTRAN and PL/I seemed so prosaic.

As an undergrad, I never developed the sort of fluency in that would allow me to, say, write an assembler in 20 lines of APL or Lisp. I did develop a fondness for functional programming that stayed with me into graduate school, where I came into deeper contact with Lisp. I also learned Smalltalk, which I came to admire in a way similar to Perlis's feeling for APL.

I must admit, the beauty and expressiveness of array languages and functional languages have always felt less natural to me than natural language. Their more mathematical orientation felt foreign to me, less like writing in a natural language than solving a puzzle. This wasn't a matter of me not liking math; I took advanced math throughout school and always enjoyed. But it felt different to me. This is, I see now, a personal preference and likely an indicator of why I was drawn more intimately into computer science than into more study of math.

The language I use these days that makes me feel the way Perlis feels about APL is Ruby. It occupies a similar space as Python, which we teach our students and use in several courses. I like Python a lot, more than I like most languages, but it feels plain to me in the sense once explained by John Cook. It is simple, and I get things done when I program in it, but when I use it, I feel like I am programming.

Ruby has this goofy, complex syntax that makes it possible to write some hideous stuff. But Ruby also makes it possible to write code that is brief and elegant, even artistic.

I first saw this at PLoP many years ago, when looking over Gerard Meszaros's shoulder at code he had written to support the writing and publishing of his XUnit Test Patterns book. His code read like the book he was writing. Then I began to see DSLs embedded in Ruby, tools like Rake and Treetop, that made me forget about the language they were implemented in. When you use those tools and others like them, you were writing in a new language, one that fit the thoughts in your head. Yet you were still unmistakably writing Ruby.

Perhaps if I were more an engineer at heart, I would feel differently about simple, sturdy languages that let me get things done. I like them, but they don't make me feel like I am "under the influence", as Perlis writes. They are just attractive tools. Perhaps if I were more a mathematician at heart, I would feel even more at home with the elegance that Haskell and APL give me.

Whatever the reasons, Smalltalk and Ruby grabbed in ways that no other languages have. I think that is due at least in part to the way they connected to my personal understanding and love for natural language. It's interesting how we can all feel this way about different programming languages. I think it says something important about the nature of computing and programming.


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 31, 2012 8:22 AM

Building Things and Breaking Things Down

As I look toward 2013, I've been thinking about Alan Kay's view of CS as science [ link ]:

I believe that the only kind of science computing can be is like the science of bridge building. Somebody has to build the bridges and other people have to tear them down and make better theories, and you have to keep on building bridges.

In 2013, what will I build? What will I break down, understand, and help others to understand better?

One building project I have in mind is an interactive text. One analysis project in mind involves functional design patterns.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns

December 28, 2012 10:03 AM

Translating Code Gibberish to Human-Speak

Following an old link to Ridiculous Fish's Unix shell fish, I recently stumbled upon the delightful cdecl, a service that translates C declarations, however inscrutable, into plain English (and vice versa). As this introductory post says,

Every C declaration will be as an open book to you! Your coworkers' scruffy beards and suspenders will be nigh useless!

The site even provides permalinks so that you can share translations of your thorniest C casts with friends and family.

These pages are more than three years old, so I'm surely telling you something you already know. How did I just find this?

I don't program in C much these days, so cdecl itself is of use to me only as humorous diversion. But it occurs to me that simple tools like this could be useful in a pedagogical setting. Next semester, my students will be learning Scheme and functional programming style. The language doesn't have much syntax, but it does have all those parentheses. Whatever I say or do, they disorient many of my students for a while. Some them will look at even simple code such as

     (let ((x (square 4))
           (y 7))
       (+ x y))

... and feel lost. We spend time in class learning how to read code, and talk about the semantics of such expressions, which helps. But in a pinch, wouldn't it be nice for a student to hit a button and have that code translated into something more immediately comprehensible? Perhaps:

Let x be the square of 4 and y be 7 in the sum of x and y.

This might be a nice learning tool for students as they struggle with a language that seems to them -- at least early on -- to be gibberish on a par with char (*(*(* const x[3])())[5])(int).

Some Scheme masters might well say, "But the syntax and semantics of a let are straightforward. You don't really need this tool." At one level, this is true. Unfortunately, it ignores the cognitive and psychological challenges that most people face when they learn something that is sufficiently unfamiliar to them.

Actually, I think we can use the straightforwardness of the translation as a vehicle to help students learn more than just how a let expression works. I have a deeper motive.

Learning Scheme and functional programming are only a part of the course. Its main purpose is to help students understand programming languages more generally, and how they are processed by interpreters and compilers.

When we look at the let expression above, we can see that translating it into the English expression is not only straightforward, it is 100% mechanical. If it's a mechanical process, then we can write a program to do it for us! Following a BNF description of the expression's syntax, we can write an interpreter that exposes the semantics of the expression.

In many ways, that is the essence of this course.

At this point, this is only a brainstorm, perhaps fueled by holiday cooking and several days away from the office. I don't know yet how much I will do with this in class next term, but there is some promise here.

Of course, we can imagine using a cdecl-like tool to help beginners learn other languages, too. Perhaps there are elements of writing OO code in Java that confuse students enough to make a simple translator useful. Surely public static void main( String[] args) deserves some special treatment! Ruby is complex enough that it might require dozens of little translators to do it justice. Unfortunately, it might take Matz's inside knowledge to write them.

(The idea of translating inscrutable code into language understandable by humans is not limited to computer code, of course. There is a popular movement, to write laws and other legal code in Plain English. This movement is occasionally championed by legislators -- especially in election years. The U.S. Securities and Exchange Commission has its own Plain English Initiative and Plain English Handbook. At seventy-seven pages, the SEC handbook is roughly the same size as R6RS description of Scheme.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 17, 2012 3:39 PM

The Web is More Than The Latest App or Walled Garden

Anil Dash, on the web we lost:

... today's social networks, they've brought in hundreds of millions of new participants to these networks, and they've certainly made a small number of people rich.

But they haven't shown the web itself the respect and care it deserves, as a medium which has enabled them to succeed. And they've now narrowed the possibilities of the web for an entire generation of users who don't realize how much more innovative and meaningful their experience could be.

I've never warmed to Facebook, for much this reason. I enjoy Twitter, but I treat it as a source of ephemera. Anything that I want to last gets cached in a file of links, shared with colleagues or friends by e-mail, or -- best of all -- blogged about.

I sometimes wonder if blog readers will weary of finding links to things they've already seen via Twitter, or if Twitter has trained too many of us not to want to read someone's comments on such articles in blog entries. But this seems one of the great and lasting values of a blog, one that will remain even after Facebook and Twitter have gone the way of Usenet and GeoCities. The social web is more, and I want to remain a part of it.


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 12, 2012 4:18 PM

Be a Driver, Not a Passenger

Some people say that programming isn't for everyone, just as knowing how to tinker under the hood of one's car isn't for everyone. Some people design and build cars; other people fix them; and the rest of us use them as high-level tools.

Douglas Rushkoff explains why this analogy is wrong:

Programming a computer is not like being the mechanic of an automobile. We're not looking at the difference between a mechanic and a driver, but between a driver and a passenger. If you don't know how to drive the car, you are forever dependent on your driver to take you where you want to go. You're even dependent on that driver to tell you when a place exists.

This is CS Education week, "a highly distributed celebration of the impact of computing and the need for computer science education". As a part of the festivities, Rushkoff was scheduled to address members of Congress and their staffers today about "the value of digital literacy". The passage quoted above is one of ten points he planned to make in his address.

As good as the other nine points are -- and several are very good -- I think the distinction between driver and passenger is the key, the essential idea for folks to understand about computing. If you can't program, you are not a driver; you are a passenger on someone else's trip. They get to decide where you go. You may want to invent a new place entirely, but you don't have the tools of invention. Worse yet, you may not even have the tools you need to imagine the new place. The world is as it is presented to you.

Don't just go along for the ride. Drive.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

December 09, 2012 5:12 PM

Just Build Things

The advantage of knowing how to program is that you can. The danger of knowing how to program is that you will want to.

From Paul Graham's How to Get Startup Ideas:

Knowing how to hack also means that when you have ideas, you'll be able to implement them. That's not absolutely necessary..., but it's an advantage. It's a big advantage, when you're considering an idea ..., if instead of merely thinking, "That's an interesting idea," you can think instead, "That's an interesting idea. I'll try building an initial version tonight."

Writing programs, like any sort of fleshing out of big ideas, is hard work. But what's the alternative? Not being able to program, and then you'll just need a programmer.

If you can program, what should you do?

[D]on't take any extra classes, and just build things. ... But don't feel like you have to build things that will become startups. That's premature optimization. Just build things.

Even the professor in me has to admit this is true. You will learn a lot of valuable theory, tools, and practices in class. But when a big idea comes to mind, you need to build it.

As Graham says, perhaps the best way that universities can help students start startups is to find ways to "leave them alone in the right way".

Of course, programming skills are not all you need. You'll probably need to be able to understand and learn from users:

When you find an unmet need that isn't your own, it may be somewhat blurry at first. The person who needs something may not know exactly what they need. In that case I often recommend that founders act like consultants -- that they do what they'd do if they'd been retained to solve the problems of this one user.

That's when those social science courses can come in handy.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

December 07, 2012 11:17 AM

Agglutination and Crystallization

Alan Kay talks about programming languages quite a bit in this wide-ranging interview. (Aren't all interviews with Kay wide-ranging?) I liked this fuzzy bifurcation of the language world:

... a lot of them are either the agglutination of features or ... a crystallization of style.

My initial reaction was that I'm a crystallization-of-style guy. I have always had a deep fondness for style languages, with Smalltalk at the head of the list and Joy and Scheme not far behind.

But I'm not a purist when it comes to neat and scruffy. As an undergrad, I really liked programming in PL/I. Java never bothered me as much as it bothered some of my purist friends, and I admit unashamedly that I enjoy programming in it.

These days, I like Ruby as much as I like any language. It is a language that lies in the fuzz between Kay's categories. It has an "everything is an object" ethos but, man alive, is it an amalgamation of syntactic and semantic desiderata.

I attribute linguistic split personality to this: I prefer languages with a "real center", but I don't mind imposing a stylistic filter on an agglutinated language. PL/I always felt comfortable because I programmed with a pure structured programming vibe. When I program in Java or Ruby now, somewhere in the center of my mind is a Smalltalk programmer seeing the language through a Smalltalk lens. I have to make a few pragmatic concessions to the realities of my tool, and everything seems to work out fine.

This semester, I have been teaching with Java. Next semester, I will be teaching with Scheme. I guess I can turn off the filter.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 30, 2012 3:49 PM

Passing Out the Final Exam on Day One

I recently ran across an old blog posting called Students learn what they need, not what is assigned, in which a Ted Dunning described a different sort of "flipped course" than is usually meant: He gave his students the final exam on Day 1, passed out the raw materials for the course, and told them to "get to work". They decided what they needed to learn, and when, and asked for instruction and guidance on their own schedule.

Dunning was happy with the results and concluded that...

... these students could learn vastly more than was expected of them if they just wanted to.

When students like what they are doing, they can surprise most everyone with what they will do to learn. Doing something cool like building a robot (as Dunning's students did) can be all the motivation some students need.

I'm sometimes surprised by just what catches my students' fancy. A few weeks ago, I asked my sophomore- and junior-level OOP class to build the infrastructure for a Twitter-like app. It engaged them like only graphical apps usually do. They've really dug into the specs to figure out what they mean. Many of them don't use Twitter, which has been good, because it frees them of too many preconceived limitations on where they can take their program.

They are asking good questions, too, about design: Should this object talk to that one? The way I divided up the task led to code that feels fragile; is there a better way? It's so nice not to still be answering Java questions. I suspect that some are still encountering problems at the language level, but they are solving them on their own and spending more time thinking about the program at a higher level.

I made this a multi-part project. They submitted Iteration 1 last weekend, will submit Iteration 2 tomorrow, and will work on Iteration 3 next week. That's a crucial element, I think, in getting students to begin taking their designs more seriously. It matters how hard it easy to change the code, because they have to change it now -- and tomorrow!

The point of Dunning's blog is that students have to discover the need to know something before they are really interesting in learning it. This is especially true if the learning process is difficult or tedious. You can apply this idea to a lot of software development, and even more broadly to CS.

I'm not sure when I'll try the give-the-final-exam-first strategy. My compiler course already sort of works that way, since we assign the term project upfront and then go about learning what we need to build the compiler. But I don't make my students request lectures; I still lay the course out in advance and take only occasional detours.

I think I will go at least that far next semester in my programming languages course, too: show them a language on day one and explain that our goal for the semester is to build an interpreter for it by the end of the semester, along with a few variations that explore the range of possibilities that programming languages offer. That may create a different focus in my mind as I go through the semester. I'm curious to see.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 28, 2012 6:34 PM

Converting Lecture Notes into an Active Website

... in which the author seeks pointers to interactive Scheme materials on-line.

Last summer, I fiddled around a bit with Scribble, a program for writing documentation in (and for) Racket. I considered using it to write the lecture notes and website for my fall OOP course, but for a variety of reasons set it aside.

the icon for Slideshow

In the spring I'll be teaching Programming Languages again, and using Racket with my students. This seems like the perfect time to dive in and use Scribble and Slideshow to create all my course materials. This will create a synergy between what I do in class and how I prep, which will be good for me. Using Racket tools will also set a good example for my students.

After seeing The Racket Way, Matthew Flatt's talk at StrangeLoop, I am inspired to do more than simply use Racket tools to create text and slides and web pages. I'd like to re-immerse myself in a world where everything is a program, or nearly so. This would set an even more important example for my students, and perhaps help them to see more clearly that they don't ever to settle for the programs, the tools, or the languages that people give them. That is the Computer Science way as well as the Racket way.

I've also been inspired recently by the idea of an interactive textbook a lá Miller and Ranum. I have a pretty good set of lecture notes for Programming Languages, but the class website should be more than a 21st-century rendition of a 19th-century presentation. I think that using Scribble and Slideshow are a step in the right direction.

So, a request: I am looking for examples of people using the Racket presentation tools to create web pages that have embedded Scheme REPLs, perhaps even a code stepper of the sort Miller and Ranum use for Python. Any pointers you might have are welcome.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 18, 2012 9:13 AM

Programming Languages Quote of the Day

... comes from Gilad Bracha:

I firmly believe that a time traveling debugger is worth more than a boatload of language features[.]

This passage comes as part of a discussion of what it would take to make Bret Victor's vision of programming a reality. Victor demonstrates powerful ideas using "hand crafted illustrations of how such a tool might behave". Bracha, whose work on Smalltalk and Newspeak have long inspired me -- reflects on what it would take to offer Victor's powerful ideas in a general purpose programming environment.

Smalltalk as a language and environment works at a level where we conceive of providing the support Victor and Bracha envision, but most of the language tools people use today are too far removed from the dynamic behavior of the programs being written. The debugger is the most notable example.

Bracha suggests that we free the debugger from the constraints of time and make it a tool for guiding the evolution of the program. He acknowledges that he is not the first person to propose such an idea, pointing specifically to Bill Lewis's proposal for an omniscient debugger. What remains is the hard work needed to take the idea farther and provide programmers more transparent support for understanding dynamic behavior while still writing the code.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 15, 2012 4:04 PM

Teaching Students to Read and Study in a New Way

Mark Guzdial's How students use an electronic book, reports on the research paper "Performance and Use Evaluation of an Electronic Book for Introductory Python Programming" [ pdf ]. In this paper, Alvarado et al. evaluate how students used the interactive textbook How to Think Like a Computer Scientist by Ranum and Miller in an intro CS course. The textbook integrates traditional text with embedded video, "active" examples using an embedded Python interpreter, and empirical examples using a code stepper a lá a debugger.

The researchers were surprised to find how little some students used the book's interactive features:

One possible explanation for the less-than-anticipated use of the unique features may be student study skills. The survey results tend to suggest that students "study" by "reading". Few students mention coding or tracing programs as a way of "studying" computer science.

I am not using an interactive textbook in my course this semester, but I have encountered the implicit connection in many students' minds between studying and reading. It caught me off-guard, too.

After lengthy searching and some thought, I decided to teach my sophomore-level OOP course without a required text. I gave students links to two on-line books they could use as Python references, but neither covers the programming principles and techniques that are at the heart of the course. In lieu of a traditional text, I have been giving my students notes for each session, written up carefully in a style that resembles a textbook, and source code -- lots and lots of source code.

Realizing that this would be an unusual way for students to study for a CS class, at least compared to their first-year courses, I have been pretty consistent in encouraging them to work this way. Daily I suggest that they unpack the code, read it, compile it, and tinker with it. The session notes often include little exercises they can do to test or extend their understanding of a topic we have covered in class. In later sessions, I often refer back to an example or use it as the basis for something new.

I figured that, without a textbook to bog them down, they would use my session notes as a map and spend most of their time in the code spelunking, learning to read and write code, and seeing the ideas we encounter in class alive in the code.

a snapshot of Pousse cells in two dimensions

Like the results reported in the Alvarado paper, my experiences have been mixed, and in many ways not what I expected. Some students read very little, and many of those who do read the lecture notes spend relatively little time playing with the code. They will spend plenty of time on our homework assignments, but little or no time on code for the purposes of studying. My data is anecdotal, based on conversations with the subset of students who visit office hours and e-mail exchanges with students who ask questions late at night. But performance on the midterm exam and some of the programming assignments are consistent with my inference.

OO programs are the literature of this course. Textbooks are like commentaries and (really long) Cliff Notes. If indeed the goal is to get students to read and write code, how should we proceed? I have been imagining an even more extreme approach:

  • no textbook, only a language reference
  • no detailed lecture notes, only cursory summaries of what we did in class
  • code as a reading assignment before each session
  • every day in class, students do tasks related to the assigned reading -- engaging, fun tasks, but tasks they can't or wouldn't want to do without having studied the assigned code

A decade or so ago, I taught a course that mixed topics in user interfaces and professional ethics using a similar approach. It didn't provide magic results, but I did notice that once students got used to the unusual rhythm of the course they generally bought in to the approach. The new element here is the emphasis on code as the primary literature to read and study.

Teaching a course in a way that subverts student expectations and experience creates a new pedagogical need: teaching new study skills and helping students develop new work habits. Alvarado et al. recognize that this applies to using a radically different sort of textbook, too:

Might students have learned more if we encouraged them to use codelens more? We may need to teach students new study skills to take advantage of new learning resources and opportunities.

...

Another interesting step would be to add some meta-instruction. Can we teach students new study skills, to take advantage of the unique resources of the book? New media may demand a change in how students use the media.

I think those of us who teach at the university level underestimate how important meta-level instruction of this sort is to most of students. We tend to assume that students will figure it out on their own. That's a dangerous assumption to make, at least for a discipline that tends to lose too many good students on the way to graduation.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 03, 2012 11:17 AM

When "What" Questions Presuppose "How"

John Cook wrote about times in mathematics when maybe you don't need to do what you were asked to do. As one example, he used remainder from division. In many cases, you don't need to do division, because you can find the answer using a different, often simpler, method.

We see a variation of John's theme in programming, too. Sometimes, a client will ask for a result in a way that presupposes the method that will be used to produce it. For example, "Use a stack to evaluate these nested expressions." We professors do this to students a lot, because they want the students to learn the particular technique specified. But you see subtle versions of this kind of request more often than you might expect outside the classroom.

An important part of learning to design software is learning to tease apart the subtle conflation of interface and implementation in the code we write. Students who learn OO programming after a traditional data structures course usually "get" the idea of data abstraction, yet still approach large problems in ways that let implementations leak out of their abstractions in the form of method names and return values. Kent Beck talked about how this problem afflicts even experienced programmers in his blog entry Naming From the Outside In.

Primitive Obsession is another symptom of conflating what we need with how we produce it. For beginners, it's natural to use base types to implement almost any behavior. Hey, the extreme programming principle You Ain't Gonna Need It encourages even us more experienced developers not to create abstractions too soon, until we know we need them and in what form. The convenience offered by hashes, featured so prominently in the scripting languages that many of us use these days, makes it easy to program for a long time without having to code a collection of any sort.

But learning to model domain objects as objects -- interfaces that do not presuppose implementation -- is one of the powerful stepping stones on the way to writing supple code, extendible and adaptable in the face of reasonable changes in the spec.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 24, 2012 11:38 AM

"Don't Break the Phone; Fix the Computer"

Rob Pike in The Set-Up:

Twenty years ago, you expected a phone to be provided everywhere you went, and that phone worked the same everywhere. At a friend's house, or a restaurant, or a hotel, or a pay phone, you could pick up the receiver and make a call. You didn't carry a phone around with you; phones were part of the infrastructure. Computers, well, that was a different story. As laptops came in, people started carrying computers around with them everywhere. The reason was to have the state stored on the computer, not the computer itself. You carry around a computer so you can access its disk.

In summary, it used to be that phones worked without you having to carry them around, but computers only worked if you did carry one around with you. The solution to this inconsistency was to break the way phones worked rather than fix the way computers work.

Ah, the memories of grad school, WYSE terminals, and VT-100 emulation.

The advent of ubiquitous networking is making it possible for us to return to the days of dumb terminals. Is that where we want to live?

Pike's vision notwithstanding: I still carry a computer, both for state and processor. I access networked computers frequently. I do not yet carry a phone. I remain happy.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 19, 2012 3:08 PM

Computer Programming, Education Reform, and Changing Our Schools

Seymour Papert

You almost can't go wrong by revisiting Seymour Papert's work every so often. This morning I read Why School Reform Is Impossible, which reminds us that reform and change are different things. When people try to "reform" education by injecting a new idea from outside, schools seem to assimilate the reform into its own structure, which from the perspective of the reformer blunts or rejects the intended reform. Yet schools and our education system do change over time, evolving as the students, culture, and other environmental factors change.

As people such as Papert and Alan Kay have long argued, a big part of the problem in school reform involving computers is that we misunderstand what a computer is:

If you ask, "Which is not like the other two?" in the list "educational movie, textbook, computer", it is pretty obvious from my perspective that the answer must be "computer."

... not "textbook", which is how most people answer, including many people who want to introduce more computers into the classroom. Textbooks and movies are devices for receiving content that someone else made. Computers are for creating content. It just so happens that we can use them to communicate ideas in new ways, too.

This misunderstanding leads people to push computers for the wrong reasons, or at least for reasons that miss their game-changing power. We sometimes here that "programming is the new Latin". Papert reminds us that the reasons we used to teach Latin in schools changed over time:

In recent times, Latin was taught in schools because it was supposed to be good for the development of general cognitive skills. Further back, it was taught because it was the language in which all scholarly knowledge was expressed, and I have suggested that computational language could come to play a similar role in relation to quite extensive areas of knowledge.

If programming is the new Latin, it's not Latin class, circa 1960, in which Latin taught us to be rigorous students. It's Latin class, circa 1860 or 1760 or 1560, in which Latin was the language of scholarly activity. As we watch computing become a central part of the language of science, communication, and even the arts and humanities, we will realize that students need to learn to read and write code because -- without that skill -- they are left out of the future.

No child left behind, indeed.

In this essay, Paper gives a short version of his discussion in Mindstorms of why we teach the quadratic equation of the parabola to every school child. He argues that its inclusion in the curriculum has more to do with its suitability to the medium of the say -- pencil and paper -- than to intrinsic importance. I'm not too sure that's true; knowing how parabolas and ellipses work is pretty important for understanding the physical world. But it is certainly true that how and when we introduce parabolas to students can change when we have a computer and a programming language at hand.

Even at the university we encounter this collision of old and new. Every student here must take a course in "quantitative reasoning" before graduating. For years, that was considered to be "a math course" by students and advisors alike. A few years ago, the CS department introduced a new course into the area, in which students can explores a lot of the same quantitative issues using computation rather than pencil and paper. With software tools for modeling and simulation, many students can approach and even begin to solve complex problems much more quickly than they could working by hand. And it's a lot more fun, too.

To make this work, of course, students have to learn a new programming language and practice using it in meaningful ways. Papert likens it to learning a natural language like French. You need to speak it and read it. He says we would need the programming analog of "the analog of a diverse collection of books written in French and access to French-speaking people".

the Scratch logo cat

The Scratch community is taking at shot at this. The Scratch website offers not only a way to download the Scratch environment and a way to view tutorials on creating with Scratch. It also offers -- front and center, the entire page, really -- links to shared projects and galleries. This gives students a chance first to be inspired by other kids and then to download and read the actual Scratch programs that enticed them. It's a great model.

The key is to help everyone see that computers are not like textbooks and televisions and movie projectors. As Mitch Resnick has said:

Computers for most people are black boxes. I believe kids should understand objects are "smart" not because they're just smart, but because someone programmed them to be smart.

What's most important ... is that young children start to develop a relationship with the computer where they feel they're in control. We don't want kids to see the computer as something where they just browse and click. We want them to see digital technologies as something they can use to express themselves.

Don't just play with other people's products. Make your own.

Changes in the world's use of computing may do more to cause schools to evolve in a new direction than anyone's educational reforms ever could. Teaching children that they can be creators and not simply consumers is a subversive first step.

~~~~

IMAGE 1: Seymour Papert at the OLPC offices in Cambridge, Massachusetts, in 2006. Source: Wikimedia Commons License: Creative Commons Attribution-Share Alike 2.0.

IMAGE 2: The Scratch logo. Source: Wikimedia Commons License: Creative Commons Attribution-Share Alike 2.0.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 12, 2012 10:37 AM

Make Let, Not Var

I don't blog images for their own sake often, but this mash-up makes me happy:

a parody of Lennon and Ono's 'Make Love, Not War' image

Even as I enjoy teaching OO programming this semester, this reminds me that I'll enjoy teaching functional programming in the spring.

This came to me via a tweet. If you know the source, I'd love to hear from you.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 07, 2012 2:50 PM

Equality Check Patterns for Recursive Structures

I first encountered this trio of programming patterns when writing Smalltalk programs to manipulate graphs back in the late 1980s. These days I see them most often when comparing recursive data types in language processors. Andrew Black wrote about these patterns in the context of Smalltalk on the Squeak mailing list in the mid-2000s.

~~~~

Recursive Equality Check

Problem

You are working with a recursive data structure. In the simplest form, you might have a list or a record that can contain lists or records as parts. In a language application, you might have a structured type that can have the same type as one of its parts. For instance, a function type might allow a function as an argument or a function as a return value.

You have two instances of the structure and need to compare them, say, for equality or for dominance. In the language app, for example, you will need to verify that the type of an argument matches the type of the formal parameter on a procedure.

Solution

Standard structural recursion works. Walk the two structures in parallel, checking to see that they have the same values in the same positions. When one of the positions holds values the same structure, make a recursive call to compare them.

But what if ...

~~~~

Recursive Equality Check with Identity Check

Problem

An instance of the recursive structure can contain a reference to itself as a value, either directly or through mutual recursion. In the simplest form, this might be a dictionary that contains itself as a value in a key/vaue pair, as in Smalltalk, where the global variable Smalltalk is the dictionary of all global variables, including Smalltalk.

Comparing two instances now raises concerns. Two instances may be identical, or contain identical components. In such cases, the standard recursive comparison will never terminate.

Solution

Check for identity first. Recurse only if the two values are distinct.

But what if...

~~~~

Recursive Equality Check with Cache

Problem

You have two structures that do not share any elements, but they are structurally isomorphic. For example, this can occur in the simplest of structures, two one-element maps:

    a = { :self => a }
    b = { :self => b }

Now, even with an identity test up front, the recursive comparison will never terminate.

Solution

Maintain a cache of compared pairs. Before you begin to compare two objects, check to see if the pair is in the cache. If yes, return true. Otherwise, add the pair to the cache and proceed.

This approach works even though the function has not finished comparing the two objects yet. If there turns out to be a difference between the two, the check currently in progress will find it elsewhere and answer false. There is no need to enter a recursive check.

~~~~

A variation of this caching technique can also be used in other situations, such as computing a hash value for a recursive structure. If in the course of computing the hash you encounter the same structure again, assume that the value is a suitable constant, such as 0 or 1. Hashes are only approximations anyway, so making only one pass over the structure is usually enough. If you really need a fixpoint, then you can't take this shortcut.

Ruby hashes handle all three of these problems correctly:

    a = { :first  => :int,
          :second => { :first  => :int, :second => :int } }
    b = { :second => { :second => :int, :first  => :int },
          :first  => :int, }

a == b # --> true

c = { :first => c } a = { :first => :int, :second => { :first => c, :second => :int } } b = { :first => :int, :second => { :first => c, :second => :int } }

a == b # --> true

a = { :self => a } b = { :self => b }

a == b # --> true

I don't know if either MRI Ruby or JRuby uses these patterns to implement their solutions, or if they use some other technique.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

September 30, 2012 12:45 PM

StrangeLoop 8: Reactions to Brett Victor's Visible Programming

The last talk I attended at StrangeLoop 2012 was Bret Victor's Visible Programming. He has since posted an extended version of his presentation, as a multimedia essay titled Learnable Programming. You really should read his essay and play the video in which he demonstrates the implementation of his ideas. It is quite impressive, and worthy of the discussion his ideas have engendered over the last few months.

In this entry, I give only a high-level summary of the idea, react to only one of his claims, and discuss only one of his design principles in ay detail. This entry grew much longer than I originally intended. If you would like to skip most of my reaction, jump to the mini-essay that is the heart of this entry, Programing By Reacting, in the REPL.

~~~~

Programmers often discuss their productivity as at least a partial result of the programming environments they use. Victor thinks this is dangerously wrong. It implies, he says, that the difficulty with programming is that we aren't doing it fast enough.

But speed is not the problem. The problem is that our programming environments don't help us to think. We do all of our programming in our minds, then we dump our ideas into code via the editor.

Our environments should do more. They should be our external imagination. They should help us see how our programs work as we are writing them.

This is an attractive guiding principle for designing tools to help programmers. Victor elaborates this principle into a set of five design principles for an environment:

  • read the vocabulary -- what do these words mean?
  • follow the flow -- what happens when?
  • see the state -- what is the computer thinking?
  • create by reacting -- start somewhere, then sculpt
  • create by abstracting -- start concrete, then generalize

Victor's talk then discussed each design principle in detail and showed how one might implement the idea using JavaScript and Processing.js in a web browser. The demo was cool enough that the StrangeLoop crowd broke into applause at leas twice during the talk. Read the essay.

~~~~

As I watched the talk, I found myself reacting in a way I had not expected. So many people have spoken so highly of this work. The crowd was applauding! Why was I not as enamored? I was impressed, for sure, and I was thinking about ways to use these ideas to improve my teaching. But I wasn't falling head over heels in love.

A Strong Claim

First, I was taken aback by a particular claim that Victor made at the beginning of his talk as one of the justifications for this work:

If a programmer cannot see what a program is doing, she can't understand it.

Unless he means this metaphorically, seeing "in the mind's eye", then it is simply wrong. We do understand things we don't see in physical form. We learn many things without seeing them in physical form. During my doctoral study, I took several courses in philosophy, and only rarely did we have recourse to images of the ideas we were studying. We held ideas in our head, expressed in words, and manipulated them there.

We did externalize ideas, both as a way to learn them and think about them. But we tended to use stories, not pictures. By speaking an idea, or writing it down, and sharing it with others, we could work with them.

So, my discomfort with one of Victor's axioms accounted for some of my unexpected reaction. Professional programmers can and do manipulate ideas abstractly. Visualization can help, but when is it necessary, or even most helpful?

Learning, Versus Doing

This leads to a second element of my concern. I think I had a misconception about Victor's work. His talk and its title, "Visible Programming", led me to think his ideas are aimed primarily at working programmers, that we need to make programs visible for all programmers.

The title of his essay, "Learnable Programming", puts his claims into a different context. We need to make programs visible for people who are learning to program. This seems a much more reasonable position on its face. It also lets me see the axiom that bothered me so much in a more sympathetic light: If a novice programmer cannot see what a program is doing, then she may not be able to understand it.

Seeing how a program works is a big part of learning to program. A few years ago, I wrote about "biction" and the power of drawing a picture of what code does. I often find that if I require a student to draw a picture of what his code is doing before he can ask me for debugging help, he will answer his own question before getting to me.

The first time a student experiences this can be a powerful experience. Many students begin to think of programming in a different way when they realize the power of thinking about their programs using tools other than code. Visible programming environments can play a role in helping students think about their programs, outside their code and outside their heads.

I am left puzzling over two thoughts:

  • How much of the value my students see in pictures comes from not from seeing the program work but from drawing the picture themselves -- the act of reflecting about the program? If our tools visualizes the code for them, will we see the same learning effect that we see in drawing their own pictures?

  • Certainly Victor's visible programming tools can help learners. How much will they help programmers once they become experts? Ben Shneiderman's Designing the User Interface taught me that novices and experts have different needs, and that it's often difficult to know what works well for experts until we run experiments.

Mark Guzdial has written a more detailed analysis of Victor's essay from the perspective of a computer science educator. As always, Mark's ideas are worth reading.

Programming By Reacting, in the REPL

My favorite parts of this talk were the sections on creating by reacting and abstracting. Programmers, Victor says, don't work like other creators. Painters don't stare at a blank canvas, think hard, create a painting in their minds, and then start painting the picture they know they want to create. Sculptors don't stare at a block of stone, envision in their mind's eye the statue they intend to make, and then reproduce that vision in stone. They start creating, and react, both to the work of art they are creating and to the materials they are using.

Programmers, Victor says, should be able to do the same thing -- if only our programming environments helped us.

As a teacher, I think this is an area ripe for improvement in how we help students learn to program. Students open up their text editor or IDE, stare at that blank screen, and are terrified. What do I do now? A lot of my work over the last fifteen to twenty years has been in trying to find ways to help students get started, to help them to overcome the fear of the blank screen.

My approaches haven't been through visualization, but through other ways to think about programs and how we grow them. Elementary patterns can give students tools for thinking about problems and growing their code at a scale larger than characters or language keywords. An agile approach can help them start small, add one feature at a time, proceed in confidence with working tests, and refactor to make their code better as they go along. Adding Victor-style environment support for the code students write in CS1 and CS2 would surely help as well.

However, as I listened to Victor describe support for creating by reacting, and then abstracting variables and functions out of concrete examples, I realized something. Programmers don't typically write code in an environment with data visualizations of the sort Victor proposes, but we do program in the style that such visualizations enable.

We do it in the REPL!

A simple, interactive computer programming environment enables programmers to create by reacting.

  • They write short snippets of code that describe how a new feature will work.
  • They test the code immediately, seeing concrete results from concrete examples.
  • They react to the results, shaping their code in response to what the code and its output tell them.
  • They then abstract working behaviors into functions that can be used to implement another level of functionality.

Programmers from the Lisp and Smalltalk communities, and from the rest of the dynamic programming world, will recognize this style of programming. It's what we do, a form of creating by reacting, from concrete examples in the interaction pane to code in the definitions pane.

In the agile software development world, test-first development encourages a similar style of programming, from concrete examples in the test case to minimal code in the application class. Test-driven design stimulates an even more consciously reactive style of programming, in which the programmer reacts both to the evolving program and to the programmer's evolving understanding of it.

The result is something similar to Victor's goal for programmers as they create abstractions:

The learner always gets the experience of interactively controlling the lower-level details, understanding them, developing trust in them, before handing off that control to an abstraction and moving to a higher level of control.

It seems that Victor would like to perform even more support for novices than these tools can provide, down to visualizing what the program does as they type each line of code. IDEs with autocomplete is perhaps the closest analog in our current arsenal. Perhaps we can do more, not only for novices but also professionals.

~~~~

I love the idea that our environments could do more for us, to be our external imaginations.

Like many programmers, though, as I watched this talk, I occasionally wondered, "Sure, this works great if you creating art in Processing. What about when I'm writing a compiler? What should my editor do then?"

Victor anticipated this question and pre-emptively answered it. Rather than asking, How does this scale to what I do?, we should turn the question inside out and ask, These are the design requirements for a good environment. How do we change programming to fit?

I doubt such a dogmatic turn will convince skeptics with serious doubts about this approach.

I do think, though, that we can reformulate the original question in a way that focuses on helping "real" programmers. What does a non-graphical programmer need in an external imagination? What kind of feedback -- frequent, even in-the-moment -- would be most helpful to, say, a compiler writer? How could our REPLs provide even more support for creating, reacting, and abstracting?

These questions are worth asking, whatever one thinks of Victor's particular proposal. Programmers should be grateful for his causing us to ask them.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

September 29, 2012 4:04 PM

StrangeLoop 7: The Racket Way

I have been using Racket since before it was Racket, back when it was "just another implementation of Scheme". Even then, though, it wasn't just another implementation of Scheme, because it had such great libraries, a devoted educational community around it, and an increasingly powerful facility for creating and packaging languages. I've never been a deep user of Racket, though, so I was eager to see this talk by one of its creators and learn from him.

Depending on your perspective, Racket is either a programming language (that looks a lot like Scheme), a language plus a set of libraries, or a platform for creating programs. This talk set out to show us that Racket is more.

Flatt opened with a cute animated fairy tale, about three princesses who come upon a wishing well. The first asks for stuff. The second asks for more wishes. The third asks for a kingdom full of wishing wells. Smart girl, that third one. Why settle for stuff when you can have the source of all stuff?

This is, Flatt said, something like computer science. There is a similar progression of power from:

  • a document, to
  • a language for documents, to
  • a language for languages.

Computer scientists wish for a way to write programs that do... whatever.

This is the Racket way:

  1. Everything is a program.
  2. Concepts are programming language constructs.
  3. Programming languages are extensible and composable.

The rest of the talk was a series of impressive mini-demos that illustrated each part of the Racket way.

To show what it means to say that everything is a program, Flatt demoed Scribble, a language for producing documents -- even the one he was using to give his talk. Scribble allows writers to abstract over every action.

To show what it means to say that concepts are programming language constructs, Flatt talked about the implementation of Dr. Racket, the flexible IDE that comes with the system. Dr. Racket needs to be able to create, control, and terminate processes. Relying on the OS to do this for it means deferring to what that OS offers. In the end, that means no control.

Dr. Racket needs to control everything, so the language provides constructs for these concepts. Flatt showed as examples threads and custodians. He then showed this idea at work in an incisive way: he wrote a mini-Dr. Racket, called Racket, Esq. -- live using Racket. To illustrate its completeness, he then ran his talk inside racket-esq. Talk about a strange loop. Very nice.

To show what it means to say that programming languages are extensible and composable, Flatt showed a graph of the full panoply of Racket's built-in languages and demoed several languages. He then used some of the basic language-building tools in Racket -- #lang, require, define-syntax, syntax-rules, and define-syntax-rule -- to build the old text-based game Adventure, which needs a natural language-like scripting language for defining worlds. Again, very nice -- so much power in so many tools.

This kind of power comes from taking seriously a particular way of thinking about the world. It starts with "Everything is a program." That is the Racket way.

Flatt is a relaxed and confident presenter. As a result, this was a deceptively impressive talk. It reinforced its own message by the medium in which it was delivered: using documents -- programs -- written and processed in Racket. I am not sure how anyone could see a slideshow with "hot" code, a console for output, and a REPL within reach, all written in the environment being demoed, and not be moved to rethink how they write programs. And everything else they create.

As Flatt intimated briefly early on, The Racket Way of thinking is not -- or should not be -- limited to Racket. It is, at its core, the essence of of computer science. The duality of code and data makes what we do so much more powerful than most people realize, and makes what we can do so much more powerful than most us actually do with the tools we accept. I hope that Flatt's talk inspires a few more of us not to settle for less than we have to.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 29, 2012 3:40 PM

StrangeLoop 6: Y Y

I don't know if it was coincidence or by design of the conference organizers, but Wednesday morning was a topical repeat of Tuesday morning for me: two highly engaging talks on functional programming. I had originally intended to write them up in a single entry, but that write-up grew so long that I decided to give them their own entries.

Y Not?

Watching talks and reading papers about the Y combinator are something of a spectator code kata for me. I love to see new treatments, and enjoy seeing even standard treatments every now and then. Jim Weirich presented it at StrangeLoop with a twist I hadn't seen before.

Weirich opened, as speakers often do, with him. This is a motivational talk, so it should be...

  • non-technical. But it's not. It is highly technical.
  • relevant. But it's not. It is extremely pointless.
  • good code. But it's not. It shows the worst Clojure code ever.

But it will be, he promises, fun!

Before diving in, he had one more joke, or at least the first half of one. He asked for audience participation, then asked his volunteer to calculate cos(n) for some value of n I missed. Then he asked the person to keep hitting the cosine button repeatedly until he told him to stop.

At the dawn of computing, to different approaches were taken in an effort to answer the question, What is effectively computable?

Alan Turing devised what we now call a universal Turing machine to embody the idea. Weirich showed a video demonstration of a physical Turing machine to give his audience a sense of what a TM is like.

(If you'd like to read more about Turing and the implication of his universal machine, check out this reflection I wrote earlier this year after a visit by Doug Hofstadter to my campus. Let's just say that the universal TM means more than just an answer to what functions are effectively computable.)

A bit ahead of Turing, Alonzo Church devised an answer to the same question in the form of the lambda calculus, a formal logical system. As with the universal TM, the lambda calculus can be used to compute everything, for a particular value of eveything. These days, nearly every programming language has lambdas of some form

... now came the second half of the joke running in the background. Weirich asked his audience collaborator what was in his calculator's display. The assistant called out some number, 0.7... Then Weirich showed his next slide -- the same number, taken out many more digits. How was he able to do this? There is a number n such that cos(n) = n. By repeatedly pressing his cosine button, Weirich's assistant eventually reached it. That number n is called the fixed point of the cosine function. Other functions have fixed points to, and they can be a source of great fun.

Then Weirich opened up his letter and wrote some code from the ground up to teach some important concepts of functional programming, using the innocuous function 3(n+1). With this short demo, Weirich demonstrated the idea of a higher-order function, including function factories, a set of useful functional refactorings that included

  • Introduce Binding
    -- where the new binding is unused in the body
  • Inline Definition
    -- where a call to a function is replaced by the function body, suitably parameterized
  • Wrap Function
    -- where an expression is replaced by a function call that computes the expression
  • Tennent Correspondence Principle
    -- where an expression is turned into a think

At the end of his exercise, Weirich had created a big function call that contained no named function definitions yet computed the same answer.

He asks the crowd for applause, then demurs. This is 80-year-old technology. Now you know, he says, what a "chief scientist" at New Context does. (Looks a lot like what an academic might do...)

Weirich began a second coding exercise, the point behind all his exposition to this point: He wrote the factorial function, and began to factor and refactor it just as he had the simpler 3(n+1). But now inlining the function breaks the code! There is a recursive call, and the name is now out of scope. What to do?

He refactors, and refactors some more, until the body of factorial is an argument to a big melange of lambdas and applications of lambdas. The result is a function that computes the fixed point of any function passed it.

That is Y. The Y combinator.

Weirich talked a bit about Y and related ideas, and why it matters. He closed with a quote from Wittgenstein, from Philosophical Investigations:

The aspects of things that are most important for us are hidden because of their simplicity and familiarity. (One is unable to notice something -- because it is always before one's eyes.) The real foundations of his enquiry do not strike a man at all. Unless that fact has at some time struck him. -- And this means: we fail to be struck by what, once seen, is most striking and most powerful.

The thing that sets Weirich's presentation of Y apart from the many others I've seen is its explicit use of refactoring to derive Y. He created Y from a sequence of working pieces of code, each the result of a refactoring we can all understand. I love to do this sort of thing when teaching programming ideas, and I was pleased to see it used to such good effect on such a challenging idea.

The title of this talk -- Y Not? -- plays on Y's interrogative homonym. Another classic in this genre echos the homonym in its title, then goes on to explain Y in four pages of English and Scheme. I suggest that you study @rpg's essay while waiting for Weirich's talk to hit InfoQ. Then watch Weirich's talk. You'll like it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

September 28, 2012 3:59 PM

StrangeLoop 5: Miscellany -- At All Levels

Most of the Tuesday afternoon talks engaged me less deeply than the ones that came before. Part of that was the content, part was the style of delivery, and part was surely that my brain was swimming in so many percolating ideas that there wasn't room for much more.

Lazy Guesses

Oleg Kiselyov, a co-author of the work behind yesterday's talk on miniKanren, gave a talk on how to implement guessing in computer code. That may sound silly, for a couple of reasons. But it's not.

First, why would we want to guess at all? Don't we want to follow principles that guarantee we find the right answer? Certainly, but those principles aren't always available, and even when they are the algorithms that implement them may be computationally intractable. So we choose to implement solutions that restrict the search space, for which we pay a price along some other dimension, often expressiveness.

Kiselyov mentioned scheduling tasks early in his talk, and any student of AI can list many other problems for which "generate and test" is a surprisingly viable strategy. Later in the talk, he mentioned parsing, which is also a useful example. Most interesting grammars have nondeterministic choices in them. Rather than allow our parsers to make choices and fail, we usually adopt rules that make the process predictable. The result is an efficient parser, but a loss in what we can reasonably say in the language.

So, perhaps the ability to make good guesses is valuable. What is so hard about implementing them? The real problem is that there are so many bad guesses. We'd like to use knowledge to guide the process of guessing again, to favor some guesses over others.

The abstract for the talk promises a general principle on which to build guessing systems. I must admit that I did not see it. Kiselyov moved fast at times through his code, and I lost sight of the big picture. I did see discussions of forking a process at the OS level, a fair amount of OCaml code, parser combinators, and lazy evaluation. Perhaps my attention drifted elsewhere at a key moment.

The speaker closed his talk by showing a dense slide and saying, "Here is a list of buzzwords, some of which I said in my talk and some of which I didn't say in my talk." That made me laugh: a summary of a talk he may or may not have given. That seemed like a great way to end a talk about guessing.

Akka

I don't know much about the details of Akka. Many of my Scala-hacking former students talk about it every so often, so I figured I'd listen to this quick tour and pick up a little more. The underlying idea, of course, is Hewitt's Actor model. This is something I'm familiar with from my days in AI and my interest in Smalltalk.

The presenter, Akka creator Jonas Boner, reminded the audience that Actors were a strong influence on the original Smalltalk. In many ways, it is truer to Kay's vision of OOP than the languages we use today.

This talk was a decent introduction to Hewitt's idea and its implementation in Akka. My two favorite things from the talk weren't technical details, but word play:

  • The name "Akka" has many inspirations, including a mountain in northern Sweden, a goddess of the indigenous people of northern Scandinavia, and a palindrome of Actor Kernel / Kernel Actor.

  • Out of context, this quote made the talk for me:
    We have made some optimizations to random.
    Ah, aren't we all looking for those?

Expressiveness and Abstraction

This talk by Ola Bini was a personal meditation on the expressiveness of language. Bini, whose first slide listed him as a "computational metalinguist", started from the idea that, informally, the expressiveness of a language is inversely proportional to the distance between our thoughts and the code we have to write in that language.

In the middle part of the talk, he considered a number of aspects of expressiveness and abstraction. In the latter part, he listed ideas from natural language and wondered aloud what their equivalents would be in programming languages, among them similes, metaphors, repetition, elaboration, and multiple equivalent expressions with different connotations.

During this part of the talk, my mind wandered, too, to a blog entry I wrote about parts of speech in programming languages back in 2003, and a talk by Crista Lopes at OOPSLA that year. Nouns, verbs, pronouns, adjectives, and adverbs -- these are terms I use metaphorically when teaching students about new languages. Then I thought about different kinds of sentence -- declarative, interrogative, imperative, and exclamatory -- and began to think about their metaphorical appearances in our programming languages.

Another fitting way for a talk to end: my mind wondering at the end of a wondering talk.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 27, 2012 5:33 PM

StrangeLoop 4: Computing Like The Brain

Tuesday morning kicked off with a keynote address by Jeff Hawkins entitled "Computing Like The Brain". Hawkins is currently with Numenta, a company he co-founded in 2005, after having founding the Redwood Neuroscience Institute and two companies most technophiles will recognize: Palm and Handspring.

Hawkins said he has devoted his professional life to understanding machine science. He recalls reading an article by Francis Crick in Scientific American as a youth and being inspired to study neuroscience. It was a data-rich, theory-poor discipline, one crying out for abstractions to unify our understanding of how the brain works from the mass of data we were collecting. He says he dedicated life then to discovering principles of how the brain works, especially the neocortex, and to build computer systems that implement these principles.

The talk began with a primer on the neocortex, which can be thought of as a predictive modeling system to controls human intelligence. If we take into account all the components of what we think of as our five senses, the brain has millions of sensors that constantly stream data to the neocortex. Its job is to build an on-line model from this streaming data. It constantly predicts what he expects to receive next, detects anomalies, updates itself, and produces actions. When the neocortex updates, we learn.

On this view, the brain doesn't "compute". It is a memory system. (I immediately thought of Roger Schank, his views on AI, and case-based reasoning...) The brain is really one memory algorithm operating over all of our sensory inputs. The key elements of this memory system are:

  • a hierarchy of regions,
  • sequence memory, and
  • sparse distributed representation.

Hawkins spoke briefly about hierarchy and sequence memory, but he quickly moved into the idea of sparse distributed representation (SDR). This can be contrasted to the dense, localized memory of traditional computer systems. For example, ASCII code consists of seven bits, all combinations of which we use to represent a single character. Capital 'A' is 65, or 1000001; the digit '5' is 55, or 0110111. The coincidence of '5' and 55 notwithstanding, the individual bits of an ASCII code don't mean anything. Change one bit, and you get a different character, sometimes a very different one.

An SDR uses a large number of bits, with only a few set to 1. Hawkins said that typically only ~ 2% of the bits are "on". Each bit in an SDR has specific meaning, one that has been learned through memory updating, not assigned. He then demonstrated several properties of an SDR, such as how it can be used to detect similarities, how it can do "store-and-compare" using only indices, and how it can perform remarkably well using on a sampling of the indices. Associative look-up in the brain's SDR produces surprisingly few errors, and those tend to be related to the probe, corresponding to similar situations encountered previously.

The first takeaway point of the talk was this: Intelligent systems of the future will be built using sparse distributed representation.

At this point, my note-taking slowed. I am not a biologist, so most of what Hawkins was describing lies far outside my area of expertise. So I made a master note -- gotta get this guy's book! -- and settled into more focused listening.

(It turns out that a former student recommended Hawkins's book, On Intelligence, to me a year or two ago. I should have listened to Allyn then and read it!)

One phrase that made me smile later in the talk was the semantic meaning of the wrongness. Knowing why something is wrong, or how, is a huge step up on "just" being wrong. Hawkins referred to this in particular as part of the subtlety of making predictions.

To close, Hawkins offered some conjectures. He thinks that the future of machine intelligence will depend on us developing more and better theory to explain how the brain works, especially in the areas of hierarchy and attention. The most compelling implementation will be an embodied intelligence, with embedded agents distributed across billions of sensors. We need better hardware in order to create faster systems. recall that the brain is more a memory systems than a computation device, so better memory is as or more important than better processors. Finally, we need to find a way to increase the level connectivity among components. Neurons have tens or hundreds of connections to other neurons, and these can be grown or strengthened dynamically. Currently, our computer chips are not good at this.

Where will breakthrough applications come from? He's not sure. In the past, breakthrough applications of technologies have not always been where we expected them.

I gotta read more. As a student of AI, I was never been all that interested in neurobiology or even its implications for my discipline. The cognitive level has always excited me more. But Hawkins makes an interesting case that the underlying technologies we need to reach the cognitive level will look more like our brains than today's computers.


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 25, 2012 9:35 PM

StrangeLoop 2: The Future of Databases is In Memory

The conference opened with a keynote address by Michael Stonebraker, who built Ingres, Postgres, and several other influential database systems. Given all the hubbub about NoSQL the last few years, including at StrangeLoop 2010, this talk brought some historical perspective to a conversation that has been dominated in recent years by youngsters. Stonebraker told the audience that the future is indeed here, but from the outside it will look a lot like the past

The problem, of course, is "big data". It's big because of volume, velocity, and variety. Stonebraker framed his opening comments in terms of volume. In the traditional database setting back in the 1980s, we all bought airplane tickets through a travel agent who acted, for all meaningful purposes, in the role of professional terminal operator. We were doing business "at the speed of the intermediary". The ACID properties were inviolable: "Don't you dare lose my data."

Then came change. The internet disintermediated access to database, cutting intermediaries out of the equation. Volume shot through the roof. PDAs further disintermediated access, removing limitations on the locations from which we accessed our data. Volume shot up even further. Suddenly, databases came to be part of the solution to a much broader class of problems: massively online games, ad placement, new forms of commerce. We all know what that meant for volume.

Stonebraker then offered two reality checks to frame the solution to our big data problems. The first involved the cost of computer memory. One terabyte is a really big database for transaction processing, yet it 1TB of memory now costs $25-50K. Furthermore, the price is dropping faster than transaction volume is rising. So: the big data problem is really now a problem for main memory.

The second reality check involved database performance. Well under 10% of the time spent by a typical database is spent doing useful work. Over 90% is overhead: managing a buffer pool, latching, locking, and recovery. We can't make faster databases by creating better DB data structures or algorithms; a better B-tree can affect only 4% of application runtime. If we could eliminate the buffer pool, we can gain up to 25% in performance. We must focus on overhead.

Where to start? We can forget about all the traditional database vendors. They have code lines that are thirty years old and older. They have to manage backward compatibility for a huge installed customer base. They are, from the perspective of the future, bloatware. They can't improve.

How about the trend toward NoSQL? We can just use raw data storage and write our own low-level code, optimized to the task. Well, the first thing to realize is that the compiler already translates SQL into lower-level operations. In the world of databases as almost everywhere else, it is really hard to beat the compiler at its game. High-level languages are good, and our compilers do an awesome job generating near-optimal code. Moving down an abstraction layer is, Stonebraker says, a fundamental mistake: "Always move code to the data, never data to the code."

Second, we must realize that the ACID properties really are a good thing. More important, they are nearly impossible to retrofit into a system that doesn't already provide them. "Eventually consistent" doesn't really mean eventually consistent if it's possible to sell your last piece of inventory. In any situation where there exists a pair of non-commutative transactions, "eventually consistent" is a recipe for corruption.

So, SQL and ACID are good. Let's keep them. Stonebraker says that instead of NoSQL databases, we should build "NewSQL" databases that improve performance through innovative architectures. Putting the database in main memory is one way to start. He addressed several common objections to this idea ("But what if the power fails??") by focusing on speed and replication. Recovery may be slow, but performance is wildly better. We should optimize for the most common case and treat exceptional cases for what they are: rare.

He mentioned briefly several other components of a new database architecture, such horizontally scaling across a cluster of nodes, automatic sharding, and optimization via stored procedures targeted at the most common activities. The result is not a general purpose solution, but then why does it need to be?

I have a lot of gray hair, Stonebraker said, but that means he has seen these wars before. It's better to stick with what we know to be valuable and seek better performance where our technology has taken us.


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 25, 2012 8:35 PM

StrangeLoop 1: A Miscellany of Ideas

For my lunch break, I walked a bit outside, to see the sun and bend my knee a bit. I came back for a set of talks without an obvious common thread. After seeing the talks, I saw a theme: ideas for writing programs more conveniently or more concisely.

ClojureScript

David Nolen talked about ClojureScript, a Clojure-like language that compiles to Javascript. As he noted, there is a lot of work in this space, both older and newer. The goal of all that work is to write Javascript more conveniently, or generate it from something else. The goal of ClojureScript is to bring the expressibility and flexible programming style of the Lisp world to JS world. Nolen's talk gave us some insights into the work being done to make the compiler produce efficient Javascript, as well as into why you might use ClojureScript in the first place.

Data Structures and Hidden Code

The message of this talk by Scott Vokes is that your choice in data structures plays a big role in determining how much code you have to write. You can make a lot of code disappear by using more powerful data structures. We can, of course, generalize this claim from data structures to data. This is the theme of functional and object-oriented programming, too. This talk highlights how often we forget the lowly data structure when we think of writing less code.

As Vokes said, your choice in data structures sets the "path of least resistance" for what your program will do and also for the code you will write. When you start writing code, you often don't know what the best data structure for your application is. As long as you don't paint yourself into a corner, you should be able to swap a new structure in for the old. The key to this is something novice programmers learn early, writing code not in terms of a data structure but in terms of higher-level behaviors. Primitive obsession can become implementation obsession if you aren't careful.

The meat of this talk was a quick review of four data structures that most programmers don't learn in school: skip lists, difference list, rolling hashes, and jumpropes, a structure Vokes claims to invented.

This talk was a source of several good quote, including

  • "A data structure is just a stupid programming language." -- Bill Gosper
  • "A data structure is just a virtual machine." -- Vokes himself, responding to Gosper
  • "The cheapest, fastest, and most reliable components are those that aren't there." -- Gordon Bell

The first two quotes there would make nice mottos for a debate between functional and OO programming. They also are two sides of the same coin, which destroys the premise of the debate.

miniKanren

As a Scheme programmer and a teacher of programming languages, I have developed great respect and fondness for the work of Dan Friedman over the last fifteen years. As a computer programmer who began his studies deeply interested in AI, I have long had a fondness for Prolog. How could I not go to the talk on miniKanren? This is a small implementation (~600 lines written in a subset of Scheme) of Kanren, a declarative logic programming system described in The Reasoned Schemer.

This talk was like a tag-team vaudeville act featuring Friedman and co-author William Byrd. I can't so this talk justice in a blog entry. Friedman and Byrd interleaved code demo with exposition as they

  • showed miniKanren at its simplest, built from three operators (fresh, conde, and run)
  • extended the language with a few convenient operators for specifying constraints, types, and exclusions, and
  • illustrated how to program in miniKanren by building a language interpreter, EOPL style.

The cool endpoint of using logic programming to build the interpreter is that, by using variables in a specification, the interpreter produces legal programs that meet a given specification. It generates code via constraint resolution.

If that weren't enough, they also demo'ed how their system can, given a language grammar, produce quines -- programs p such that

    (equal p (eval p))
-- and twines, pairs of programs p and q such that
    (and (equal p (eval q))
         (equal q (eval p)))

Then they live-coded an implementation of typed lambda calculus.

Yes, all in fifty minutes. Like I said, you really need to watch the talk at InfoQ as soon as it's posted.

In the course of giving the talk, Friedman stated a rule that my students can use:

Will's law: If your function has a recursion, do the recursion last.

Will followed up with cautionary advice:

Will's second law: If your function has two recursions, call Will.

We'll see how serious he was when I put a link to his e-mail address in my Programming Languages class notes next spring.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 25, 2012 7:31 PM

Blogging from StrangeLoop

StrangeLoop logo

This week I have the pleasure of spending a couple of days expanding my mind at StrangeLoop 2012. I like StrangeLoop because it's a conference for programmers. The program is filled with hot programming topics and languages, plus a few keynotes to punctuate our mental equilibria. The 2010 conference gave me plenty to think about, but I had to skip 2011 while teaching and recovering. This year was a must-see.

I'll be posting the following entries from the conference as time permits me to write them.

You can find links to other write-ups of the conference, as well as slides from some talks and other material, at the StrangeLoop 2012 github site.

Now that the conference has ended, I can say with confidence that StrangeLoop 2012 was even better than StrangeLoop 2010.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 20, 2012 8:09 PM

Computer Science is a Liberal Art

Over the summer, I gave a talk as part of a one-day conference on the STEM disciplines for area K-12, community college, and university advisors. They were interested in, among other things, the kind of classes that CS students take at the university and the kind of jobs they get when they graduate.

In the course of talking about how some of the courses our students take (say, algorithms and the theory of computing) seem rather disconnected from many of the jobs they get (say, web programmer and business analyst), I claimed that the more abstract courses prepare students to understand the parts of the computing world that never change, and the ones that do. The specific programming languages or development stack they use after they graduate to build financial reporting software may change occasionally, but the foundation they get as a CS major prepares them to understand what comes next and to adapt quickly.

In this respect, I said, a university CS education is not job training. Computer Science is a liberal art.

This is certainly true when you compare university CS education with what students get at a community college. Students who come out of a community college networking program often possess specific marketable skills at a level we are hard-pressed to meet in a university program. We bank our program's value on how well it prepares students for a career, in which networking infrastructure changes multiple times and our grads are asked to work at the intersection of networks and other areas of computing, some of which may not exist yet.

It is also true relative to the industries they enter after graduation. A CS education provides a set of basic skills and, more important, several ways to think about problems and formulate solutions. Again, students who come out of a targeted industry or 2-year college training program in, say, web dev, often have "shovel ready" skills that are valuable in industry and thus highly marketable. We bank our program's value on how well it prepares students for a career in which ASP turns to JSP turns PHP turns to JavaScript. Our students should be prepared to ramp up quickly and have a shovel in the hands producing value soon.

And, yes, students in a CS program must learn to write code. That's a basic skill. I often hear people comment that computer science programs do not prepare students well for careers in software development. I'm not sure that's true, at least at schools like mine. We can't get away with teaching all theory and abstraction; our students have to get jobs. We don't try to teach them everything they need to know to be good software developers, or even many particular somethings. That should and will come on the job. I want my students to be prepared for whatever they encounter. If their company decides to go deep with Scala, I'd like my former students to be ready to go with them.

In a comment on John Cook's timely blog entry How long will there be computer science departments?, Daniel Lemire suggests that we emulate the model of medical education, in which doctors serve several years in residency, working closely with experienced doctors and learning the profession deeply. I agree. Remember, though, that aspiring doctors go to school for many years before they start residency. In school, they study biology, chemistry, anatomy, and physiology -- the basic science at the foundation of their profession. That study prepares them to understand medicine at a much deeper level than they otherwise might. That's the role CS should play for software developers.

(Lemire also smartly points out that programmers have the ability to do residency almost any time they like, by joining an open source project. I love to read about how Dave Humphrey and people like him bring open-source apprenticeship directly into the undergrad CS experience and wonder how we might do something similar here.)

So, my claim that Computer Science is a liberal arts program for software developers may be crazy, but it's not entirely crazy. I am willing to go even further. I think it's reasonable to consider Computer Science as part of the liberal arts for everyone.

I'm certainly not the first person to say this. In 2010, Doug Baldwin and Alyce Brady wrote a guest editors' introduction to a special issue of the ACM Transactions on Computing Education called Computer Science in the Liberal Arts. In it, they say:

In late Roman and early medieval times, seven fields of study, rooted in classical Greek learning, became canonized as the "artes liberales" [Wagner 1983], a phrase denoting the knowledge and intellectual skills appropriate for citizens free from the need to labor at the behest of others. Such citizens had ample leisure time in which to pursue their own interests, but were also (ideally) civic, economic, or moral leaders of society.

...

[Today] people ... are increasingly thinking in terms of the processes by which things happen and the information that describes those processes and their results -- as a computer scientist would put it, in terms of algorithms and data. This transformation is evident in the explosion of activity in computational branches of the natural and social sciences, in recent attention to "business processes," in emerging interest in "digital humanities," etc. As the transformation proceeds, an adequate education for any aspect of life demands some acquaintance with such fundamental computer science concepts as algorithms, information, and the capabilities and limitations of both.

The real value in a traditional Liberal Arts education is in helping us find better ways to live, to expose us to the best thoughts of men and women in hopes that we choose a way to live, rather than have history or accident choose a way to live for us. Computer science, like mathematics, can play a valuable role in helping students connect with their best aspirations. In this sense, I am comfortable at least entertaining the idea that CS is one of the modern liberal arts.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

September 05, 2012 5:24 PM

Living with the Masters

I sometimes feel guilty that most of what I write here describes connections between teaching or software development and what I see in other parts of the world. These connections are valuable to me, though, and writing them down is valuable in another way.

I'm certainly not alone. In Why Read, Mark Edmondson argues for the value of reading great literature and trying on the authors' view of the world. Doing so enables us to better understand our own view of the world, It also gives us the raw material out of which to change our worldview, or build a new one, when we encounter better ideas. In the chapter "Discipline", Edmondson writes:

The kind of reading that I have been describing here -- the individual quest for what truth a work reveals -- is fit for virtually all significant forms of creation. We can seek vital options in any number of places. They may be found for this or that individual in painting, in music, in sculpture, in the arts of furniture making or gardening. Thoreau felt he could derive a substantial wisdom by tending his bean field. He aspired to "know beans". He hoed for sustenance, as he tells us, but he also hoed in search of tropes, comparisons between what happened in the garden and what happened elsewhere in the world. In his bean field, Thoreau sought ways to turn language -- and life -- away from old stabilities.

I hope that some of my tropes are valuable to you.

The way Edmondson writes of literature and the liberal arts applies to the world of software in a much more direct ways too. First, there is the research literature of computing and software development. One can seek truth in the work of Alan Kay, David Ungar, Ward Cunningham, or Kent Beck. One can find vital options in the life's work of Robert Floyd, Peter Landin, or Alan Turing; Herbert Simon, Marvin Minsky, or John McCarthy. I spent much of my time in grad school immersed in the writings and work of B. Chandrasekaran, which affected my view of intelligence in both humans and machines.

Each of these people offers a particular view into a particular part of the computing world. Trying out their worldviews can help us articulate our own worldviews better, and in the process of living their truths we sometimes find important new truths for ourselves.

We in computing need not limit ourselves to the study of research papers and books. As Edmondson says the individual quest for the truth revealed in a work "is fit for virtually all significant forms of creation". Software is a significant form of creation, one not available to our ancestors even sixty years ago. Live inside any non-trivial piece of software for a while, especially one that has withstood the buffets of human desire over a period of time, and you will encounter truth -- truths you find there, and truths you create for yourself. A few months trying on Smalltalk and its peculiar view of the world taught me OOP and a whole lot more.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

August 08, 2012 1:50 PM

Examples First, Names Last

Earlier this week, I reviewed a draft chapter from a book a friend is writing, which included a short section on aspect-oriented programming. The section used common AOP jargon: "cross cutting", "advice", and "point cut". I know enough about AOP to follow his text, but I figured that many of his readers -- young software developers from a variety of backgrounds -- would not. On his use of "cross cutting", I commented:

Your ... example helps to make this section concrete, but I bet you could come up with a way of explaining the idea behind AOP in a few sentences that would be (1) clear to readers and (2) not use "cross cutting". Then you could introduce the term as the name of something they already understand.

This may remind you of the famous passage from Richard Feynman about learning names and understanding things. (It is also available on a popular video clip.) Given that I was reviewing a chapter for a book of software patterns, it also brought back memories of advice that Ralph Johnson gave many years ago on the patterns discussion list. Most people, he said, learn best from concrete examples. As a result, we should write software patterns in such a way that we lead with a good example or two and only then talk about the general case. In pattern style, he called this idea "Concrete Before Abstract".

I try to follow this advice in my teaching, though I am not dogmatic about it. There is a lot of value in mixing up how we organize class sessions and lectures. First, different students connect better with some approaches than others, so variety increases the chances that of connecting with everyone a few times each semester. Second, variety helps to keep students in interested, and being interested is a key ingredient in learning.

Still, I have a preference for approaches that get students thinking about real code as early as possible. Starting off by talking about polymorphism and its theoretical forms is a lot less effective at getting the idea across to undergrads than showing students a well-chosen example or two of how plugging a new object into an application makes it easier to extend and modify programs.

So, right now, I have "Concrete Before Abstract" firmly in mind as I prepare to teaching object-oriented programming to our sophomores this fall.

Classes start in twelve days. I figured I'd be blogging more by now about my preparations, but I have been rethinking nearly everything about the way I teach the course. That has left my mind more muddled that settled for long stretches. Still, my blog is my outboard brain, so I should be rethinking more in writing.

I did have one crazy idea last night. My wife learned Scratch at a workshop this summer and was talking about her plans to use it as a teaching tool in class this fall. It occurred to me that implementing Scratch would be a fun exercise for my class. We'll be learning Java and a little graphics programming as a part of the course, and conceptually Scratch is not too many steps from the pinball game construction kit in Budd's Understanding Object-Oriented Programming with Java, the textbook I have used many times in the course. I'm guessing that Budd's example was inspired by Bill Budge's game for Electronic Arts, Pinball Construction Set. (Unfortunately, Budd's text is now almost as out of date as that 1983 game.)

Here is an image of a game constructed using the pinball kit and Java's AWT graphics framework:

a pinball game constructed using a simple game kit

The graphical ideas needed to implement Scratch are a bit more complex, including at least:

  • The items on the canvas must be clickable and respond to messages.
  • Items must be able to "snap" together to create units of program. This could happen when a container item such as a choice or loop comes into contact with an item it is to contain.

The latter is an extension of collision-detecting behavior that students would be familiar with from earlier "ball world" examples. The former is something we occasionally do in class anyway; it's awfully handy to be able to reconfigure the playing field after seeing how the game behaves with the ball in play. The biggest change would be that the game items are little pieces of program that know how to "interpret" themselves.

As always, the utility of a possible teaching idea lies in the details of implementing it. I'll give it a quick go over the next week to see if it's something I think students would be able to handle, either as a programming assignment or as an example we build and discuss in class.

I'm pretty excited by the prospect, though. If this works out, it will give me a nice way to sneak basic language processing into the course in a fun way. CS students should see and think about languages and how programs are processed throughout their undergrad years, not only in theory courses and programming languages courses.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

August 03, 2012 3:23 PM

How Should We Teach Algebra in 2012?

Earlier this week, Dan Meyer took to task a New York Times opinion piece from the weekend, Is Algebra Necessary?:

The interesting question isn't, "Should every high school graduate in the US have to take Algebra?" Our world is increasingly automated and programmed and if you want any kind of active participation in that world, you're going to need to understand variable representation and manipulation. That's Algebra. Without it, you'll still be able to clothe and feed yourself, but that's a pretty low bar for an education. The more interesting question is, "How should we define Algebra in 2012 and how should we teach it?" Those questions don't even seem to be on Hacker's radar.

"Variable representation and manipulation" is a big part of programming, too. The connection between algebra and programming isn't accidental. Matthias Felleisen won the ACM's Outstanding Educator Award in 2010 for his long-term TeachScheme! project, which has now evolved into Program by Design. In his SIGCSE 2011 keynote address, Felleisen talked about the importance of a smooth progression of teaching languages. Another thing he said in that talk stuck with me. While talking about the programming that students learned, he argued that this material could be taught in high school right now, without displacing as much material as most people think. Why? Because "This is algebra."

Algebra in 2012 still rests fundamentally on variable representation and manipulation. How should we teach it? I agree with Felleisen. Programming.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 20, 2012 3:39 PM

A Philosopher of Imitation

Ian Bogost, in The Great Pretender: Turing as a Philosopher of Imitation, writes:

Intelligence -- whatever it is, the thing that goes on inside a human or a machine -- is less interesting and productive a topic of conversation than the effects of such a process, the experience it creates in observers and interlocutors.

This is a very nice one-sentence summary of Turing's thesis in Computing Machinery and Intelligence. I wrote a bit about Turing's ideas on machine intelligence a few months back, but the key idea in Bogost's essay relates more closely to my discussion in Turing's ideas on representation and universal machines.

In this centennial year of his birth, we can hardly go wrong in considering again and again the depth of Turing's contributions. Bogost uses a lovely turn of phrase in his title: a philosopher of imitation. What may sound like a slight or a trifle is, in fact, the highest of compliments. Turing made that thinkable.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 18, 2012 2:31 PM

Names, Values, and The Battle of Bull Run

the cover of 'Encyclopedia Brown Finds the Clues'

Author Donald Sobol died Monday. I know him best from his long-running series, Encyclopedia Brown. Like many kids of my day, I loved these stories. I couldn't get enough. Each book consisted of ten or so short mysteries solved by Encyclopedia or Sally Kimball, his de facto partner in the Brown Detective Agency. I wanted to be Encyclopedia.

The stories were brain teasers. Solving them required knowledge and, more important, careful observation and logical deduction. I learned to pay close attention while reading Encyclopedia Brown, otherwise I had no hope of solving the crime before Encyclopedia revealed the solution. In many ways, these stories prepared me for a career in math and science. They certainly were a lot of fun.

One of the stories I remember best after all these years is "The Case of the Civil War Sword", from the very first Encyclopedia Brown book. I'm not the only person who found it memorable; Rob Bricken ranks it #9 among the ten most difficult Encyclopedia Brown mysteries. The solution to this case turned on the fact that one battle had two different names. Northerners often named battles for nearby bodies of water or prominent natural features, while Southerners named them for the nearest town or prominent man-made features. So, the First Battle of Bull Run and the First Battle of Manassas were the same event.

This case taught me a bit of historical trivia and opened my mind to the idea that naming things from the Civil War was not trivial at all.

This story taught me more than history, though. As a young boy, it stood out as an example of something I surely already knew: names aren't unique. The same value can have different names. In a way, Encyclopedia Brown taught me one of my first lessons about computer science.

~~~~

IMAGE: the cover of Encyclopedia Brown Finds the Clues, 1966. Source: Topless Robot.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 14, 2012 11:01 AM

"Most Happiness Comes From Friction"

Last time, I mentioned again the value in having students learn broadly across the sciences and humanities, including computer science. This is a challenge going in both directions. Most students like to concentrate on one area, for a lot of different reasons. Computer science looks intimidating to students in other majors, perhaps especially to the humanities-inclined.

There is hope. Earlier this year, the Harvard Magazine ran The Frisson of Friction, an essay by Sarah Zhang, a non-CS student who decided to take CS 50, Harvard's intro to computer science. Zhang tells the story of finding a thorny, semicolon-induced bug in a program (an extension for Google's Chrome browser) on the eve of her 21st birthday. Eventually, she succeeded. In retrospect, she writes:

Plenty of people could have coded the same extension more elegantly and in less time. I will never be as good a programmer as -- to set the standard absurdly high -- Mark Zuckerberg. But accomplishments can be measured in terms relative to ourselves, rather than to others. Rather than sticking to what we're already good at as the surest path to résumé-worthy achievements, we should see the value in novel challenges. How else will we discover possibilities that lie just beyond the visible horizon?

... Even the best birthday cake is no substitute for the deep satisfaction of accomplishing what we had previously deemed impossible -- whether it's writing a program or writing a play.

The essay addresses some of the issues that keep students from seeking out novel challenges, such as fear of low grades and fear of looking foolish. At places like Harvard, students who are used to succeeding find themselves boxed in by their friends' expectations, and their own, but those feelings are familiar to students at any school. Then you have advisors who subtly discourage venturing too far from the comfortable, out of their own unfamiliarity and fear. This is a social issue as big as any pedagogical challenge we face in trying to make introductory computer science more accessible to more people.

With work, we can help students feel the deep satisfaction that Zhang experienced. Overcoming challenges often leads to that feeling. She quotes a passage about programmers in Silicon Valley, who thrive on such challenges: "Most happiness probably comes from friction." Much satisfaction and happiness come out of the friction inherent in making things. Writing prose and writing programs share this characteristic.

Sharing the deep satisfaction of computer science is a problem with many facets. Those of us who know the satisfaction know it's a problem worth solving.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

July 13, 2012 12:02 PM

How Science -- and Computing -- Are Changing History

While reading a recent Harvard Magazine article about Eric Mazur's peer instruction technique in physics teaching, I ran across a link to an older paper that fascinated me even more! Who Killed the Men of England? tells several stories of research at the intersection of history, archaeology, genomics, evolution, demography, and simulation, such as the conquest of Roman England by the Anglo Saxons.

Not only in this instance, but across entire fields of inquiry, the traditional boundaries between history and prehistory have been melting away as the study of the human past based on the written record increasingly incorporates the material record of the natural and physical sciences. Recognizing this shift, and seeking to establish fruitful collaborations, a group of Harvard and MIT scholars have begun working together as part of a new initiative for the study of the human past. Organized by [professor of medieval history Michael] McCormick, who studies the fall of the Roman empire, the aim is to bring together researchers from the physical, life, and computer sciences and the humanities to explore the kinds of new data that will advance our understanding of human history.

... The study of the human past, in other words, has entered a new phase in which science has begun to tell stories that were once the sole domain of humanists.

I love history as much as computing and was mesmerized by these stories of how scientists reading the "material record" of the world are adding to our knowledge of the human past.

However, this is more than simply a one-way path of information flowing from scientists to humanists. The scientific data and models themselves are underconstrained. The historians, cultural anthropologists, and demographers are able to provide context to the data and models and so extract even more meaning from them. This is a true collaboration. Very cool.

The rise of science is erasing boundaries between the disciplines that we all studied in school. Scholars are able to define new disciplines, such as "the study of the human past", mentioned in the passage above. These disciplines are organized with a greater focus on what is being studied than on how we are studying it.

We are also blurring the line between history and pre-history. It used to be that history required a written record, but that is no longer a hard limit. Science can read nature's record. Computer scientists can build models using genomic data and migration data that suggest possible paths of change when the written and scientific record are incomplete. These ideas become part of the raw material that humanists use to construct a coherent story of the past.

This change in how we are able to study the world highlights the importance of a broad education, something I've written about a few times recently [ 1 | 2 | 3 ] and not so recently. This sort of scholarship is best done by people who are good at several things, or at least curious and interested enough in several things to get to know them intimately. As I wrote in Failure and the Liberal Arts, it's important both not to be too narrowly trained and not to be too narrowly "liberally educated".

Even at a place like Harvard, this can leave scholars in a quandary:

McCormick is fired with enthusiasm for the future of his discipline. "It is exciting. I jump up every morning. But it is also challenging. Division and department boundaries are real. Even with a generally supportive attitude, it is difficult [to raise funds, to admit students who are excellent in more than one discipline, and so on]. ..."

So I will continue to tell computer science students to take courses from all over the university, not just from CS and math. This is one point of influence I have as a professor, advisor, and department head. And I will continue to look for ways to encourage non-CS students to take CS courses and students outside the sciences to study science, including CS. As that paragraph ends:

"... This is a whole new way of studying the past. It is a unique intellectual opportunity and practically all the pieces are in place. This should happen here--it will happen, whether we are part of it or not."

"Here" doesn't have to be Harvard. There is a lot of work to be done.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 11, 2012 2:45 PM

A Few Comments on the Alan Kay Interview, and Especially Patterns

Alan Kay

Many of my friends and colleagues on Twitter today are discussing the Interview with Alan Kay posted by Dr. Dobb's yesterday. I read the piece this morning while riding the exercise bike and could not contain my desire to underline passages, star paragraphs, and mark it up with my own comments. That's hard to do while riding hard, hurting a little, and perspiring a lot. My desire propelled me forward in the face of all these obstacles.

Kay is always provocative, and in this interview he leaves no oxen ungored. Like most people do when whenever they read outrageous and provocative claims, I cheered when Kay said something I agreed with and hissed -- or blushed -- when he said something that gored me or one of my pet oxen. Twitter is a natural place to share one's cheers and boos for an artyicle with or by Alan Kay, given the amazing density of soundbites one finds in his comments about the world of computing.

(One might say the same thing about Brian Foote, the source of both soundbites in that paragraph.)

I won't air all my cheers and hisses here. Read the article, if you haven't already, and enjoy your own. I will comment on one paragraph that didn't quite make me blush:

The most disastrous thing about programming -- to pick one of the 10 most disastrous things about programming -- there's a very popular movement based on pattern languages. When Christopher Alexander first did that in architecture, he was looking at 2,000 years of ways that humans have made themselves comfortable. So there was actually something to it, because he was dealing with a genome that hasn't changed that much. I think he got a few hundred valuable patterns out of it. But the bug in trying to do that in computing is the assumption that we know anything at all about programming. So extracting patterns from today's programming practices ennobles them in a way they don't deserve. It actually gives them more cachet.

Long-time Knowing and Doing readers know that patterns are one of my pet oxen, so it would have been natural for me to react somewhat as Keith Ray did and chide Kay for what appears to be a typical "Hey, kids, get off my lawn!" attitude. But that's not my style, and I'm such a big fan of Kay's larger vision for computing that my first reaction was to feel a little sheepish. Have I been wasting my time on a bad idea, distracting myself from something more important? I puzzled over this all morning, and especially as I read other people's reactions to the interview.

Ultimately, I think that Kay is too pessimistic when he says we hardly know anything at all about programming. We may well be closer to the level of the Egyptians who built the pyramids than we are to the engineers who built the Empire State Building. But I simply don't believe that people such as Ward Cunningham, Ralph Johnson, and Martin Fowler don't have a lot to teach most of us about how to make better software.

Wherever we are, I think it's useful to identify, describe, and catalog the patterns we see in our software. Doing so enables us to talk about our code at a level higher than parentheses and semicolons. It helps us bring other programmers up to speed more quickly, so that we don't all have to struggle through all the same detours and tar pits our forebears struggled through. It also makes it possible for us to talk about the strengths and weaknesses of our current patterns and to seek out better ideas and to adopt -- or design -- more powerful languages. These are themes Kay himself expresses in this very same interview: the importance of knowing our history, of making more powerful languages, and of education.

Kay says something about education in this interview that is relevant to the conversation on patterns:

Education is a double-edged sword. You have to start where people are, but if you stay there, you're not educating.

The real bug in what he says about patterns lies at one edge of the sword. We may not know very much about how to make software yet, but if we want to remedy that, we need to start where people are. Most software patterns are an effort to reach programmers who work in the trenches, to teach them a little of what we do know about how to make software. I can yammer on all I want about functional programming. If a Java practitioner doesn't appreciate the idea of a Value Object yet, then my words are likely wasted.

Ward Cunningham

Ironically, many argue that the biggest disappointment of the software patterns effort lies at the other edge of education's sword: an inability to move the programming world quickly enough from where it was in the mid-1990s to a better place. In his own Dr. Dobb's interview, Ward Cunningham observed with a hint of sadness that an unexpected effect of the Gang of Four Design Patterns book was to extend the life of C++ by a decade, rather than reinvigorating Smalltalk (or turning people on to Lisp). Changing the mindset of a large community takes time. Many in the software patterns community tried to move people past a static view of OO design embodied in the GoF book, but the vocabulary calcified more quickly than they could respond.

Perhaps that is all Kay meant by his criticism that patterns "ennoble" practices in a way they don't deserve. But if so, it hardly qualifies in my mind as "one of the 10 most disastrous things about programming". I can think of a lot worse.

Kurt Vonnegut

To all this, I can only echo the Bokononists in Kurt Vonnegut's novel Cat's Cradle: "Busy, busy, busy." The machinery of life is usually more complicated and unpredictable than we expect or prefer. As a result, reasonable efforts don't always turn out as we intend them to. So it goes. I don't think that means we should stop trying.

Don't let my hissing about one paragraph in the interview dissuade you from reading the Dr. Dobb's interview. As usual, Kay stimulates our minds and encourages us to do better.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

June 28, 2012 4:13 PM

"Doing research is therefore writing software."

The lede from RA Manual: Notes on Writing Code, by Gentzkow and Shapiro:

Every step of every research project we do is written in code, from raw data to final paper. Doing research is therefore writing software.

The authors are economists at the University of Chicago. I have only skimmed the beginning of the paper, but I like what little I've seen. They take seriously the writing of computer programs.

  • "This document lays out some broad principles we should all follow."
  • "We encourage you to invest in reading more broadly about software craftsmanship, looking critically at your own code and that of your colleagues, and suggesting improvements or additions to the principles below."
  • "Apply these principles to every piece of code you check in without exception."
  • "You should also take the time to improve code you are modifying or extending even if you did not write the code yourself."

...every piece of code you check in... Source code management and version control? They are a couple of steps up on many CS professors and students.

Thanks to Tyler Cowen for the link.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 19, 2012 3:04 PM

Basic Arithmetic, APL-Style, and Confident Problem Solvers

After writing last week about a cool array manipulation idiom, motivated by APL, I ran across another reference to "APL style" computation yesterday while catching up with weekend traffic on the Fundamentals of New Computing mailing list. And it was cool, too.

Consider the sort of multi-digit addition problem that we all spend a lot of time practicing as children:

        365
     +  366
     ------

The technique requires converting two-digit sums, such as 6 + 5 = 11 in the rightmost column, into a units digit and carrying the tens digit into the next column to the left. The process is straightforward but creates problems for many students. That's not too surprising, because there is a lot going on in a small space.

David Leibs described a technique, which he says he learned from something Kenneth Iverson wrote, that approaches the task of carrying somewhat differently. It takes advantage of the fact that a multi-digit number is a vector of digits times another vector of powers.

First, we "spread the digits out" and add them, with no concern for overflow:

        3   6   5
     +  3   6   6
     ------------
        6  12  11

Then we normalize the result by shifting carries from right to left, "in fine APL style".

        6  12  11
        6  13   1
        7   3   1

According to Leibs, Iverson believed that this two-step approach was easier for people to get right. I don't know if he had any empirical evidence for the claim, but I can imagine why it might be true. The two-step approach separates into independent operations the tasks of addition and carrying, which are conflated in the conventional approach. Programmers call this separation of concerns, and it makes software easier to get right, too.

Multiplication can be handled in a conceptually similar way. First, we compute an outer product by building a digit-by-digit times table for the digits:

     +---+---------+
     |   |  3  6  6|
     +---+---------+
     | 3 |  9 18 18|
     | 6 | 18 36 36|
     | 5 | 15 30 30|
     +---+---------+

This is straightforward, simply an application of the basic facts that students memorize when they first learn multiplication.

Then we sum the diagonals running southwest to northeast, again with no concern for carrying:

     (9) (18+18) (18+36+15) (36+30) (30)
      9      36         69      66   30

In the traditional column-based approach, we do this implicitly when we add staggered columns of digits, only we have to worry about the carries at the same time -- and now the carry digit may be something other than one!

Finally, we normalize the resulting vector right to left, just as we did for addition:

         9  36  69  66  30
         9  36  69  69   0
         9  36  75   9   0
         9  43   5   9   0
        13   3   5   9   0
     1   3   3   5   9   0

Again, the three components of the solution are separated into independent tasks, enabling the student to focus on one task at a time, using for each a single, relatively straightforward operator.

(Does this approach remind some of you of Cannon's algorithm for matrix multiplication in a two-dimensional mesh architecture?)

Of course, Iverson's APL was designed around vector operations such as these, so it includes operators that make implementing such algorithms as straightforward as the calculate-by-hand technique. Three or four Greek symbols and, voilá, you have a working program. If you are Dave Ungar, you are well on your way to a compiler!

the cover of High-Speed Math Self-Taught, by Lester Meyers

I have a great fondness for alternative ways to do arithmetic. One of the favorite things I ever got from my dad was a worn copy of Lester Meyers's High-Speed Math Self-Taught. I don't know how many hours I spent studying that book, practicing its techniques, and developing my own shortcuts. Many of these techniques have the same feel as the vector-based approaches to addition and multiplication: they seem to involve more steps, but the steps are simpler and easier to get right.

A good example of this I remember learning from High-Speed Math Self-Taught is a shortcut for finding 12.5% of a number: first multiply by 100, then divide by 8. How can a multiplication and a division be faster than a single multiplication? Well, multiplying by 100 is trivial: just add two zeros to the number, or shift the decimal point two places to the right. The division that remains involves a single-digit divisor, which is much easier than multiplying by a three-digit number in the conventional way. The three-digit number even has its own decimal point, which complicates matters further!

To this day, I use shortcuts that Meyers taught me whenever I'm making updating the balance in my checkbook register, calculating a tip in a restaurant, or doing any arithmetic that comes my way. Many people avoid such problems, but I seek them out, because I have fun playing with the numbers.

I am able to have fun in part because I don't have to worry too much about getting a wrong answer. The alternative technique allows me to work not only faster but also more accurately. Being able to work quickly and accurately is a great source of confidence. That's one reason I like the idea of teaching students alternative techniques that separate concerns and thus offer hope for making fewer mistakes. Confident students tend to learn and work faster, and they tend to enjoy learning more than students who are handcuffed by fear.

I don't know if anyone was tried teaching Iverson's APL-style basic arithmetic to children to see if it helps them learn faster or solve problems more accurately. Even if not, it is both a great demonstration of separation of concerns and a solid example of how thinking about a problem differently opens the door to a new kind of solution. That's a useful thing for programmers to learn.

~~~~

Postscript. If anyone has a pointer to a paper or book in which Iverson talks about this approach to arithmetic, I would love to hear from you.

IMAGE: the cover of Meyers's High-Speed Math Self-Taught, 1960. Source: OpenLibrary.org.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

May 29, 2012 2:44 PM

Some Final Thoughts and Links from JRubyConf

You are probably tired of hearing me go on about JRubyConf, so I'll try to wrap up with one more post. After the first few talks, the main result of the rest of conference was to introduced me to several cool projects and a few interesting quotes.

Sarah Allen speaking on agile business development

Sarah Allen gave a talk on agile business development. Wow, she has been part of creating several influential pieces of software, including AfterEffects, ShockWave, and FlashPlayer. She talked a bit about her recent work to increase diversity among programmers and reminded us that diversity is about more than the categories we usually define:

I may be female and a minority here, but I'm way more like everybody in here than everybody out there.

Increasing diversity means making programming accessible to people who wouldn't otherwise program.

Regarding agile development, Sarah reminded us that agile's preference for working code over documentation is about more than just code:

Working code means not only "passes the tests" but also "works for the customer".

... which is more about being the software they need than simply getting right answers to some tests written in JUnit.

Nate Schutta opened day two with a talk on leading technical change. Like Venkat Subramaniam on day one, Schutta suggested that tech leaders consider management's point of view when trying to introduce new technology, in particular the risks that managers face. If you can tie new technology to the organization's strategic goals and plans, then managers can integrate it better into other actions. Schutta attributed his best line to David Hussman:

Change must happen with people, not to them.

The award for the conference's most entertaining session goes to Randall Thomas and Tammer Saleh for "RUBY Y U NO GFX?", their tag-team exegesis of the history of computer graphics and where Ruby fits into that picture today. They echoed several other speakers in saying that JRuby is the bridge to the rest of the programming world that Ruby programmers need, because the Java community offers so many tools. For example, it had never occurred to me to use JRuby to connect my Ruby code to Processing, the wonderful open-source language for programming images and animations. (I first mentioned Processing here over four years ago in its original Java form, and most recently was thinking of its JavaScript implementation.)

Finally, a few quickies:

  • Jim Remsik suggested Simon Sinek's TED talk, How great leaders inspire action, with the teaser line, It's not what you do; it's why you do it.

  • Yoko Harada introduced me to Nokogiri, a parser for HTML, XML, and the like.

  • Andreas Ronge gave a talk on graph databases as a kind of NoSQL database and specifically about Neo4j.rb, his Ruby wrapper on the Java library Neo4J.

  • I learned about Square, which operates in the #fintech space being explored by the Cedar Valley's own T8 Webware and by Iowa start-up darling Dwolla.

  • rapper Jay Z
    I mentioned David Wood in yesterday's entry. He also told a great story involving rapper Jay-Z, illegal music downloads, multi-million-listener audiences, Coca Cola, logos, and video releases that encapsulated in a nutshell the new media world in which we live. It also gives a very nice example of why Jay-Z will soon be a billionaire, if he isn't already. He gets it.

  • The last talk I attended before hitting the road was by Tony Arcieri, on concurrent programming in Ruby, and in particular his concurrency framework Celluloid. It is based on the Actor model of concurrency, much like Erlang and Scala's Akka framework. Regarding these two, Arcieri said that Celluloid stays truer the original model's roots than Akka by having objects at its core and that he currently views any differences in behavior between Celluloid and Erlang as bugs in Celluloid.

One overarching theme for me of my time at JRubyConf: There is a lot of stuff I don't know. I won't run out of things to read and learn and do for a long, long, time.

~~~~

IMAGE 1: my photo of Sarah Allen during her talk on agile business development. License: Creative Commons Attribution-ShareAlike 3.0 Unported.

IMAGE 2: Jay-Z, 2011. Source: Wikimedia Commons.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 28, 2012 10:58 AM

The Spirit of Ruby... and of JRuby

JRubyConf was my first Ruby-specific conference, and one of the things I most enjoyed was seeing how the spirit of the language permeates the projects created by its community of users. It's one thing to read books, papers, and blog posts. It's another to see the eyes and mannerisms of the people using the language to make things they care about. Being a variant, JRuby has its own spirit. Usually it is in sync with Ruby's, but occasionally it diverges.

the letter thnad, created

The first talk after lunch was by Ian Dees, talking about his toy programming language project Thnad. (He took the name from one of the new letters of the alphabet in Dr. Seuss's On Beyond Zebra.) Thnad looks a lot like Klein, a language I created for my compiler course a few years ago, a sort of functional integer assembly language.

The Thnad project is a great example of how easy it is to roll little DSLs using Ruby and other DSLs created in it. To implement Thnad, Dees uses Parslet, a small library for generating scanners and parsers PEG-style, and BiteScript, a Ruby DSL for generating Java bytecode and classes. This talk demonstrated the process of porting Thnad from JRuby to Rubinius, a Ruby implementation written in Ruby. (One of the cool things I learned about the Rubinius compiler is that it can produce s-expressions as output, using the switch -S.)

Two other talks exposed basic tenets of the Ruby philosophy and the ways in which implementations such as JRuby and Rubinius create new value in the ecosystem.

On Wednesday afternoon, David Wood described how his company, the Jun Group, used JRuby to achieve the level of performance its on-line application requires. He told some neat stories about the evolution of on-line media over the last 15-20 years and how our technical understanding for implementing such systems has evolved in tandem. Perhaps his most effective line was this lesson learned along the way, which recalled an idea from the keynote address the previous morning:

Languages don't scale. Architectures do. But language and platform affect architecture.

In particular, after years of chafing, he had finally reached peace with one of the overarching themes of Ruby: optimize for developer ease and enjoyment, rather than performance or scalability. This is true of the language and of most of the tools built around, such as Rails. As a result, Ruby makes it easy to write many apps quickly. Wood stopped fighting the lack of emphasis on performance and scalability when he realized that most apps don't succeed anyway. If one does, you have to rewrite it anyway, so suck it up and do it. You will have benefited from Ruby's speed of delivery.

This is the story Twitter, apparently, and what Wood's team did. They spent three person-months to port their app from MRI to JRuby, and are now quite happy.

Where does some of that performance bump come from? Concurrency. Joe Kutner gave a talk after Thnad on Tuesday afternoon about using JRuby to deploy efficient Ruby web apps on the JVM, in which he also exposed a strand of Ruby philosophy and place where JRuby diverges.

The canonical implementations of Ruby and Python use a Global Interpreter Lock to ensure that non-thread-safe code does not interfere with the code in other threads. In effect, the interpreter maps all threads onto a single thread in the kernel. This may seem like an unnecessary limitation, but it is consistent with Matz's philosophy for Ruby: Programming should be fun and easy. Concurrency is hard, so don't do allow it to interfere with the programmer's experience.

Again, this works just fine for many applications, so it's a reasonable default position for the language. But it does not work so well for web apps, which can't scale if they can't spawn new, independent threads. This is a place where JRuby offers a big win by running atop the JVM, with its support for multithreading. It's also a reason why the Kilim fibers GSoC project mentioned by Charles Nutter in the State of JRuby session is so valuable.

In this talk, I learned about three different approaches to delivering Ruby apps on the JVM:

  • Warbler, a light and simple tool for packaging .war files,
  • Trinidad, which is a JRuby wrapper for a Tomcat server, and
  • TorqueBox, an all-in-one app server that appears to be the hot new thing.

Links, links, and more links!

Talks such as these reminded me of the feeling of ease and power that Ruby gives developers, and the power that language implementors have to shape the landscape in which programmers work. They also gave me a much better appreciation for why projects like Rubinius and JRuby are essential to the Ruby world because -- not despite -- deviating from a core principle of the language.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 25, 2012 4:07 PM

JRubyConf, Day 1: The State of JRuby

Immediately after the keynote address, the conference really began for me. As a newcomer to JRuby, this was my first chance to hear lead developers Charles Nutter and Tom Enebo talk about the language and community. The program listed this session as "JRuby: Full of Surprises", and Nutter opened with a slide titled "Something About JRuby", but I just thought of the session as a "state of the language" report.

Nutter opened with some news. First, JRuby 1.7.0.preview1 is available. The most important part of this for me is that Ruby 1.9.3 is now the default language mode for the interpreter. I still run Ruby 1.8.7 on my Macs, because I have never really needed more and that kept my installations simple. It will be nice to have a 1.9.3 interpreter running, too, for cases where I want to try out some of the new goodness that 1.9 offers.

Second, JRuby has been awarded eight Google Summer of Code placements for 2012. This was noteworthy because there were no Ruby projects at all in 2010 or 2011, for different reasons. Several of the 2012 projects are worth paying attention to:

  • creating a code generator for Dalvik byte code, which will give native support for JRuby on Android
  • more work on Ruboto, the current way to run Ruby on Android, via Java
  • implementing JRuby fibers using Kilim fibers, for lighterweight and faster concurrency than Java threads can provide
  • work on krypt, "SSL done right" for Ruby, which will eliminate the existing dependence on OpenSSL
  • filling in some of the gaps in the graphics framework Shoes, both Swing and SWT versions

Charles Nutter discussing details of the JRuby compiler

Enebo then described several projects going on with JRuby. Some are smaller, including closing gaps in the API for embedding Ruby code in Java, and Noridoc, a tool for generating integrated Ruby documentation for Ruby and Java APIs that work together. Clever -- "No ri doc".

One project is major: work on the JRuby compiler itself. This includes improving to the intermediate representation used by JRuby, separating more cleanly the stages of the compiler, and writing a new, better run-time system. I didn't realize until this talk just how much overlap there is in the existing scanner, parser, analyzer, and code generator of JRuby. I plan to study the JRuby compiler in some detail this summer, so this talk whet my appetite. One of the big challenges facing the JRuby team is to identify execution hot spots that will allow the compiler to do a better job of caching, inlining, and optimizing byte codes.

This led naturally into Nutter's discussion of the other major project going on: JRuby support for the JVM's new invokedynamic instruction. He hailed invokedynamic as "the most important change to the JVM -- ever". Without it, JRuby's method invocation logic is opaque to the JVM optimizer, including caching and inlining. With it, the JRuby compiler will be able not only to generate optimizable function calls but also more efficient treatment of instance variables and constants.

Charles Nutter donning his new RedHat

Nutter showed some performance data comparing JRuby to MRI Ruby 1.9.3 on some standard benchmarks. Running on Java 6, JRuby is between 1.3 and 1.9 times faster than the C-based compiler on the benchmark suites. When they run it on Java 7, performance jumps to speed-ups of between 2.6 and 4.3. That kind of speed is enough to make JRuby attractive for many compute-intensive applications.

Just as Nutter opened with news, he closed with news. He and Enebo are moving to RedHat. They will work with various RedHat and JBoss teams, including TorqueBox, which I'll mention in an upcoming JRubyConf post. Nutter and Enebo have been at EngineYard for three years, following a three-year stint at Sun. It is good to know that, as the corporate climate around Java and Ruby evolves, there is usually a company willing to support open-source JRuby development.

~~~~

IMAGE 1: my photo of Charles Nutter talking about some details of the JRuby compiler. License: Creative Commons Attribution-ShareAlike 3.0 Unported.

IMAGE 2: my photo of Charles Nutter and Tom Enebo announcing their move to RedHat. License: Creative Commons Attribution-ShareAlike 3.0 Unported.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 24, 2012 3:05 PM

JRubyConf 2012: Keynote Address on Polyglot Programming

JRubyConf opened with a keynote address on polyglot programming by Venkat Subramaniam. JRuby is itself a polyglot platform, serving as a nexus between a highly dynamic scripting language and a popular enterprise language. Java is not simply a language but an ecosphere consisting of language, virtual machine, libraries, and tools. For many programmers, the Java language is the weakest link in its own ecosphere, which is one reason we see so many people porting their favorite languages run on the JVM, or creating new languages with the JVM as a native backend.

Subramaniam began his talk by extolling the overarching benefits of being able to program in many languages. Knowing multiple programming languages changes how we design software in any language. It changes how we think about solutions. Most important, it changes how we perceive the world. This is something that monolingual programmers often do not appreciate. When we know several languages well, we see problems -- and solutions -- differently.

Why learn a new language now, even if you don't need to? So that you can learn a new language more quickly later, when you do need to. Subramaniam claimed that the amount of time required to learn a new language is inversely proportional to the number of languages a person has learned in last ten years. I'm not sure whether there is any empirical evidence to support this claim, but I agree with the sentiment. I'd offer one small refinement: The greatest benefits come from learning different kinds of language. A new language that doesn't stretch your mind won't stretch your mind.

Not everything is heaven for the polyglot programmer. Subramaniam also offered some advice for dealing with the inevitable downsides. Most notable among these was the need to "contend with the enterprise".

Many companies like to standardize on a familiar and well-established technology stack. Introducing a new language into the mix raises questions and creates resistance. Subramaniam suggested that we back up one step before trying to convince our managers to support a polyglot environment and make sure that we have convinced ourselves. If you were really convinced of a language's value, you would find a way to do it. Then, when it comes time to convince your manager, be sure to think about the issue from her perspective. Make sure that your argument speaks to management's concerns. Identify the problem. Explain the proposed solution. Outline the costs of the solution. Outline its benefits. Show how the move can be a net win for the organization.

The nouveau JVM languages begin with a head start over other new technologies because of their interoperability with the rest of the Java ecosphere. They enable you to write programs in a new language or style without having to move anyone else in the organization. You can experiment with new technology while continuing to use the rest of the organization's toolset. If the experiments succeed, managers can have hard evidence about what works well and what doesn't before making larger changes to the development environment.

I can see why Subramaniam is a popular conference speaker. He uses fun language and coins fun terms. When talking about people who are skeptical of unit testing, he referred to some processes as Jesus-driven development. He admonished programmers who are trying to do concurrency in JRuby without knowing the Java memory model, because "If you don't understand the Java memory model, you can't get concurrency right." But he followed that immediately with, Of course, even if you do know the Java memory model, you can't get concurrency right. Finally, my favorite: At one point, he talked about how some Java developers are convinced that they can do anything they need to do in Java, with no other languages. He smiled and said, I admire Java programmers. They share an unrelenting hope.

There were times, though, when I found myself wanting to defend Java. That happens to me a lot when I hear talks like this one, because so many complaints about it are really about OOP practiced poorly; Java is almost an innocent bystander. For example, the speaker chided Java programmers for suffering from primitive obsession. This made me laugh, because most Ruby programmers seem to have an unhealthy predilection for strings, hashes, and integers.

In other cases, Subramaniam demonstrated the virtues of Ruby by showing a program that required a gem and then solved a thorny problem with three lines of code. Um, I could do that in Java, too, if I used the right library. And Ruby programmers probably don't want to draw to much attention to gems and the problems many Ruby devs have with dependency management.

But those are small issues. Over the next two days, I repeatedly heard echoes of Subramaniam's address in the conference's other talks. This is the sign of a successful keynote.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 22, 2012 7:53 PM

A Few Days at JRubyConf

It's been fourteen months since I last attended a conference. I decided to celebrate the end of the year, the end of my compiler course, and the prospect of writing a little code this summer by attending JRubyConf 2012. I've programmed a fair amount in Ruby but have only recently begun to play with JRuby, an implementation of Ruby in Java which runs atop the JVM. There are some nice advantages to this, including the ability to use Java graphics with Ruby models and the ability to do real concurrency. It also offers me a nice combination for the summer. I will be teaching our sophomore-level intermediate computing course this fall, which focuses in large part on OO design and Java implementation, as JRuby will let me program in Ruby while doing a little class prep at the same time.

the Stone Arch Bridge in Minneapolis

Conference organizer Nick Sieger opened the event with the obligatory welcome remarks. He said that he thinks the overriding theme of JRubyConf is being a bridge. This is perhaps a natural effect of Minneapolis, a city of many bridges, as the hometown of JRuby, its lead devs, and the conference. The image above is of the Stone Arch Bridge, as seen from the ninth level of the famed Guthrie Center, the conference venue. (The yellow tint is from the window itself.)

The goal for the conference is to be a bridge connecting people to technologies. But it also aims to be a bridge among people, promoting what Sieger called "a more sensitive way of doing business". Emblematic of this goal were its Sunday workshop, a Kids CodeCamp, and its Monday workshop, Railsbridge. This is my first open-source conference, and when I look around I see the issue that so many people talk about. Of 150 or so attendees, there must be fewer than one dozen women and fewer than five African-Americans. The computing world certainly has room to make more and better connections into the world.

My next few entries will cover some of the things I learn at the conference. I start with a smile on my face, because the conference organizers gave me a cookie when I checked in this morning:

the sugar cookie JRubyConf gave me at check-in

That seems like a nice way to say 'hello' to a newcomer.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

May 14, 2012 3:26 PM

Lessons Learned from Another Iteration of the Compiler Course

I am putting the wrap on spring semester, so that I can get down to summer duties and prep for fall teaching. Here are a few lessons I learned this spring.

•  A while back, I wrote briefly about re-learning the value of a small source language for the course. If I want to add a construct or concept, then I need also to subtract a corresponding load from language. In order to include imperative features, I need to eliminate recursive functions, or perhaps eliminate functions altogether. Eliminating recursion may be sufficient, as branching to a function is not much more complex than branching in loops. It is the stack of activation records that seems to slow down most students.

•  Using a free on-line textbook worked out okay. The main problem was that this particular book contained less implementation detail than books we have used in the past, such as Louden, and that hurt the students. We used Louden's TM assembly language and simulator, and the support I gave them for that stage of the compiler in particular was insufficient. The VM and assembly language themselves are simple enough, but students wanted more detailed examples of a code generator than I gave them.

•  I need to teach code generation better. I felt that way as the end of the course approached, and then several students suggested as much in our final session review. This is the most salient lesson I take from this iteration of the course.

I'm not sure at this moment if I need only to do a better job of explaining the process or if I need a different approach to the task more generally. That's something I'll need to think about between now and next time. I do think that I need to show them how to implement function calls in a bit more detail. Perhaps we could spend more time in class with statically-allocated activation records, and then let the students extend those ideas for a run-time stack and recursion.

•  For the first time ever, a few students suggested that I require something simpler than a table-driven parser. Of course, I can address several issues with parsing and code generation by using scaffolding: parser generators, code-generation frameworks and the like. But I still prefer that students write a compiler from scratch, even if only a modest one. There is something powerful in making something from scratch. A table-driven parser is a nice blend of simplicity (in algorithm) and complexity (in data) for learning how compilers really work.

I realize that I have to draw the abstraction line somewhere, and even after several offerings of the course I'm ready to draw it there. To make that work as well as possible, I may have to improve parts of the course to make it work better.

•  Another student suggestion that seems spot-on is that, as we learn each stage of the compiler, we take some time to focus on specific design decisions that the teams will have to make. This will alway them, as they said in their write-ups, "to make informed decisions". I do try to introduce key implementation decisions that they face and offer advice on how to proceed. Clearly I can do better. One way, I think, is to connect more directly with the programming styles they are working in.

~~~~

As usual, the students recognized some of the same shortcomings of the course that I noticed and suggested a couple more that had not occurred to me. I'm always glad I ask for their feedback, both open and anonymous. They are an indispensable source of information about the course.

Writing your first compiler is a big challenge. I can't help but recall something writer Samuel Delany said when asked if he "if it was fun" to write a set of novellas on the order of Eliot's The Waste Land, Pound's The Cantos, and Joyce's Ulysses:

No, not at all. ... But at least now, when somebody asks, "I wonder if Joyce could have done all the things he's supposed to have done in Ulysses," I can answer, "Yes, he could have. I know, because I tried it myself. It's possible."

Whatever other virtues there are in learning to write a compiler, it is valuable for computer science students to take on big challenges and know that it is possible to meet the challenge, because they have tried it themselves.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 01, 2012 8:56 AM

The Pleasure in a Good Name

Guile is a Scheme. It began life as GEL, which stood for GNU Extension Language. This brief history of Guile explains the change:

Due to a naming conflict with another programming language, Jim Blandy suggested a new name for GEL: "Guile". Besides being a recursive acronym, "Guile" craftily follows the naming of its ancestors, "Planner", "Conniver", and "Schemer". (The latter was truncated to "Scheme" due to a 6-character file name limit on an old operating system.) Finally, "Guile" suggests "guy-ell", or "Guy L. Steele", who, together with Gerald Sussman, originally discovered Scheme.

That is how you name a language, making it significant on at least three levels. Recursive acronyms are a staple of the GNU world, beginning with GNU itself. Guile recurses as the GNU Ubiquitous Intelligent Language for Extensions. Synonymny with Planner and Conniver keeps alive the historical connection to artificial intelligence, and is reinforced by the use of intelligent in the acronym. Finally, the homophonic connection to Steele is pure genius.

I bow to you, Mr. Blandy.

(While we are talking about words, I must say that I love the author's use of discovered in the passage quoted above. Most people say that Steele and Sussman "created" Scheme, or "designed" it, or "invented" it. But if you read Steele's and Gabriel's The Evolution of Lisp, you will see that discovery is a better label for what happened. Scheme was lurking in the implementation of Lisp's apply primitive and Carl Hewitt's theory of actors. Hewitt, of course, created Planner, which is another connection back to Guile.)


Posted by Eugene Wallingford | Permalink | Categories: Computing

April 24, 2012 4:55 PM

Recursive Discussions about Recursion

The SIGCSE listserv has erupted today with its seemingly annual discussion of teaching recursion. I wrote about one of the previous discussions a couple of years ago. This year's conversation has actually included a couple of nice ideas, so it was worth following along.

Along the way, one prof commented on an approach he has seen used to introduce students to recursion, often in a data structures course. First you cover factorial, gcd, and the Fibonacci sequence. Then you cover the Towers of Hanoi and binary search. Unfortunately, such an approach is all too common. The poster's wistful analysis:

Of the five problems, only one (binary search) is a problem students might actually want to solve. Only two (Fibonacci and Hanoi) are substantially clearer in recursive than iterative form, and both of them take exponential time. In other words, recursion is a confusing way to solve problems you don't care about, extremely slowly.

Which, frankly, I think is the lesson [some CS profs] want to convey.

And this on a day when I talked with my compiler students about how a compiler can transform many recursive programs into iterative ones, and even eliminate the cost of a non-recursive function call when it is in a tail position.

The quoted passage contains my favorite line of the week thus far: In other words, recursion is a confusing way to solve problems you don't care about, extremely slowly. If that's not the message you want to convey to your students, then please don't introduce them to recursion in this way. If that is the message you want to send to your students, then I am sad for you, but mostly for your students.

I sometimes wonder about the experiences that some computer scientists bring to the classroom. It only takes a little language processing to grok the value of recursion. And a data structures course is a perfectly good place for students to see and do a little language processing. Syntax, abstract or not, is a great computer science example of trees. And students get to learn a little more CS to boot!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 15, 2012 8:10 PM

Learning From the Wheel of Reincarnation

Last week I was pointed in the direction of a forty-five old CACM paper. It had an innocuous title, On the Design of Display Processors, and was outside my areas of primary interest, so I might well have passed it by. But it was co-written by Ivan Sutherland, whose work Alan Kay has praised often, and it was recommended by someone on Kay's the Fundamentals of New Computing mailing list, where a lot of neat ideas are discussed. So I printed it out for my daily exercise bike ride. I'm glad I did.

Myer and Sutherland tell the story of needing a display system for a research computer. That was a bigger deal in 1967 than it is today, so they did some research of their own....

Finally we decided to design the processor ourselves, because only in this way, we thought, could we obtain a truly complete display processor. We approached the task by starting with a simple scheme and adding commands and features that we felt would enhance the power of the machine. Gradually the processor became more complex. We were not disturbed by this because computer graphics, after all, are complex. Finally the display processor came to resemble a full-fledged computer with some special graphics features. And then a strange thing happened. We felt compelled to add to the processor a second, subsidiary processor, which, itself, began to grow in complexity.

It was then that we discovered a disturbing truth. Designing a display processor can become a never-ending cyclical process. In fact, we found the process so frustrating that we have come to call it the "wheel of reincarnation." We spent a long time trapped on that wheel before we finally broke free. In the remainder of this paper we describe our experiences. We have written it in the hope that it may speed others on toward "Nirvana."

A mantra from the paper characterizes the authors' time on the wheel: "For just a little more money...". I'll bet that sounds familiar to a lot of researchers, not to mention all of us who buy computing equipment for labs and end users.

I was really happy to read this paper. It's an experience report in which the authors share honestly the mistakes they made. But they paid attention, recognized a pattern, and learned from it. Even better, they wrote what they learned, in hopes of teaching the rest of us.

The wheel of reincarnation is not limited to display design or hardware design. It occurs any place where we encounter complexity. We try to tame it, first by specializing and then by generalizing. The design of programming languages is just as prone to dizzying cycle.

(In software design, we have a related phenomenon, captured in Greenspun's Tenth Rule.)

In language design, we almost have to look for a fixed point at which we stabilize the pendulum between general and specialized. What we most often need as users is the ability to grow systems gracefully over time. This speaks to the value of a good macro system and good compiler design.

Reading this paper reminded me of a couple of lessons I've learned over the years:

  1. I should read and watch everything I can get my hands on from Sutherland, Robert Floyd, and other computer scientists of their generation. They were solving hard problems long ago, in the face of resource limitations few of us can imagine today.
  2. As someone tweeted recently, @fogus, I think: reading these old papers makes me think that I will never have an original idea in your life. But then they also teach me a lot and prepare me to have more and sometimes better ideas.
  3. I need to do my best to hang around with smart, curious people. It's old advice, I know, but it requires action and, in some environments, constant vigilance. Simply eavesdropping on the FONC mailing list raises the level of my brain's activity by a few degrees.

These papers also remind us of a valuable role that academics can play in the software and computing worlds, which are also heavily influenced by industry practitioners. We need to keep papers like these alive, so that the smart and curious people in our classes and in industry will read them. We never know when two ideas will crash into each other and lead to something new and better.


Posted by Eugene Wallingford | Permalink | Categories: Computing

April 11, 2012 4:06 PM

What Penn and Teller Have in Common With a Compilers Course

Early this morning (and I mean early!), Alfred Thompson posted What Do Magic and Computer Science Have in Common?, relaying that Alex Suter of Industrial Light & Magic will give the closing keynote at this summer's Computer Science and Information Technology conference. That sounds pretty cool. The title of his entry conjured up other thoughts for me, though, especially in light of something I said in class yesterday.

Recently, I used a superhero reference in a blog entry. That is how many people feel when they use a program to accomplish something meaningful -- like a superhero. I feel that way, too, sometimes. However, like many other people, I am more prone to magical imagery. To someone who has not learned to code, a program is like an incantation, capable of making the computer do something mystical.

There is a long tradition of magical imagery in computer science. The Hacker's Dictionary tells us that a wizard is someone who knows how a complex piece of software or hardware works (that is, who groks it). A wizard can do things that hackers and mere mortals cannot. The entry for "wizard" has links to other magical jargon, such as heavy wizardry incantation>/A> and magic itself.

Structure and Interpretation of Computer Programs

I tell my Programming Languages students that this is a course in which the "high priests" of computer science reveal their secrets, that after the course students will understand the magic embodied in the interpreters and compilers that process their programs. I should probably refer to wizards, rather high priests, given that so many of the course's ideas are covered masterfully in SICP.

Lots of CS courses reveal magic, or propose it. A course in compliers finishes the job of Programming Languages, driving program translation all the way down to hardware. Artificial Intelligence describes our best ideas for how to make a computer do human-like magic: reasoning, recognizing patterns, learning, and playing Jeopardy!.

In my compliers class yesterday, I was showing my students a technique for generating three-address code from an abstract syntax tree, based in large part on ideas found in the Dragon book (surely not a coincidence). I wrote on the board this template for a grammar rule denoting addition:

    E → E1 + E2

E.place := makeNewTemporaryIdentifier() E.code := [ E1.code ] [ E2.code ] emitCode( E.place " := " E1.place " + " E2.place )

When I finished, I stepped back, looked it over, and realized again just how un-magical that seems. Indeed, when in written in black on white, it looks pretty pedestrian.

the magician due of Penn and Teller

That made me think of another connection between magic and computer science, one that applies to practitioners and outsiders alike. Taking an AI course or a compilers course is like having Penn and Teller explain to you how they made a person disappear or a ball levitate in thin air. For some people, that kills any joy that might have in watching the act. They don't want to know. They want to be amazed. And, knowing that something is implemented -- often in a way that doesn't seem especially artful, performed with an obvious misdirection -- prevents them from being amazed.

That can happen in CS, too. My friends and I came in to our AI course wide-eyed and wanting to be amazed -- and to build amazing things. We studied search and logical resolution, Bayes' Theorem and decision tree induction. And it all looked so... pedestrian. Many of my friends lost their sense of wonder then and there. Without the magic of AI, they were just as interested in operating systems or databases. More interested, really, because AI had let them down. It all looked like parlor tricks.

But there is a second kind of person in the world. Lots of people love to watch Penn and Teller explain a trick. They want to know how it works. They want to watch again, knowing how it works, to see if they can notice the deception. If they don't, they are amazed again. If they do, though, they still feel wonder -- at the skill of the magicians, at the design of the illusion, and even at the way their mind wants to trick them at the moment of execution.

I am of this second kind, for magic and especially for computer science. Yes, I know that compilers and AI programs are, at their cores, implemented using techniques that don't always look all that impressive in the light of day. Sometimes, those techniques look pretty boring indeed.

Yet I am still amazed when a C compiler takes in my humble instructions and creates machine code that compresses a musical file or fills out my tax return. I am amazed when Watson crunches through gazillions of bytes of data in a second or two and beats two most worthy human opponents to the buzzer. I like to hear my friends and colleagues de-mystify their current projects with nitty-gritty details. I still feel wonder -- at their skill, at the cleverness of their designs, and even at the moment the program runs and makes something out of what seems like nothing.

That's one thing magic and computer science have in common.

~~~~

IMAGE 1: a photo of the cover of Structure and Interpretation of Computer Programs. Source: the book's website.

IMAGE 2: a publicity still of magicians Penn and Teller. Source: All-About-Magicians.com.


Posted by Eugene Wallingford | Permalink | Categories: Computing

April 06, 2012 4:29 PM

A Reflection on Alan Turing, Representation, and Universal Machines

Douglas Hofstadter speaking at UNI

The day after Douglas Hofstadter spoke here on assertions, proof's and Gödel's theorem, he gave a second public lecture hosted by the philosophy department. Ahead of time, we knew only that Hofstadter would reflect on Turing during his centennial. I went in expecting more on the Turing test, or perhaps a popular talk on Turing's proof of The Halting Problem. Instead, he riffed on Chapter 17 from I Am a Strange Loop.

In the end, we are self-perceiving, self-inventing, locked-in mirages that are little miracles of self-reference.

Turing, he said, is another peak in the landscape occupied by Tarski and Gödel, whose work he had discussed the night before. (As a computer scientist, I wanted to add to this set contemporaries such as Alonzo Church and Claude Shannon.) Hofstadter mentioned Turing's seminal paper about the Entscheidungsproblem but wanted to focus instead on the model of computation for which he is known, usually referred to by the name "Turing machine". In particular, he asked us to consider a key distinction that Turing made when talking about his model: that between dedicated and universal machines.

A dedicated machine performs one task. Human history is replete with dedicated machines, whether simple, like the wheel, or complex, such as a typewriter. We can use these tools with different ends in mind, but the basic work is fixed in their substance and structure.

The 21st-century cell phone is, in contrast, a universal machine. It can take pictures, record audio, and -- yes -- even be used as a phone. But it can also do other things for us, if we but go to the app store and download another program.

Hofstadter shared a few of his early personal experiences with programs enabling line printers to perform tasks for which they had not been specifically designed. He recalled seeing a two-dimensional graph plotted by "printing" mostly blank lines that contained a single *. Text had been turned into graphics. Taking the idea further, someone used the computer to print a large number of cards which, when given to members of the crowd at a football game, could be used to create a massive two-dimensional message visible from afar. Even further, someone used a very specific layout of the characters available on the line printer to produce a print-out that appeared from the other side of the room to be a black-and-white photograph of Raquel Welch. Text had been turned into image.

People saw each of these displays as images by virtue of our eyes and mind interpreting a specific configuration of characters in a certain way. We can take that idea down a level into the computer itself. Consider this transformation of bits:

0000 0000 0110 1011 → 0110 1011 0000 0000

A computer engineer might see this as a "left shift" of 8 bits. A computer programmer might see it as multiplying the number on the left by 256. A graphic designer might see us moving color from one pixel to another. A typesetter may see one letter being changed into another. What one sees depends on how one interprets what the data represent and what the process means.

Alan Turing was the first to express clearly the idea that a machine can do them all.

"Aren't those really binary numbers?", someone asked. "Isn't that real, and everything else interpretation?" Hofstadter said that this is a tempting perspective, but we need to keep in mind that they aren't numbers at all. They are, in most computers, pulses of electricity, or the states of electronic components, that we interpret as 0s and 1s.

After we have settled on interpreting those pulses or states as 0s and 1s, we then interpret configurations of 0s and 1s to mean something else, such as decimal numbers, colors, or characters. This second level of interpretation exposes the flaw in popular claims that computers can do "only" process 0s and 1s. Computers can deal with numbers, colors, or characters -- anything that can be represented in any way -- when we interpret not only what the data mean but also what the process means.

(In the course of talking representations, he threw in a cool numeric example: Given an integer N, factor it as 2^a * 3^b * 5^c *7^d ... and use [a.b.c.d. ...] to stand for N. I see a programming assignment or two lying in wait.)

The dual ideas of representation and interpretation take us into a new dimension. The Principia Mathematica describes a set of axioms and formal rules for reasoning about numeric structures. Gödel saw that it could be viewed at a higher level, as a system in its own right -- as a structure of integers. Thus the Principia can talk about itself. It is, in a sense, universal.

This is the launching point for Turing's greatest insight. In I Am a Strange Loop, Hofstadter writes:

Inspired by Gödel's mapping of PM into itself, Alan Turing realized that the critical threshold for this kind of computational universality comes exactly at the point where a machine is flexible enough to read and correctly interpret a set of data that describes its own structure. At this crucial juncture, a machine can, in principle, explicitly watch how it does any particular task, step by step. Turing realized that a machine that has this critical level of flexibility can imitate any other machine, no matter how complex the latter is. In other words, there is nothing more flexible than a universal machine. Universality is as far as you can go!

Alan Turing

Thus was Turing first person to recognize the idea of a universal machine, circa 1935-1936: that a Turing machine can be given, as input, data that encodes its own instructions. This is the beginning of perhaps the biggest of the Big Ideas of computer science: the duality of data and program.

We should all be glad he didn't patent this idea.

Turing didn't stop there, of course, as I wrote in my recent entry on the Turing test. He recognized that humans are remarkably capable and efficient representational machines.

Hofstadter illustrates this with the idea of "hub", a three-letter word that embodies an enormous amount of experience and knowledge, chunked in numerous ways and accreted slowly over time. The concept is assembled in our minds out of our experiences. It is a representation. Bound up in that representation is an understanding of ourselves as actors in certain kinds of interactions, such as booking a flight on an airplane.

It is this facility with representations that distinguishes us humans from dogs and other animals. They don't seem capable of seeing themselves or others as representations. Human beings, though, naturally take other people's representations into their own. This results in a range of familiarities and verisimilitude. We "absorb" some people so well that we feel we know them intimately. This is what we mean when we say that someone is "in our soul". We use the word 'soul' not in a religious sense; we are referring to our essence.

Viewed this way, we are all distributed beings. We are "out there", in other people, as well as "in here", in ourselves. We've all had dreams of the sort Hofstadter used as example, a dream in which his deceased father appeared, seemingly as real as he ever had been while alive. I myself recently dreamt that I was running, and the experience of myself was as real as anything I feel when I'm awake. Because we are universal machines, we are able to process the representations we hold of ourselves and of others and create sensations that feel just like the ones we have when we interact in the world.

It is this sense that we are self-representation machines that gives rise to the title of his book, "I am a strange loop". In Hofstadter's view, our identity is a representation of self that we construct, like any other representation.

This idea underlies the importance of the Turing test. It takes more than "just syntax" to pass the test. Indeed, syntax is itself more than "just" syntax! We quickly recurse into the dimension of representation, of models, and a need for self-reference that makes our syntactic rules more than "just" rules.

Indeed, as self-representation machines, we are able to have a sense of our own smallness within the larger system. This can be scary, but also good. It makes life seem precious, so we feel a need to contribute to the world, to matter somehow.

Whenever I teach our AI course, I encounter students who are, for religious or philosophical reasons, deeply averse to the idea of an intelligent machine, or even of scientific explanations of who we are. When I think about identity in terms of self-representation, I can't help but feel that, at an important level, it does not matter. God or not, I am in awe of who we are and how we got to here.

So, we owe Alan Turing a great debt. Building on the work of philosophers, mathematicians, and logicians, Turing gave us the essential insight of the universal machine, on which modern computing is built. He also gave us a new vocabulary with which to think about our identity and how we understand the world. I hope you can appreciate why celebrating his centennial is worthwhile.

~~~~

IMAGE 1: a photo of Douglas Hofstadter speaking at UNI, March 7, 2012. Source: Kevin C. O'Kane.

IMAGE 2: the Alan Turing centenary celebration. Source: 2012 The Alan Turing Year.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 04, 2012 4:39 PM

Computational Search Answers an Important Question

Update: Well, this is embarrassing. Apparently, Mat and I were the victims of a prank by the folks at ChessBase. You'd think that, after more than twenty-five years on the internet, I would be more circumspect at this time of year. Rather than delete the post, I will leave it here for the sake of posterity. If nothing else, my students can get a chuckle from their professor getting caught red-faced.

I stand behind my discussion of solving games, my recommendation of Rybka, and my praise for My 60 Memorable Games (my favorite chess book of all time. I also still marvel at the chess mind of Bobby Fischer.

~~~~

Thanks to reader Mat Roberts for pointing me to this interview with programmer Vasik Rajlich, which describes a recent computational result of his: one of the most famous openings in chess, the King's Gambit, is a forced draw.

Games are, of course, a fertile testbed for computing research, including AI and parallel computation. Many researchers make one of their goals to "solve" a game, that is, to show that, with best play by both players, a game has a particular outcome. Games with long histories and large communities of players naturally attract a lot of interest, and solving one of them is usually considered a valuable achievement.

For us in CS, interest grows as with the complexity of the game. Solving Connect Four was cool, but solving Othello on a full-sized board would be cooler. Almost five years ago, I blogged about what I still consider the most impressive result in this domain: the solving of checkers by Jonathan Schaeffer and his team at the University of Alberta.

the King's Gambit

The chess result is more limited. Rajlich, an International Master of chess and the programmer of the chess engine Rybka, has shown results only for games that begin 1.e4 e5 2.f4 exf4. If White plays 3.Nf3 -- the most common next move -- then Black can win with 3... d6. 3.Bc4 also loses. Only one move for White can force a draw, the uncommon 3.Be2. Keep in mind that these results all assume best play by both players from there on out. White can win, lose, or draw in all variations if either player plays a sub-optimal move.

I say "only" when describing this result because it leaves a lot of chess unsolved, all games starting with some other sequence of moves. Yet the accomplishment is still quite impressive! The King's Gambit is one of the oldest and most storied opening sequences in all of chess, and it remains popular to this day among players at every level of skill.

Besides, consider the computational resources that Rajlich had to use to solve even the King's Gambit:

... a cluster of computers, currently around 300 cores [created by Lukas Cimiotti, hooked up to] a massively parallel cluster of IBM POWER 7 Servers provided by David Slate, senior manager of IBM's Semantic Analysis and Integration department -- 2,880 cores at 4.25 GHz, 16 terabytes of RAM, very similar to the hardware used by IBM's Watson in winning the TV show "Jeopardy". The IBM servers ran a port of the latest version of Rybka, and computation was split across the two clusters, with the Cimiotti cluster distributing the search to the IBM hardware.

Oh, and this set up had to run for over four months to solve the opening. I call that impressive. If you want something less computationally intensive yet still able to beat you me and everybody we know at chess, you can by Rybka, a chess engine available commercially. (An older version is available for free!)

What effect will this result have on human play? Not much, practically speaking. Our brains aren't big enough or fast enough to compute all the possible paths, so human players will continue to play the opening, create new ideas, and explore the action in real time over the board. Maybe players with the Black pieces will be more likely to play one of the known winning moves now, but results will remain uneven between White and Black. The opening leads to complicated positions.

the cover of Bobby Fischer's 'My 60 Memorable Games'

If, like some people, you worry that results such as this one somehow diminish us as human beings, take a look again at the computational resources that were required to solve this sliver of one game, the merest sliver of human life, and then consider: This is not the first time that someone claimed the King's Gambit busted. In 1961, an eighteen-year-old U.S. chess champion named Bobby Fischer published an article claiming that 1.e4 e5 2.f4 exf4 3.Nf3 was a forced loss. His prescription? 3... d6. Now we know for sure. Like so many advances in AI, this one leaves me marveling at the power of the human mind.

Well, at least Bobby Fischer's mind.

~~~~

IMAGE 1: The King's Gambit. Source: Wikimedia Commons.

IMAGE 2: a photograph of the cover of my copy of My 60 Memorable Games by Bobby Fischer. Bobby analyzes a King's Gambit or two in this classic collection of games.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 03, 2012 4:00 PM

Intermediate Representations and Life Beyond the Compiler

In the simplest cases, a compiler can generate target code directly from the abstract syntax tree:

generating target code directly from the abstract syntax tree

In many cases, though, there are good reasons why we don't want to generate code for the target machine immediately. One is modularity. A big part of code generation for a particular target machine is machine-dependent. If we write a monolithic code generator, then we will have to reimplement the machine-independent parts every time we want to target a new machine.

Even if we stick with one back-end architecture, modularity helps us. Not all of the machine-dependent elements of code generation depend in the same way on the machine. If we write a monolithic code generator, then any small change in the target machine -- perhaps even a small upgrade in the processor -- can cause changes throughout our program. If instead we write a modular code generator, with modules that reflect particular shear layers in the generation process, a lá How Buildings Learn, then we may be able to contain changes in target machine specification to an easily identified subset of modules.

So, more generally we think of code generation in two parts:

  • one or more machine-independent transformations from an abstract syntax tree to intermediate representations of the program, followed by

  • one or more machine-dependent transformations from the final intermediate representation to machine code.

generating target code directly from the abstract syntax tree

Intermediate representations between the abstract syntax tree and assembly code have other advantages, too. In particular, they enable us to optimize code in machine-independent ways, without having to manipulate a complex target language.

In practice, an intermediate representation sometimes outlives the compiler for which it was created. Chris Clark described an example of this phenomenon in Build a Tree--Save a Parse:

Sometimes the intermediate language (IL) takes on a life of its own. Several systems that have multiple parsers, sophisticated optimizers, and multiple code generators have been developed and marketed commercially. Each of these systems has its own common virtual assembly language used by the various parsers and code generators. These intermediate languages all began connecting just one parser to one code generator.

P-code is an example IL that took on a life of its own. It was invented by Nicklaus Wirth as the IL for the ETH Pascal compiler. Many variants of that compiler arose [Ne179], including the USCD Pascal compiler that was used at Stanford to define an optimizer [Cho83]. Chow's compiler evolved into the MIPS compiler suite, which was the basis for one of the DEC C compilers -- acc. That compiler did not parse the same language nor use any code from the ETH compiler, but the IL survived.

Good language design usually pays off, sometimes in unexpected ways.

(If you like creating languages and writing language processors, Clark's paper is worth a read!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

March 30, 2012 5:22 PM

A Reflection on Alan Turing, the Turing Test, and Machine Intelligence

Alan Turing

In 1950, Alan Turing published a paper that launched the discipline of artificial intelligence, Computing Machinery and Intelligence. If you have not read this paper, go and do so. Now. 2012 is the centennial of Turing's birth, and you owe yourself a read of this seminal paper as part of the celebration. It is a wonderful work from a wonderful mind.

This paper gave us the Imitation Game, an attempt to replace the question of whether a computer could be intelligent by withn something more concrete: a probing dialogue. The Imitation became the Turing Test, now a staple of modern culture and the inspiration for contests and analogies and speculation. After reading the paper, you will understand something that many people do not: Turing is not describing a way for us to tell the difference between human intelligence and machine intelligence. He is telling us that the distinction is not as important as we seem to think. Indeed, I think he is telling us that there is no distinction at all.

I mentioned in an entry a few years ago that I always have my undergrad AI students read Turing's paper and discuss the implications of what we now call the Turing Test. Students would often get hung up on religious objections or, as noted in that entry, a deep and a-rational belief in "gut instinct". A few ended up putting their heads in the sand, as Turing knew they might, because they simply didn't want to confront the implication of intelligences other than our own. And yet they were in an AI course, learning techniques that enable us to write "intelligent" programs. Even students with the most diehard objections wanted to write programs that could learn from experience.

Douglas Hofstadter, who visited campus this month, has encountered another response to the Turing Test that surprised him. On his second day here, in honor of the Turing centenary, Hofstadter offered a seminar on some ideas related to the Turing Test. He quoted two snippets of hypothetical man-machoine dialogue from Turing's seminal paper in his classic Gödel, Escher, Bach. Over the the years, he occasionally runs into philosophers who think the Turing Test is shallow, trivial to pass with trickery and "mere syntax". Some are concerned that it explores "only behavior". Is behavior all there is? they ask.

As a computer programmer, the idea that the Turing test explores only behavior never bothered me. Certainly, a computer program is a static construct and, however complex it is, we can read and understand it. (Students who take my programming languages course learn that even another program can read and process programs in a helpful way.) This was not a problem for Hofstadter either, growing up as he did in a physicist's household. Indeed, he found Turing's formulation of the Imitation Game to be deep and brilliant. Many of us who are drawn to AI feel the same. "If I could write a program capable of playing the Imitation Game," we think, "I will have done something remarkable."

One of Hofstadter's primary goals in writing GEB was to make a compelling case form Turing's vision.

Douglas Hofstadter

Those of us who attended the Turing seminar read a section from Chapter 13 of Le Ton beau de Marot, a more recent book by Hofstadter in which he explores many of the same ideas about words, concepts, meaning, and machine intelligence as GEB, in the context of translating text from one language to another. Hofstadter said the focus in this book is on the subtlety of words and the ideas they embody, and what that means for translation. Of course, these are the some of the issues that underlie Turing's use of dialogue as sufficient for us to understand what it means to be intelligent.

In the seminar, he shared with us some of his efforts to translate a modern French poem into faithful English. His source poem had itself been translated from older French into modern French by a French poet friend of his. I enjoyed hearing him talk about "the forces" that pushed him toward and away from particular words and phrases. Le Ton beau de Marot uses creative dialogues of the sort seen in GEB, this time between the Ace Mechanical Translator (his fictional computer program) and a Dull Rigid Human. Notice the initials of his raconteurs! They are an homage to Turing. The human translator, Douglas R. Hofstadter himself, is cast in the role of AMT, which shares its initials with Alan M. Turing, the man who started this conversation over sixty years ago.

Like Hofstadter, I have often encountered people who object to the Turing test. Many of my AI colleagues are comfortable with a behavioral test for intelligence but dislike that Turing considers only linguistic behavior. I am comfortable with linguistic behavior because it captures what is for me the most important feature of intelligence: the ability to express and discuss ideas.

Others object that it sets too low a bar for AI, because it is agnostic on method. What if a program "passes the test", and when we look inside the box we don't understand what we see? Or worse, we do understand what we see and are unimpressed? I think that this is beside the point. Not to say that we shouldn't want to understand. If we found such I program, I think that we would make it an overriding goal to figure out how it works. But how an entity manages to be "intelligent" is a different question from whether it is intelligent. That is precisely Turing's point!

I agree with Brian Christian, who won the prize for being "The Most Human Human" in a competition based on Turing's now-famous test. In an interview with The Paris Review, he said,

Some see the history of AI as a dehumanizing narrative; I see it as much the reverse.

Turing does not diminish what it is to be human when he suggests that a computer might be able to carry on a rich conversation about something meaningful. Neither do AI researchers or teenagers like me, who dreamed of figuring just what it is that makes it possible for humans to do what we do. We ask the question precisely because we are amazed. Christian again:

We build these things in our own image, leveraging all the understanding of ourselves we have, and then we get to see where they fall short. That gap always has something new to teach us about who we are.

As in science itself, every time we push back the curtain, we find another layer of amazement -- and more questions.

I agree with Hofstadter. If a computer could do what it does in Turing's dialogues, then no one could rightly say that it wasn't "intelligent", whatever that might mean. Turing was right.

~~~~

PHOTOGRAPH 1: the Alan Turing centenary celebration. Source: 2012 The Alan Turing Year.

PHOTOGRAPH 2: Douglas Hofstadter in Bologna, Italy, 2002. Source: Wikimedia Commons.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

March 09, 2012 3:33 PM

This and That from Douglas Hofstadter's Visit

Update: In the original, I conflated two quotes in
"Food and Hygiene". I have un-conflated them.

In addition to his lecture on Gödel's incompleteness theorem, Douglas Hofstadter spent a second day on campus, leading a seminar and giving another public talk. I'll blog on those soon. In the meantime, here are a few random stories I heard and impressions I formed over the two days.

The Value of Good Names.   Hofstadter told a story about his "favorite chapter on Galois theory" (don't we all have one?), from a classic book that all the mathematicians in the room recognized. The only thing Hofstadter didn't like about this chapter was that it referred to theorems by number, and he could never remember which theorem was which. That made an otherwise good text harder to follow than it needed to be.

In contrast, he said, was a book by Galyan that gave each theorem a name, a short phrase evocative of what the theorem meant. So much better for reader! So he gave his students one semester an exercise to make his favorite chapter better: they were to give each of the numbered theorems in the chapter an evocative name.

This story made me think of my favorite AI textbook, Patrick Henry Winston's Artificial Intelligence. Winston's book stands out from the other AI books as quirky. He uses his own vocabulary and teaches topics very much in the MIT AI fashion. But he also gives evocative names to many of the big ideas he wants us to learn, among them the representation principle, the principle of least commitment, the diversity principle, and the eponymous "Winston's principle of parallel evolution". My favorite of all is the convergent intelligence principle:

The world manifests constraints and regularities. If a computer is to exhibit intelligence, it must exploit those constraints and regularities, no matter of what the computer happens to be made.

To me, that is AI.

Food and Hygiene.   The propensity of mathematicians to make their work harder for other people to understand, even other mathematicians, reminded Doug Shaw of two passages, from famed mathematicians Gian-Carlo Rota and André Weil. Rota said that we must guard ... against confusing the presentation of mathematics with the content of mathematics. More colorfully, Weil cautioned [If] logic is the hygiene of the mathematician, it is not his source of food. Theorems, proofs, and Greek symbols are mathematical hygiene. Pictures, problems, and understanding are food.

A Good Gig, If You Can Get It.   Hofstadter holds a university-level appointment at Indiana, and his research on human thought and the fluidity of concepts is wide enough to include everything under the sun. Last semester, he taught a course on The Catcher in the Rye. He and his students read the book aloud and discussed what makes it great. Very cool.

If You Need a Research Project...   At some time in the past, Hofstadter read, in a book or article about translating natural language into formal logic, that 'but' is simply a trivial alternative to 'and' and so can be represented as such. "Nonsense", he said! 'but' embodies all the complexity of human thought. "If we could write a program that could use 'but' correctly, we would have accomplished something impressive."

Dissatisfied.   Hofstadter uses that word a lot in conversation, or words like it, such as 'unsatisfying'. He does not express the sentiment in a whiny way. He says it in a curious way. His tone always indicates a desire to understand something better, to go deeper to the core of the question. That's a sign of a good researcher and a deep thinker.

~~~~

Let's just say that this was a great treat. Thanks to Dr. Hofstadter for sharing so much time with us here.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

March 07, 2012 5:35 PM

Douglas Hofstadter on Questions, Proofs, and Passion

In the spring of my sophomore year in college, I was chatting with the head of the Honors College at my alma mater. His son, a fellow CS major, had recently read what he considered a must-read book for every thinking computer scientist. I went over to library and checked it out in hardcopy. I thumbed through it, and it was love at first sight. So I bought the paperback and spent my summer studying it, line by line.

the cover of Godel, Escher, Bach

Gödel, Escher, Bach seemed to embody everything that excited me about computer science and artificial intelligence. It made, used, and deconstructed analogies. It talked about programming languages, and computer programs as models. Though I allowed myself to be seduced in grad school by other kinds of AI, I never felt completely satisfied. My mind and heart have never really left go of the feeling I had that summer.

Last night, I had the pleasure of seeing Douglas Hofstadter give the annual Hari Shankar Memorial Lecture here. This lecture series celebrates the beauty of mathematics and its accessibility to everyone. Hofstadter said that he was happy to honored to be asked to give such a public lecture, speaking primarily to non-mathematicians. Math is real; it is in the world. It's important, he said, to talk about it in ways that are accessible to all. His lecture would share the beauty of Gödel's Incompleteness Theorem. Rather than give a dry lecture, he told the story as his story, putting the important issues and questions into the context of his own life in math.

As a 14-year-old, he discovered a paperback copy of Gödel's Proof in a used bookstore. His father mentioned that one of the authors, Ernest Nagel, was one of his teachers and friends. Douglas was immediately fascinated. Gödel used a tool (mathematics) to study itself. It was "a strange loop".

As a child, he figured out that two twos is four. The natural next question is, "What is three threes?" But this left him dissatisfied, because two was still lurking in the question. What is "three three threes"? It wasn't even clear to him what that might mean.

But he was asking questions about patterns and and seeking answers. He was on his way to being a mathematician. How did he find answers? He understood science to be about experiments, so he looked for answers by examining a whole bunch of cases, until he had seen enough to convince himself that a claim was true.

He did not know yet what a proof was. There are, of course, many different senses of proof, including informal arguments and geometric demonstrations. Mathematicians use these, but they are not what they mean by 'proof'.

Douglas Hofstadter

So he explored problems and tried to find answers, and eventually he tried to prove his answers right. He became passionate about math. He was excited by every new discovery. (Pi!) In retrospect, his excitement does not surprise him. It took mathematicians hundreds of years to create and discover these new ideas. When he learned about them after the fact, they look like magic.

Hofstadter played with numbers. Squares. Triangular numbers. Primes. He noticed that 2^3 and 3^2 adjacent to one another and wondered if any other powers were adjacent.

Mathematicians have faith that there is an answer to questions like that. It may be 'yes', it may be 'no', but there's an answer. He said this belief is so integral to the mindset that he calls this the Mathematician's Credo:

If something is true, it has a proof, and if something has a proof, then it is true.

As an example, he wrote the beginning of the Fibonacci series on the chalk board: 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233. The list contains some powers of integers: 1, 8, and 144. Are there more squares? Are there more powers? Are there infinitely many? How often do they appear? It turns out that someone recently discovered that there are no more integer powers in the list. A mathematician may be surprised that this is true, but she would not be surprised that, if it is true, there is a proof of it.

Then he gave an open question as an example. Consider this variation of a familiar problem:

  • Rule 1: n → 2n
  • Rule 2: 3n+1 → n
  • Start with 1.

Rule 1 takes us from 1 to 2. Rule 1 takes us to 4. Rule 2 takes us to 1. We've already been there, so that's not very interesting. But Rule 1 takes us to 8.... And so on.

Hofstadter called the numbers we visited C-numbers. He then asked a question: Where can we go using these rules? Can we visit every integer? Which numbers are C-numbers? Are all integers C-numbers?

The answer is, we don't know. People have used computers to test all the integers up to a very large number (20 x 2^58) and found that we can reach every one of them from 1. So many people conjecture strongly that all integers are C-numbers. But we don't have a proof, so the purist mathematician will say only, "We don't know".

At this point in the talk, my mind wanders.... (Wonders?) It would be fun to write a program to answer, "Is n a C-number?" in Flair, the language for which my students this semester are writing a compiler. That would make a nice test program. Flair is a subset of Pascal without any data structures, so there is an added challenge... A danger of teaching a compilers course -- any course, really -- is that I would rather write programs than do almost anything else in the world.

One could ask the same question of the Fibonacci series: Is every number a Fibonacci number? It is relatively easy to answer this question with 'no'. The sequence grows larger in each new entry, so once you skip any number, you know it's not in the list. C-numbers are tougher. They grow and shrink. For any given number n, we can search the tree of values until we find it. But there is no proof for all n.

As one last bit of preparation, Hofstadter gave an informal proof of the statement, "There are an infinite number of prime numbers." The key is that his argument used one assumption (there are a finite number of primes) to destroy a necessary consequence of the same (p1*p2*...*pk+1 is prime).

From there, Hofstadter told a compact, relatively simple version of Tarski's undefinability theorem and, at the end, made the bridge to Gödel's theorem. I won't tell that story here, for a couple of reasons. First, this entry is already quite long. Second, Hofstadter himself has already told this story better than I ever could, in Gödel, Escher, Bach. You really should read it there.

This story gave him a way to tell us about the importance of the proof: it drives a wedge between truth and provability. This undermines the Mathematician's Credo. It also allowed him to demonstrate his fascination with Gödel's Proof so many years ago: it uses mathematical logic to say something interesting, powerful, and surprising about mathematical logic itself.

Hofstadter opened the floor to questions. An emeritus CS professor asked his opinion of computer proofs, such as the famous 1976 proof of the four color theorem. That proof depends on a large number of special cases and requires hundreds of pages of analysis. At first, Hofstadter said he doesn't have much of an opinion. Of course, such proofs require new elements of trust, such as trust that the program is correct and trust that the computer is functioning correctly. He is okay with that. But then he said that he finds such proofs to be unsatisfying. Invariably, they are a form of brute force, and that violates the spirit of mathematics that excites him. In the end, these proofs do not help him to understand why something isn true, and that is the whole point of exploring: to understand why.

This answer struck a chord in me. There are whole swaths of artificial intelligence that make me feel the same way. For example, many of my students are fascinated by neural networks. Sure, it's exciting any time you can build a system that solves a problem you care about. (Look ma, no hands!) But these programs are unsatisfying because they don't give me any insight into the nature of the problem, or into how humans solve the problem. If I ask a neural network, "Why did you produce this output for this input?", I can't expect an answer at a conceptual level. A vector of weights leaves me cold.

To close the evening, Hofstadter responded to a final question about the incompleteness theorem. He summarized Gödel's result in this way: Every interesting formal system says true things, but it does not say all true things. He also said that Tarski's result is surprising, but in a way comforting. If an oracle for T-numbers existed, then mathematics would be over. And that would be depressing.

As expected, I enjoyed the evening greatly. Having read GEB and taken plenty of CS theory courses, I already knew the proofs themselves, so the technical details weren't a big deal. What really highlighted the talk for me was hearing Hofstadter talk about his passions: where they came from, how he has pursued them, and how these questions and answers continue to excite him as they do. Listening to an accomplished person tell stories that make connections to their lives always makes me happy.

We in computer science need to do more of what people like Hofstadter do: talk about the beautiful ideas of our discipline to as many people as we can, in way that is accessible to all. We need a Sagan or a Hofstadter to share the beauty.

~~~~

PHOTOGRAPH 1: a photograph of the cover of my copy of Gödel, Escher, Bach.

PHOTOGRAPH 2: Douglas Hofstadter in Bologna, Italy, 2002. Source: Wikimedia Commons.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 14, 2012 3:51 PM

Beautiful Sentences, Programming Languages Division

From The Heroes of Java: Ward Cunningham (emphasis added):

Making a new language is the ultimate power move. There are lots of ways to do it. Get to know them all. A switch statement inside a for loop makes an interpreter. When you write that, ask yourself: what language am I interpreting?

Those last two sentences distill a few weeks of a course on programming languages into an idea that spans all of computer science. Beautiful.

As in most conversations with Ward, this one is full of good sentences, including:

The shortest path to exceeding expectations rarely goes through meeting expectations.

And:

When [pair programming] you can assume both people bring something to the collaboration. I'd rather find my colleague's strength or passion and learn from them as we go forward together.

In my experience, Ward brings the same mindset to all of his professional interactions. I am still a fan, though I have never asked for his autograph.


Posted by Eugene Wallingford | Permalink | Categories: Computing

January 10, 2012 4:05 PM

Looking Forward: Preparing for Compilers

Spring semester is underway. My compilers course met for the first time today. After all these years, I still get excited at the prospect of writing a compiler. On top of that, we get to talk about programming languages and programming all semester.

I've been preparing for the course since last semester, during the programming languages course I debriefed recently. I've written blog entries as I planned previous offerings of the compiler course, on topics such as short iterations and teaching by example, fifteen compilers in fifteen weeks and teaching the course backwards. I haven't written anything yet this time for one of the same reasons I haven't been writing about my knee rehab: I haven't had much to say. Actually, I have two small things.

First, on textbooks. I found that the textbook I've used for the last few offering of the course now costs students over $140, even at Amazon. That's no $274.70, but sheesh. I looked at several other popular undergrad compiler texts and found them all to be well over $100. The books my students might want to keep for their professional careers are not suitable for an undergrad course, and the ones that are suitable are expensive. I understand the reasons why yet can't stomach hitting my students with such a large bill. The Dragon book is the standard, of course, but I'm not convinced it's a good book for my audience -- too few examples, and so much material. (At least it's relatively inexpensive, at closer to $105.)

I found a few compiler textbooks available free on-line, including that $275 book I like. Ultimately I settled on Torben Mogensen's Basics of Compiler Design. It covers the basic material without too much fluff, though it lacks a running example with full implementation. I'll augment it with my own material and web readings. The price is certainly attractive. I'll let you know how it works out.

Second, as I was filing a pile of papers over break, I ran across the student assessments from last offering of the course. Perfect timing! I re-read them and am able to take into account student feedback. The last group was pretty pleased with the course and offered two broad suggestions for improvement: more low-level details and more code examples. I concur in both. It's easy when covering so many new ideas to stay at an abstract level, and the compiler course is no exception. Code examples help students connect the ideas we discuss with the reality of their own projects.

These are time consuming improvements to make, and time will be at a premium with a new textbook for the course. This new text makes them even more important, though, because it has few code examples. My goal is to add one new code example to each week of the course. I'll be happy if I manage one really good example every other week.

And we are off.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 01, 2011 3:06 PM

Programming for Everyone: Journalists

Jacob Harris begins his recent In Praise of Impractical Programming with a short discussion of how programming is becoming an integral part of the newsroom:

For the past few years, I've been working as a software developer in the newsroom, where perceptions of my kind have changed from novelty to a necessity. Recognizing this, some journalism schools now even require programming courses to teach students practical skills with databases or web frameworks. It's thrilling to contemplate a generation of web-hacking journalists -- but I wish we could somehow squeeze a little magic into their course load.

This seems like a natural evolutionary path that many industries will follow in the coming years or decades. At first it will be enough to use other people's tools. Then, practitioners will want to be able to write code in a constrained environment, such as a web framework or a database application. Eventually, I suspect that at least a few of the programming practitioners will tire of the constraints, step outside of the box, and write the code -- and maybe even the tools -- they want and need. If historians can do it, so can journalists.


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 30, 2011 7:07 PM

A Definition of Design from Charles Eames

Paul Rand's logo for NeXT

Our council of department heads meets in the dean's conference room, in the same building that houses the Departments of Theater and Art, among others. Before this morning's meeting, I noticed an Edward Tufte poster on the wall and went out to take a look. It turns out that the graphic design students were exhibiting posters they had made in one of their classes, while studying accomplished designers such as Tufte and Paul Rand, the creator of the NeXT logo for Steve Jobs.

As I browsed the gallery, I came across a couple of posters on the work of Charles and Ray Eames. One of them prominently featured this quote from Charles:

Design is a plan for arranging elements in such a way as best to accomplish a particular purpose.

This definition works just as well for software design as it does for graphic design. It is good to be reminded occasionally how universal the idea of design is to the human condition.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 12, 2011 10:40 AM

Tools, Software Development, and Teaching

Last week, Bret Victor published a provocative essay on the future of interaction design that reminds us we should be more ambitious in our vision of human-computer interaction. I think it also reminds us that we can and should be more ambitious in our vision of most of our pursuits.

I couldn't help but think of how Victor's particular argument applies to software development. First he defines "tool":

Before we think about how we should interact with our Tools Of The Future, let's consider what a tool is in the first place.

I like this definition: A tool addresses human needs by amplifying human capabilities.

a tool addresses human needs by amplifying human capabilities

That is, a tool converts what we can do into what we want to do. A great tool is designed to fit both sides.

The key point of the essay is that our hands have much more consequential capabilities than our current interfaces use. They feel. They participate with our brains in numerous tactile assessments of the objects we hold and manipulate: "texture, pliability, temperature; their distribution of weight; their edges, curves, and ridges; how they respond in your hand as you use them". Indeed, this tactile sense is more powerful than the touch-and-slide interfaces we have now and, in many ways, is more powerful than even sight. These tactile senses are real, not metaphorical.

As I read the essay, I thought of the software tools we use, from language to text editors to development processes. When I am working on a program, especially a big one, I feel much more than I see. At various times, I experience discomfort, dread, relief, and joy.

Some of my colleagues tell me that these "feelings" are metaphorical, but I don't think so. A big part of my affinity for so-called agile approaches is how these sensations come into play. When I am afraid to change the code, it often means that I need to write more or better unit tests. When I am reluctant to add a new feature, it often means that I need to refactor the code to be more hospitable. When I come across a "code smell", I need to clean up, even if I only have time for a small fix. YAGNI and doing the simplest thing that can possibly work are ways that I feel my way along the path to a more complete program, staying in tune with the code as I go. Pair programming is a social practice that engages more of my mind than programming alone.

Victor closes with some inspiration for inspiration:

In 1968 -- three years before the invention of the microprocessor -- Alan Kay stumbled across Don Bitzer's early flat-panel display. Its resolution was 16 pixels by 16 pixels -- an impressive improvement over their earlier 4 pixel by 4 pixel display.

Alan saw those 256 glowing orange squares, and he went home, and he picked up a pen, and he drew a picture of a goddamn iPad.

We can think bigger about so much of what we do. The challenge I take from Victor's essay is to think about the tools I to teach: what needs do they fulfill, and how well do they amplify my own capabilities? Just as important are the tools we give our students as they learn: what needs do they fulfill, and how well do they amplify our students' capabilities?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

November 09, 2011 4:29 PM

Sentences of the Day: Sheer Fun

From Averia, The Average Font:

Having found a simple process to use, I was ready to start. And after about a month of part-time slaving away (sheer fun! Better than any computer game) -- in the process of which I learned lots about bezier curves and font metrics -- I had a result.

Programmers love to slave away in their free time on projects that put fires in their bellies, with no slave driver other than their own passion.

The story of Averia is a worthy one to read, even if you are not particularly a font person. It's really about how the seed of an idea can grow as our initial efforts pull us deeper into the beautiful intricacies of a problem. It also reminds us how programs make nice testbeds for our experiments.


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 05, 2011 10:09 AM

Is Computing Too Hard, Too Foreign, or Too Disconnected?

A lot of people are discussing a piece published in the New York Times piece yesterday, Why Science Majors Change Their Minds (It's Just So Darn Hard). It considers many factors that may be contributing to the phenomenon, such as low grades and insufficient work habits.

Grades are typically much lower in STEM departments, and students aren't used to getting that kind of marks. Ben Deaton argues that this sort of rigor is essential, quoting his undergrad steel design prof: "If you earn an A in this course, I am giving you a license to kill." Still, many students think that a low grade -- even a B! -- is a sign that they are not suited for the major, or for the specialty area. (I've had students drop their specialty in AI after getting a B in the foundations course.)

Most of the students who drop under such circumstances are more than capable of succeeding. Unfortunately, they have not usually developed the disciplined work habits they need to succeed in such challenging majors. It's a lot easier to switch to a different major where their current skills suffice.

I think there are two more important factors at play. On the first, the Times article paraphrases Peter Kilpatrick, Notre Dame's Dean of Engineering:

... it's inevitable that students will be lost. Some new students do not have a good feel for how deeply technical engineering is.

In computer science, our challenge is even bigger: most HS students don't have any clue at all what computer science is. My university is nearing the end of its fall visit days for prospective students, who are in the process of choosing a college and a major. The most common question I am asked is, "What is computer science?", or its cousin, "What do computer scientists do?". This question comes from even the brightest students, ones already considering math or physics. Even more students walk by the CS table with their parents with blank looks on their faces. I'm sure some are thinking, "Why consider a major I have no clue about?"

This issue also plagues students who decide to major in CS and then change their minds, which is the topic of the Times article. Students begin the major not really knowing what CS is, they find out that they don't like it as much as they thought they might, and they change. Given what they know coming into the university, it really is inevitable that a lot of students will start and leave CS before finishing.

On the second factor I think most important, here is the money paragraph from the Times piece:

But as Mr. Moniz (a student exceedingly well prepared to study engineering) sat in his mechanics class in 2009, he realized he had already had enough. "I was trying to memorize equations, and engineering's all about the application, which they really didn't teach too well," he says. "It was just like, 'Do these practice problems, then you're on your own.'" And as he looked ahead at the curriculum, he did not see much relief on the horizon.

I have written many times here about the importance of building instructions around problems, beginning with Problems Are The Thing. Students like to work on problems, especially problems that matter to someone in the world. Taken to the next level, as many engineering schools are trying to do, courses should -- whenever possible -- be built around projects. Projects ground theory and motivate students, who will put in a surprising effort on a project they care about or think matters in the world. Projects are also often the best way to help students understand why they are working so hard to memorize and practice tough material.

In closing, I can take heart that schools like mine are doing a better job retaining majors:

But if you take two students who have the same high school grade-point average and SAT scores, and you put one in a highly selective school like Berkeley and the other in a school with lower average scores like Cal State, that Berkeley student is at least 13 percent less likely than the one at Cal State to finish a STEM degree.

Schools tend to teach less abstractly than our research-focused sister schools. We tend to provide more opportunities early in the curriculum to work on projects and to do real research with professors. I think the other public universities in my state do a good job, but if a student is considering an undergrad STEM major, they will be much better served at my university.

There is one more reason for the better retention rate at the "less selective" schools: pressure. The students at the more selective schools are likely to be more competitive about grades and success than the students at the less selective schools. This creates an environment more conducive to learning for most students. In my department, we try not to "treat the freshman year as a 'sink or swim' experience and accept attrition as inevitable" for reasons of Darwinian competition. As the Times article says, this is both unfair to students and wasteful of resources.

By changing our curricula and focusing more on student learning than on how we want to teach, universities can address the problem of motivation and relevance. But that will leave us with the problem of students not really knowing what CS or engineering are, or just how technical and rigorous they need to be. This is an education problem of another sort, one situated in the broader population and in our HS students. We need to find ways to both share the thrill and help more people see just what the STEM disciplines are and what they entail.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 02, 2011 7:39 AM

Programming for All: will.i.am

While reading a bit about the recent flap over racism in the tech start-up world, I found this passage in a piece by Michael Arrington:

will.i.am was proposing an ambitious new idea to help get inner city youth (mostly minorities) to begin to see superstar entrepreneurs as the new role models, instead of NBA stars. He believes that we can effect real societal change by getting young people to learn how to program, and realize that they can start businesses that will change the world.

Cool. will.i.am is a pop star who has had the ear of a lot of kids over the last few years. I hope they listen to this message from him as much as they do to his music.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 31, 2011 3:47 PM

"A Pretty Good Lisp"

I occasionally read or hear someone say, "X is a pretty good Lisp", where X is a programming language. Usually, it's a newish language that is more powerful than the languages many of us learned in school. For a good example, see Why Ruby is an acceptable LISP. A more recent article, Ruby is beautiful (but I'm moving to Python) doesn't go quite that far. It says only "almost":

Ruby does not revel in structures or minutiae. It is flexible. And powerful. It really almost is a Lisp.

First, let me say that I like both of these posts. They tell us about how we can do functional programming in Ruby, especially through its support for higher-order functions. As a result, I have found both posts to be useful reading for students. And, of course, I love Ruby, and like Python well enough.

But that's not all there is to Lisp. It's probably not even the most important thing.

Kenny Tilton tells a story about John McCarthy's one-question rebuttal to such claims at the very end of his testimonial on adopting Lisp, Ooh! Ooh! My turn! Why Lisp?:

... [McCarthy] simply asked if Python could gracefully manipulate Python code as data.

"No, John, it can't," said Peter [Norvig] and nothing more...

That's the key: data == program. It really is the Big Idea that sets Lisp apart from the other programming languages we use. I've never been a 100% full-time Lisper, and as a result I don't think I fully appreciate the full power to which Lisp programmers put this language feature. But I've programmed enough with and without macros to be able to glimpse what they see in the ability to gracefully manipulate their code -- all code -- as data.

In the "acceptable Lisp" article linked above, Kidd does address this shortcoming and says that "Ruby gives you about 80% of what you want from macros". Ruby's rather diverse syntax lets us create readable DSLs such as Treetop and Rake, which is one of the big wins that Lisp and Scheme macros give us. In this sense, Ruby code can feel generative, much as macros do.

Unfortunately, Ruby, Python, and other "pretty good Lisps" miss out on the other side of the code-as-data equation, the side McCarthy drew out in his question: manipulation. Ruby syntax is too irregular to generate "by hand" or to read and manipulate gracefully. We can fake it, of course, but to a Lisp programmer it always feels fake.

I think what most people mean when they say a language is a pretty good Lisp is that it can be used as a pretty good functional programming language. But Lisp is not only an FP language. Many would claim it is not even primarily a functional programming language.

I love Ruby. But it's not a pretty good Lisp. It is a fine programming languages, perhaps my favorite these days, with strengths that take it beyond the system programming languages that most of us cut our teeth on. Among those strengths are excellent support for a functional programming style. It also has its weaknesses, like every other programming language.

Neither is Python a pretty good Lisp. Nor is most anything else, for that matter. That's okay.

All I ask is this: When you are reading articles like the ones linked above, don't dismiss every comment you see that says, "No, it's not, and here's why" as the ranting of a smug Lisp weenie. It may be a rant, and it may be written by a smug Lisp weenie. But it may instead be written by a perfectly sane programmer who is trying to teach you that there is more to Lisp than higher-order functions, and that the more you've missed is a whole lot more. We can learn from some of those comments, and think about how to make our programming languages even better.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 25, 2011 3:53 PM

On the Passing of John McCarthy

John McCarthy tribute -- 'you are doing it wrong'

It's been a tough couple of weeks for the computer science community. First we lost Steve Jobs, then Dennis Ritchie. Now word comes that John McCarthy, the creator of Lisp, died late Sunday night at the age of 84. I'm teaching Programming Languages this semester based on the idea of implementing small language interpreters, and we are using Scheme. McCarthy's ideas and language are at the heart of what my students and I are doing every day.

Scheme is a Lisp, so McCarthy is its grandfather. Lisp is different from just about every other programming language. It's not just the parentheses, which are only syntax. In Lisp and Scheme, programs and data are the same. To be more specific, the representation of a Lisp program is the the same representation used to represent Lisp data. The equivalence of data and program is one of the truly Big Ideas of computer science, one which I wrote about in Basic Concepts: The Unity of Data and Program. This idea is crucial to many areas of computer science, even ones in which programmers do not take direct advantage of it through their programming language.

We also owe McCarthy for the idea that we can write a language interpreter in the language being interpreted. Actually, McCarthy did more: he stated the features of Lisp in terms of the language features themselves. Such a program defines the language in which the program is written. This is the idea of meta-circular interpreter, in which two procedures:

  • a procedure that evaluates an expression, and
  • a procedure that applies a procedure to its arguments
recurse mutually to evaluate a program. This is one of the most beautiful ideas in computing, as well as serving as the mechanism and inspiration for modern-day interpreters and compilers.

Last week, the CS world lost Dennis Ritchie, the creator of the C programming language. By all accounts I've read and heard, McCarthy and Ritchie were very different kinds of people. Ritchie was an engineer through and through, while McCarthy was an academic's academic. So, too, are the languages they created very different. Yet they are without question the two most influential programming languages ever created. One taught us about simplicity and made programming across multiple platforms practical and efficient; the other taught us about simplicity made programming a matter of expressiveness and concision.

Though McCarthy created Lisp, he did not implement the first Lisp interpreter. As Paul Graham relates in Revenge of the Nerds, McCarthy first developed Lisp as a theoretical exercise, an attempt to create an alternative to the Turing Machine. Steve Russell, one of McCarthy's grad students, suggested that he could implement the theory in an IBM 704 machine language program. McCarthy laughed and told him, "You're confusing theory with practice..." Russell did it any way. (Thanks to Russell and the IBM 704, we also have car and cdr!) McCarthy and Russell soon discovered that Lisp was more powerful than the language they had planned to build after their theoretical exercise, and the history of computing was forever changed.

If you'd like, take a look at my Scheme implementation of John McCarthy's Lisp written in Lisp. It is remarkable how much can be built out of so little. Alan Kay has often compared this interpreter to Maxwell's equations in physics. To me, its parts usually feel like the basic particles out of which all matter is built. Out of these few primitives, all programs are built.

I first learned of McCarthy not from Lisp but from my first love, AI. McCarthy coined the term "Artificial Intelligence" when organizing (along with Minsky, Rochester, and Shannon) the 1956 Dartmouth conference that gave birth to the field. I studied McCarthy's work in AI using the language he had created. To me, he was a giant of AI long before I recognized that he was giant of programming languages, too. Like many pioneers of our field, he laid the groundwork in many subdisciplines. They had no choice; they had to build their work out of ideas using only the rawest materials. McCarthy is even credited with the first public descriptions of time-sharing systems and what we now call cloud computing. (For McCarthy's 1970-era predictions about home computers and the cloud, see his The Home Information Terminal, reprinted in 2000.)

Our discipline has lost a giant.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 24, 2011 7:38 PM

Simple/Complex Versus Easy/Hard

A few years ago, I heard a deacon give a rather compelling talk to a group of college students on campus. When confronted with a recommended way to live or act, students will often say that living or acting that way is hard. These same students are frustrated with the people who recommend that way of living or acting, because the recommenders -- often their parents or teachers -- act as if it is easy to live or act that way. The deacon told the students that their parents and teachers don't think it is easy, but they might well think it is simple.

How can this be? The students were confounding "simple" and "easy". A lot of times, life is simple, because we know what we should do. But that does not make life easy, because doing a simple thing may be quite difficult.

This made an impression on me, because I recognized that conflict in my own life. Often, I know just what to do. That part is simple. Yet I don't really want to do it. To do it requires sacrifice or pain, at least in the short term. To do it means not doing something else, and I am not ready or willing to forego that something. That part is difficult.

Switch the verb from "do" to "be", and the conflict becomes even harder to reconcile. I may know what I want to be. However, the gap between who I am and who I want to be may be quite large. Do I really want to do what it takes to get there? There may be a lot of steps to take which individually are difficult. The knowing is simple, but the doing is hard.

This gap surely faces college students, too, whether it means wanting to get better grades, wanting to live a healthier life, or wanting to reach a specific ambitious goal.

When I heard the deacon's story, I immediately thought of some of my friends, who like very much the idea of being a "writer" or a "programmer", but they don't really want to do the hard work that is writing or programming. Too much work, too much disappointment. I thought of myself, too. We all face this conflict in all aspects of life, not just as it relates to personal choices and values. I see it in my teaching and learning. I see it in building software.

I thought of this old story today when I watched Rich Hickey's talk from StrangeLoop 2011, Simple Made Easy. I had put off watching this for a few days, after tiring of a big fuss that blew up a few weeks ago over Hickey's purported views about agile software development techniques. I knew, though, that the dust-up was about more than just Hickey's talk, and several of my friends recommended it strongly. So today I watched. I'm glad I did; it is a good talk. I recommend it to you!

Based only on what I heard in this talk, I would guess that Hickey misunderstands the key ideas behind XP's practices of test-driven development and refactoring. But this could well be a product of how some agilistas talk about them. Proponents of agile and XP need to be careful not to imply that tests and refactoring make change or any other part of software development easy. They don't. The programmer still has to understand the domain and be able to think deeply about the code.

Fortunately, I don't base what I think about XP practices on what other people think, even if they are people I admire for other reasons. And if you can skip or ignore any references Hickey makes to "tests as guard rails" or to statements that imply refactoring is debugging, I think you will find this really is a very good talk.

Hickey's important point is that simple/complex and easy/hard are different dimensions. Simplicity should be our goal when writing code, not complexity. Doing something that is hard should be our goal when it makes us better, especially when it makes us better able to create simplicity.

Simplicity and complexity are about the interconnectedness of a system. In this dimension, we can imagine objective measures. Ease and difficulty are about what is most readily at hand, what is most familiar. Defined as they are in terms of a person's experience or environment, this dimension is almost entirely subjective.

And that is good because, as Hickey says a couple of times in the talk, "You can solve the familiarity problem for yourself." We are not limited to our previous experience or our current environment; we can take on a difficult challenge and grow.

a Marin mountain bike

Alan Kay often talks about how it is worth learning to play a musical instrument, even though playing is difficult, at least at the start. Without that skill, we are limited in our ability to "make music" to turning on the radio or firing up YouTube. With it, you are able make music. Likewise riding a bicycle versus walking, or learning to fly an airplane versus learning to drive a car. None of these skills is necessarily difficult once we learn them, and they enable new kinds of behaviors that can be simple or complex in their own right.

One of the things I try to help my students see is the value in learning a new, seemingly more difficult language: it empowers us to think new and different thoughts. Likewise making the move from imperative procedural style to OOP or to functional programming. Doing so stretches us. We think and program differently afterward. A bonus is that something that seemed difficult before is now less daunting. We are able to work more effectively in a bigger world.

In retrospect, what Hickey says about simplicity and complexity is actually quite compatible with the key principles of XP and other agile methods. Writing tests is a part of how we create systems that are as simple as we can in the local neighborhood of a new feature. Tests can also help us to recognize complexity as it seeps into our program, though they are not enough by themselves to help us see complexity. Refactoring is an essential part of how we eliminate complexity by improving design globally. Refactoring in the presence of unit tests does not make programming easy. It doesn't replace thinking about design; indeed, it is thinking about design. Unit tests and refactoring do help us to grapple with complexity in our code.

Also in retrospect, I gotta make sure I get down to St. Louis for StrangeLoop 2012. I missed the energy this year.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 19, 2011 1:37 PM

A Programming Digression: Benford's Law and Factorials

leading digits of factorials up to 500

This morning, John Cook posted a blog entry on the leading digits of factorials and how, despite what might be our intuition, they follow Benford's Law. He whipped up some Python code and showed the results of his run for factorials up to 500. I have linked to his graphic at the right.

As I am , I decided to whip up a quick Scheme version of Cook's experiment. He mentioned some implementation issues involving the sizes of integers and floating-point numbers in Python, and I wondered how well Scheme would fare.

For my first attempt, I did the simplest thing that would possibly work. I already had a tail-recursive factorial function and so wrote a procedure that would call it n times and record the first digit of each:

(define benford-factorials
  (lambda (n)
    (let ((counts (make-vector 10 0)))
      (letrec ((foreach
                 (lambda (n)
                   (if (zero? n)
                       counts
                       (let ((lead-digit (first-digit (factorial n))))
                         (vector-set! counts lead-digit
                                      (+ 1 (vector-ref counts lead-digit)))
                         (foreach (- n 1)))))))
        (foreach n)))))

This gets the answers for us:

     > (benford-factorials 500)
     #(0 148 93 67 38 34 43 24 28 25)

Of course, it is wildly inefficient. My naive implementation computes and acts on each factorial independently, which means that it recomputes (n-1)!, (n-2)!, ... for each value less than n. As a result, benford-factorials becomes unnecessarily sluggish for even relatively small values of n. How can I do better?

I decided to create a new factorial function, one that caches the smaller factorials it creates on the way to n!. I call it all-factorials-up-to:

(define all-factorials-up-to
  (lambda (n)
    (letrec ((aps (lambda (i acc)
                    (if (> i n)
                        acc
                        (aps (+ i 1)
                             (cons (* i (car acc)) acc))))))
      (aps 2 '(1)))))

Now, benford-factorials can use a more functional approach: map first-digit over the list of factorials, and then map a count incrementer over the list of first digits.

(define benford-factorials
  (lambda (n)
    (let ((counts (make-vector 10 0))
          (first-digits (map first-digit
                             (all-factorials-up-to n))))
      (map (lambda (digit)
             (vector-set! counts digit
                          (+ 1 (vector-ref counts digit))))
           first-digits)
      counts)))

(We can, of course, do without the temporary variable first-digit by dropping the first map right into the second. I often create an explaining temporary variable such as this one to make my code easier for me to write and read.)

How does this one perform? It gets the right answers and runs more comfortably on larger n:

     > (benford-factorials 500)
     #(0 148 93 67 38 34 43 24 28 25)
     > (benford-factorials 1000)
     #(0 293 176 124 102 69 87 51 51 47)
     > (benford-factorials 2000)
     #(0 591 335 250 204 161 156 107 102 94)
     > (benford-factorials 3000)
     #(0 901 515 361 301 244 233 163 147 135)
     > (benford-factorials 4000)
     #(0 1192 707 482 389 311 316 227 201 175)
     > (benford-factorials 5000)
     #(0 1491 892 605 477 396 387 282 255 215)

This procedure begins to be sluggish for n ≥ 3000 on my iMac.

Cook's graph shows how closely the predictions of Benford's Law fit for factorials up to 500. How well do the actual counts match the predicted values for the larger sets of factorials? Here is a comparison for n = 3000, 4000, and 5000:

     n = 3000
       digit        1   2   3   4   5   6   7   8   9
       actual     901 515 361 301 244 233 163 147 135
       predicted  903 528 375 291 238 201 174 153 137

n = 4000 digit 1 2 3 4 5 6 7 8 9 actual 1192 707 482 389 311 316 227 201 175 predicted 1204 704 500 388 317 268 232 205 183

n = 5000 digit 1 2 3 4 5 6 7 8 9 actual 1491 892 605 477 396 387 282 255 215 predicted 1505 880 625 485 396 335 290 256 229

That looks pretty close to the naked eye. I've always found Benford's Law to be almost magic, even though mathematicians can give a reasonable account of why it holds. Seeing it work so well with something seemingly as arbitrary as factorials only reinforces my sense of wonder.

If you would like play with these ideas, feel free to start with my Scheme code. It has everything you need to replicate my results above. If you improve on my code or take it farther, please let me know!


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 17, 2011 4:46 PM

Computational Thinking Everywhere: Experiments in Education

I recently ran across Why Education Startups Do Not Succeed, based on the author's experience working as an entrepreneur in the education sector. He admits upfront that he isn't offering objective data to support his conclusions, so we should take them with a grain of salt. Still, I found his ideas interesting. Here is the take-home point in sentences:

Most entrepreneurs in education build the wrong type of business, because entrepreneurs think of education as a quality problem. The average person thinks of it as a cost problem.

That disconnect creates a disconnect between the expectations of sellers and buyers, which ends up hurting, even killing, most education start-ups.

The old AI guy in me latched on to this paragraph:

Interestingly, in the US, the people who are most willing to try new things are the poor and uneducated because they have a similar incentive structure to a person in rural India. Their default state is "screwed." If a poor person doesn't do something dramatic, they are going to stay screwed. Many parents and teachers in these communities understand this. So the communities are often willing to try new, experimental things -- online education, charter schools, longer school days, no summer vacation, co-op programs -- even if they may not work. Why? Because their students default state is "screwed", and they need something dramatically better. Doing something significantly higher quality is the only way to overcome the inertia of already being screwed. The affordable, but poor quality approaches just aren't good enough. These communities are on the hunt for dramatically better approaches and willing to try new things.

Local and global maxima in hill-climbing

I've seen other discussions of the economic behavior of people in the lowest socioeconomic categories that fit this model. Among them were the consumption of lottery tickets in lieu of saving, and more generally the trade-off between savings and consumption. If a small improvement won't help a people much, then it seems they are more likely willing to gamble on big improvements or to simply enjoy short-term rewards of spending.

This mindset immediately brought to mind the AI search technique known as hill climbing. When you know you are on a local maximum that is significantly lower than the global maximum, you are willing to take big steps in search of a better hill to climb, even if that weakens your position in the short-term. Baby steps won't get you there.

This is a small example of unexpected computational thinking in the real world. Psychologically, it seems, that we are often hill climbers.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

October 13, 2011 3:10 PM

Learning and New Kinds of Problems

I recently passed this classic by Reg Braithwaite to a grad student who is reading in the areas of functional programming and Ruby. I love how Braithwaite prefaces the technical content of the entry with an exhortation to learners:

... to obtain the deepest benefit from learning a new language, you must learn to think in the new language, not just learn to translate your favourite programming language syntax and idioms into it.

The more different the thing you are learning from what you already know, the more important this advice. You are already good at solving the problems your current languages solve well!

And worse, when a new tool is applied to a problem you think you know well, you will probably dismiss the things the new tool does well. Look at how many people dismiss brevity of code. Note that all of the people ignore the statistics about the constant ratio between bugs and lines of code use verbose languages. Look at how many people dismiss continuation-based servers as a design approach. Note that all of them use programming languages bereft of control flow abstractions.

Real programmers know Y.

This is great advice for people trying to learn functional programming, which is all the rage these days. Many people come to a language like Scheme, find it lacking for the problems they have been solving in Python, C, and Java, and assume something is wrong with Scheme, or with functional programming more generally. It's easy to forget that the languages you know and the problems you solve are usually connected in a variety of ways, not the least of which for university students is that we teach them to solve problems most easily solved by the languages we teach them!

If you keep working on the problems your current language solves well, then you miss out on the strengths of something different. You need to stretch not only your skill set but also your imagination.

If you buy this argument, schedule some time to work through Braithwaite's derivation of the Y combinator in Ruby. It will, as my daughter likes to say, make your brain hurt. That's a good thing. Just like with physical exercise, sometimes we need to stretch our minds, and make them hurt a bit, on the way to making them stronger.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 12, 2011 12:31 PM

Programming for Everyone -- Really?

TL;DR version: Yes.

Yesterday, I retweeted a message that is a common theme here:

Teaching students how to operate software, but not produce software, is like teaching kids to read & not write. (via @KevlinHenney)

It got a lot more action than my usual fare, both retweets and replies. Who knew? One of the common responses questioned the analogy by making another, usually of this sort:

Yeah, that would be like teaching kids how to drive a car, but not build a car. Oh, wait...

This is a sounds like a reasonable comparison. A car is a tool. A computer is a tool. We use tools to perform tasks we value. We do not always want to make our own tools.

But this analogy misses out on the most important feature of computation. People don't make many things with their cars. People make things with a computer.

When people speak of "using a computer", they usually mean using software that runs on a computer: a web browser, a word processor, a spreadsheet program. And people use many of these tools to make things.

As soon as we move into the realm of creation, we start to bump into limits. What if the tool we are given doesn't allow us to say or do what we want? Consider the spreadsheet, a general data management tool. Some people use it simply as a formatted data entry tool, but it is more. Every spreadsheet program gives us a formula language for going beyond what the creators of Excel or Numbers imagined.

But what about the rest of our tools? Must we limit what we say to what our tool affords us -- to what our tool builders afford us?

A computer is not just a tool. It is also a medium of expression, and an increasingly important one.

If you think of programming as C or Java, then the idea of teaching everyone to program may seem silly. Even I am not willing to make that case here. But there are different kinds of programming. Even professional programmers write code at many levels of abstraction, from assembly language to the highest high-level language. Non-programmers such as physicists and economists use scripting languages like Python. Kids of all ages are learning to program Scratch.

Scratch is a good example of what I was thinking when I retweeted. Scratch is programming. But Scratch is really a way to tell stories. Just like writing and speaking.

Alfred Thompson summed up this viewpoint succinctly:

[S]tudents need to be creators and not just consumers.

Kids today understand this without question. They want to make video mash-ups and interactive web pages and cutting-edge presentations. They need to know that they can do more than just use the tools we deign to give them.

One respondent wrote:

As society evolves there is an increasing gap between those that use technology and those that can create technology. Whilst this is a concern, it's not the lowest common denominator for communication: speaking, reading and writing.

The first sentence is certainly true. The question for me is: on which side of this technology divide does computing live? If you think of computation as "just" technology, then the second sentence seems perfectly reasonable. People use Office to do their jobs. It's "just a tool".

It could, however, be a better tool. Many scientists and business people write small scripts or programs to support their work. Many others could, too, if they had the skills. What about teachers? Many routine tasks could be automated in order to give them more time to do what they do best, teach. We can write software packages for them, but then we limit them to being consumers of what we provide. They could create, too.

Is computing "just tech", or more? Most of the world acts like it is the former. The result is, indeed, an ever increasing gap between the haves and the have nots. Actually, the gap is between the can dos and the cannots.

I, and many others, think computation is more than simply a tool. In the wake of Steve Jobs's death last week, many people posted his famous quote that computing is a liberal art. Alan Kay, one of my inspirations, has long preached that computing is a new medium on the order of reading and writing. The list of people in the trenches working to make this happen is too numerous to include.

More practically, software and computer technology are the basis of much innovation these days. If we teach the new medium to only a few, the "5 percent of the population over in the corner" to whom Jobs refers, we exclude the other 95% from participating fully in the economy. That restricts economic growth and hurts everyone. It is also not humane, because it restricts people's personal growth. Everyone has a right to the keys to the kingdom.

I stand in solidarity with the original tweeter and retweeter. Teaching students how to operate software, but not produce software, is like teaching kids to read but not to write. We can do better.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 04, 2011 4:43 PM

Programming in Context: Digital History

Last April I mentioned The Programming Historian, a textbook aimed at a specific set of non-programmers who want or need to learn how to program in order to do their job in the digital age. I was browsing through the textbook today and came across a paragraph that applies to more than just historians or so-called applied programmers:

Many books about programming fall into one of two categories: (1) books about particular programming languages, and (2) books about computer science that demonstrate abstract ideas using a particular programming language. When you're first getting started, it's easy to lose patience with both of these kinds of books. On the one hand, a systematic tour of the features of a given language and the style(s) of programming that it supports can seem rather remote from the tasks that you'd like to accomplish. On the other hand, you may find it hard to see how the abstractions of computer science are related to your specific application.

I don't think this feeling is limited to people with a specific job to do, like historians or economists. Students who come to the university intending to major in Computer Science lose patience with many of our CS1 textbooks and CS1 courses for the very same reasons. Focusing too much on all the features of a language is overkill when you are just trying to make something work. The abstractions we throw at them don't have a home in their understanding of programming or CS yet and so seem, well, too abstract.

Writing for the aspiring applied programmer has an advantage over writing for CS1: your readers have something specific they want to do, and they know just what it is. Turkel and MacEachern can teach a subset of several tools, including Python and Javascript, focused on what historians want to be able to do. Greg Wilson and his colleagues can teach what scientists want and need to know, even if the book is pitched more broadly.

In CS1, your students don't have a specific task in mind and do eventually need to take a systematic tour of a language's features and to learn a programming style or three. They do, eventually, need to learn a set of abstractions and make sense of them in the context of several languages. But when they start, they are much like any other person learning to program: they would like to do something that matters. The problems we ask them to solve matter.

Guzdial, Ericson, and their colleagues have used media computation as context in which to learn how to program, with the idea that many students, CS majors and non-majors alike, can be enticed to manipulate images, sounds, and video, the raw materials out of which students' digital lives are now constructed. It's not quite the same -- students still need to be enticed, rather than starting with their own motivation -- but it's a shorter leap to caring than the run-off-the-mill CS textbook has to make.

Some faculty argue that we need a CS0 course that all students take, in which they can learn basic programming skills in a selected context before moving onto the major's first course. The context can be general enough, say, media manipulation or simple text processing on the web, that the tools students learn will be useful after the course whether they continue on or not. Students who elect to major in CS move on to take a systematic tour of a language's features, to learn about OO or FP style, and to begin learning the abstractions of the discipline.

My university used to follow this approach, back in the early and mid 1990s. Students had to take a one-year HS programming course or a one-semester programming course at the university before taking CS1. We dropped this requirement when faculty began asking, why shouldn't we put the same care into teaching low-level programming skills in CS1 as we do into teaching CS0? The new approach hasn't always been as successful as we hoped, due to the difficulty of finding contexts that motivate students as well as we want, but I think the approach is fundamentally sound. It means that CS1 may not teach all the things that it did when the course had a prerequisite.

That said, students who take one of our non-majors programming courses, C and Visual Basic, and then move decide to major in CS perform better on average in CS1 than students who come in fresh. We have work to do.

Finally, one sentence from The Programming Historian made me smile. It embodies the "programming for all" theme that permeates this blog:

Programming is for digital historians what sketching is for artists or architects: a mode of creative expression and a means of exploration.

I once said that being able to program is like having superhuman strength. But it is both more mundane and more magical than that. For digital historians, being able to program means being able to do the mundane, everyday tasks of manipulating text. It also gives digital historians a way to express themselves creatively and to explore ideas in ways hard to imagine otherwise.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 03, 2011 7:20 AM

Softmax, Recursion, and Higher-Order Procedures

Update: This entry originally appeared on September 28. I bungled my blog directory and lost two posts, and the simplest way to get the content back on-line is to repost.

John Cook recently reported that he has bundled up some of his earlier writings about the soft maximum as a tech report. The soft maximum is "a smooth approximation to the maximum of two real variables":

    softmax(x, y) = log(exp(x) + exp(y))

When John posted his first blog entry about the softmax, I grabbed the idea and made it a homework problem for my students, who were writing their first Scheme procedures. I gave them a link to John's page, so they had access to this basic formula as well as a Python implementation of it. That was fine with me, because I was simply trying to help students become more comfortable using Scheme's unusual syntax:

    (define softmax
      (lambda (x y)
        (log (+ (exp x)
                (exp y)))))

On the next assignment, I asked students to generalize the definition of softmax to more than two variables. This gave them an opportunity to write a variable arity procedure in Scheme. At that point, they had seen only a couple simple examples of variable arity, such as this implementation of addition using a binary + operator:

    (define plus              ;; notice: no parentheses around
      (lambda args            ;; the args parameter in lambda
        (if (null? args)
            0
            (+ (car args) (apply plus (cdr args))) )))

Many students followed this pattern directly for softmax:

    (define softmax-var
      (lambda args
        (if (null? (cdr args))
            (car args)
            (softmax (car args)
                     (apply softmax-var (cdr args))))))

Some of their friends tried a different approach. They saw that they could use higher-order procedures to solve the problem -- without explicitly using recursion:

    (define softmax-var
      (lambda args
        (log (apply + (map exp args)))))

When students saw each other's solutions, they wondered -- as students often do -- which one is correct?

John's original blog post on the softmax tells us that the function generalizes as we might expect:

    softmax(x1, x2, ..., xn) = log(exp(x1) + exp(x2) + ... + exp(xn))

Not many students had looked back for that formula, I think, but we can see that it matches the higher-order softmax almost perfectly. (map exp args) constructs a list of the exp(xi) values. (apply + ...) adds them up. (log ...) produces the final answer.

What about the recursive solution? If we look at how its recursive calls unfold, we see that this procedure computes:

    softmax(x1, softmax(x2, ..., softmax(xn-1, xn)...))

This is an interesting take on the idea of a soft maximum, but it is not what John's generalized definition says, nor is it particularly faithful to the original 2-argument function.

How might we roll our own recursive solution that computes the generalized function faithfully? The key is to realize that the function needs to iterate not over the maximizing behavior but the summing behavior. So we might write:

    (define softmax-var
      (lambda args
        (log (accumulate-exps args))))

(define accumulate-exps (lambda (args) (if (null? args) 0 (+ (exp (car args)) (accumulate-exps (cdr args))))))

This solution turns softmax-var into interface procedure and then uses structural recursion over a flat list of arguments. One advantage of using an interface procedure is that the recursive procedure accumulate-exps no longer has to deal with variable arity, as it receives a list of arguments.

It was remarkable to me and some of my students just how close the answers produced by the two student implementations of softmax were, given how different the underlying behaviors are. Often, the answers were identical. When different, they differed only in the 12th or 15th decimal digit. As several blog readers pointed out, softmax is associative, so the two solutions are identical mathematically. The differences in the values of the functions result from the vagaries of floating-point precision.

The programmer in me left the exercise impressed by the smoothness of the soft maximum. The idea is resilient across multiple implementations, which makes it seem all the more useful to me.

More important, though, this programming exercise led to several interesting discussions with students about programming techniques, higher-order procedures, and the importance of implementing solutions that are faithful to the problem domain. The teacher in me left the exercise pleased.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 02, 2011 5:16 PM

Computing Everywhere: Economist Turns Programmer

This About Me page, on Ana Nelson's web site, is a great example of how computing can sneak up on people:

Having fallen in love with programming while studying for my Ph.D. in economics, I now develop open source software to explore and explain data. I am the creator of Dexy, a new tool for reproducible research and document automation.

A lot of disciplines explore and explain data, from particular domains and within particular models. I'm not surprised when I encounter someone in one of those disciplines who finds she likes exploring and explaining data more than the specific domain or model. Programming is a tool that lets rise above disciplinary silos and consider data, patterns, and ideas across the intellectual landscape.


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 26, 2011 9:02 PM

Taking Computer Science Off the Beaten Track

I love this idea: a workshop at POPL 2012, the premier programming languages research conference, called Off the Beaten Track. Here is the gist.

Programming language researchers have the principles, tools, algorithms and abstractions to solve all kinds of problems, in all areas of computer science. However, identifying and evaluating new problems, particularly those that lie outside the typical core PL problems we all know and love, can be a significant challenge. Hence, the goal of this workshop is to identify and discuss problems that do not often show up in our top conferences, but where programming language researchers can make a substantial impact. The hope is that by holding such a forum and associating it directly with a top conference like POPL, we can slowly start to increase the diversity of problems that are studied by PL researchers and that by doing so we will increase the impact that our community has on the world.

I remember when I first started running across papers written by physicists on networks in social science and open-source software. Why were physicists writing these papers? They are out of their league. In fact, though they were something else: curious, and equipped with good tools for studying the problems. Good for them -- and good fro the rest of us. too, as they contributed to our understanding of how the world works.

Computer science has even better tools and methods for studying all manners of problems and systems, especially the more dynamic systems. Our ability to reify the language of any domain and then write programs to implement the semantics of the domain is one step up from the models that most mathematicians and physicists bring to the table.

Sometimes we forget the power we have in language. As Tim Ottinger tweeted today, "I think people have lost the idea that OO builds the language you implement in, as well as the product." We forget to use our own ability to create and use language even in an approach built on the premise! And of course we can go farther when we build architectures and virtual machines for domain-specific languages, rather than living inside a relatively restrictive model like Java.

The organizers of Off the Beaten Track remind us to think about the wealth of "principles, tools, algorithms, and abstractions" we possess and can bring to bear on problems far beyond the narrow technical area of programming languages research, from the natural sciences to art and music, from economics and the law to linguistics and education. They even acknowledge that we don't always appreciate the diversity in our own research field and so encourage submissions on "unusual compilers" and "underrepresented programming languages".

The last sentence in the passage above expresses an important ultimate goal: to increase the impact the programming languages community has on the world. I heartily support this goal and suggest that it is an important one not only for programming languages researchers. It is essential that many more of us in computer science look off the beaten track for ways to apply what we have learned to problems far beyond our own borders. If we start focusing on problems that matter to other people, problems that matter, we might just solve them.

My favorite line in the Off the Beaten Track home page is the last item in its bullet list of potential topic areas: Surprise us.. Indeed. Surprise us.


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 23, 2011 3:52 PM

Grading and Learning in the Age of Social Media

Yesterday morning, I was grading the first quiz from my programming languages course and was so surprised by the responses to the first short-answer question that I tweeted in faux despair:

Wow. We discussed a particular something every day in class for three weeks. Student quiz answers give no evidence of this.

Colleagues around the globe consoled me and commiserated. But I forgot that I am also followed by several of my student, and their reaction was more like... panic. Even though I soon followed up with a tweet saying that their Scheme code made me happy, they were alarmed about that first tweet.

It's a new world. I never used to grade with my students watching or to think out loud while I was grading. Twitter changes that, unless I change me and stop using Twitter. On balance, I think I'm still better off. When I got to class, students all had smiles on their faces, some more nervous than others. We chatted. I did my best to calm them. We made a good start on the day with them all focused on the course.

We have reached the end of Week 5, one-third of the way through the course. Despite the alarm I set off in students' minds, they have performed on par with students in recent offerings over the first three homeworks and the first quiz. At this point, I am more concerned with my performance than theirs. After class yesterday, I was keenly aware of the pace of the session being out of sync with the learning curve of material. The places where I've been slowing down aren't always the best places to slow down, and the places where I've been speeding up (whether intentional or unintentional) aren't always the best places to speed up. A chat with one student that afternoon cemented my impression.

Even with years of experience, teaching is hard to get right. One shortcoming of teaching a course only every third semester is that the turnaround time on improvements is so long. What I need to do is use my realization to improve the rest of this offering, first of all this unit on recursive programming.

I spent some time early this week digging into Peter Norvig's Prescient but Not Perfect, a reconsideration of Christopher Strachey's 1966 Sci Am article and in particular Strachey's CPL program to play checkers. Norvig did usual wonderful job with the code. It's hard to find a CPL compiler these days, and has been since about 1980, so he wrote a CPL-to-Python translator, encoded and debugged Strachey's original program, and published the checkers program and a literate program essay that explains his work.

This is, of course, a great topic for a programming languages course. Norvig exemplifies the attitude I encourage to my students on Day 1: if you need a language processor, write one. It's just another program. I am not sure yet when I will bring this topic into my course; perhaps when we first talk in detail about interpreters, or perhaps when we talk about parsing and parser generators. (Norvig uses YAPPS, a Python parser generator, to convert a representation of CPL's grammar into a CPL parser written in Python.)

There are some days when I had designed all of my course sessions to be 60 minutes instead of 75 minutes, so that we had more lüft for opportunistic topics like this one. Or that I could design a free day into the course every 2-3 weeks for the same purpose. Alas, the CS curriculum depends on this course to expose students to a number of important ideas and practices, and the learning curve for some of the material is non-trivial. I'll do my best to provide at least a cursory coverage of Norvig's article and program. I hope that a few students will turn out to his approach to the world -- the computer scientist's mindset.

If nothing else, working through his paper and code excite me, and that will leak over into the rest of my work.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 16, 2011 4:13 PM

The Real Technological Revolution in the Classroom Hasn't Happened Yet

Earlier this month, the New York Times ran a long article exploring the topic of technology in K-12 classrooms, in particular the lack of evidence that the mad rush to create the "classroom of future" is having any effect on student learning. Standardized test scores are stagnant in most places, and in schools showing improvements, research has not been able to separate the effect of using technology from the effect of extra teacher training.

We should not be surprised. It is unlikely that simply using a new technology will have any effect on student learning. If we teach the same way and have students do the same things, then we should expect student learning to be about the same whether they are writing on paper, using a typewriter, or typing on a computer keyboard. There are certainly some very cool things one can do with, say, Keynote, and I think having those features available can augment a student's experience. But I doubt that those features can have a substantial effect on learning. Technology is like a small booster rocket for students who are already learning a lot and like training wheels for those who are not.

As I read that article, one fact screamed out at me. Computers are being used in classrooms everywhere for almost everything. Everything, that is, except the one thing that makes them useful at all: computer programming.

After reading the Times piece, Clive Thompson pulled the critical question out of it and asked, What can computers teach that textbooks and paper can't? Mark Guzdial has written thoughtfully on this topic in the past as well. Thompson offers two answers: teaching complexity and seeing patterns, (His third answer is a meta-answer, more effective communication between teacher and student.) We can improve both teaching complexity and seeing patterns by using the right software, but -- thankfully! -- Thompson points out that we can do even better if we teach kids even a little computer programming.

Writing a little code is a great vehicle for exploring a complex problem and trying to create and communicate understanding. Using or writing a program to visualize data and relationships among them is, too.

Of course, I don't have hard evidence for these claims, either. But it is human nature to want to tinker, to hack, to create. If I am going to go out on a limb without data, I'd rather do it with students creating tools that help them understand their world than with students mostly consuming media using canned tools. And I'm convinced that we can expand our students' minds more effectively by showing them how to program than by teaching most of what we teach in K-12. Programming can be tedious, like many learning tasks, and students need to learn how to work through tedium to deeper understanding. But programming offers rewards in a way and on a scale that, say, the odd problems at the end of a chapter in an algebra textbook can never do by themselves.

Mark Surman wrote a very nice blog entry this week, Mozilla as teacher, expressing a corporate vision for educating the web-using public that puts technology in context:

... people who make stuff on the internet are better creators and better online citizens if they know at least a little bit about the web's basic building blocks.

As I've written before, we do future teachers, journalists, artists, filmmakers, scientists, citizens, and curious kids a disservice if we do not teach them a little bit of code. Without this knowledge, they face unnecessary limits on their ability to write, create, and think. They deserve the ability to tinker, to hack, to trick out their digital worlds. The rest of us often benefit when they do, because some of things they create make all of our lives better.

(And increasingly, the digital world intersects with the physical world in surprising ways!)

I will go a step further than Surman's claim. I think that people who create and who are engaged with the world they inhabit have an opportunity to better citizens, period. They will be more able, more willing participants in civic life when they understand more clearly their connections to and dependence on the wider society around them. By giving them the tools they need to think more deeply and to create more broadly, education can enable them to participate in the economy of the world and improve all our lots.

I don't know if K-12 standardized test scores would get better if we taught more students programming, but I do think there would be benefits.

As I often am, I am drawn back to the vision Alan Kay has for education. We can use technology -- computers -- to teach different content in different ways, but ultimately it all comes back to new and better ways to think in a new medium. Until we make the choice to cross over into that new land, we can spend all the money we want on technology in the classroom and not change much at all.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 11, 2011 11:27 AM

Using Plain Text to Scrub Spreadsheet Files

For a few years, I have been saving many .xls as plain text. This is handy when I want to program against the data more often than I want to use spreadsheet tools. It also makes the data more accessible across apps and platforms, which is good now and in the future. While doing this, I have came across this technique that I find useful more generally. Maybe you will, too.

People use spreadsheets used for two purposes, structuring data and presenting data. Both Excel and Apple's Numbers offer as much or more functionality for presenting information as they do for manipulating it. For me, the presentation stuff often gets in the way of organizing and manipulating, both by cluttering the UI with commands I don't need to know about and by adding formatting information to my data. The result is a UI more complicated than I need and data files much larger than I need them to be.

When I run into one of those bloated files, I sometimes take a round trip:

  • export the file as .csv or .tsv
  • import that text file back into Numbers or OpenOffice as a spreadsheet file

The result is a clean data file, with the data and its basic structure, but nothing more. No text formatting, no variably spaced rows or columns, and no presentation widgets. When I do work with the data in the spreadsheet app, it's unadorned, just as I like it.


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 09, 2011 2:19 PM

The Future of Your Data

In Liberating and future-proofing your research data, J.B. Deaton describes a recent tedious effort "to liberate several gigabytes of data from a directory of SigmaPlot files". Deaton works in Python, so he had to go through the laborious process of installing an evaluation copy of SigmaPlot, stepping through each SigmaPlot file workbook by workbook, exporting each to Excel, and then converting the files from Excel to CSV or some other format he could process in Python.

Of course, this was time spent not doing research.

All of this would have been a moot point if the data had been stored as CSV or plain text. I can open and process data stored in CSV on any operating system with a large number of tools, for free. And I am confident in 10 years time, I will be able to do the same.

This is a problem we face when we need to work with old data, as Deaton is doing. It's a problem we face when working with current data, too. I wrote recently about how doing a basic task of my job, such as scheduling courses each semester, in a spreadsheet gets in the way getting the job done as well as I might.

Had Deaton not taken the time to liberate his data, things could have been worse in the long run. Not only would the data have been unavailable to his current project, but it may well have fallen into disuse forever and eventually disappeared.

Kari Kraus wrote this week about the problem of data disappearing. One problem is the evolution of media:

When you saved that unpublished manuscript on [a 5-1/4" floppy disk], you figured it would be accessible forever. But when was the last time you saw a floppy drive?

Well, not a 5-1/4". I do have a 3-1/2" USB floppy drive at home and another in my office. But Kraus has a point. Most of the people creating data aren't CS professionals or techno-geeks. And while I do have a floppy drive, I never use them for my own data. Over the years, I've been careful to migrate my data, from floppy drives to zip drives, from CDs to large, replicated hard drives. Eventually it may live somewhere in the cloud, and I will certainly have to move it to the next new thing in hardware sometime in the future.

Deaton's problem wasn't hardware, though, and Kraus points out the bigger problem: custom encoded-data from application software:

If you don't have a copy of WordPerfect 2 around, you're out of luck.

The professional data that I have lost over the years hasn't been "lost" lost. The problem has always been with software. I, too, have occasionally wished I had a legacy of copy of WordPerfect lying around. My wife and I created a lot of files in pre-Windows WordPerfect back in the late 1980s, and I continued to use Mac versions of WP through the 1990s. As I moved to newer apps, I converted most of the files over, but every once in a while I still run across an old file in .wp format. At this point, it is rarely anything important enough to devote the time Deaton spent on his conversion experience. I choose to let that data die.

Fortunately, not all of my data from that era was encoded. I wrote most of grad school papers in nroff. That's also how I created our wedding invitations.

This is a risk we run as more of our world moves from paper to digital, even when it's just entertainment. Fortunately, for the last 5 years or so, I've been storing more and more of my data in plain text or, short of that, rich text. Like Deaton, I am pretty confident that I will be able to read and process that data 10 years hence. And, whenever possible, I have used an open file formats only policy with my colleagues.

Rather than having to liberate data in the future, it is wiser to let it live free from birth. That reduces friction now and later. Deaton offers a set of preferences that can help you keep your data as free as possible:

  • Open source beats closed source.
  • Ubiquitous beats niche software.
  • Automation/scripting beats manual processes.
  • Plain text beats binaries.
  • READMEs in every project directory.

That third bullet is good advice even if you are not a computer scientist. Deaton isn't. But you don't have to be a computer scientist to reap the benefits of a little programming!


Posted by Eugene Wallingford | Permalink | Categories: Computing

August 31, 2011 8:52 PM

Learning to Learn to Automate Work

Jon Udell recently wrote about the real problem with automating work: most of us don't know how. The problem isn't with using particular tools, which come and go, but with recognizing the power of information networks and putting data into a form from which it can be leveraged.

I want to apply his advice more often than I do. I have found that for me it is not enough simply to learn the principles.

The first challenge is fighting an uphill battle against institutional inertia. The university provides too much of its data in dead-tree form, and what data comes to us in digital form comes unstructured. Despite a desire to increase efficiency and decrease costs, a university is a big, slow organization. It takes a long time for its many bureaucracies to re-make themselves. It also takes a persistent, pervasive effort to change on the parts of many people. Too many administrators and faculty thrive in a papered society, which makes change even harder. This is the broad base of people who need to learn Udell's core principles of information networks.

The second challenge is my own habits. I may not be of my students' generation, but I've been in computer science for a long time, and I think I get the value of data. Even still, it's easy -- especially as department head -- to be sucked into institutional habits. My secretary and I are combating this by trying to convert as much data entering our office as possible into live, structured data. In the process, I am trying to teach her, a non-computer scientist, a bit about the principles of data and structured representation. We aren't worrying yet about networks and pub/sub, simply getting data into a named, structured form that supports computational processing.

Yet I need to change some of my own habits, too. When under time pressure, it's easy for me to, say, whip up assignments of graduate assistants to tasks and lab hours on a legal pad. Once the assignments are made, I can communicate a subset of the information in a couple of e-mail messages. The result is a lot of information and not a byte of structured data. Oh, and a lot of lost opportunities for using code to check consistency, make changes, publish the assignments in multiple forms, or reuse the data in adapted form next semester.

My next big opportunity to practice better what I preach is scheduling courses for spring semester. Instead of using spreadsheets as we have in the past, perhaps I should open up Dr. Racket and use it to record all the data we collect and create about the schedule. Scheme's reliance on the simple list as its primary data structure usually puts me in the mindset of grammars and syntax-driven programming. Sometimes, the best way to break a bad old habit is to create a good new one.

So, yes, we need to teach the principles of data networks in a systematic way to information technologists and everyone else. We also need to practice applying them and look for ways to help individuals and institutions alike change their habits.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Managing and Leading

August 18, 2011 3:52 PM

Some Thoughts on "Perlis Languages"

Alan Perlis

Fogus recently wrote a blog entry, Perlis Languages, that has traveled quickly through parts of software world. He bases his title on one of Alan Perlis's epigrams: "A language that doesn't affect the way you think about programming is not worth knowing." Long-time Knowing and Doing readers may remember this quote from my entry, Keeping Up Versus Settling Down. If you are a programmer, you should read Fogus's article, which lists a few languages he thinks might change how you think about programming.

There can be no single list of Perlis languages that works for everyone. Perlis says that a language is worth knowing if it affects how you think about programming. That depends on you: your background, your current stage of development as a programmer, and the kind of problems you work on every day. As an example, in the Java world, the rise of Scala and Clojure offered great opportunities for programmers to expand their thinking about programming. To Haskell and Scheme programmers, the opportunity was much smaller, perhaps non-existent.

The key to this epigram is that each programmer should be thinking about her knowledge and on the look out for languages that can expand her mind. For most of us, there is plenty of room for growth. We tend to work in one or two styles on a daily basis. Languages that go deep in a different style or make a new idea their basic building block can change us.

That said, some languages will show up lots of peoples' Perlis lists, if only because they are so different from the languages most people know and use on a daily basis. Lisp is one of the languages that used to be a near universal in this regard. It has a strangely small and consistent syntax, with symbols as first-order objects, multiple ways to work with functions, and macros for extending the language in a seamless way. With the appearance of Clojure, more and more people are being exposed to the wonders of Lisp, so perhaps won't be on everyone's Perlis list in 10 years. Fogus mentions Clojure only in passing; he has written one of the better early Clojure books, and he doesn't want to make a self-serving suggestion.

I won't offer my own Perlis list here. This blog often talks about languages that interest me, so readers have plenty of chances to hear my thoughts. I will add my thoughts about two of the languages Fogus mentions in his article.

Joy. *Love* it! It's one of my favorite little languages, and one that remains very different from what most programmers know. Scripting languages have put a lot of OOP and functional programming concepts before mainstream programmers across the board, but the idea of concatenative programming is still "out there" for most.

Fogus suggests the Forth programming language in this space. I cannot argue too strongly against this and have explained my own fascination with it in a previous entry. Forth is very cool. Still, I prefer Joy as a first step into the world of concatenative programming. It is clean, simple, and easy to learn. It is also easy to write a Joy interpreter in your favorite language, which I think is one of the best ways to grok a language in a deep way. As I mentioned in the Forth entry linked above, I spent a few months playing with Joy and writing an interpreter for it while on sabbatical a decade ago.

If you play with Joy and like it, you may find yourself wanting more than Joy offers. Then pick up Forth. It will not disappoint you.

APL. Fogus says, "I will be honest. I have never used APL and as a result find it impenetrable." Many things are incomprehensible before we try them. (A student or two will be telling me that Scheme is incomprehensible in the next few weeks...) I was fortunate to write a few programs in APL back in my undergrad programming languages course. I'm sure if I wrote a lot of APL it would become more understandable, but every time I return to the language, it is incomprehensible again to me for a while.

David Ungar told one of my favorite APL stories at OOPSLA 2003, which I mentioned in my report on his keynote address. The punchline of that story fits very well with the theme of so-called Perlis languages: "They could have done the same thing [I] did in APL -- but they didn't think of it!"

There are modern descendants of APL, but I still think there is something special about the language's unique character set. I miss the one-liners consisting or five or twenty Greek symbols, punctuation, and numbers, which accomplished unfathomable tasks such as implementing a set of accounting books.

I do second Fogus's reason for recommending APL despite never having programmed in it: creator Kenneth Iverson's classic text, A Programming Language. It is an unusually lucid account of the design of a programming language -- a new language, not an adaptation of a language we already know. Read it. I had the wonderful opportunity to meet Iverson when he spoke at Michigan State in the 1980s, as described in my entry on Iverson's passing.

... So, I encourage you to follow the spirit of Fogus's article, if not its letter. Find the languages that can change how you think, and learn them. I begin helping a new set of students on this path next week, when we begin our study of Scheme, functional programming, and the key ideas of programming languages and styles.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 03, 2011 7:55 PM

Psychohistory, Economics, and AI

Or, The Best Foresight Comes from a Good Model

Hari Seldon from the novel Foundation

In my previous entry, I mentioned re-reading Asimov's Foundation Trilogy and made a passing joke about psychohistory being a great computational challenge. I've never heard a computer scientist mention psychohistory as a primary reason for getting involved with computers and programming. Most of us were lucky to see so many wonderful and more approachable problems to solve with a program that we didn't need to be motivated by fiction, however motivating it might be.

I have, though, heard and read several economists mention that they were inspired to study economics by the ideas of psychohistory. The usual reason for the connection is that econ is the closest thing to psychohistory in modern academia. Trying to model the behavior of large groups of people, and reaping the advantages of grouping for predictability, is a big part of what macroeconomics does. (Asimov himself was most likely motivated in creating psychohistory by physics, which excels at predicting the behavior of masses of atoms over predicting the behavior of individual atoms.)

As you can tell from recent history, economists are no where near the ability to do what Hari Seldon did in Foundation, but then Seldon did his work more than 10,000 years in the future. Maybe 10,00 years from now economists will succeed as much and as well. Like my economist friends, I too am intrigued by economics, which also shares some important features in common with computer science, in particular a concern with the trade-offs among limited resources and the limits of rational behavior.

The preface to the third book in Asimov's trilogy, Second Foundation, includes a passage that caught my eye on this reading:

He foresaw (or he solved his [system's] equations and interpreted its symbols, which amounts to the same thing)...

I could not help but be struck by how this one sentence captured so well the way science empowers us and changes the intellectual world in which we live. Before the rapid growth of science and broadening of science education, the notion of foresight was limited to personal experience and human beings' limited ability to process that experience and generalize accurately. When someone had an insight, the primary way to convince others was to tell a good story. Foresight could be feigned and sold through stories that sounded good. With science, we have a more reliable way to assess the stories we are told, and a higher standard to which we can hold the stories we are told.

(We don't always do well enough in using science to make us better listeners, or better judges of purported foresights. Almost all of us can do better, both in professional settings and personal life.)

As a young student, I was drawn to artificial intelligence as the big problem to solve. Like economics, it runs directly into problems of limited resources and limited rationality. Like Asimov's quote above, it runs directly into the relationship between foresight and accurate models of the world. During my first few years teaching AI, I was often surprised by how fiercely my students defended the idea of "intuition", a seemingly magical attribute of men and women forever unattainable by computer programs. It did me little good to try to persuade them that their belief in intuition and "gut instinct" were outside the province of scientific study. Not only didn't they care; that was an integral part of their belief. The best thing I could do was introduce them to some of the techniques used to write AI programs and to show them such programs behaving in a seemingly intelligent manner in a situation that piqued my students' interest -- and maybe opened their minds a bit.

Over the course of teaching those early AI courses, I was eventually able to see one of the fundamental attractions I had to the field. When I wrote an AI program, I was building a model of intelligent behavior, much as Seldon's psychohistory involved building a model of collective human behavior. My inspiration did not come from Asimov, but it was similar in spirit to the inspiration my economist friends' drew from Asimov. I have never been discouraged or deterred by any arguments against the prospect of artificial intelligence, whether my students' faith-based reasons or by purportedly rational arguments such as John Searle's Chinese room argument. I call Searle's argument "purportedly rational" because, as it is usually presented, ultimately it rests on the notion that human wetware -- as a physical medium -- is capable of representing symbols in a way that silicon or other digital means cannot.

I always believed that, given enough time and enough computational power, we could build a model that approximated human intelligence as closely as we desired. I still believe this and enjoy watching (and occasionally participating in) efforts that create more and more intelligent programs. Unlike many, I am undeterred by the slow progress of AI. We are only sixty years into an enterprise that may take a few thousand years. Asimov taught me that much.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

June 06, 2011 9:12 PM

Magazines, PDF Files, and a Lifetime of Memories

I remember the day I received my first issue of Chess Life & Review magazine. It was the summer of 1979, in late June or early July. I had won a membership in the U.S. Chess Federation as part of a local goodwill tournament, by virtue of beating my good buddy and only competition for the junior prize. My victory entitled me to the membership as $20 of loot, which includes a portable set I use to this day and a notation book that records my games over a period of five or ten years.

Play it again, Sam.

That first issue arrived while I was at summer school. The cover heralded the upcoming U.S. Open championship, but inside the story of Montreal 1979, a super-GM tournament, captivated me with the games of familiar names (Karpov, Tal, Larsen, and Spassky) and new heroes (Portisch, Huuml;bner, Hort, and Ljubojevic). A feature article reviewed films of the 1940s that featured chess and introduced me to Humphrey Bogart's love of and skill at the game I loved to play. Bogart: the man's man, the tough-guy leading man at whose feet women swooned. Bogart! The authors of the magazine's regular columns became my fast friends, and for years thereafter I looked forward monthly to Andy Soltis's fun little stories, which always seemed to teach me something, and Larry Evans's Q-n-A column, which always seemed to entertain.

I was smitten, as perhaps only a young bookish kid can be.

Though I haven't played tournament chess regularly in three decades, I have remained an occasional player, a passionate programmer, and a lovestruck fan. And I've maintained my membership in the USCF, which entitles me to a monthly issue of Chess Life. Though life as husband, father, and professor leave me little time for the game I once played so much, every month I anticipate the arrival of my new issue, replete with new names and new games, tournament reports and feature articles, and regular columns that include Andy Soltis's "Chess to Enjoy". Hurray!

... which is all prelude to my current dilemma, a psychological condition that reveals me a man of my time and not a man of the future, or even the present. It's time to renew my USCF membership, and I am torn: do I opt for the membership that provides on-line access only to Chess Life?

For the last few years, ever since we moved into a new house and I cam face to face with just how much stuff I have, I've been in the process of cutting back. Even before then, I have made some society membership choices based in part on how little I need more piles of paper taking up space in my house and attention in my mind. This is the 21st century, right? I am a computer scientist, who deals daily in digital materials, who has disk space beyond his wildest dreams, whose students have effortlessly grown into a digital world that makes magazines seem like quaint compendia of the past. Right?

Yet I waffle. I can save roughly $7 a year by going paperless, which is a trifle, I know, but a prudent choice nonetheless. Right?

Undoubtedly, my CL&R-turned-CL collection takes up space. If I stem the tide of incoming issues, I can circumscribe the space needed to store my archive and devote future space to more worthy application. Perhaps I could even convert some of the archive into digital form and recoup space already spent?

This move would space, but if I am honest it does not free up all my attention. My magazines will join my music collection in the river of bits flowing into my future, being copied along from storage device to storage device, from medium to medium, and from software application to software application. I've lived through several generations of storage media, beginning in earnest with 5-1/4" floppies, and I'm sure I'll live through several more.

And what of changing formats? The text files that have followed me from college remain readable, for the most part, but not everything survives. For every few files I've converted from WordPerfect for DOS I have surely lost a file or two. Occasionally I run across one and ask myself, is it worth my time to try to open it and convert it to something more modern? I am sad to say that too often the answer is, well, no. This never happens to my books and magazines and pamphlets from that time. I choose to keep or to discard, and if I have it, I can read it. Where will PDF be in 50 years?

the cover of Bobby Fischer's 'My 60 Memorable Games'

I am also just old enough that I somewhat cherish having a life that is separate from my digital existence. When I have the chance to play chess these days, I still prefer to pull out a board and set up the pieces. The feel of the ivory or plastic or wood in my hands is part of the experience -- not essential to the experience, I suppose, in a cosmic sense, but a huge ingredient to my personal experience. I have been playing chess on computers since 1980 or so, which isn't much later than I began playing the game in earnest as in grade school, so I know that feeling, too. But feeling the pieces in my hand, poring over My 60 Memorable Games (another lifelong treasure from the booty that first brought me Chess Life) line by line in search of Bobby Fischer's magic... these are a part of the game for me.

Ultimately, that's where my renewal dilemma lies, too. My memories of checking the mailbox every day at that time of the month, eager to find the next issue of the magazine. The smell of the ink as I thumbed through the pages, peeking ahead at the delights that awaited me. The feel of the pages as I turned to the next column or article or advertisement. The joy of picking out an old issue, grabbing that magnetic portable set from 30-odd years ago, and settling into a comfortable chair for an evening of reminiscence and future-making. All are a part of what chess has been for me. A cache of PDF files, $22 over three years, and a little closet space hardly seem sufficient consideration.

Alas, we are all creatures of our own times, I no less than any man. Even though I know better, I find myself pulled backward in time just as much as Kurt Vonnegut, who occasionally waxed poetic about the future of printed book. Both Vonnegut and I realize that the future may well exceed our imaginations, but our presents retain the gifts of days past.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

May 02, 2011 3:52 PM

Thinking and Doing in the Digital Age

Last week, someone I follow tweeted this link in order to share this passage:

You will be newbie forever. Get good at the beginner mode, learning new programs, asking dumb questions, making stupid mistakes, soliciting help, and helping others with what you learn (the best way to learn yourself).

That blog entry is about inexorable change of technology in then modern world and how, if we want to succeed in this world, we need a mindset that accommodates change. We might even argue that we need a mindset that welcomes or seeks out change. To me, this is one of the more compelling reasons for us to broaden the common definition of the liberal arts to include computing and other digital forms of communication.

As much as I like the quoted passage, I liked a couple of others as much or more. Consider:

Understanding how a technology works is not necessary to use it well. We don't understand how biology works, but we still use wood well.

As we introduce computing and other digital media to more people, we need to balance teaching how to use new ideas and techniques and teaching underlying implementations. Some tools change how we work without us knowing how they work, or needing to know. It's easy for people like me to get so excited about, say, programming that we exaggerate its importance. Not everyone needs to program all the time.

Then again, consider this:

The proper response to a stupid technology is to make a better one yourself, just as the proper response to a stupid idea is not to outlaw it but to replace it with a better idea.

In the digital world as in the physical world, we are not limited by our tools. We can change how our tools work, through configuration files and scripts. We can make our own tools.

Finally, an aphorism that captures differences between how today's youth think about technology and how people my age often think (emphasis added):

Nobody has any idea of what a new invention will really be good for. To evaluate, don't think; try.

This has always been true of inventions. I doubt many people appreciated just how different the world would be after the creation of the automobile or the transistor. But with digital tools, the cost of trying things out has been driven so low, relative to the cost of trying things in the physical world, that the cost is effectively zero. In so many situations now, the net value of trying things exceeds the net value of thinking.

I know that sounds strange, and I certainly don't mean to say that we should all just stop thinking. That's the sort of misinterpretation too many people made of the tenets of extreme programming. But the simple fact is, thinking too much means waiting too long. While you are thinking -- waiting to start -- someone else is trying, learning faster, and doing things that matter.

I love this quote from Elisabeth Hendrickson, who reminded herself of the wisdom of "try; don't think" when creating her latest product:

... empirical evidence trumps speculation. Every. Single. Time.

The scientific method has been teaching us the value of empiricism over pure thought for a long time. In the digital world, the value is even more pronounced.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 19, 2011 6:04 PM

A New Blog on Patterns of Functional Programming

(... or, as my brother likes to say about re-runs, "Hey, it's new to me.")

I was excited this week to find, via my Twitter feed, a new blog on functional programming patterns by Jeremy Gibbons, especially an entry on recursion patterns. I've written about recursion patterns, too, though in a different context and for a different audience. Still, the two pieces are about a common phenomenon that occurs in functional programs.

I poked around the blog a bit and soon ran across articles such as Lenses are the Coalgebras for the Costate Comonad. I began to fear that the patterns on this blog would not be able to help the world come to functional programming in the way that the Gang of Four book helped the world come to object-oriented programming. As difficult as the GoF book was for every-day programmers to grok, it eventually taught them much about OO design and helped to make OO programming mainstream. Articles about coalgebras and the costate comonad are certainly of value, but I suspect they will be most valuable to an audience that is already savvy about functional programming. They aren't likely to reach every-day programmers in a deep way or help them learn The Functional Way.

But then I stumbled across an article that explains OO design patterns as higher-order datatype-generic programs. Gibbons didn't stop with the formalism. He writes:

Of course, I admit that "capturing the code parts of a pattern" is not the same as capturing the pattern itself. There is more to the pattern than just the code; the "prose, pictures, and prototypes" form an important part of the story, and are not captured in a HODGP representation of the pattern. So the HODGP isn't a replacement for the pattern.

This is one of the few times that I've seen an FP expert speak favorably about the idea that a design pattern is more than just the code that can be abstracted away via a macro or a type class. My hope rebounds!

There is work to be done in the space of design patterns of functional programming. I look forward to reading Gibbons's blog as he reports on his work in that space.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

April 14, 2011 10:20 PM

Al Aho, Teaching Compiler Construction, and Computational Thinking

Last year I blogged about Al Aho's talk at SIGCSE 2010. Today he gave a major annual address sponsored by the CS department at Iowa State University, one of our sister schools. When former student and current ISU lecturer Chris Johnson encouraged me to attend, I decided to drive over for the day to hear the lecture and to visit with Chris.

Aho delivered a lecture substantially the same as his SIGCSE talk. One major difference was that he repackaged it in the context of computational thinking. First, he defined computational thinking as the thought processes involved in formulating problems so that their solutions can be expressed as algorithms and computational steps. Then he suggested that designing and implementing a programming language is a good way to learn computational thinking.

With the talk so similar to the one I heard last year, I listened most closely for additions and changes. Here are some of the points that stood out for me this time around, including some repeated points:

  • One of the key elements for students when designing a domain-specific language is to exploit domain regularities in a way that delivers expressiveness and performance.
  • Aho estimates that humans today rely on somewhere between 0.5 and 1.0 trillion lines of software. If we assume that the total cost associated with producing each line is $100, then we are talking about a most serious investment. I'm not sure where he found the $100/LOC number, but...
  • Awk contains a fast, efficient regular expression matcher. He showed a figure from the widely read Regular Expression Matching Can Be Simple And Fast, with a curve showing Awk's performance -- quite close to Thompson NFA curve from the paper. Algorithms and theory do matter.
  • It is so easy to generate compiler front ends these days using good tools in nearly every implementation language. This frees up time in his course for language design and documentation. This is a choice I struggle with every time I teach compilers. Our students don't have as strong a theory background as Aho's do when they take the course, and I think they benefit from rolling their own lexers and parsers by hand. But I'm tempted by what we could with the extra time, including processing a more compelling source language and better coverage of optimization and code generation.
  • An automated build system and a complete regression test suite are essential tools for compiler teams. As Aho emphasized in both talks, building a compiler is a serious exercise in software engineering. I still think it's one of the best SE exercises that undergrads can do.
  • The language for quantum looks cool, but I still don't understand it.

After the talk, someone asked Aho why he thought functional programming languages were becoming so popular. Aho's answer revealed that he, like any other person, has biases that cloud his views. Rather than answering the question, he talked about why most people don't use functional languages. Some brains are wired to understand FP, but most of us are wired for, and so prefer, imperative languages. I got the impression that he isn't a fan of FP and that he's glad to see it lose out in the social darwinian competition among languages.

If you'd like to see an answer to the question that was asked, you might start with Guy Steel's StrangeLoop 2010 talk. Soon after that talk, I speculated that documenting functional design patterns would help ease FPs into the mainstream.

I'm glad I took most of my day for this visit. The ISU CS department and chair Dr. Carl Chang graciously invited me to attend a dinner this evening in honor of Dr. Aho and the department's external advisory board. This gave me a chance to meet many ISU CS profs and to talk shop with a different group of colleagues. A nice treat.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

April 06, 2011 7:10 PM

Programming for All: The Programming Historian

Knowing how to program is crucial for doing advanced research with digital sources.

This is the opening line of William Turkel's how-to for digital researchers in the humanities, A Workflow for Digital Research Using Off-the-Shelf Tools. This manual helps such researchers "Go Digital", which is, he says, the future in disciplines such as history. Turkel captures one of the key changes in mindset that faces humanities scholars as they make the move to the digital world:

In traditional scholarship, scarcity was the problem: travel to archives was expensive, access to elite libraries was gated, resources were difficult to find, and so on. In digital scholarship, abundance is the problem. What is worth your attention or your trust?

Turkel suggests that programs -- and programming -- are the only way to master the data. Once you do master the data, you can be an even more productive researcher than you were in the paper-only world.

I haven't read any of Turkel's code yet, but his how-to shows a level of sophistication as a developer. I especially like that his first step for the newly-digital is:

Start with a backup and versioning strategy.

There are far too many CS grads who need to learn the version control habit, and a certain CS professor has been bitten badly by a lack of back-up strategy. Turkel wisely makes this Job #1 for digital researchers.

The how-to manual does not currently have a chapter on programming itself, he does talk about using RSS feeds to do work for you and about and about measuring and refactoring constantly -- though at this point he is talking about one's workflow, not one's programs. Still, it's a start.

As soon as I have some time, I'm going to dig into Turkel's The Programming Historian, "an open-access introduction to programming in Python, aimed at working historians". I think there is a nice market for many books like this.

This how-to pointed me toward a couple of tools I might add to my own workflow. One is Feed43, a web-based tool to create RSS feeds for any web page (an issue I've discussed here before). On first glance, Feed43 looks a little complex for beginners, but it may be worth learning. The manual also reminded me of Mendeley an on-line reference manager. I've been looking for a new tool to manage bibliographies, so I'll give it another look.

But the real win here is a path for historians into the digital world and then into programming -- because it makes historians more powerful at what they do. Superhuman strength for everyone!


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 24, 2011 10:23 PM

Teachers and Programming Languages as Permission Givers

Over spring break, I read another of William Zinsser's essays at The American Scholar, called Permission Givers. Zinsser talks about importance of people who give others permission to do, to grow, and to explore, especially in a world that offers so many freedoms but is populated with people and systems that erect barriers at every turn.

My first reaction to the paper was as a father. I have recognized our elementary and high schools as permission-denying places in a way I didn't experience them as a student myself, and I've watched running the gauntlet of college admissions cause a bright, eager, curious child to wonder whether she is good enough after all. But my rawest emotions were fear and hope -- fear that I had denied my children permission too often, and hope that on the whole I had given them permission to do what they wanted to do and become who they can be. I'm not talking about basic rules; some of those are an essential part of learning discipline and even cultivating creativity. I mean encouraging the sense of curiosity and eagerness that happy, productive people carry through life.

The best teachers are permission givers. They show students some of what is possible and then create conditions in which students can run with ideas, put them together and take them apart, and explore the boundaries of their knowledge and their selves. I marvel when I see students creating things of beauty and imagination; often, there is a good teacher to be found there as well. I'm sad whenever I see teachers who care deeply about students and learning but who sabotage their students' experience by creating "a long trail of don'ts and can'ts and shouldn'ts", by putting subtle roadblocks along the path of advancement.

I don't think that by nature I am permission giver, but over my career as a teacher I think I've gotten better. At least now I am more often aware of when I'm saying 'no' in subtle and damaging ways, so that I can change my behavior, and I am more often aware of the moments when the right words can help a student create something that matters to them.

In the time since I read the essay, another strange connection formed in my mind: Some programming languages are permission givers. Some are not.

Python is a permission giver. It doesn't erect many barriers that get in the way of the novice, or even the expert, as she explores ideas. Ruby is a permission giver, too, but not to the extent that Python is. It's enough more complex syntactically and semantically that things don't always work the way one first suspects. As a programmer, I prefer Ruby for the expressiveness it affords me, but I think that Python is the more empowering language for novices.

Simplicity and consistency seem to be important features of permission-giving languages, but they are probably not sufficient. Another of my favorite languages, Scheme, is simple and offers a consistent model of programming and computation, but I don't think of it as a permission giver. Likewise Haskell.

I don't think that the tired argument between static typing and dynamic typing is at play here. Pascal had types but it was a permission giver. Its descendant Ada, not so much.

I know many aficionados of other languages often feel differently. Haskell programmers will tell me that their language makes them so productive. Ada programmers will tell me how their language helps them build reliable software. I'm sure they are right, but it seems to me there is a longer learning curve before some languages feel like permission givers to most people.

I'm not talking about type safety, power, or even productivity. I'm talking about the feeling people have when they are deep in the flow of programming and reach out for something they want but can't quite name... and there it is. I admit, too, that I also have beginners in mind. Students who are learning to program, more than experts, need to be given permission to experiment and persevere.

I also admit that this idea is still new in mind and is almost surely colored heavily by my own personal experiences. Still, I can't shake the feeling that there is something valuable in this notion of language as permission giver.

~~~~

If nothing else, Zinsser's essay pointed me toward a book I'd not heard of, Michelle Feynman's Reasonable Deviations from the Beaten Track, a collection of the personal and professional letters written by her Nobel Prize-winning father. Even in the most mundane personal correspondence, Richard Feynman tells stories that entertain and illuminate. I've only begun reading and am already enjoying it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 22, 2011 4:45 PM

Encounters with Large Numbers and the Limits of Programs

Yesterday, this xkcd illustration of radiation doses made the round of Twitter. My first thought was computational: this graph is a great way to help students see what "order of magnitude" means and how the idea matters to our understanding of a real-world phenomenon.

Late yesterday afternoon, one of my colleagues stopped by to describe a Facebook conversation he had been having with a few of our students, and in particular one of out better students. This student announced that he was going to write a program to generate all possible brackets for the men's NCAA basketball tournament. My colleague said, "Um, there are 2 to the 67th power brackets", to which the student responded, "Yeah, I know, that's why I'm going to write a program. There are too many to do by hand." From this followed a discussion of just how big 2**67 is and how long it would take a program to generate all the brackets. Even using a few heuristics to trim the problem down, such as always picking a 1-seed to beat a 16-seed, the number is astronomically large. (Or, as Richard Feynman suggests, "economically large".)

Sometimes, even good students can gain a better understanding of a concept by encountering it in the wild. This is perhaps even more often true when the idea is unintuitive or beyond our usual experience.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 10, 2011 9:21 PM

SIGCSE Day 2 -- Limited Exposure

For a variety of reasons, I am scheduled for only two days at SIGCSE this year. I did not realize just how little time that is until I arrived and started trying to work in all the things I wanted to do: visit the exhibits, attend a few sessions and learn a new thing or two, and -- most important -- catch up with several good friends.

It turns out that's hard to do in a little more than a day. Throw in a bout of laryngitis in the aftermath of a flu-riddled week, and the day passed even more quickly. Here are a few ideas that stood out from sessions on either end of the day.

Opening Keynote Address

Last March I blogged about Matthias Felleisen winning ACM's Outstanding Educator Award. This morning, Felleisen gave the opening address for the conference, tracing the evolution of his team's work over the last fifteen years in a smooth, well-designed talk. One two-part idea stood out for me: design a smooth progression of teaching languages that are neither subset nor superset of any particular industrial-strength language, then implement them, so that your tools can support student learning as well as possible.

Matthias's emphasis on the smooth progression reminds me of Alan Kay's frequent references to the fact that English-speaking children learn the same language used by Shakespeare to write our greatest literature, growing into it over time. One of his goals for Smalltalk, or whatever replaces it, is a language that allows children to learn programming and grow smoothly into more powerful modes of expression as their experience and cognitive skills grow.

Two Stories from Scratch

At the end of the day, I listened in on a birds-of-a-feather session about Scratch, mostly in K-12 classrooms. One HS teacher described how his students learn to program in Scratch and then move onto a "real language". As they learn concepts and vocabulary in the new language, he connects the new terms back to their concrete experiences in Scratch. This reminded me of a story in one of Richard Feynman's books, in which he outlines his father's method of teaching young Richard science. He didn't put much stock in learning the proper names of things at first, instead helping his son to learn about how things work and how they relate to one another. The names come later, after understanding. One of the advantages of a clean language such as Scratch (or one of Felleisen's teaching languages) is that it enables students to learn powerful ideas by using them, not by memorizing their names in some taxonomy.

Later in the session, Brian Harvey told the story of a Logo project conducted back in the 1970s, in which each 5th-grader in a class was asked to write a Logo program to teach a 3rd-grader something about fractions. An assignment so wide open gave every student a chance to do something interesting, whatever they themselves knew about fractions. I need to pull this trick out of my teaching toolbox a little more often.

(If you know of a paper about this project, please send me a pointer. Thanks.)

~~~~

There is one unexpected benefit of a short stay: I am not likely to leave any dynamite blog posts sitting in the queue to be written, unlike last year and 2008. Limited exposure also limits the source of triggers!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 09, 2011 11:31 PM

SIGCSE Day 1 -- Innovative Approaches for Introducing CS

SIGCSE 2011 in Dallas, Texas

I'm in Dallas for a couple of days for SIGCSE 2011. I owe my presence to Jeff Forbes and Owen Astrachan, who organized a pre-conference workshop on innovative approaches for introducing computer science and provided support for its participants, courtesy of their NSF projects.

The Sheraton Dallas is a big place, and I managed to get lost on the way to the workshop this morning. As I entered the room fifteen minutes late, Owen was just finishing up talking about something called the Jinghui Rule. I still don't know what it is, but I assume it had something to do with us not being able to use our laptops during much of the day. This saves you from reading a super-long breakdown of the day, which is just as well. The group will produce a report soon, and I'm sure Jeff and Owen will do more complete job than I might -- not least of which because we all produced summaries of our discussion throughout the day, presented them to the group as a whole, and submitted them to our leaders for their use.

The topics we discussed were familiar ones, including problems, interdisciplinary approaches, integrative approaches, motivating students, and pedagogical issues. Even still, the discussions were often fresh, as most everyone in the room wrestles with these topics in the trenches and is constantly trying new things.

I did take a few notes the old-fashioned way about some things that stood out to me:

  • Owen captured the distinction between "interdisciplinary" and "integrative" well; here is my take. Interdisciplinary approaches pull ideas from other areas of study into our CS courses as a way to illustrate or motivate ideas. Integrative approaches push CS techniques out into courses in other areas of study where they become a native part of how people in those disciplines work.
  • Several times during the day people mentioned the need to "document best practices" of various sorts. Joe Bergin was surely weeping gently somewhere. We need more than disconnected best practices; we need a pattern language or two for designing certain kinds of courses and learning experiences.
  • Several times during the day talk turned to what one participant termed student-driven discovery learning. Alan Kay's dream of an Exploratorium never strays far from my mind, especially when we talk about problem-driven learning. We seem to know what we need to do!
  • A group of us discussed problems and big data in a "blue sky" session, but the talk was decidedly down-to-earth: the need to format, sanitize, and package data sets for use in the classroom.
  • One of the biggest challenges we face is the invisibility of computing today. Most everyone at the workshop today views computing's ubiquity as a great opportunity, and I often feel the same way. But I fear the reality is that, for most everyone else, computing has disappeared into the background noise of life. Convincing them that it is cool to understand how, say, Facebook works may be a tougher task than we realize.

Finally, Ge Wang demoed some of the cool things you can do with an iPhone using apps like those from Smule. Wow. That was cool.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 24, 2011 4:17 PM

What are Java, C++, and Prolog?

That's the correct question for the clue, "The languages used to write Watson", courtesy of the IBM Watson team answering questions in the usual way on Reddit. I'm looking forward to learning more about how Watson operates at the software level, especially how it handles the uncertainty of multiple knowledge sources working in parallel on imprecise data.

I meet with prospective students and their parents frequently as part of on- and off-campus recruiting events. Last week and this are Up Close Days on campus, when students who have been admitted to the university but not yet accepted the offer visit in hopes of making their decision. I usually try to use part of our time together to help them see what computer science is and what graduates of our program can do, because so few of either the students or parents have any idea. With the match on my mind for teaching and research, it occurred to me that the timing of the match offered a perfect opportunity.

Here was a story that everyone had heard about, the kind that captures the attention of even the least technically minded among us: a battle of man versus machine in an arena that has historical been ours alone. Watson is not only a triumph of computer science; it covers the full spectrum of what we do, from racks of servers and an open-source operating system, through algorithms and data structures and programs that embody them, to the dreams of artificial intelligence and the promise of applications that help people all over the world. So I built a short presentation around the match, showing short clips from the show, offering commentary at key moments, taking wide-eyed questions at any time, and tying what the program was doing to what computer science is and what computer science graduates do.

Last week's first session went very well. If nothing else, all of our visitors could see my excitement at what we do and what remains to be done. Sharing the thrill, indeed!

Yesterday, Lance Fortnow tweeted:

Asked: What is Computer Science? Answered: Everything that happens after you ask a question to Google until you get a result.

I don't know what all the future implications of Watson's performance will be for AI, CS, or the world. I do know that, for now, the match has given us a compelling way to talk to others about computer science ripped right from the headlines and our TV screens. When we can connect our story to something like Google, we have a chance of helping others to get what we do. When the story connects to something that has already excited our audience, our job is perhaps a little bit easier.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 18, 2011 4:51 PM

The Dawn of a New Age

Jeopardy! champ Ken Jennings passes the baton to Watson

My world this week has been agog with the Watson match on Jeopardy!. The victory by a computer over the game's two greatest champions may well signal a new era, in much the same way as the rise of the web and the advent of search. I'd like to collect my thoughts before writing anything detailed about the match itself.

In anticipation of match, last week my Intelligent Systems students and I began to talk about some of the techniques that were likely being used by Watson under the hood. When I realized how little they had learned in their AI course about reasoning in the face of uncertainty, I dusted off an old lecture from my AI course, circa 2001. With a few tweaks and additions, it held up well. Among its topics were probability and Bayes' Law. In many ways, this material is more timely today than it was then.

Early in the week, as I began to think about what this match would mean for computing and for the world, I was reminded by Peter Norvig's The Machine Age that, in so many ways, the Jeopardy! extravaganza heralds a change already in progress. If you haven't read it yet, you should.

The shift from classical AI to the data-driven AI that underlies the advances Norvig lists happened while I was in graduate school. I saw glimpses of it at conferences on expert systems in finance and accounting, where the idea of mining reams of data seemed to promise new avenues for decision makers in business. The data might be audited financial statements of public corporations or, more tantalizing, collected from grocery store scanners. But I was embedded pretty deeply in a particular way of thinking about AI, and I missed the paradigm shift.

What is most ironic for me is that my own work, which involved writing programs that could construct legal arguments using both functional knowledge of a domain and case law, was most limited by the problem that we see addressed in systems like Google and Watson: the ability to work with the staggering volume of text that makes up our case law. Today's statistical techniques for processing language make extending my work and seeing how well it works in large domains possible in a way I could only dream of. And I never even dreamed of doing it myself, such was my interest in classical AI. (I just needed a programmer, right?)

There is no question that data-driven, statistical AI has proven to be our most successful way to build intelligent systems, particularly those of a certain scale. As an engineering approach, it has won. It may well turn out to be the best scientific approach to understanding intelligence as well, but... For an old classicist like me, there is something missing. Google is more idiot savant than bon vivant; more Rain Man than Man on the Street.

There was a telling scene in one of the many short documentary films about Watson in which the lead scientist on the project said something like, "People ask me if I know why Watson gave the answer it did. I don't how it got the answer right. I don't know how it got the answer wrong." His point was that Watson is so complex and so data-rich it can surprise us with its answers. This is true, of course, and an important revelation to many people who think that, because a computer can only do what we program it to do, it can never surprise or create.

But immediately I was thinking about another sense of that response. When Watson asks "What is Toronto? in a Final Jeopardy! on "U.S. Cities", I want to ask it, "What in the world were you thinking?" If it were a human player, it might be able to tell about its reasoning process. I'd be able to learn from what it did right or wrong. But if I ask most computer programs based on statistical computations over large data sets "Why?" I can't get much more than the ranked lists of candidates we saw at the bottom of the screen on Jeopardy!

There is still a romantic part of me that wants to understand what it means to think and reason at a conscious level. Perhaps my introspection misleads me with explanations constructed post hoc, but it sure seems like I am able to think at a level above my neurons firing. That feeling is especially strong when I perform more complex tasks, such as writing a program or struggling to understand a new idea.

So, when it comes to the science of AI, I still hope for more. Maybe our statistical systems will become complex and data-rich enough that they will be able to explain their reasoning in a meaningful way. It's also possible that my desire is nothing more than a form of chauvinism for my own species, that the desire to sit around and talk about stuff, including how and why we think the way we do, is a quaint feature peculiar to humans. I don't know the answer to this question, but I can't shake the deep belief in an architecture of thought and intelligent behavior that accounts for metacognition.

In any case, it was fun to watch Watson put on such an impressive show!

~~~~

If you are interested in such things, you might want to (re-)read Newell and Simon's 1975 Turing Award lecture, Computer Science as Empirical Inquiry: Symbols and Search. It's a classic of the golden era of AI.


Posted by Eugene Wallingford | Permalink | Categories: Computing

February 11, 2011 4:15 PM

DIY Empirical Analysis Via Scripting

In my post on learning in a code base, I cited Michael Feathers's entry on measuring the closure of code. Michael's entry closes with a postscript:

It's relatively easy to make these diagrams yourself. grep the log of your VCS for lines that depict adds and modifications. Strip everything except the file names, sort it, run it through 'uniq -c', sort it again, and then plot the first column.

Ah, the Unix shell. A few years ago I taught a one-unit course on bash scripting, and I used problems like this as examples in class. Many students are surprised to learn just how much you can do with a short pipeline of Unix commands, operating on plain text data pulled from any source.

You can also do this sort of thing almost as easily in a more full-featured scripting language, such as Python or Ruby. That is one reason languages like them are so attractive to me for teaching programming in context.

Of course, using a powerful, fun language in CS1 creates a new set of problems for us. A while back, a CS educator on the SIGCSE mailing list pointed out one:

Starting in Python postpones the discovery that "CS is not for me".

After years of languages such as C++, Java, and Ada in CS1, which hastened the exit of many a potential CS major, it's ironic that our new problem might be students succeeding too long for their own good. When they do discover that CS isn't for them, they will be stuck with the ability to write scripts and analyze data.

With all due concern for not wasting students' time, this is a problem we in CS should willingly accept.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 10, 2011 4:04 PM

This and That: Problems, Data, and Programs

Several articles caught my eye this week which are worth commenting on, but at this point none has triggered a full entry of its own. Some of my favorite bloggers do what they call "tab sweeps", but I don't store cool articles in browser tabs. I cache URLs and short notes to myself. So I'll sweep up three of my notes as a single entry, related to programming.

Programmer as Craftsman

Seth Godin writes about:

... the craftsperson, someone who takes real care and produces work for the ages. Everyone else might be a hack, or a factory guy or a suit or a drone, but a craftsperson was someone we could respect.

There's a lot of talk in the software development world these days about craftsmanship. All the conversation and all the hand-waving boil down to this. A craftsman is the programmer we all respect and the programmer we all want to be.

Real Problems...

Dan Meyer is an erstwhile K-12 math teacher who rails against the phony problems we give kids when we ask them to learn math. Textbooks do so in the name of "context". Meyer calls it "pseudocontext". He gives an example in his entry Connect These Two Dots, and then explains concisely what is wrong with pseudocontext:

Pseudocontext sends two signals to our students, both false:
  • Math is only interesting in its applications to the world, and
  • By the way, we don't have any of those.

Are we really surprised that students aren't motivated to practice and develop their craft on such nonsense? Then we do the same things to CS students in our programming courses...

... Are Everywhere These Days

Finally, Greg Wilson summarizes what he thinks "computational science" means in one of his Software Carpentry lessons. It mostly comes down to data and how we understand it:

It's all just data.

Data doesn't mean anything on its own -- it has to be interpreted.

Programming is about creating and composing abstractions.

...

The tool shapes the hand.

We drown in data now. We collect faster than we can understand it. There is room for more programmers, better programmers, across the disciplines and in CS.

We certainly shouldn't be making our students write Fahrenheit-to-Celsius converters or processing phony data files.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 07, 2011 9:03 PM

Teaching and Learning in a Code Base

In a pair of tweets today, Brian Marick offered an interesting idea for designing instruction for programmers:

A useful educational service: examine a person's codebase. Devise a new feature request that would be hard, given existing code and skill...

... Keep repeating as the codebase and skill improve. Would accelerate a programmer's skill at dealing with normal unexpected change.

This could also be a great way to help each programmer develop competencies that are missing from his or her skill set. I like how this technique would create an individualized learning for each student. The cost, of course, is in the work needed by the instructor to study the codebases and devise the feature requests. With a common set of problems to work on, over time an instructor might be able to develop a checklist of (codebase characteristic, feature request) pairs that covered a lot of the instructional space. This idea definitely deserves some more thought!

Of course, we can sometimes analyze valuable features of a codebase with relatively simple programs. Last month, Michael Feathers blogged about measuring the closure of code, in which he showed how we can examine the Open/Closed Principle in a codebase by extracting and plotting the per-file commit frequencies of source files in a project's version control repository. Feathers discussed how developers could use this information intentionally to improve the quality of their code. I think this sort of analysis could be used to great effect in the classroom. Students could see the OCP graphically for a number of projects and, combined with their programming knowledge of the projects, begin to appreciate what the OCP means to a programmer.

A serendipitous side effect would be for students to experience CS as an empirical discipline. This would help us prepare developers more readily in sync with Feathers's use of analytical data in their practice and CS grads who understand the ways in which CS can and should be an empirical endeavor.

I actually blogged a bit about studying program repositories last semester, for the purpose of understanding how to design better programming languages. That work used program repositories for research purposes. What I like about Marick's and Feathers's recent ideas is that they bring to mind how studying a program repository can aid instruction, too. This didn't occur to me so much back when one of my grad students studied relationships among open-source software packages with automated analysis of a large codebase. I'm glad to have received a push in that direction now.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 03, 2011 3:30 PM

Science and Engineering in CS

A long discussion on the SIGCSE members listserv about math requirements for CS degrees has drifted, as most curricular discussions seem to do, to "What is computer science?" Somewhere along the way, someone said, "Computer Science *is* a science, by name, and should therefore be one by definition". Brian Harvey responded:

The first thing I tell my intro CS students is "Computer Science isn't a science, and it isn't about computers." (It should be called "information engineering.")

I think that this assertion is wrong, at least without a couple of "only"s thrown in, but it is a great way to start a conversation with students.

I've been seeing the dichotomy between CS as science and CS as system-building again this semester in my Intelligent course. The textbook my students used in their AI course last semester is, like nearly every undergrad AI text, primarily an introduction to the science of AI: a taxonomy of concepts, results of research that help to define and delimit the important ideas. It contains essentially no pragmatic results for building intelligent systems. Sure, students learn about state-space search, logic as a knowledge representation, planning, and learning, along with algorithms for the basic methods of the field. But they are not prepared for the fact that, when they try to implement search or logical inference for a given problem, they still have a huge amount of work to do, with little guidance from the text.

In class today, we discussed this gap in two contexts: the gap one sees between low-level programming and high-level programming languages, and the difference between general-purpose languages and domain-specific languages.

My students seemed to understand my point of view, but I am not sure they really grok it. That happens best after they gain experience writing code and feel the gap while making real systems run. This is one of the reasons I'm such a believer in projects, real problems, and writing code. We don't always understand ideas until we see them in a concrete.

I don't imagine that intro CS students have any of the experience they need to understand the subtleties academics debate about what computer science is or what computer scientists. We are almost surely better off asking them to do something that matters them, whether a small problem or a larger project. In these problems and projects, students can learn from us and from their own work what CS is and how computer scientists think.

Eventually, I hope that the students writing large-ish AI programs in my course this semester learn just how much more there is to writing an intelligent system than just implementing a general-purpose algorithm from their text. The teams that are using pre-existing packages as part of they system might even learn that integrating software systems is "more like performing a heart transplant than snapping together LEGO blocks". (Thanks to John Cook for that analogy.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

January 31, 2011 4:12 PM

Side Thoughts on Tasks, Methods, and Intelligent Systems

I've been enjoying teaching Intelligent Systems again this semester. A big part of the fun is re-reading old papers. I had a lot of fun re-reading and thinking about the ideas in William Clancey's classic paper, Heuristic Classification. Long-time readers may recall me mentioning this paper in a long-ago post about one of computing's basic concepts, the unity of data and program.

My students didn't read this paper, but they did read a chapter from Peter Jackson's Introduction to Expert Systems that discusses it in some detail. The chapter is called "Heuristic Classification", too, but it's really about much more: the distinction between tasks and methods, the distinction between analytic tasks and synthesis tasks, and Clancey's heuristic classification method itself.

This one paper set me off on a mini-lecture that could have been much longer. To me, it felt like a technical reminiscence, a story of ideas that set the stage for my own work on design patterns of knowledge-based systems. Eventually, I had to cut my story short, both in the interest of having time to do other things in class that day and in the interest of not boring students with my own deep fascination with these topics. I hope I said enough to give them a sense of how important these ideas are and to whet the appetite of any student who might have his or her own fascination with the ideas.

Our reading and our conversation that day brought to my mind a rich set of ideas for students of intelligent systems to think about, such as:

  • The distinction between task (what) and method (how) is, at one level, no big deal. Students begin learning about this in their intro course, see its idea more clearly in their data structures course, and study the lifecycle issues involved with both in Software Engineering. But understanding the specification for a software system is often a challenge; it's easy to focus on the wrong task. The problem is made more difficult in many AI software projects because we are working in domains we understand less well, on problems at the edge of our intellectual abilities.
  • Many problems in the world are really composites of other, more primitive tasks. For example, a tutoring system must diagnose "bugs" in a student's thinking, develop a plan to fix the bug, and monitor student performance while solving problems. When we decompose a task in this way, we are in effect talking about a method for performing the task, but at a level that commits only to the exchange of information among high-level modules.
  • Where do learning systems fit into our classification of tasks. In one sense, a learning system "goes meta", because it creates a system capable of performing one of the tasks in the taxonomy. In another, though, it, too, is performing a task. But which one?

Many of these go beyond my current interests precisely because they are at the beginning of their studies, focused on building a particular system. They may find their interest in these ideas piqued after they build one of these systems, or three, and realize just how many degrees of freedom they face when writing a program that behaves "intelligently".


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 15, 2010 4:30 PM

You May Be A Programmer If ... #1

... you get to the code generation part of the compiler course you are teaching. You realize that you have forgotten the toy assembly language from the textbook for its toy virtual machine, so you have to relearn it. You think, "It's gonna be fun to write programs in this language again."

Assembly language? A toy assembly language? Really?

I just like to program. Besides, after programming in Klein this semester, a language I termed as an integer assembly language, moving to a very RISC assembly doesn't seem like that big a step down. Assembly language can be fun, though I don't think I'd want to program in it for a living!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

December 10, 2010 3:43 PM

Mean Time to Competence and Mean Time to Mastery

I'm on a mailing list with a bunch of sports fans, many of whom are also CS academics and techies. Yesterday, one of the sysadmins posted a detailed question about a problem he was having migrating a Sun Solaris server to Ubuntu, due to a conflict between default /etc/group values on the two operating systems. For example, the staff group has a value of 50 on Ubuntu and a value of 10 on Solaris. On Ubuntu, the value 10 identifies the uucp group.

Another of the sysadmins on the list wrote an even more detailed answer, explaining how group numbers are supposed to work in Unix, describing an ideal solution, and outlining a couple of practical approaches involving files such as /etc/nsswitch.conf.

After I read the question and the answer, I sent my response to the group:

Thank goodness I'm a CS professor and don't have to know how all this works.

I was joking, of course. Some people love to talk about how CS profs are disconnected from computing in the real world, and this is the sort of real-world minutia that CS profs might not know, or even have to know if they teach courses on algorithms, intro programming, or software engineering. After seeing my friends' exchange and seeing all that Unix guru-speak, I was happy to play to the stereotype.

Of course, the numbers used to implement Unix group numbers really are minutia and something only practicing sysadmins would need to know. The professor who teaches our systems courses is deeply versed in these details, as are the prof or two who manage servers for their courses and research. There certainly are CS academics divorced from reality, but you can say that of any group of people. Most know what they need to know, and a bit more.

Later in the day, I became curious about the problem my friends had discussed, so I dug in, studied a bit, and came to understand the problem and candidate solutions. Fun stuff.

Our department has recently been discussing our curriculum and in particular the goals of our B.A. and B.S. programs: What are the desired outcomes of each program? That is jargon for the simpler, "When students graduate, what would we like for them to be able to do?" For departments like ours, that means skills such as being able to write programs of a certain size, design a database, and choose appropriate data structures for solving a specific problem.

I was thinking about the Unix exchange on my mailing list in this context. Let's be honest... There is a lot of computer science, especially CS as it is applied in specific technology, that I don't know. Should I have known to fix my friend's problem? Should our students? What can we reasonably expect of our students or of ourselves as faculty?

Obviously, we professors can't know everything, and neither can our students. That is true in any discipline and especially in one like CS, which changes and grows so fast. This is one of the reasons it is so important for us to define clearly what we expect our programs to achieve. The space of computing knowledge and skills is large and growing. Without a pretty good idea of what we are hoping to achieve with our courses and curricula, our students could wander around in the space aimlessly and not having anything coherent to show for the time or effort. Or the money they spent paying tuition.

So, when I first read my friends' messages about Unix groups, I didn't know how to solve the problem. And that's okay, because I can't know everything about CS, let alone every arcane detail of every Unix distro out in the world. But I do have a CS degree and lots of experience. What difference should that make when I approach problems like this? If one of our graduates confronts this situation or one like it, how will they be difference from the average person on the street, or even the average college graduate?

Whatever specific skills our graduates have, I think that they should be able to come up to speed on computing problems relatively quickly. They should have enough experience with a broad set of CS domains and enough theoretical background to be able to make sense of unfamiliar problems and understand candidate solutions. They should be able to propose solutions that make sense to a computer scientist, even if the solutions lack a detailed knowledge of the domain.

That is, CS graduates should have a relatively low mean time to competence in most sub-areas of computing, even the ones they have not studied in detail yet.

For a smaller set of sub-areas, our students should also have a relatively low mean time to mastery. These are the areas they have studied in some detail, either in class or through project work, but which they have not yet mastered. A CS degree should put them in a position to master them more quickly than most educated non-computer scientists.

Mean time to competence (MTTC) and mean time to mastery (MTTM) are actually a big part of how I distinguish a university education from a community college education when I speak to prospective students and their parents, though I have never used those terms before. They always wonder about the value of technical certifications, which community college programs often stress, and why my department does not make study for certification exams an explicit goal for students.

We hope, I tell them, to put the student in a position of being ready to prepare for any certification exam in relatively short order, rather than spending a couple of years preparing them to take a specific exam. We also hope that the knowledge and experience they gain will prepare them for the inevitable developments in our discipline that will eventually make any particular certification obsolete.

I am not certain if mean time to competence and mastery are student learning outcomes in the traditional educational jargon sense of the word, or whether they are abstractions of several more concrete outcomes. In any case, I am left thinking about how we can help to create these outcomes in students and how we can know whether we are successful or not. (The agile developer in me says, if we can't figure out how to test our program, we need to break the feature down into smaller, more concrete steps.)

Whatever the practical challenges of curriculum design and outcomes, I think MTTC and MTTM are essential results for a CS program to generate. Indeed, they are the hallmark of a university education in any discipline and of education in general.

Now, to figure out how to do that.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 30, 2010 5:08 PM

No More Bugs

Maurice Wilkes, one of computer science's pioneers, passed away yesterday at the grand age of 97. Wilkes spent most of his career working on computer systems, so I never studied his work in the detail that I studied Backus's or Floyd's. He is perhaps best known as the creator of the EDSAC, the first practical (if slow!) stored program computer, in May 1949. Wilkes also had an undeniable impact on programming as the primary developer of the idea of microprogramming, whereby the CPU is controlled by a program stored in ROM.

Finally, every programmer must hold a special place in his or her heart for this, Wilkes's most quoted aphorism:

... the realization came over me that a good part of my life was going to be spent finding errors in my own programs.

All programmers have their Wilkes Moment sooner or later.


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 19, 2010 4:45 PM

Debugging the Law

Debugging the Law

Recently, Kevin Carey's Decoding the Value of Computer Science got a lot of play among CS faculty I know. Carey talks about how taking a couple of computer programming courses way back at the beginning of his academic career has served him well all these years, though he ended up majoring in the humanities and working in the public affairs sector. Some of my colleagues suggested that this article gives great testimony about the value of computational thinking. But note that Carey didn't study abstractions about computation or theory or design or abstraction. He studied BASIC and Pascal. He learned computer programming.

Indeed, programming plays a central role in the key story within the story. In his first job out of grad school, Carey encountered a convoluted school financing law in my home state of Indiana. He wrote code to simulate the law in SAS and, between improving his program and studying the law, he came to understand the convolution so well that he felt confident writing a simpler formula "from first principles". His formula became the basis of an improved state law.

That's right. His code was so complicated and hard to maintain, he through the system away and wrote a new one. Every programmer has lived this experience with computer code. Carey tried to debug a legal code and found its architecture to be so bad that he was better off creating a new one.

CS professors should use this story every time they try to sell the idea of universal computer programming experience to the rest of the university!

The idea of refactoring legal code via a program that implements it is not really new. When I studied logic programming and Prolog in grad school, I read about the idea of expressing law as a Prolog programming and using the program to explore its implications. Later, I read examples where Prolog was used to do just that. The AI and law community still works on problems of this sort. I should dig into some of the recent work to see what progress, if any, has been made since I moved away from that kind of work.

My doctoral work involved modeling and reasoning about legal arguments, which are very much like computer programs. I managed to think in terms of argumentation patterns, based on the work of Stephen Toulmin (whose work I have mentioned here before). I wish I had been smart or insightful enough to make the deeper connection from argumentation to software development ideas such as architecture and refactoring. It seems like there is room for some interesting cross-fertilization.

(As always, if you know about work in this domain, please let me know!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 18, 2010 3:43 PM

The Will to Run, or Do Anything Else

In "How Do You Do It?", an article in the latest issue of Running Times about how to develop the intrinsic motivation to do crazy things like run every morning at 5:00 AM, ultrarunner Eric Grossman writes:

The will to run emerges gradually where we cultivate it. It requires humility -- we can't just decide spontaneously and make it happen. Yet we must hold ourselves accountable for anything about which we can say, "I could have done differently."

Cultivation, humility, patience, commitment, accountability -- all features of developing the habits I need to run on days I'd rather stay in bed. After a while, you do it, because that's what you do.

I think this paragraph is true of whatever habit of thinking an doing that you are trying to develop, whether it's object-oriented programming, playing piano, or test-driven design.

~~~~

Eugene speaking at Tech Talk Cedar Valley, 2010/11/17

Or functional programming. Last night I gave a talk at Tech Talk Cedar Valley, a monthly meet-up of tech and software folks in the region. Many of these developers are coming to grips with a move from Java to Scala and are peddling fast to add functional programming style to their repertoires. I was asked to talk about some of the basic ideas of functional programming. My talk was called "Don't Drive on the Railroad Tracks", referring to Bill Murray's iconic character in the movie Groundhog Day. After hundreds or thousands of days reliving February 2 from the same starting point, Phil Connors finally comes to understand the great power of living in a world without side effects. I hope that my talk can help software developers in the Cedar Valley reach that state of mind sooner than a few years from now.

If you are interested, check out the slides of the talk (also available on SlideShare) and the code. both Ruby and Scheme, that I used to illustrate some of the ideas.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 13, 2010 9:28 AM

Shape, or Be Shaped

Playwright Arthur Miller is often quoted as saying:

Man must shape his tools lest they shape him.

I read this again yesterday, in the online book Focus, while listening to faculty at a highly-ranked local school talk about the value of a liberal arts education. The quote reminds me about one of the reasons I so like being a computer scientist. I can shape my tools. If I need a new tool, or even a new kind of tool, I can make it.

Our languages are tools, too. We can shape them, grow them, change them. We can create new ones. (Thanks, Matz.)

Via the power of the Internet I am continuously surrounded by colleagues smarter and more motivated than I doing the same. I've been enjoying watching Brian Marick tweet about his thoughts and decision making as he implements Midje in Clojure. His ongoing dialog reminds me that I do not have to settle.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

November 03, 2010 2:12 PM

Ideas from Readers on Recent Posts

A few recent entries have given rise to interesting responses from readers. Here are two.

Fat Arrows

Relationships, Not Characters talked about how the most important part of design often lies in the space between the modules we create, whether objects or functions, not the modules themselves. After reading this, John Cook reminded me about an article by Thomas Guest, Distorted Software. Near the end of that piece, which talks about design diagrams, Guest suggests that the arrows in application diagrams should be larger, so that they would be proportional to the time their components take to develop. Cook says:

We typically draw big boxes and little arrows in software diagrams. But most of the work is in the arrows! We should draw fat arrows and little boxes.

I'm not sure that would make our OO class diagrams better, but it might help us to think more accurately!

My Kid Could Do That

Ideas, Execution, and Technical Achievement wistfully admitted that knowing how to build Facebook or Twitter isn't enough to become a billionaire. You have to think to do it. David Schmüdde mentioned this entry in his recent My Kid Could Do That, which starts:

One of my favorite artists is Mark Rothko. Many reject his work thinking that they're missing some genius, or offended that others see something in his work that they don't. I don't look for genius because genuine genius is a rare commodity that is only understood in hindsight and reflection. The beauty of Rothko's work is, of course, its simplicity.

That paragraph connects with one of the key points of my entry: Genius is rare, and in most ways irrelevant to what really matters. Many people have ideas; many people have skills. Great things happen when someone brings these ingredients together and does something.

Later, he writes:

The real story with Rothko is not the painting. It's what happens with the painting when it is placed in a museum, in front of people at a specific place in the world, at a specific time.

In a comment on this post, I thanked Dave, and not just because he discusses my personal reminiscence. I love art but am a novice when it comes to understanding much of it. My family and I saw an elaborate Rothko exhibit at the Smithsonian this summer. It was my first trip to the Smithsonian complex -- a wonderful two days -- and my first extended exposure to Rothko's work. I didn't reject his art, but I did leave the exhibit puzzled. What's the big deal?, I wondered. Now I have a new context in which to think about that question and Rothko's art. I didn't expect the new context to come from a connection a reader made to my post on tech start-up ideas that change the world!

I am glad to know that thinkers like Schmüdde are able to make connections like these. I should note that he is a professional artist (both visual and aural), a teacher, and a recovering computer scientist -- and a former student of mine. Opportunities to make connections arise when worlds collide.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Personal, Software Development

October 28, 2010 4:27 PM

Ideas, Execution, and Technical Achievement

Four or five years ago, my best buddy on campus and I were having lunch at our favorite Chinese buffet. He looked up between bites of General Tsao's and asked, "Why didn't you and I sit down five years ago and write Facebook?"

You see, he is an awesome programmer and has worked with me enough to know that I do all right myself. At various times, both of us have implemented bits and pieces of the technology that makes up Facebook. It doesn't look like all that big a deal.

I answered, "Because we didn't think of it."

The technical details may or may not have been a big deal. Once implemented, they look straightforward. In any case, though, the real reason was that it never occurred to us to write Facebook. We were techies who got along nicely with the tools available to us in 1999 or 2000, such as e-mail, wiki, and the web. If we needed to improve our experience, we did so by improving our tools. Driven by one of his own itches, Troy had done his M.S. research with me as his advisor, writing a Bayesian filter to detect spam. But neither of us thought about supplanting e-mail with a new medium.

We had the technical skills we needed to write Facebook. We just didn't have the idea of Facebook. Turns out, that matters.

That lunch conversation comes into my mind every so often. It came back yesterday when I read Philip Greenspun's blog entry on The Social Network. Greenspun wrote one of my favorite books, Philip and Alex's Guide to Web Publishing, which appeared in 1998 and which describes in great detail (and with beautiful photos) how to implement web community software. When his students ask how he feels about Zuckerberg getting rich without creating anything "new", Greenspun gives a wonderfully respectful and dispassionate answer: "I didn't envision every element of Facebook." Then he explains what he means.

Technically, Greenspun was positioned as well or better than my buddy and I to write Facebook. But he didn't the idea, either, at least not to the same level as Zuckerberg. Having the core of an idea is one thing. Developing it to the point that it becomes a platform that changes the world in which it lives is another. Turns out, that matters, too.

I like Lawrence Lessig's most succinct summation of what makes Zuckerberg writing Facebook a notable achievement: He did it. He didn't just have an idea, or talk about it, or dream about it. He implemented it.

That's what great hackers do.

Watch this short video to hear Zuckerberg himself say why he built. His answer is also central to the hacker ethic: Because he wanted to.

(Also read through to the end of Lessig's article for a key point that many people miss when they think about the success and achievement of things like Facebook and Twitter and Napster: The real story is not the invention.

Zuckerberg may or may not be a genius? I don't know or care. That is a word that modern culture throws around far too carelessly these days. I will say this. I don't think that creating Facebook is in itself sufficient evidence for concluding so. A lot of people have cool ideas. A more select group of people write the code to make their ideas come alive. Those people are hackers. Zuckerberg is clearly a great hacker.

I'm not a big Facebook user, but it has been on my mind more than usual the last couple of days. Yesterday was my birthday, and I was overwhelmed by all the messages I received from Facebook friends wishing me a happy day. They came from all corners of the country; from old grade-school friends I haven't seen in over thirty years; from high school and college friends; from professional friends and acquaintances. These people all took the time to type a few words of encouragement to someone hundreds of miles away in the middle of the Midwest. I felt privileged and a little humbled.

Clearly, this tool has made the world a different place. The power of the social network lies in the people. The technology merely enables the communication. That's a significant accomplishment, even if most of the effects are beyond what the creator imagined. That's the power of a good idea.

~~~~

All those years ago, my buddy and I talked about how little technical innovation there was in Facebook. Greenspun's answer reminds us that there was some. I think there is another element to consider, something that was a driving force at StrangeLoop: big data. The most impressive technical achievement of Facebook and smaller web platforms such as Twitter is the scale at which they operate. They've long ago outgrown naive implementations and have had to try to offer uninterrupted service in the face of massive numbers of users and exponential growth. Solving the problems associated with operating at such a scale is an ongoing technical challenge and a laudable achievement in its own right.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

October 25, 2010 4:50 PM

A Programming Digression: Farey Sequences

Farey sequences as Ford circles

Last Wednesday, I saw John Cook's article on rational approximations using Farey's algorithm, and it led me on a programming journey. You might find it entertaining.

First, some background. The students in my compilers course are writing a compiler for a simple numerical language I call Klein. Klein is a minimal functional language inspired by Doug Baldwin's MinimL, designed to give compiler students experience with all the important issues in the course while not duplicating very much. The goal is that students be able to write a complete and correct compiler for the language from scratch in a semester. Among Klein's limitations are:

  • Repetition is done with recursion. There are no loops.
  • There are only two data types, integers and booleans.
  • All binding of values to names is done via function calls. There are no local variables, no assignments statements, and no sequences.
  • There is only one primitive function, print, that can be called only at the top of a function definition.

As you can imagine, it is difficult to find good, large problems for which we can write Klein programs. So I'm always on the look-out for numerical algorithms that can be implemented in Klein.

When I saw Cook's article and Python program, I got to thinking... Wouldn't this be a cool application for Klein?

Of course, I faced a few problems. Cook's program takes as arguments a real number in [0..1), r, and an integer N. It returns a fraction that is the rational number closest in value to r with a denominator that no bigger than N. Klein does not have floating-point values, so I would have to fake even the real value being approximated! Without assignment statements and sequences of statements, let alone Python's multiple assignments and multiple returns, I would have to convert the algorithm's loop to a set of functions for computing the individual components of the converging bounds. Finally, without floats again, I would also have to compare the bounding fractions to the faked real number as fractions.

Those are just the sort of challenges that intrigue a programmer. I spent a couple of hours on Thursday slowly refactoring Cook's code to a Klein-like subset of Python. Instead of passing r to the main function, I fake it by passing in two arguments: the digits to the right of the decimal point as an integer, and a power of 10 to indicate its magnitude. These serve as numerator and denominator of a rational approximation, albeit with a very denominator. For example, r = 0.763548745, I pass in 7635487 and 10,000,000. The third argument to the function is N. Here is an example:

     >>> farey( 127, 1000, 74)
     8
     63

8/63 is the best rational approximation for 0.127 with a denominator ≤ 74. My code prints the 8 and returns the 63, because that is best way for an equivalent Klein program to work.

Here is Cook's example of 1/e with a denominator bounded by 100:

     >>> farey( 367879, 1000000, 100 )
     32
     87

Take a look at my Python code if you are easily amused. In one part of the program, I duplicated code with only small changes to the resulting functions, farey_num() andfarey_den. In another section, I avoided creating duplicate functions by adding a selector argument that I allowed me to use one bit of code for two different calculations. while_loop_for returns the next value for one of a, b, c, and d, depending on the value of its first argument. For some reason, I am fond of the code that replaces Cook's mediant variable.

At this point, the port to Klein was straightforward.

In the end, there is so much code to do what is basically a simple task: to compute row N of this tree on an as-needed basis, in the service of a binary search:

Farey sequences as a lattice

Programming in Klein feels a lot like programming in an integer assembly language. That can be an interesting experience, but the real point of the language is to exercise students as they write a compiler. Still, I find myself now wanting to implement refactoring support for Klein, in order to make such digressions flow faster in the future!

I enjoyed this little diversion so much that I've been buttonholing every colleague who walks through the door. I decided I'd do the same to you.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 21, 2010 4:45 PM

On Kazoos and Violins

Coming out of Strange Loop, I have been reading more about Scala and Clojure, both of which bring a lot of cool functional programming ideas to the JVM world. A lot of people have seem to have made deeper inroads into FP via Scala than via Clojure. That is certainly true among the companies in my area. That is not too surprising. Scala looks more like what they are used to than Clojure. It feels more comfortable.

This context is probably one of the reasons that I so liked this quote I read yesterday from Brenton Ashworth, a serial contributor to the Clojure community:

I'm now convinced that choosing a language based on how easy it is to read or write is a very bad idea. What if we chose musical instruments based on how easy they are to play? Everyone would playing kazoos. You can't think deeply about music if all you know is a kazoo.

This reminds me of a story Alan Kay often tells about violins. I have alluded to it at least once before:

And we always need to keep in mind the difference between essential and gratuitous complexity. I am often reminded of Alan Kay's story about learning to play violin. Of course it's hard to learn. But the payoff is huge.

New tools often challenge us, because knowing them well will change us. And such change is a part of progress. Imagine if no one had wanted to learn to fly an airplane because it was too different from driving a car or a horse.

Don't get me wrong: I am not comparing Scala to a kazoo and Clojure to a violin! I don't know either language well enough to make grand pronouncements about them, but my small bit of study tells me that both are powerful, valuable languages, languages worth knowing. I'm simply concerned that too many people opt out of learning Clojure well because of how it looks. As Ashworth says in the sentence immediately following the passage quoted above, "Skill at reading and writing code is learned."

You can do it! Don't settle on something only because something else looks unfamiliar. By taking the easier path in this moment, you may be making your path harder in the long run.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 21, 2010 8:50 AM

Strange Loop Redux

StrangeLoop 2010 logo

I am back home from St. Louis and Des Moines, up to my next in regular life. I recorded some of my thoughts and experiences from Strange Loop in a set of entries here:

Unlike most of the academic conferences I attend, Strange Loop was not held in a convention center or in a massive conference hotel. The primary venue for the conference was the Pageant Theater, a concert nightclub in the Delmar Loop:

The Pageant Theater

This setting gave the conference's keynotes something of an edgy feel. The main conference lodging was the boutique Moonrise Hotel a couple of doors down:

The Pageant Theater

Conference session were also held in the Moonrise and in the Regional Arts Commission building across the street. The meeting rooms in the Moonrise and the RAC were ordinary, but I liked being in human-scale buildings that had some life to them. It was a refreshing change from my usual conference venues.

It's hard to summarize the conference in only a few words, other than perhaps to say, "Two thumbs up!" I do think, though, that one of the subliminal messages in Guy Steele's keynote is also a subliminal message of the conference. Steele talked for half an hour about a couple of his old programs and all of his machinations twenty-five or forty years to make them run in the limited computing environments of those days. As he went to all the effort to reconstruct the laborious effort that went into those programs in the first place, the viewer can't help but feel that the joke's on him. He was programming in the Stone Age!

But then he gets to the meat of his talk and shows us that how we program now is the relic of a passing age. For all the advances we have made, we still write code that transitions from state to state to state, one command at a time, just like our cave-dwelling ancestors in the 1950s.

It turns out that the joke is on us.

The talks and conversations at Strange Loop were evidence that one relatively small group of programmers in the midwestern US are ready to move into the future.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

October 16, 2010 7:59 PM

Strange Loop This and That

Some miscellaneous thoughts after a couple of days in the mix...

Pertaining to Knowing and Doing

** Within the recurring theme of big data, I still have a few things to study: MongoDB, CouchDB, and FlockDB. I also learned about Pig, a language I'd never heard of before. I think I need to learn it.

** I need to be sure that my students learn about the idea of multimethods. Clojure has brought them back into mainstream discussion.

** Kevin Weil, who spoke about NoSQL at Twitter, told us that his background is math and physics. Not CS. Yet another big-time developer who came to programming from another domain because they had real problems to solve.

Pertaining to the Conference

** The conference served soda all day long, from 8:00 in the morning to the end of the day. Hurray! My only suggestion: add Diet Mountain Dew to the selections.

** The conference venues consist of two rooms in a hotel, two rooms in a small arts building, and the auditorium of the Pageant Theater. The restrooms are all small. During breaks, the line for the men's room was, um, long. The women in attendance came and went without concern. This is exactly opposite of what one typically sees out in public. The women of Strange Loop have their revenge!

** This is the first time I have ever attended a conference with two laptop batteries. And I saw that it was good. Now, I just have to find out why every couple of weeks my keyboard and trackpad freeze up and effectively lock me out. Please, let it not be a failing mother board...

Pertaining to Nothing But Me

** Like every conference, Strange Loop fills the silence between sessions with a music loop. The music the last two days has been aimed at its audience, which is mostly younger and mostly hipper than I am. I really enjoyed it. I even found a song that will enter my own rotation, "Hey, Julie" by Fountains of Wayne. You can, of course, listen to it on YouTube. I'll have to check out more Fountains of Wayne later.

** On Twitter, I follow a relatively small number of people, mostly professional colleagues who share interesting ideas and links. I also follow a few current and former students. Rounding out the set are a couple connections I made with techies through others, back when Twitter was small. I find that I enjoy their tweets even though I don't know them, or perhaps because I don't.

On Thursday, it occurred to me: Maybe it would be fun to follow some arbitrary incredibly popular person. During one of the sessions, we learned that Lady Gaga has about 6,500,000 followers, surpassing Ashton Kutcher's six million. I wonder it would be like to have their tweets flowing in a stream with those of Brian Marick and Kevlin Henney, Kent Beck and Michael Feathers.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

October 16, 2010 9:09 AM

Strange Loop, Day 2 Afternoon

After eating some dandy deep-dish BBQ chicken at Pi Pizzeria with the guys from T8 Webware (thanks, Wade!), I returned for a last big run of sessions. I'll save the first session for last, because my report of it is the longest.

Android Squared

I went to this session because so many of my students want me to get down in the trenches with phone programming. I saw a few cool tools, especially RetroFit, a new open-source framework for Android. There are not enough hours in a day for me to explore every tool out there. Maybe I can have a student do an Android project.

Java Puzzlers

And I went to this session because I am weak. I am a sucker for silly programming puzzles, especially ones that take advantage of the dark corners of our programming languages. This session did not disappoint in this regard. Oh, the tortured code they showed us! I draw from this experience a two-part moral:

  1. Bad programmers can write really bad code, especially in a complex language.
  2. A language that is too complex makes bad programmers of us all.

Brian Marick on Outside-In TDD

Marick demoed a top-down | outside-in style of TDD in Clojure using Midje, his homebrew test package. This package and style make heavy use of mock objects. Though I've dinked around a bit in Scala, I've done almost nothing in Clojure, so I'll have to try this out. The best quote of the session echoed a truth about all programming: You should have no words in your test that are not specifically about the test.

Douglas Crockford, Open-Source Heretic

Maybe my brain was fried by the end of the two days, or perhaps I'm simply not clever enough. While I able to chuckle several times through this closing keynote, I never saw the big picture or the point of the talk. There were plenty of interesting, if disconnected, stories and anecdotes. I enjoyed Crockford's coverage of several historical mark-up languages, including Runoff and Scribe. (Runoff was the forebear of troff, a Unix utility I used throughout grad school -- I even produced my wedding invitations using it! Fans of Scribe should take a look at Scribble, a mark-up tool built-on top of Racket.) He also told an absolutely wonderful story about Grace Murray Hopper's A-0, the first compiler-like tool and likely the first open-source software project.

Panel on the Future of Languages

Panels like this often don't have a coherent summary. About all I can do is provide a couple of one-liners and short answers to a couple of particularly salient questions.

Joshua Bloch: Today's patterns are tomorrow's language features. Today's bugs are tomorrow's type system features.

Douglas Crockford: Javascript has become what Java was meant to be, the language that runs everywhere, the assembly language of the web.

Someone in the audience asked, "Are changes in programming languages driven by academic discovery or by practitioner pain?" Guy Steele gave the best answer: The evolution of ideas is driven by academics. Uptake is driven by practitioner needs.

So, what is the next big thing in programming languages? Some panelists gave answers grounded in today's problems: concurrency, a language that could provide and end-to-end solution for the web, and security. One panelists offered laziness. I think that laziness will change how many programmers think -- but only after functional programming has blanketed the mainstream. Collectively, several panelists offered variations of sloppy programming, citing as early examples Erlang's approach to error recovery, NoSQL's not-quite-data consistency, and Martin Rinard's work on acceptability-oriented computing.

The last question from the audience elicited some suggestions you might be able to use. What language, obscure or otherwise, should people learn in order to learn the language you really want them to learn? For this one, I'll give you a complete list of the answers:

  • Io. (Bruce Tate)
  • Rebol. (Douglas Crockford)
  • Forth. Factor. (Alex Payne)
  • Scheme. Assembly. (Josh Bloch)
  • Clojure. Haskell. (Guy Steele)

I second all of these suggestions. I also second Steele's more complete answer: Learn any three languages you do not know. The comparisons and contrasts among them will teach you more than any one language can.

Panel moderator Ted Neward closed the session with a follow-up question: "But what should the Perl guys learn while they are waiting for Perl 6?" We are still waiting for the answer.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 15, 2010 11:46 PM

Strange Loop, Day 2 Morning

Perhaps my brain is becoming overloaded, or I have been less disciplined in picking talks to attend, or the slate of sessions for Day 2 is less coherent than Day 1's. But today has felt scattered, and so far less satisfying than yesterday. Still, I have had some interesting thoughts.

Billy Newport on Enterprise NoSQL

This was yet another NoSQL talk, but not, because it was different than the preceding ones at the conference. This talk was not about any particular technologies. It was about mindsets.

Newport explained that NoSQL means not only SQL. These two general approaches to data storage offer complementary strengths and weaknesses. This means that they are best used in different contexts.

I don't do enough programming for big data apps to appreciate all the details of this talk. Actually, I understood most of the basic concepts, but they soon starting blurring in my mind, because I don't have personal experience on which to hang them. A few critical points stood out:

  • In the SQL world, the database is the "system of record" for all data, so consistency is a given. In the NoSQL world, having multiple systems of record is normal. In order to ensure consistency, the application uses business rules to bring data back into sync. This requires a big mind shift for SQL guys.

  • In the SQL world, the row is a bottleneck. In the NoSQL world, any node can handle the request. So there is not a bottleneck, which means the NoSQL approach scales transparently. J But see the first bullet.

These two issues are enough to see one of Newport's key points. The differences between the two worlds is not only technical but also cultural. SQL and NoSQL programmers use different vocabulary and have different goals. Consider that "in NoSQL, 'query' is a dirty word". NoSQL programmers do everything they can to turn queries into look-ups. For the SQL programmer, the query is a fundamental concept.

The really big idea I took away from this talk is that SQL and NoSQL solve different problems. The latter optimizes for one dominant question, while the former seeks to support an infinite number of questions. Most of the database challenges facing NoSQL shops boil down to this: "What happens if you ask a different question?"

Dean Wampler on Scala

The slot in which this tutorial ran was longer than the other sessions at the conference. This allowed Wampler to cover a lot of details about Scala. I didn't realize how much of an "all but the kitchen sink" language Scala is. It seems to include just about every programming language feature I know about, drawn from just about every programming language I know about.

I left the talk a bit sad. Scala contains so much. It performs so much syntactic magic, with so many implicit conversions and so many shortcuts. On the one hand, I fear that large Scala programs will overload programmers' minds the way C++ does. On the other, I worry that its emphasis on functional style will overload programmers' minds the way Haskell does.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 15, 2010 12:34 PM

Guy Steele's Own Strange Loop

The last talk of the afternoon of Day 1 was a keynote by Guy Steele. My notes for his talk are not all that long, or at least weren't when I started writing. However, as I expected, Steele presented a powerful talk, and I want to be able to link directly to it later.

Steele opened with a story about a program he wrote forty years ago, which he called the ugliest program he ever wrote. It fit on a single punch card. To help us understand this program, he described in some detail the IBM hardware on which it ran. One problem he faced as a programmer is that the dumps were undifferentiated streams of bytes. Steele wanted line breaks, so he wrote an assembly language program to do that -- his ugliest program.

Forty years later, all he has is the punch card -- no source. Steele's story then turned into CSI: Mainframe. He painstakingly reverse-engineered his code from punches on the card. We learned about instruction format, data words, register codes... everything we needed to know how this program managed to dump memory with newlines and fit on a single card. The number of hacks he used, playing on puns between op codes and data and addresses, was stunning. That he could resurrect these memories forty years later was just as impressive.

I am just old enough to have programmed assembly for a couple of terms on punch cards. This talk brought back memories, even how you can recognize data tables on a card by the unused rows where there are no op codes. What a wonderful forensics story.

The young guys in the room liked the story, but I think some were ready for the meet of the talk. But Steele told another, about a program for computing sin 3x on a PDP-11. To write this program, Steele took advantage of changes in the assembly languages between the IBM mainframe and the PDP-11 to create more readable code. Still, he had to use several idioms to make it run quickly in the space available.

These stories are all about automating resource management, from octal code to assemblers on up to virtual memory and garbage collection. These techniques let the programmer release concerns about managing memory resources to tools. Steele's two stories demonstrate the kind of thinking that programmers had to do back in the days when managing memory was the programmer's job. It turns out that the best way to think about memory management is not to think about it at all.

At this point, Steele closed his own strange loop back to the title of his talk. His thesis is this: the best way to think about parallel programming is not to have to.

If we program using a new set of idioms, then parallelism can be automated in our tools. The idioms aren't about parallelism; they are more like functional programming patterns that commit the program less to underlying implementation.

There are several implications of Steele's thesis. Here are two:

  • Accumulators are bad. Divide and conquer is good.

  • Certain algebraic properties of our code are important. Programmers need to know and preserve them in then code they write.

Steele illustrated both of these implications by solving an example problem that would fit nicely in a CS1 course: finding all the words in a string. With such a simple problem, everyone in the room has an immediate intuition about how to solve it. And nearly everyone's intuition produces a program using accumulators that violates several important algebraic properties that our code might have.

One thing I love about Steele's talks: he grounds ideas in real code. He developed a complete solution to the problem in Fortress, the language Steele and his team have been creating at Sun/Oracle for the last few years. I won't try to reproduce the program or the process. I will say this much. One, the process demonstrated a wonderful interplay between functions and objects. Two, in the end, I felt like we had just used a process very similar to the one I use when teaching students to create this functional merge sort function:

    (define mergesort
       (lambda (lst)
          (merge-all (map list lst))))

Steele closed his talk with the big ideas that his programs and stories embody. Among the important algebraic properties that programs should have whenever possible are ones we all learned in grade school, explicitly or implicitly. Though they may still sound scary, they all have intuitive common meanings:

  • associative -- grouping don't matter
  • commutative -- order doesn't matter
  • idempotent -- duplicates don't matter
  • identity -- this value doesn't matter
  • zero -- other values don't matter

Steele said that "wiggle room" was the key buzzword to take away from his talk. Preserving invariants of these algebraic properties give the compiler wiggle room to choose among alternative ways to implement the solution. In particulars, associativity and commutativity give the compiler wiggle room to parallelize the implementation.

(Note that the merge-all operation in my mergesort program satisfies all five properties.)

One way to convert an imperative loop to a parallel solution is to think in terms of grouping and functions:

  1. Bunch mutable state together as a state "value".
  2. Look at the loop as an application of one or more state transformation functions.
  3. Look for an efficient way to compose these transformation functions into a single function.

The first two steps are relatively straightforward. The third step is the part that requires ingenuity!

In this style of programming, associative combining operators are a big deal. Creating new, more diverse associative combining operators is the future of programming. Creating new idioms -- the patterns of programs written in this style -- is one of our challenges. Good programming languages of the future will provide, encourage, and enforce invariants that give compilers wiggle room.

In closing, Steele summarized our task as this: We need to do for processor allocation what garbage collection did for memory allocation. This is essential in a world in which we have parallel computers of wildly different sizes. (Multicore processors, anyone?)

I told some of the guys at the conference that I go to hear Guy Steele irrespective of his topic. I've been fortunate enough to be in a small OOPSLA workshop on creativity with Steele, gang-writing poetry and Sudoku generators, and I have seen him speak a few times over the years. Like his past talks, this talk makes me think differently about programs. It also crystallizes several existing ideas in a way that clarifies important questions.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 14, 2010 11:17 PM

Strange Loop 2010, Day 1 Afternoon

I came back from a lunchtime nap ready for more ideas.

You Already Use Closures in Ruby

This is one of the talks I chose for a specific personal reason. I was looking for ideas I might use in a Cedar Valley Tech Talk on functional programming later this month, and more generally in my programming languages course. I found one, a simple graphical notation for talking about closures as a combination code and environment. Something the speaker said about functions sharing state also gave me an idea for how to use the old koan on the dual "a closure is poor man's object / an object is poor man's closure".

NoSQL at Twitter

Speaker Kevin Weil started by decrying his own title. Twitter uses MySQL and relational databases everywhere. They use distributed data stores for applications that need specific performance attributes.

Weil traced Twitter's evolution toward different distributed data solutions. They started with Syslog for logging applications. It served there early needs but didn't scale well. They then moved to Scribe, which was created by the Facebook team to solve the same problem and then open-sourced.

This move led to a realization. Scribe solved Twitter's problem and opened new vistas. It made logging data so easy, that they started logging more. Having more data gave them better insight into the behavior of their users, behaviors they didn't even know to look for before. Now, data analytics is one of Twitter's most interesting areas.

The Scribe solution works, scaling with no changes to the architecture as data throughput doubles; they just add more servers. But this data creates a 'write' problem: giving today's takes technology, it takes something like 42 hours to write 12 TB to a single hard drive. This led Twitter to add Hadoop to its toolset. Hadoop is both scalable and powerful. Weil mentioned an a 4,000-node cluster at Yahoo! that had sorted one terabyte of integers in 62 seconds.

The rest of Weil's talk focused on data analytics. The key point underlying all he said was this: It is easy to answer questions. It is hard to ask the right questions. This makes experimental programming valuable, and by extension a powerful scripting language and short turnaround times. They need time to ask a lot of questions, looking for good ones and refining promising questions into more useful ones. Hadoop is a Java platform, which doesn't fit those needs.

So, Twitter added Pig, a high-level language that sits atop Hadoop. Programs written in Pig are easy to read and almost English-like. Equivalent SQL programs would probably be shorter, but Pig compiles to MapReduce jobs that run directly on Hadoop. Pig extracts a performance penalty, but the Twitter team doesn't mind. Weil captured why in another pithy sentence: I don't mind if a 10-minute job runs in 12 minutes if it took me 10 minutes to write the script instead of an hour.

Twitter works on several kinds of data-analytic problems. A couple stood out:

  • correlating big data. How do different kinds of user behave -- mobile, web, 3rd-party clients? What features hook users? How do user cohorts work? What technical details go wrong at the same time, leading to site problems?

  • research on big data. What can we learn from a users' tweets, the tweets of those they follow, or the tweets of those who follow them? What can we learn from asymmetric follow relationships about social and personal interests?

As much as Weil had already described, there was more! HBase, Cassandra, FlockDB, .... Big data means big problems and big opportunities, which lead to hybrid solutions that optimize competing forces. Interesting stuff.

Solutions to the Expression Problem

This talk was about Clojure, which interests me for obvious reasons, but the real reason I chose this talk was that I wanted to know what is the expression problem! Like many in the room, I had experienced the expression problem without knowing it by this name:

The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).

Speaker Chris Houser used an example in which we need to add a behavior to an existing class that is hard or impossible to modify. He then stepped through four possible solutions: the adapter pattern and monkey patching, which are available in languages like Java and Ruby, and multimethods and protocols, which are available in Clojure.

I liked two things about this talk. First, he taught his a "failure-driven" way: pose a problem, solve it using a known technique, expose a weakness, and move on to a more effective solution. I often use this technique when teaching design patterns. Second, the breadth of the problem and its potential solutions encouraged language geeks to talk ideas in language design. The conversation included not only Java and Clojure but also Javascript, C#, and macros.

Guy Steele

My notes on this talk aren't that long, but it was important enough to have its own entry. Besides, this entry is already long enough!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

October 14, 2010 10:02 PM

Strange Loop 2010, Day 1 Morning

I arrived in the University City district this morning ready for a different kind of conference. The program is an interesting mix of hot topics in technology, programming languages, and computing. My heuristic for selecting talks to attend was this: Of the talks in each time slot, which discusses the biggest idea I need to know more about? The only exceptions will be talks that fill a specific personal need, in teaching, research, or personal connections.

A common theme among the talks I attended today was big data. Here are some numbers I jotted down. They may be out of date by now, or I may have made an error in some of the details, but the orders of magnitude are impressive:

  • Hilary Mason reported that bit.ly processes 10s of millions URLs per day, 100s of millions of events per day, and 1,000s of millions data points each day.
  • Google processes 24 petabytes a day, which amounts to 8 exabytes a year.
  • Digg has a 3 terabyte Cassandra database. This is dwarfed by Facebook's 150-node Cassandra cluster that comprises 150 terabytes.
  • Twitter users generate 12 terabytes of data each day -- the equivalent of 17,000 CDs -- and more than 4 petabytes each year. Also astounding: the data rate is doubling multiple times a year.

All of these numbers speak to the need for more CS programs to introduce students to data-intensive computing.

Hilary Mason on Machine Learning

Mason is a data scientist at bit.ly, a company that faces "wicked hard problems" at scale. These problems arise from the combination of algorithms, on-demand computing, and "data, data, data". Mason gave an introduction to machine learning as applied to some of these problems. She had a few quotable lines:

  • AI was "founded on a remarkable conceit". This reminded me of a recent entry.
  • "Academics like to create new terms, because if they catch on the academics get prizes."
  • "Only a Bayesian can tell you why if there's a 50-50 chance that something will happen, then 90% of the time it will."

She also sprinkled her top with programming tips. One of the coolest was that you can append a "+" to the end of any bit.ly URL to get analytic metrics for it.

Gradle

I had planned to go to a talk on Flickr's architecture, but I talked so long in the break that the room full before I got there. So I stopped in on a talk by Ken Sipe on Gradle, a scriptable Java build system built on top of Groovy. He had one very nice quote as well: "I can't think of a real use of this idea. I just like showing it off." The teacher and hacker in me smiled.

Eben Hewitt on Adopting Cassandra

Even if you are not Google, you may have to process a lot of data. Hewitt talked about some of the efforts made to scale relational databases to handle big data (query tuning, indexes, vertical scaling, shards, and denormalization). This sequence of fixes made me think of epicycles, fixes applied to orbits of heavenly bodies in an effort to match observed data. At some point, you start looking for a theory that fits better up front.

That's what happened in the computing world, too. Soon there were a number of data systems defined mostly by what they are not: "NoSQL". That idea is not new. Lotus Notes used a document-oriented storage system; Smalltalk images store universes of objects. Now, as the problem of big data becomes more prevalent, new and different implementations have been proposed.

Hewitt talked about Cassandra and what distinguishes it from other approaches. He called Cassandra the love child of the Google BigTable paper (2006) and the Amazon Dynamo paper (2007). He also pointed out some of the limitations that circumscribe its applicability. In the NoSQL approaches, there is "no free lunch": you give up many relational DB advantages, such as transactions, joins, and ad hoc queries.

He did advocates one idea I'll need to read more about: that we should shift our attention from the CAP theorem of distributed data systems, which is useful but misses some important dynamic distinctions, with Abadi's PACELC model: If the data store is Partitioned, then you trade off between Availability and Consistency; Else you trade off between Latency and Consistency.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 06, 2010 11:24 AM

Studying Program Repositories

Hackage, through Cabal

Last week, Garrett Morris presented an experience report at the 2010 Haskell Symposium entitled Using Hackage to Inform Language Design, which is also available at Morris's website. Hackage is an online repository for the Haskell programming community. Morris describes how he and his team studied programs in the Hackage repository to see how Haskell programmers work with a type-system concept known as overlapping instances. The details of this idea aren't essential to this entry, but if you'd like to learn more, check out Section 6.2.3 in this user guide.)

In Morris's search, he sought answers to three questions:

  • What proportion of the total code on Hackage uses overlapping instances?
  • In code that uses overlapping instances, how many instances overlap each other?
  • Are there common patterns among the uses of overlapping instances?

Morris and his team used what they learned from this study to design the class system for their new language. They found that their language did not need to provide the full power of overlapping instances in order to support what programmers were really doing. Instead, they developed a new idea they call "instance chains" that was sufficient to support a large majority of the uses of overlapping instances. They are confident that their design can satisfy programmers because the design decision reflects actual uses of the concept in an extensive corpus of programs.

I love this kind of empirical research. One of the greatest assets the web gives us is access to large corpora of code: SourceForge and GitHub are examples large public repositories, and there are an amazing number of smaller domain-specific repositories such as Hackage and RubyGems. Why design languages or programming tools blindly when we can see how real programmers work through the code they write?

The more general notion of designing languages via observing behavior, forming theories, and trying them out is not new. In particular, I recommend the classic 1981 paper Design Principles Behind Smalltalk. Dan Ingalls describes the method by which the Smalltalk team grew its language and system as explicitly paralleling the scientific method. Morris's paper talks about a similar method, only with the observation phase grounded in an online corpus of code.

Not everyone in computer science -- or outside CS -- thinks of this method as science. Just this weekend Dirk Riehle blogged about the need to broaden the idea of science in CS. In particular, he encourages us to include exploration as a part of our scientific method, as it provides a valuable way for us to form the theories that we will test using the sort of experiments that everyone recognizes as science.

Dirk Riehle's cycle of theory formation and validation

Unfortunately, too many people in computer science do not take this broad view. Note that Morris published his paper as an experience report at a symposium. He would have a hard time trying to get an academic conference program committee to take such a paper in its research track, without first dressing it up in the garb of "real" research.

As I mentioned in an earlier blog entry, one of my grad students, Nate Labelle, did an M.S. thesis a few years ago based on information gathered from the study of a large corpus of programs. Nate was interested in dependencies among open-source software packages, so he mapped the network of relationships within several different versions of Linux and within a few smaller software packages. This was the raw data he used to analyze the mathematical properties of the dependencies.

In that project, we trolled repositories looking for structural information about the code they contained. Morris's work studied Hackage to learn about the semantic content of the programs. While on sabbatical several years ago, I started a much smaller project of this sort to look for design patterns in functional programs. That project was sidetracked by some other questions I ran across, but I've always wanted to get back to it. I think there is a lot we could learn about functional programming in this way that would help us to teach a new generation of programmers in industry.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

September 29, 2010 9:27 PM

The Manifest Destiny of Computer Science?

The June 2010 issue of Communications of the ACM included An Interview with Ed Feigenbaum, who is sometimes called the father of expert systems. Feigenbaum was always an ardent promoter of AI, and time doesn't seem to have made him less brash. The interview closes with the question, "Why is AI important?" The father of expert systems pulls no punches:

There are certain major mysteries that are magnificent open questions of the greatest import. Some of the things computer scientists study are not. If you're studying the structure of databases -- well, sorry to say, that's not one of the big magnificent questions.

I agree, though occasionally I find installing and configuring Rails and MySQL on my MacBook Pro to be one of the great mysteries of life. Feigenbaum is thinking about the questions that gave rise to the field of artificial intelligence more than fifty years ago:

I'm talking about mysteries like the initiation and development of life. Equally mysterious is the emergence of intelligence. Stephen Hawking once asked, "Why does the universe even bother to exist?" You can ask the same question about intelligence. Why does intelligence even bother to exist?

That is the sort of question that captivates a high school student with an imagination bigger than his own understanding of the world. Some of those young people are motivated by a desire to create an "ultra-intelligent computer:, as Feigenbaum puts it. Others are motivated more by the second prize on which AI founders set their eyes:

... a very complete model of how the human mind works. I don't mean the human brain, I mean the mind: the symbolic processing system.

That's the goal that drew the starry-eyed HS student who became the author of this blog into computer science.

Feigenbaum closes his answer with one of the more bodacious claims you'll find in any issue of Communications:

In my view the science that we call AI, maybe better called computational intelligence, is the manifest destiny of computer science.

There are, of course, many areas of computer science worthy of devoting one's professional life to. Over the years I have become deeply interested in questions related to language, expressiveness, and even the science or literacy that is programming. But it is hard for me to shake the feeling, still deep in my bones, that the larger question of understanding what we mean by "mind" is the ultimate goal of all that we do.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

September 26, 2010 7:12 PM

Competition, Imitation, and Nothingness on a Sunday Morning

Joseph Brodsky said:

Another poet who really changed not only my idea of poetry, but also my perception of the world -- which is what it's all about, ya? -- is Tsvetayeva. I personally feel closer to Tsvetayeva -- to her poetics, to her techniques, which I was never capable of. This is an extremely immodest thing to say, but, I always thought, "Can I do the Mandelstam thing?" I thought on several occasions that I succeeded at a kind of pastiche.

But Tsvetayeva. I don't think I ever managed to approximate her voice. She was the only poet -- and if you're a professional that's what's going on in your mind -- with whom I decided not to compete.

Tsvetayeva was one of Brodsky's closest friends in Russia. I should probably read some of her work, though I wonder how well the poems translate into English.

~~~~~~~~~~

absolutely nothing: next 22 miles

When I am out running 22 miles, I have lots of time to think. This morning, I spent a few minutes thinking about Brodsky's quote, in the world of programmers. Every so often I have encountered a master programmer whose work changes my perception of world. I remember coming across several programs by Ward Cunningham's, including his wiki, and being captivated by its combination of simplicity and depth. Years before that, the ParcPlace Smalltalk image held my attention for months as I learned what object-oriented programming really was. That collection of code seemed anonymous at first, but I later learned its history and and became a fan of Ingalls, Maloney, and the team. I am sure this happens to other programmers, too.

Brodsky also talks about his sense of competition with other professional poets. From the article, it's clear that he means not a self-centered or destructive competition. He liked Tsvetayeva deeply, both professionally and personally. The competition he felt is more a call to greatness, an aspiration. He was following the thought, "That is beautiful" with "I can do that" -- or "Can I do that?"

I think programmers feel this all the time, whether they are pros or amateurs. Like artists, many programmers learn by imitating the code they see. These days, the open-source software world gives us so many options! See great code; imitate great code. Find a programmer whose work you admire consistently; imitate the techniques, the style, and, yes, the voice. The key in software, as in art, is finding the right examples to imitate.

Do programmers ever choose not to compete in Brodsky's sense? Maybe, maybe not. There are certainly people whose deep grasp of computer science ideas usually feels beyond my reach. Guy Steele comes to mind. But I think for programmers it's mostly a matter of time. We have to make trade-offs between learning one thing well or another.

~~~~~~~~~~

22 miles is a long run. I usually do only one to two runs that long during my training for a given marathon. Some days I start with the sense of foreboding implied by the image above, but more often the run is just there. Twenty-two miles. Run.

This time the morning was chill, 40 degrees with a bright sun. The temperature had fallen so quickly overnight that the previous day's rain had condensed in the leaves of every tree and bush, ready to fall like a new downpour at the slightest breeze.

This is my last long run before taking on Des Moines in three weeks. It felt neutral and good at the same time. It wasn't a great run, like my 20-miler two weeks ago, but it did what it needed to do: stress my legs and mind to run for about as long as the marathon will be. And I had plenty of time to think through the nothingness.

Now begins my taper, an annual ritual leading to a race. The 52 miles I logged this week will seep into my body for the next ten days or so as it acclimates to the stress. Now, I will pare back my mileage and devote a few more short and medium-sized runs to converting strength into the speed.

~~~~~~~~~~

The quote that opens this entry comes from Joseph Brodsky, The Art of Poetry No. 28, an interview in The Paris Review by Sven Birkerts in December 1979. I like very much to hear writers talk about how they write, about other writers, and about the culture of writing. This long interview repaid me several times for the time I spent reading.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Running, Teaching and Learning

September 20, 2010 4:28 PM

Alan Kay on "Real" Object-Oriented Programming

In Alan Kay's latest comment-turned-blog entry, he offers thoughts in response to Moti Ben-Ari's CACM piece about object-oriented programming. Throughout, he distinguishes "real oop" from what passes under the term these days. His original idea of object-oriented software is really quite simple to express: encapsulated modules all the way down, with pure messaging. This statement boils down to its essence many of the things he has written and talked about over the years, such as in his OOPSLA 2004 talks.

This is a way to think about and design systems, not a set of language features grafted onto existing languages or onto existing ways of programming. He laments what he calls "simulated data structure programming", which is, sadly, the dominant style one sees in OOP books these days. I see this style in nearly every OOP textbook -- especially those aimed at beginners, because those books generally start with "the fundamentals" of the old style. I see it in courses at my university, and even dribbling into my own courses.

One of the best examples of an object-oriented system is one most people don't think of as a system at all: the Internet:

It has billions of completely encapsulated objects (the computers themselves) and uses a pure messaging system of "requests, not commands", etc.

Distributed client-server apps make good examples and problems in courses on OO design precisely because they separate control of the client and the server. When we write OO software in which we control both sides of the message, it's often too tempting to take advantage of how we control both objects and to violate encapsulation. These violations can be quite subtle, even when we take into account idioms such as access methods and the Law of Demeter. To what extent does one component depend on how another component does its job? The larger the answer, the more coupled the components.

Encapsulation isn't an end unto itself, of course. Nor are other features of our implementation:

The key to safety lies in the encapsulation. The key to scalability lies in how messaging is actually done (e.g., maybe it is better to only receive messages via "postings of needs"). The key to abstraction and compactness lies in a felicitous combination of design and mathematics.

I'd love to hear Kay elaborate on this "felicitous combination of design and mathematics"... I'm not sure just what he means!

As an old AI guy, I am happy to see Kay make reference to the Actor model proposed by Carl Hewitt back in the 1970s. Hewitt's ideas drew some of their motivation from the earliest Smalltalk and gave rise not only to Hewitt's later work on concurrent programming but also Scheme. Kay even says that many of Hewitt's ideas "were more in the spirit of OOP than the subsequent Smalltalks."

Another old AI idea that came to my mind as I read the article was blackboard architecture. Kay doesn't mention blackboards explicitly but does talk about how messaging might better be if instead of an object sent messages to specific targets they might "post their needs". In a blackboard system, objects capable of satisfying needs monitor the blackboard and offer to respond to a request as they are able. The blackboard metaphor maintains some currency in the software world, especially in the distributed computing world; it even shows up as an architectural pattern in Pattern-Oriented Software Architecture. This is a rich metaphor with much room for exploration as a mechanism for OOP.

Finally, as a CS educator, I could help but notice Kay repeating a common theme of his from the last decade, if not longer:

The key to resolving many of these issues lies in carrying out education in computing in a vastly different way than is done today.

That is a tall order all its own, much harder in some ways than carrying out software development in a vastly different way than is done today.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 10, 2010 2:22 PM

Recursion, Trampolines, and Software Development Process

Yesterday I read an old article on tail calls and trampolining in Scala by Rich Dougherty, which summarizes nicely the problem of recursive programming on the JVM. Scala is a functional language, which lends itself to recursive and mutually recursive programs. Compiling those programs to run on the JVM presents problems because (1) the JVM's control stack is shallow and (2) the JVM doesn't support tail-call optimization. Fortunately, Scala supports first-class functions, which enables the programmer to implement a "trampoline" that avoids the growing the stack. The resulting code is harder to understand and so to maintain, but it runs without growing the control stack. This is a nice little essay.

Dougherty's conclusion about trampoline code being harder to understand reminded me of a response by reader Edward Coffin to my July entry on CS faculty sending bad signals about recursion. He agreed that recursion usually is not a problem from a technical standpoint but pointed out a social problem (paraphrased):

I have one comment about the use of recursion in safety-critical code, though: it is potentially brittle with respect to changes made by someone not familiar with that piece of code, and brittle in a way that makes breaking the code difficult to detect. I'm thinking of two cases here: (1) a maintainer unwittingly modifies the code in a way that prevents the compiler from making the formerly possible tail-call optimization and (2) the organization moves to a compiler that doesn't support tail-call optimization from one that did.

Edward then explained how hard it is to warn the programmers that they have just made changes to the code that invalidate essential preconditions. This seems like a good place to comment the code, but we can't rely on programmers paying attention to such comments, even that the comments will accompany the code forever. The compiler may not warn us, and it may be hard to write test cases that reliably fail when the optimization is missed. Scala's @tailrec annotation is a great tool to have in this situation!

"Ideally," he writes, "these problems would be things a good organization could deal with." Unfortunately, I'm guessing that most enterprise computing shops are probably not well equipped to handle them gracefully, either by personnel or process. Coffin closes with a pragmatic insight (again, paraphrased):

... it is quite possible that [such organizations] are making the right judgement by forbidding it, considering their own skill levels. However, they may be using the wrong rationale -- "We won't do this because it is generally a bad idea." -- instead of the more accurate "We won't do this because we aren't capable of doing it properly."

Good point. I don't suppose it's reasonable for me or anyone to expect people in software shops to say that. Maybe the rise of languages such and Scala and Clojure will help both industry and academia improve the level of developers' ability to work with functional programming issues. That might allow more organizations to use a good technical solution when it is suitable.

That's one of the reasons I still believe that CS educators should take care to give students a balanced view of recursive programming. Industry is beginning to demand it. Besides, you never know when a young person will get excited about a problem whose solution feels so right as a recursion and set off to write a program to grow his excitement. We also want our graduates to be able to create solutions to hard problems that leverage the power of recursion. We need for students to grok the Big Idea of recursion as a means for decomposing problems and composing systems. The founding of Google offers an instructive example of using inductive definition recursion, as discussed in this Scientific American article on web science:

[Page and Brin's] big insight was that the importance of a page -- how relevant it is -- was best understood in terms of the number and importance of the pages linking to it. The difficulty was that part of this definition is recursive: the importance of a page is determined by the importance of the pages linking to it, whose importance is determined by the importance of the pages linking to them. [They] figured out an elegant mathematical way to represent that property and developed an algorithm they called PageRank to exploit the recursiveness, thus returning pages ranked from most relevant to least.

Much like my Elo ratings program that used successive approximation, PageRank may be implemented in some other way, but it began as a recursive idea. Students aren't likely to have big recursive ideas if we spend years giving them the impression it is an esoteric technique best reserved for their theory courses.

So, yea! for Scala, Clojure, and all the other languages that are making recursion respectable in practice.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 07, 2010 4:53 PM

Quick Thoughts on "Computational Thinking"

When Mark Guzdial posted his article Go to the Data, offering "two stories of (really) computational thinking", I thought for sure that I'd write something on the topic soon. Over four and a half weeks have passed, and I have not written yet. That's because, despite thinking about it on and off ever since, I still don't have anything deep or detailed to say. I'm as confused as Mark says he and Barb are, I guess.

Still, I can't seem to shake these two simple thoughts:

  • Whatever else else it may be, "computational thinking" has to involve, well, um, computation. As several commenters on his piece point out, lots of scientists collect and analyze data. Computer science connects data and algorithm in a way that other disciplines don't.

  • If a bunch of smart, well-informed people can discuss a name like "computational thinking" for several years and still be confused, maybe the problem is with the name. Maybe it really is just another name for a kind of computer science, generalized so that we can speak meaningfully of non-CS types and non-programmers doing it. I believe strongly that computing offers a new medium for expressing and creating ideas but, as for the term computational thinking, maybe there's no there there.

I do agree with Mark that we runners can learn a lot by looking back at our running logs! An occasional short program can often expose truths that elude the naked eye.


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 01, 2010 9:12 PM

The Beginning of a New Project

"Tell me you're not playing chess," my colleague said quizzically.

But I was. My newest grad student and I were sitting in my office playing a quick couple of games of progressive chess, in which I've long been interested. In progressive chess, white makes one move, then black makes two; white makes three moves, then black makes four. The game proceeds in this fashion until one of the players delivers checkmate or until the game ends in any other traditional way.

This may seem like a relatively simple change to the rules of the game, but the result is something that almost doesn't feel like chess. The values of the pieces changes radically, as does the value of space and the meaning of protection. That's why we needed to play a couple of games: to acquaint my student with how different it is from the classical chess I know and love and which has played since a child.

For his master's project, the grad student wanted to do something in the general area of game-playing and AI, and we both wanted to work on a problem that is relatively untouched, where a few cool discoveries are still accessible to mortals. Chess, the fruit fly of AI from the 1950s into the 1970s, long ago left the realm where newcomers could make much of a contribution. Chess isn't solved in the technical sense, as checkers is, but the best programs now outplay even the best humans. To improve on the state of the art requires specialty hardware or exquisitely honed software.

Progressive chess, on the other hand, has a funky feel to it and looks wide open. We are not yet aware of much work that has been done on it, either in game theory or automation. My student is just beginning his search of the literature and will know soon how much has been done and what problems have been solved, if any.

That is why we were playing chess in my office on a Wednesday afternoon, so that we could discuss some of the ways in which we will have to think differently about this problem as we explore solutions. Static evaluation of positions is most assuredly different from what works in classical chess, and I suspect that the best ways to search the state space will be quite different, too. After playing only a few games, my student proposed a new way to do search to capitalize on progressive chess's increasingly long sequences of moves by one player. I'm looking forward to exploring it further, giving it a try in code, and finding other new ideas!

I may not be an AI researcher first any more, but this project excites me. You never know what you will discover until you wander away from known territory, and this problem offers us a lot of unknowns.

And I'll get to say, "Yes, we are playing chess," every once in a while, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

August 31, 2010 7:18 PM

Notes from a Master Designer

Someone tweeted recently about a recent interview with Fred Brooks in Wired magazine. Brooks is one of the giants of our field, so I went straight to the page. I knew that I wanted to write something about the interview as soon as I saw this exchange, which followed up questions about how a 1940s North Carolina schoolboy ended up working with computers:

Wired: When you finally got your hands on a computer in the 1950s, what did you do with it?

Brooks: In our first year of graduate school, a friend and I wrote a program to compose tunes. The only large sample of tunes we had access to was hymns, so we started generating common-meter hymns. They were good enough that we could have palmed them off to any choir.

It never surprises me when I learn that programmers and computer scientists are first drawn to software by a desire to implement creative and intelligent tasks. Brooks was first drawn to computers by a desire to automatic data retrieval, which at the time must have seemed almost as fantastic as composing music. In an Communications of the ACM interview printed sometime last year, Ed Feigenbaum called AI the "manifest destiny" of computer science. I often think he is right. (I hope to write about that interview soon, too.)

But that's not the only great passage in Brooks's short interview. Consider:

Wired: You say that the Job Control Language you developed for the IBM 360 OS was "the worst computer programming language ever devised by anybody, anywhere." Have you always been so frank with yourself?

Brooks: You can learn more from failure than success. In failure you're forced to find out what part did not work. But in success you can believe everything you did was great, when in fact some parts may not have worked at all. Failure forces you to face reality.

As an undergrad, I took a two-course sequence in assembly language programming and JCL on an old IBM 370 system. I don't know how much the JCL on that machine had advanced beyond Brooks's worst computer programming language ever devised, if it had at all. But I do know that the JCL course gave me a negative-split learning experience unlike any I had ever had before or have had since. As difficult as that was, I will be forever grateful for Dr. William Brown, a veteran of the IBM 360/370 world, and what he taught me that year.

There are at least two more quotables from Brooks that are worth hanging on my door some day:

Great design does not come from great processes; it comes from great designers.

Hey to Steve Jobs.

The insight most likely to improve my own work came next:

The critical thing about the design process is to identify your scarcest resource.

This one line will keep me on my toes for many projects to come.

If great design comes from great designers, then how can the rest of us work toward the goal of becoming a great designer, or at least a better one?

Design, design, and design; and seek knowledgeable criticism.

Practice, practice, practice. But that probably won't be enough. Seek out criticism from thoughtful programmers, designers, and users. Listen to what they have to say, and use it to improve your practice.

A good start might be to read this interview and Brooks's books.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

August 24, 2010 4:30 PM

Dreaming, Doing, Perl, and Language Translation

Today, I quoted Larry Wall's 2000 Atlanta Linux Showcase Talk in the first day of my compilers course. In that talk, he gives a great example of using a decompiler to port code -- in this case, from Perl 5 to Perl 6. While re-reading the talk, I remembered something that struck me as wrong when I read it the first time:

["If you can dream it, you can do it"--Walt Disney]

"If you can dream it, you can do it"--Walt Disney. Now this is actually false (massive laughter). I think Walt was confused between necessary and sufficient conditions. If you *don't* dream it, you can't do it; that is certainly accurate.

I don't think so. I think this is false, too. (Laugh now.)

It is possible to do things you don't dream of doing first. You certainly have to be open to doing things. Sometimes we dream something, set out to do it, and end up doing something else. The history of science and engineering are full of accidents and incidental results.

I once was tempted to say, "If you don't start it, you can't do it; that is certainly accurate." But I'm not sure that's true either, because of the first "it". These days, I'm more inclined to say that if you don't start doing something, you probably won't do anything.

Back to Day 1 of the compilers: I do love this course. The Perl quote in my lecture notes is but one element in a campaign to convince my students that this isn't just a compilers course. The value in the course material and in the project itself go far beyond the creation of an old-style source language-to-machine language translator. Decompilers, refactoring browsers, cross-compilers, preprocessors, interpreters, and translators for all sorts of domain-specific languages -- a compilers course will help you learn about all of these tools, both how they work and how to build them. Besides, there aren't many better ways to consolidate your understanding of the breadth of computer science than to build a compiler.

The official title of my department's course is "Translation of Programming Languages". Back in 1994, before the rebirth of mainstream language experimentation and the growth of interest in scripting languages and domain-specific languages, this seemed like a daring step. These days, the title seems much more fitting than "Compiler Construction". Perhaps my friend and former colleague Mahmoud Pegah and I had a rare moment of foresight. More likely, Mahmoud had the insight, and I was simply wise enough to follow.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

August 20, 2010 3:30 PM

Simplicity, Complexity, and Good Enough Chess Ratings

Andrew Gelman writes about a competition offered by Kaggle to find a better rating system for chess. The Elo system has been used for the last 40+ years with reasonable success. In the era of big data, powerful ubiquitous computers, and advanced statistical methods, it turns out that we can create a rating system that predicts more accurately the performance of players on games in the near-future. Very cool. I'm still enough of chess geek that I want to know just when Capablanca surpassed Lasker and how much better Fischer was than his competition in the 1972 challenger's matches. I've always had an irrational preference for ridiculously precise values.

Even as we find systems that perform better, I find myself still attached to Elo. I'm sure part of it is that I grew up with Elo ratings as a player, and read Elo's The Rating of Chess Players, Past and Present as a teen.

But there's more. I've also written programs to implement the rating system, including the first program I ever wrote out of passion. Writing the code to assign initial ratings to a pool of players based on the outcomes of games played among them required me to do something I didn't even know was possible at the time: start a process that wasn't guaranteed to stop. I learned about the idea of successive approximations and how my program would have to settle for values that fit the data well enough. This was my first encounter with epsilon, and my first non-trivial use of recursion. Yes, I could have written a loop, but the algorithm seemed so clear written recursively. Such experiences stick with a person.

There is still more, though, beyond my personal preferences and experiences. Compared to most of the alternatives that do a better job objectively, the Elo system is simple. The probability curve is simple enough for anyone to understand, and the update process is basic arithmetic. Even better, there is a simple linear approximation of the curve that made it possible for a bunch of high school kids with no interest in math to update ratings based on games played at the club. We posted a small table of expected values based on rating differences at the front of the room and maintained the ratings on index cards. (This is a different sort of index-card computing than I wrote about long ago.) There may have been more accurate systems we could have run, but the math behind this one was so simple, and the ratings were more than good enough for our purposes. I am guessing that the Elo system is more than good enough for most people's purposes.

Simple and good enough is a strong combination. Perhaps the Elo system will turn out to be the Newtonian physics of ratings. We know there are a better, more accurate models, and we use them whenever we need something very accurate. Otherwise, we stick to the old model and get along just fine almost all the time.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

August 17, 2010 5:33 PM

Does Computer Science Have a Culture?

Zed Shaw is known for his rants. I've enjoyed many of them over the years. However, his Go To University, Not For CS hits awfully close to home. I love his defense of a university education, but he doesn't have much good to say about computer science programs. This is the punchline:

This is why you go to a University and also why you should not study Computer Science. Except for a few places like MIT, Computer Science is a pointless discipline with no culture. Everything is borrowed from somewhere else. It's the ultimate post modern discipline, and nobody teaching it seems to know what the hell "post-modernism" even means.

He is perhaps a bit harsh, yet what counterargument might we offer? If you studied computer science, did your undergrad alma mater or your graduate school have a CS culture? Did any of your professors offer a coherent picture of CS as a serious intellectual discipline, worthy of study independent of specific technologies and languages?

In graduate school, my advisor and I talked philosophically about CS, artificial intelligence, and knowledge in a way that stoked my interest in computing as a coherent discipline. A few of my colleagues shared our interests, but many of fellow graduate students were more interested in specific problems and solutions. They viewed our philosophical explorations as diversions from the main attraction.

Unfortunately, when I look around at undergrad CS programs, I rarely see a CS culture. This true of what I see at my own university, at my friends' schools, and at schools I encounter professionally. Some programs do better than others, but most of us could do better. Some of our students would appreciate the intellectual challenge that is computer science beyond installing the latest version of Linux or making Eclipse work with SVN.

Shaw offers one sentence of great advice for those of us thinking about undergrad curriculum:

... the things that are core to Computer Science like language design, parsing, or state machines, aren't even taught unless you take an "advanced" course.

I feel his pain. Few schools seem willing to design a curriculum built around core ideas packaged any differently from the way they were packaged in 1980. Students can graduate from most CS programs in this country without studying language design or parsing in any depth.

I can offer one counterpoint: some of us do know what post-modernism is and means. Larry Wall calls Perl the first postmodern computer language. More insightful to me, though, is work by James Noble and Robert Biddle, in particular Notes on Notes on Postmodern Programming, which I mentioned briefly a few years ago.

Shaw is right: there can be great value in studying at a university. We need to make sure that computer science students receive all of the value they should.


Posted by Eugene Wallingford | Permalink | Categories: Computing

August 13, 2010 8:42 PM

Learning from Projects in Industry Training

Former student and current ThoughtWorker Chris Turner sent me an article on ThoughtWorks University's new project-based training course. I mentioned Chris once before, soon after he joined ThoughtWorks. (I also mentioned his very cool research on "zoomable" user interfaces, still one of my all-time favorite undergrad projects.)

Chris tool one of my early offerings of agile software development, one that tried to mix traditional in-class activities with a "studio approach" to a large team project. My most recent offering of the course turned the knobs a bit higher, with two weeks of lecture and pair learning exercises followed by two weeks of intensive project. I really like the results of the new course but wonder how I might be able to do the same kind of thing during the regular semester, when students take five courses and typically spend only three hours a week in class over fifteen weeks.

The ThoughtWorks U. folks do not work under such constraints and have even more focused time available than my one-month course. They bring students in for six weeks of full-time work. Not surprisingly they came to question the effectiveness of their old approach: five weeks of lecture and learning activities followed by a one-week simulation of a project. Most of the learning, it seemed, happened in context during the week-long project. Maybe they should expand the project? But... there is so much content to teach!

Eventually they asked themselves the $64,000 Question:

"What if we don't teach this at all? What's the worst that can happen?"

I love this question. When trying to balance practice in context with yet another lecture, university professors should ask this question about each element of the courses they teach. Often the answer is that students will have to learn the concept from their experience on real projects. Maybe students need more experience on real projects, not more lecture and more homework problems from the back of the textbook chapter!

The folks at TWU redesigned their training program for developers to consist of two weeks of training and four weeks of project work. And they -- and their students -- seem pleased with the results.

... information in context trumped instruction out of context in a huge way. The project was an environment for students to fail in safety. Failure created the need for people to learn and a catalyst for us to coach and teach. A real project environment also allowed students to learn to learn.

This echoes my own experience and is one of the big reasons I think so much about project-based courses. Students still need to learn ideas and concepts, and some will need more direct individual assistance to pick them. The ThoughtWorks folks addressed that need upfront:

We also created several pieces of elearning to help students gain some basic skills when they needed them. Coupled with a social learning platform and a 6:1 student-coach ratio, we were looking at a program that focussed heavily on individualisation as against an experience that was one-size-fits-all-but-fits-nobody. Even with the elearning, we ensured that we were pragmatic in partnering with external content providers whose content met our quality standards.

This is a crucial step, and one that I would like to improve before I teach my agile course again. I found lots of links to on-line resources students could use to learn about agile and XP, but I need to create better materials in some areas and create materials to fill gaps in the accessible web literature. If I want to emphasize the project in my compiler course even more, I will need to create a lot of new materials. What I'd really like to do is create active e-learning resources, rather than text to read. The volume, variety, and quality of supporting materials is even more important if we want to make projects the central activity in courses for beginners.

By the way, I also love the phrase "one-size-fits-all-but-fits-nobody".

When faculty who teach more traditional courses in more traditional curricula hear stories such as this one from TWU, they always ask me the same question: How much does the success of such an industry training program depend on "basic knowledge" students learned in traditional courses? I wonder the same thing. Could we start CS1 or CS2 with two weeks of regular classes followed by four weeks of project? What would work and what wouldn't? Could we address the weaknesses to make the idea work? If we could, student motivation might reach a level higher than we see now. Even better, student learning might be improved as they encounter ideas as they need them to solve problems that matter. (For an opinion to the contrary, see Moti Ben-Ari's comments as reported by Mark Guzdial.)

School starts in a week, so my thoughts have turned to my compiler course. This course already based on one of the classic project experiences that CS students can have. There is a tendency to think all is well with the basic structure of the course and that we should leave it alone. That's not really my style. Having taught compilers any times, I know my course's strengths and weaknesses and know that it can be improved. The extent to which I change it is always a open question.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 10, 2010 3:36 PM

In Praise of Attacking Hard Problems

With the analysis of Deolalikar's P != NP paper now under way in earnest, I am reminded of a great post last fall by Lance Fortnow, The Humbling Power of P v NP. Why should every theorist try to prove P = NP and P != NP?

Not because you will succeed but because you will fail. No matter what teeth you sink into P vs NP, the problem will bite back. Think you solved it? Even better. Not because you did but because when you truly understand why your proof has failed you will have achieved enlightenment.

You might even succeed, though I'm not sure if the person making the attempt achieves the same kind of enlightenment in that case.

Even if Deolalikar's proof holds up, Fortnow's short essay will still be valuable and true.

We'll just use a different problem as our standard.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

August 02, 2010 1:54 PM

Computing and Education in My News

My newreader and inbox are full of recent articles about computing and education in the news. First, there is a New York Times Technology section piece on Scott McNealy's open-source textbook project, Currwiki. When I first read this, I thought for sure I would blog on the idea of free and open-source textbooks today. The more I thought about it, and especially the more I tried to write about it, the less I found I have to say right now. Mark Guzdial has already responded with a few concerns he has about open-source textbooks. Guzdial conflates "open source" with "free", as does the Times piece, though McNealy's project seems to be mostly about offering low-cost or free alternatives to increasingly expensive school books. Most of Guzdial's concerns echo the red flags people have raised about free and open-source software in the past, and we see the effect FOSS has had in the world.

Maybe I'll have something cogent to say some other day, but for now, all I can think is, "Is that a MacBookPro in the photo of McNealy and his son?" If so, even well-placed pencil holder can't hide the truth!

Then there is a blog entry at Education Week on the Computer Science Education Act, a bill introduced in the U.S. House of Representatives last week aimed at improve the state of K-12 CS education. Again, any initial excitement to write at length on this topic faded as I thought more about it. This sort of bill is introduced all the time in Congress with little or no future, so until I see this one receive serious attention from House leaders I'll think of it as mostly good PR for computer science. I do not generally think that legislation of this kind has a huge effect on practice in the schools, which are much too complicated to be pushed off course by a few exploratory grants or a new commission. That said, it's nice that a few higher-ups in education might think deeply about the role CS might and could play in 21st-century K-12 education. This ain't 1910, folks.

Finally, here's one that I can blog about with excitement and a little pride: One of my students, Nick Cash, has been named one of five finalists in Entrepreneur Magazine's Entrepreneur of 2010 contest. Nick is one of those bright guys for who our education system is a poor fit, because he is thinking bigger thoughts than "when is the next problem set due?" He has been keeping me apprised of his start-up every so often, but things change so fast that it is hard for me to keep up.

One of the things that makes me proud is the company he is keeping in that final five. Maryland and Michigan are big-time universities with big-time business schools. Though you may not have heard of Babson College, it has long had one of the top-ranking undergraduate entrepreneurship programs in the country. (I'm that in part because I double-majored in accounting at Ball State University, which also has a top-ranked entrepreneurship center for undergrads.) UNI has been doing more to support student entrepreneurship over the last few years, including an incubator for launching start-ups. Still, Nick has made it to the finals against students who come from better-funded and better-known programs. That says even more about his accomplishment.

Nick's company, Book Hatchery, is certainly highly relevant in today's digital publishing market. I'll be wishing him well in the coming years and helping in any way he asks. Check out the link above and, if you are so inclined, cast a vote for his start-up in the contest!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 28, 2010 2:26 PM

Sending Bad Signals about Recursion

A couple of weeks ago there was a sustained thread on the SIGCSE mailing list on the topic of teaching recursion. Many people expressed strongly held opinions, though most of those were not applicable outside the poster's own context. Only a few of the views expressed were supported by educational research.

My biggest take-away from the discussion was this: I can understand why CS students at so many schools have a bad impression of recursion. Like so many students I encounter, the faculty who responded expressed admiration for the "beauty and elegance" of recursion but seemed to misunderstand at a fundamental level how to use it as a practical tool.

The discussion took a brief side excursion into the merits of Towers of Hanoi as a useful example for teaching recursion to students in the first year. It is simple and easy for students to understand, said the proponents, so that makes it useful. In my early years of teaching CS1/CS2, I used Towers as an example, but long ago I came to believe that real problems are more compelling and provide richer context for learning. (My good friend Owen Astrachan has been sounding the clarion call on this topic for a few years now, including a direct dig on the Towers of Hanoi!)

My concern with the Towers is more specific when we talk about recursion. One poster remarked that this problem helped students to see that recursion can be slow:

Recursion *is* slow if you're solving a problem that is exponentially hard like Hanoi. You can't solve it faster than the recursive solution, so I think Hanoi is a perfectly fine example of recursion.

M. C. Escher's 'Drawing Hands'

This, I think, is one of the attitudes that gives our students an unduly bad impression of recursion, because it confounds problem and solution. Most students leave their first-year courses thinking that recursion is slow and computationally expensive. This is in part an effect of the kinds of problems we solve with recursion there. The first examples we show students of loops tend not to solve exponentially hard problems. This leads students to infer that loops are fast and recursion is slow, when the computational complexity was a feature of the problems, not the solutions. A loop-based solution to Towers would be slow and use a lot of space, too! We can always tell our students about the distinction, but they see so few examples of recursion that they are surely left with a misimpression, through no fault of their own.

Another poster commented that he had once been a consultant on a project at a nuclear reactor. One of the programmers proudly showed one of their programs that used recursion to solve one of their problems. By using recursion, they had been able to construct a straightforward inductive proof of the code's correctness. The poster chided the programmer, because the code was able to overflow the run-time stack and fail during execution. He encouraged them to re-write the code using a subset of looping constructs that enables proofs over the limited set of programs it generates. Recursion cannot be used in real-time systems, he asserted, for just this reason.

Now, I don't want run-time errors in the code that runs our nuclear reactors or any other real-time system, for that matter, but that conclusion is a long jump from the data. I wrote to this faculty member off-list and asked whether the programming language in question forbids, allows, or requires the compiler to optimize recursive code or, more specifically, tail calls. With tail call optimization, a compiler can convert a large class of recursive functions to a non-recursive run-time implementation. This means that the programmer could have both a convincing inductive proof of the code's correctness and a guarantee that the run-time stack will never grow beyond the initial stack frame.

The answer was, yes, this is allowed, and the standard compilers provide this as an option. But he wasn't interested in discussing the idea further. Recursion is not suitable for real-time systems, and that's that.

It's hard to imagine students developing a deep appreciation for recursion when their teachers believe that recursion is inappropriate independent of any evidence otherwise. Recursion has strengths and weaknesses, but the only strengths most students seem to learn about are its beauty and its elegance. Those are code words in many students' minds for "impractical" and, when combined with a teacher's general attitude toward the technique, surely limit our students' ability to get recursion.

I'm not claiming that it's easy for students to learn recursion, especially in the first year, when we tend to work with data that make it hard to see when recursion really helps. But it's certainly possible to help students move from naive recursive solutions to uses of an accumulator variable that enable tail-recursive implementations. Whether that is a worthwhile endeavor in the first year, given everything else we want to accomplish there, is the real question. It is also the question that underlay the SIGCSE thread. But we need to make sure that our students understand recursion and know how to use it effectively in code before they graduate. It's too powerful a tool to be missing from their skill set when they enter the workforce.

As I opened this entry, though, I left the discussion not very hopeful. The general attitude of many instructors may well get in the way of achieving that goal. When confronted with evidence that one of their beliefs is a misconception, too many of them shrugged their shoulders or actively disputed the evidence. The facts interfered with what they already know to be true!

There is hope, though. One of Ursula Wolz's messages was my favorite part of the conversation. She described a study she conducted in grad school teaching recursion to middle-schoolers using simple turtle graphics. From the results of that study and her anecdotal experiences teaching recursion to undergrads, she concluded:

Recursion is not HARD. Recursion is accessible when good models of abstraction are present, students are engaged and the teacher has a broad rather than narrow agenda.

Two important ideas stand out of this quote for me. First, students need to have access to good models of abstraction. I think this can be aided by using problems that are rich enough to support abstractions our students can comprehend. Second, the teacher must have a broad agenda, not a narrow one. To me, this agenda includes not only the educational goals for the lesson but also general message that we want to send our students. Even young learners are pretty\ good at sensing what we think about the material we are teaching. If we convey to them that recursion is beautiful, elegant, hard, and not useful, then that's what they will learn.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 26, 2010 3:28 PM

Digital Cameras, Electric Guitars, and Programming

Eric Clapton at the Crossroads Festival 2010

I often write here about programming for everyone, or at least for certain users. Won't that put professional programmers out of business? Here is a great answer to a related question from a screenwriter using an analogy to music:

Well, if I gave you an electric guitar, would you instantly become Eric Clapton?

There is always room for specialists and artists, people who take literacy to a higher level. We all can write a little bit, and pretty soon everyone will be blogging, tweeting, or Facebooking, but we still need Shakespeare, James Joyce, and Kurt Vonnegut. There is more to writing than letters and sentences. There is more to programming than tokens and procedures. A person with ideas can create things we want to read and use.

Sometimes the idea is as simple as hooking up two existing ideas. I may be late to the party, but @bc_l is simply too cool:

I'm GNU bc on twitter! DM me your math and I'll tell you the answer. (by @hmason)

@hmason is awesome.

On a more practical note, I use dc as the target language for a simple demo compiler in my compilers course, following the lead of Fischer, Cytron, and LeBlanc in Crafting a Compiler. I'm considering using the new edition of this text in my course this fall, in part because of its support for virtual machines as targets and especially the JVM. I like where my course has been the last couple of offerings, but this seems like an overdue change in my course's content. I may as well start moving the course. Eventually, targeting multi-core architectures will be essential.

If I want to help students who dream of being Eric Clapton with a keyboard, I gotta keep moving.

~~~~

(The image above is courtesy of PedalFreak at flickr, with a Attribution-NoDerivs 2.0 Generic license.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 22, 2010 4:19 PM

Days of Future Passed: Standards of Comparison and The Value of Limits

This morning, a buddy of mine said something like this as part of a group e-mail discussion:

For me, the Moody Blues are a perfect artist for iTunes. Obviously, "Days of Future Passed" is a classic, but everything else is pretty much in the "one good song on an album" category.

This struck as a reflection of an interesting way in which iTunes has re-defined how we think about music. There are a lot of great albums that everyone should own, but even for fans most artists produce only a song or two worth keeping over the long haul. iTunes makes it possible to cherry-pick individual songs in a way that relegates albums to second thought. A singer or band have achieved something notable if people want to buy the whole album.

That's not the only standard measure I encountered in that discussion.

After the same guy said, "The perfect Moody Blues disc collection is a 2-CD collection with the entirety of 'Days of Future Passed' and whatever else you can fit", another buddy agreed and went further (again paraphrased):

"Days of Future Passed" is just over half an 80-minute CD, and then I grabbed another 8 or 9 songs. That worked out right for me.

Even though CDs are semi-obsolete in this context, they still serve a purpose, as a sort of threshold for asking the question "How much music from this band do I really want to rip?"

When I was growing up, the standard was the 90-minute cassette tape. Whenever I created a collection for a band from a set of albums I did not want to own, I faced two limits: forty-five minutes on a side, and ninety minutes total. Those constraints caused me many moments of uncertainty as I tried to cull my list of songs into two lists that fit. Those moments were fun, though, too, because I spent a lot of time thinking about the songs on the bubble, listening and re-listening until I could make a comfortable choice. Some kids love that kind of thing.

Then, for a couple of decades the standard was the compact disc. CDs offered better quality with no halfway cut, but only about eighty minutes of space. I had to make choices.

When digital music leapt from the CD to the hard drive, something strange happened. Suddenly we were talking about gigabytes. And small numbers of gigabytes didn't last long. From 4- and 8-gigabyte devices we quickly jumped to iPods with a standard capacity of 160GB. That's several tens of thousands of songs! People might fill their iPods with movies, but most people won't ever need to fill them with the music they listen to on any regular basis. If they do, they always have the hard drive on the computer they sync the iPod with. Can you say "one terabyte", boys and girls?

The computer drives we use for music got so large so fast that they are no longer useful as the arbitrary limit on our collections. In the long run, that may well be a good thing, but as someone who has lived on both sides of the chasm, I feel a little sadness. The arbitrary limits imposed by LPs, cassettes, and CDs caused us to be selective and sometimes even creative. This is the same thing we hear from programmers who had to write code for machines with 128K of memory and 8 Mhz processors. Constraints are a source of creativity and freedom.

It's funny how the move to digital music has created one new standard of comparison via the iTunes store and destroyed another via effectively infinite hard drives. We never know quite how we and our world will change in response to the things we build. That's part of the fun, I think.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

July 21, 2010 4:17 PM

Two Classic Programs Available for Study

I just learned that the Computer History Museum has worked with Apple Computer to make source code for MacPaint and QuickDraw available to the public. Both were written by Bill Atkinson for the original Mac, drawing on his earlier work for the Lisa. MacPaint was the iconic drawing program of the 1980s. The utility and quality of this one program played a big role in the success of the Mac. Andy Hertzfeld, another Apple pioneer, credited QuickDraw for the success of the Mac for its speed at producing the novel interface that defined the machine to the public. These programs were engineering accomplishments of a different time:

MacPaint was finished in October 1983. It coexisted in only 128K of memory with QuickDraw and portions of the operating system, and ran on an 8 Mhz processor that didn't have floating-point operations. Even with those meager resources, MacPaint provided a level of performance and function that established a new standard for personal computers.

Though I came to Macs in 1988 or so, I was never much of a MacPaint user, but I was aware of the program through friends who showed me works they created using it. Now we can look under the hood to see how the program did what it did. Atkinson implemented MacPaint in one 5,822-line Pascal program and four assembly language files for the Motorola 6800 totaling 3,583 lines. QuickDraw consists of 17,101 lines of Motorola 6800 assembly in thirty-seven modules.

I speak Pascal fluently and am eager to dig into the main MacPaint program. What patterns will I recognize? What design features will surprise me, and teach me something new? Atkinson is a master programmer, and I'm sure to learn plenty from him. He was working in an environment that so constrained his code's size that he had to do things differently than I ever think about programming.

This passage from the Computer History Museum piece shares a humorous story that highlights how Atkinson spent much of his time tightening up his code:

When the Lisa team was pushing to finalize their software in 1982, project managers started requiring programmers to submit weekly forms reporting on the number of lines of code they had written. Bill Atkinson thought that was silly. For the week in which he had rewritten QuickDraw's region calculation routines to be six times faster and 2000 lines shorter, he put "-2000" on the form. After a few more weeks the managers stopped asking him to fill out the form, and he gladly complied.

This reminded me of one of my early blog entries about refactoring. Code removed is code earned!

I don't know assembly language nearly as well as I know Pascal, let alone Motorola 6800 assembly, but I am intrigued by the idea of being able to study more than 20,000 lines of assembly language that work together on a common task and which also exposes a library API for other graphics programs. Sounds like great material for a student research project, or five...

I am a programmer, and I love to study code. Some people ask why anyone would want to read listings of any program, let alone a dated graphics program from more than twenty-five years ago. If you use software but don't write it, then you probably have no reason to look under this hood. But keep in mind that I study how computation works and how it solves problems in a given context, especially when it has limited access to time, space, or both.

But... People write programs. Don't we already know how they work? Isn't that what we teach CS students, at least ones in practical undergrad departments? Well, yes and no. Scientists from other disciplines often ask this question, not as a question but as an implication that CS is not science. I have written on this topic before, including this entry about computation in nature. But studying even human-made computation is a valuable activity. Building large systems and building tightly resource-constrained programs are still black arts.

Many programmers could write a program with the functionality of MacPaint these days, but only a few could write a program that offers such functionality under similar resource limitations. That's true even today, more than two decades after Atkinson and others of his era wrote programs like this one. Knowledge and expertise matter, and most of it is hidden away in code that most of us never get to see. Many of the techniques used by masters are documented either not well or not at all. One of the goals of the software patterns community is to document techniques and the design knowledge needed to use them effectively. And one of the great services of the free and open-source software communities is to make programs and their source code accessible to everyone, so that great ideas are available to anyone willing to work to find them -- by reading code.

Historically, engineering has almost always run ahead of science. Software scientists study source code in order to understand how and why a program works, in a qualitatively different way than is possible by studying a program from the outside. By doing so, we learn about both engineering (how to make software) and science (the abstractions that explain how software works). Whether CS is a "natural" science or not, it is science, and source code embodies what it studies.

For me, encountering the release of source code for programs such as MacPaint feels something like a biologist discovering a new species. It is exciting, and an invitation to do new work.

Update: This is worth an update: a portrait of Bill Atkinson created in MacPaint. Well done.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

July 19, 2010 4:48 PM

"I Blame Computers."

I spent the weekend in southwestern Ohio at Hueston Woods State Park lodge with a bunch of friends from my undergrad days. This group is the union of two intersecting groups of friends. I'm a member of only one but was good friends with the two main folks in the intersection. After over twenty years, with close contact every few years, we remain bonded by experiences we shared -- and created -- all those years ago.

The drives to and from the gathering were more eventful than usual. I was stopped by a train at same railroad crossing going both directions. On the way there, a semi driver intentionally ran me off the road while I was passing him on the right. I don't usually do that, but he had been driving in the left lane for quite a while, and none too fast. Perhaps I upset him, but I'm not sure how. Then, on the way back, I drove through one of the worst rainstorms I've encountered in a long while. It was scarier than most because it hit while I was on a five-lane interstate full of traffic in Indianapolis. The drivers of my hometown impressed me by slowing down, using their hazard lights, and cooperating. That was a nice counterpoint to my experience two days earlier.

Long ago, my mom gave me the New Testament of the Bible on cassette tape. (I said it was long ago!) When we moved to a new house last year, I came across the set again and have had it in pile of stuff to handle ever since. I was in an unusual mood last week while packing for the trip and threw the set in the car. On the way to Ohio, I listened to Gospel of Matthew. I don't think I have ever heard or read an entire gospel in one setting before. After hearing Matthew, I could only think, "This is a hard teaching." (That is a line from another gospel, by John, the words and imagery of which have always intrigued me more than the other gospels.)

When I arrived on Friday, I found that the lodge did offer internet service to the rooms, but at an additional cost. That made it easier for me to do what I intended, which was to spend weekend off-line and mostly away from the keyboard. I enjoyed the break. I filled my time with two runs (more on them soon) and long talks with friends and their families.

Ironically, conversation late on Saturday night turned to computers. The two guys I was talking with are lawyers, one for the Air Force at Wright Patterson Air Force Base and one for a U.S. district court in northern Indiana. Both lamented the increasing pace of work expected by their clients. "I blame computers," said one of the guys.

In the old days, documents were prepared, duplicated, and mailed by hand. The result was slow turnaround times, so people came to expect slow turnaround. Computers in the home and office, the Internet, and digital databases have made it possible to prepare and communicate documents almost instantly. This has contributed to two problems they see in their professional work. First, the ease of copy-and-paste has made it even easier to create documents that are bloated or off-point. This can be used to mislead, but in their experience the more pernicious problem is lack of thoughtfulness and understanding.

Second, the increased speed of communication has led to a change in peoples' expectations about response. "I e-mailed you the brief this morning. Have you resolved the issue this morning?" There is increasing pressure to speed up the work cycle and respond faster. Fortunately, both report that these pressures come only from outside. Neither the military brass nor the circuit court judges push them or their staff to work faster, and in fact encourage them to work with prudence and care. But the pressure on their own staff from their clients grows.

Many people lash out and blame "computers" for whatever ills of society trouble them. These guys are bright, well-read, and thoughtful, and I found their concerns about our legal system to be well thought out. They are deeply concerned by what the changes mean for the cost and equitability of the justice the system can deliver. The problem, of course, is not with the computers themselves but with how we use them, and perhaps with how they change us. For me as a computer scientist, that conversation was a reminder that writing a program does not always solve our problems, and sometimes it creates new ones. The social and cultural environments in which programs operate are harder to understand and control than our programs. People problems can be much harder to solve than technical problems. Often, when we solve technical problems, we need to be prepared for unexpected effects on how people work and think.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

July 04, 2010 3:49 PM

Sharing Skills and Joys with Teachers

For many of the departments in my college, a big part of summer is running workshops for K-12 teachers. Science education programs reside in the science departments here, and summer is the perfect time for faculty in biology, physical science, and the like to work with their K-12 colleagues and keep them up to date on both their science and the teaching of it.

Not so in Computer Science; we don't have a CS education program. My state does not certify CS teachers, so their is no demand from teachers for graduate credits to keep licenses up to date. Few schools teach any CS or even computer programming these days at all, so there aren't even many teachers interested in picking up CS content or methods for their own courses. My department did offer a teaching minor for undergrads and a master's degree in CS education for many years in the 1980s and 1990s, but the audience slowly dried up and we dropped the programs.

Still, several CS faculty and I have long talked about how valuable it might be for recruiting to share the thrill of computing with high school teachers. With a little background, they might be more motivated -- and better equipped -- to share that excitement with their students.

Google CS4HS logo

That's what makes this summer so exciting. With support from Google's CS4HS program, we are offering a workshop for middle and high school teachers, Introducing Computing via Scratch and Simulation. Because there are not many CS-only teachers in our potential audience, we pitched the workshop to science and math teachers, with a promise that the workshop would help them learn how to use computing to demonstrate concepts in their disciplines and to build simple simulations for their students. We have also had some positive experiences working with middle-school social science students at the local lab school, so we included social science and humanities teachers in our call for participants. (Scratch is a great tool for story telling!)

Our workshop will reflect a basic tenet my colleagues and I hold: the best way to excite people about computing is to show them its power. Our main focus is on how they can use CS to teach their science, math, and other courses better. But we will also begin to hint at how they can use simple scripts to make their lives better, whether to prepare data for classroom examples or to handle administrative tasks. Scratch will be the main medium we teach them, but we will also point them toward something like Python. Glitz gets people's attention, but glitz fades quickly when the demands of daily life return to the main stage. The power of computing may help keep their attention.

Google is expanding this program, which they piloted last year at a couple of top schools. This year, several elite schools are offering workshops, but also several schools like mine, schools working in the trenches both of CS but also teacher preparation. As the old teachers' college in my state, we prepare a large percentage of the K-12 teachers here.

The grant from Google, awarded through a competitive proposal process, helped us to take the big step of developing a workshop and trying to sell it to the teachers of our area. We were not sure how many teachers would be willing to spend four days (three this summer and a follow-up day in the fall) to study computer science. The Google award allowed us to offer small stipends to lower the barrier to attendance. We also kept expenses low by donating instructor time, which allowed us to offer the workshop at the lowest cost possible to teachers. They result was promising: more teachers signed up than we have stipends for.

Next comes the fun part: preparing and teaching the workshop!


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 08, 2010 4:46 PM

No, the RICO Act Does Not Apply

The mafioso over at PLT have done it. They have renamed their suite of languages, editors, tools, and teaching materials from the collective "PLT Scheme" to Racket. As they explain on their web site, the Scheme part of the old name had grown increasingly less useful for identifying their work and the languages they had created. As a result, they had to continually distinguish their work from the work of the mainstream Scheme community, which was similarly disadvantaged by having the PLT community's work consume much of the mindshare around the name Scheme. A name change solves both problems... with the new cost of having to re-brand PLT Scheme, Dr. Scheme, Mr. Ed, mzscheme, and the rest of their suite.

This is the first time I have ever seen a programming language community try to rebrand its language, so this will be fun. I suspect the PLT folks will be just fine. They are both great researchers and tool builders, and they are remarkably energetic in disseminating their ideas. I don't think there is any danger they will lose many of their current devotees, if any. The real question is whether the new name will make it easier or harder to bring new users on board. Because "Racket is a Scheme" and Racket is not called Scheme, I figure the name change is at worst a wash and at best a perceptive move. Over the next couple of years, we'll see which is true.

If nothing else, this is likely to force me out of my holding pattern with PLT Scheme 4.2.0 and into the modern era of PLT, er, Racket 5.0. The #lang construct is such a nice solution to several different issues in creating languages and building large systems. Now I'll have to try it out for real! I'll also have to convert all of the code for my programming languages course, which will give me a little more incentive to make some larger changes in what I teach in the course and how I teach it. That's good news, too -- an extra shot of energy.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 25, 2010 4:16 PM

A Book on Alan Kay / Alan Kay on Science and Engineering

Everyone already seems to know about this, but I didn't until yesterday, when I saw a note from Hal Perkins...

In honor of Alan Kay's 70th birthday, a bunch of his colleagues and best friends have published Points of View, "a collection of previously-unpublished essays". The list of authors ranges from CS greats such as Ivan Sutherland, Butler Lampson, and Leonard Kleinrock, to luminaries from other human endeavors such as Betty Edwards and Quincy Jones. The list of essay titles makes me want to drop all of my work and read it from cover to cover right now. There may be a second printing of the book, but it is available in PDF from the website for free.

Very cool.

I've been meaning to post a short snippet from Kay for the last few weeks. He wrote a guest entry on Mark Guzdial's blog about the danger of trying to make learning 'too simple', a common theme in how Kay talks about understanding and building systems. The passage that really attracted me was on something else, though, in a comment responding to other commenters' comments. Considering that Kay's blog entry itself began as a comment intended for another entry, we have more than enough indirection to make us computer scientists happy for weeks!

In his comment, Kay describes some of his team's current work at Viewpoints, in which they extract from programs of millions of code an essential core that is executable, readable, and yet several orders of magnitude smaller than the original. He uses this description as a vehicle for explaining science and engineering:

In this direction -- where we have phenomena already, and our task is to make a much more compact, understandable and debuggable model -- we are doing actual science (this is what science does and is).

But we can also use the model idea to go in the other direction -- to come up with an idea, make a runnable debuggable model of it -- and then use this as a guide and reference ... for manifesting the ... artifact -- here, we are doing invention and engineering.

Computer scientists are often asked to explain why CS is a science. Kay has always been very clear that science is about method, not subject matter. His explanation of science and engineering nicely captures how CS can be science, how CS can be engineering, and how CS can be both. The last these is part of what distinguishes computer science from other disciplines, precisely because CS is about models of systems.

For me, this simple example was the big win from Kay's guest post on Mark's blog. It's also a fitting example of how Kay thinks about the world and why accomplished people would take the time to write essays to celebrate his birthday.


Posted by Eugene Wallingford | Permalink | Categories: Computing

April 22, 2010 8:36 PM

At Some Point, You Gotta Know Stuff

A couple of days ago, someone tweeted a link to Are you one of the 10% of programmers who can write a binary search?, which revisits a passage by Jon Bentley from twenty-five years ago. Bentley observed back than that 90% of professional programmers were unable to produce a correct version of binary search, even with a couple of hours to work. I'm guessing that most people who read Bentley's article put themselves in the elite 10%.

Mike Taylor, the blogger behind The Reinvigorated Programmer, challenged his readers. Write your best version of binary search and report the results: is it correct or not? One of his conditions was that you were not allowed to run tests and fix your code. You had to make it run correctly the first time.

Writing a binary search is a great little exercise, one I solve every time I teach a data structures course and occasionally in courses like CS1, algorithms, and any programming language- or style-specific course. So I picked up the gauntlet.

You can see my solution in a comment on the entry, along with a sheepish admission: I inadvertently cheated, because I didn't read the rules ahead of time! (My students are surely snickering.) I wrote my procedure in five minutes. The first test case I ran pointed out a bug in my stop condition, (>= lower upper). I thought for a minute or so, changed the condition to (= lower (- upper 1)), and the function passed all my tests.

In a sense, I cheated the intent of Bentley's original challenge in another way. One of the errors he found in many professional developers' solution was an overflow when computing the midpoint of the array's range. The solution that popped into my mind immediately, (lower + upper)/2, fails when lower + upper exceeds the size of the variable used to store the intermediate sum. I wrote my solution in Scheme, which handle bignums transparently. My algorithm would fail in any language that doesn't. And to be honest, I did not even consider the overflow issue; having last read Bentley's article many years ago, I had forgotten about that problem altogether! This is yet another good reason to re-read Bentley occasionally -- and to use languages that do heavy lifting for you.

But.

One early commenter on Taylor's article said that the no-tests rule took away some of my best tools and his usual way of working. Even if he could go back to basics, working in an unfamiliar probably made him less comfortable and less likely to produce a good solution. He concluded that, for this reason, a challenge with a no-tests rule is not a good test of whether someone is a good programmer.

As a programmer who prefers an agile style, I felt the same way. Running that first test, chosen to encounter a specific possibility, did exactly what I had designed it to do: expose a flaw in my code. It focused my attention on a problem area and caused me to re-examine not only the stopping condition but also the code that changed the values of lower and upper. After that test, I had better code and more confidence that my code was correct. I ran more tests designed to examine all of the cases I knew of at the time.

As someone who prides himself in his programming-fu, though, I appreciated the challenge of trying to design a perfect piece of code in one go: pass or fail.

This is a conundrum to me. It is similar to a comment that my students often make about the unrealistic conditions of coding on an exam. For most exams, students are away from their keyboards, their IDEs, their testing tools. Those are big losses to them, not only in the coding support they provide but also in the psychological support they provide.

The instructor usually sees things differently. Under such conditions, students are also away from Google and from the buddies who may or may not be writing most of their code in the lab. To the instructor, This nakedness is a gain. "Show me what you can do."

Collaboration, scrapheap programming, and search engines are all wonderful things for software developers and other creators. But at some point, you gotta know stuff. You want to know stuff. Otherwise you are doomed to copy and paste, to having to look up the interface to basic functions, and to being able to solve only those problems Google has cached the answers to. (The size of that set is growing at an alarming rate.)

So, I am of two minds. I agree with the commenter who expressed concern about the challenge rules. (He posted good code, if I recall correctly.) I also think that it's useful to challenge ourselves regularly to solve problems with nothing but our own two hands and the cleverness we have developed through practice. Resourcefulness is an important trait for a programmer to possess, but so are cleverness and meticulousness.

Oh, and this was the favorite among the ones I read:

I fail. ... I bring shame to professional programmers everywhere.

Fear not, fellow traveler. However well we delude ourselves about living in a Garrison Keillor world, we are all in the same boat.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

April 20, 2010 9:58 PM

Computer Code in the Legal Code

You may have heard about a recent SEC proposal that would require issuers of asset-backed securities to submit "a computer program that gives effect to the flow of funds". What a wonderful idea!

I have written a lot in this blog about programming as a new medium, a way to express processes and the entities that participate in them. The domain of banking and finance is a natural place for us to see programming enter into the vernacular as the medium for describing problems and solutions more precisely. Back in the 1990s, Smalltalk had a brief moment in the sunshine as the language of choice used by financial houses in the creation of powerful, short-lived models of market behavior. Using a program to describe models gave the investors and arbitrageurs not only a more precise description of the model but also a live description, one they could execute against live data, test and tinker with, and use as an active guide for decision-making.

We all know about the role played by computer models in the banking crisis over the last few years, but that is an indictment of how the programs were used and interpreted. The use of programs itself was and is the right way to try to understand a complex system of interacting, independent agents manipulating complex instruments. (Perhaps we should re-consider whether to traffic in instruments so complex that they cannot be understand without executing a complex program. But that is a conversation for another blog entry, or maybe a different blog altogether!)

What is the alternative to using a program to describe the flow of funds engendered by a particular asset-backed security? We could describe these processes using text in a natural language such as English. Natural language is supremely expressive but fraught with ambiguity and imprecision. Text descriptions rely on the human reader to do most of the work figuring out what they mean. They are also prone to gratuitous complexity, which can be used to mislead unwary readers.

We could also describe these processes using diagrams, such as a flow chart. Such diagrams can be much more precise than text, but they still rely on the reader to "execute" them as she reads. As the diagrams grow more complex, the more difficult it is for the reader to interpret the diagram correctly.

A program has the virtue of being both precise and executable. The syntax and semantics of a programming are (or at least can be) well-defined, so that a canonical interpreter can execute any program written in the language and determine its actual value. This makes describing something like the flow of funds created by a particular asset-backed security as precise and accurate as possible. A program can be gratuitously complex, which is a danger. Yet programmers have at their disposal tools for removing gratuitous complexity and focusing on the essence of a program, moreso than we have for manipulating text.

The behavior of the model can still be complex and uncertain, because it depends on the complexity and uncertainty of the environment in which it operates. Our financial markets and the economic world in which asset-backed securities live are enormously complex! But at least we have a precise description of the process being proposed.

As one commentator writes:

When provisions become complex beyond a point, computer code is actually the simplest way to describe them... The SEC does not say so, but it would be useful to add that if there is a conflict between the software and textual description, the software should prevail.

Using a computer program in this way is spot on.

After taking this step, there are still a couple of important issues yet to decide. One is: What programming language should we use? A lot of CS people are talking about the proposal's choice of Python as the required language. I have grown to like Python quite a bit for its relative clarity and simplicity, but I am not prepared to say that it is the right choice for programs that are in effect "legal code". I'll let people who understand programming language semantics better than I make technical recommendations on the choice of language. My guess is that a language with a simpler, more precisely defined semantics would work better for this purpose. I am, of course, partial to Scheme, but a number of functional languages would likely do quite nicely.

Fortunately, the SEC proposal invites comments, so academic and industry computer scientists have an opportunity to argue for a better language. (Computer programmers seem to like nothing more than a good argument about language, even writing programs in their own favorite!)

The most interesting point here, though, is not the particular language suggested but that the proposers suggest any programming language at all. They recognize how much more effectively a computer program cab describe a process than text or diagrams. This is a triumph in itself.

Other people are reminding us that mortgage-backed CDOs at the root of the recent financial meltdown were valued by computer simulations, too. This is where the proposal's suggestion that the code be implemented in open-source software shines. By making the source code openly available, everyone has the opportunity and ability to understand what the models do, to question assumptions, and even to call the authors on the code's correctness or even complexity. The open source model has worked well in the growth of so much of our current software infrastructure, including the simple in concept but complex in scale Internet. Having the code for financial models be open brings to bear a social mechanism for overseeing the program's use and evolution that is essential in a market that should be free and transparent.

This is also part of the argument for a certain set of languages as candidates for the code. If the language standard and implementations of interpreters are open and subject to the same communal forces as the software, this will lend further credibility to the processes and models.

I spend a lot of time talking about code in this blog. This is perhaps the first time I have talked about legal code -- and even still I get to talk about good old computer code. It's good to see programs recognized for what they are and can be.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 06, 2010 7:36 PM

Shame at Causing Loved Ones Harm

The universe tends toward maximum irony.
Don't push it.
-- JWZ

I have had Jamie Zawinski's public service announcement on backups sitting on desk since last fall. I usually keep my laptop and my office machine pretty well in sync, so I pretty much always have a live back-up. But some files live outside the usual safety zone, such as a temporary audio files on my desktop, which also contains one or two folders of stuff. I knew I need to be more systematic and complete in safeguarding myself from disk failure, so printed Zawinski's warning and resolved to Do the Right Thing.

Last week, I read John Gruber's ode to backups and disk recovery. This article offers a different prescription but the same message. You must be 100% backed up, including even the files that you are editing now in the minutes or hours before the next backup. Drives fail. Be prepared.

Once again, I was energized to Do the Right Thing. I got out a couple of external drives that I had picked out for a good price recently. The plan was to implement a stable, complete backup process this coming weekend.

The universe tends toward maximum irony. Don't push it.

If the universe were punishing me for insufficient respect for its power, you would think that the hard drive in either my laptop or my office machine would have failed. But both chug along just fine. Indeed, I still have never had a hard drive fail in any of my personal or work computers.

It turns out that the universe's sense of irony is much bigger than my machines.

On Sunday evening, the hard drive in our family iMac failed. I rarely use this machine and store nothing of consequence there. Yet this is a much bigger deal.

My wife lost a cache of e-mail, an address book, and a few files. She isn't a big techie, so she didn't have a lot to lose there. We can reassemble the contact information at little cost, and she'll probably use this as a chance to make a clean break from Eudora and POP mail and move to IMAP and mail in the cloud. In the end, it might be a net wash.

My teenaged daughters are a different story. They are from a new generation and live a digital life. They have written a large number of papers, stories, and poems, all of which were on this machine. They have done numerous projects for schools and extracurricular activities. They have created artwork using various digital tools. They have taken photos. All on this machine, and now all gone.

I cannot describe how I felt when I first realized what had happened, or how I feel now, two days later. I am the lead techie in our house, the computer science professor who knows better and preaches better, the husband and father who should be taking care of what matters to his family. This is my fault. Not that the hard drive failed, because drives fail. It is my fault that we don't have a reliable, complete backup of all the wonderful work my daughters have created.

Fortunately, not all is lost. At various times, we have copied files to sundry external drives and servers for a variety of reasons. I sometimes copy poetry and stories and papers that I especially like onto my own machines, for easy access. The result is a scattering of files here and there, across a half dozen machines. I will spend the next few days reassembling what we have as best I can. But it will not be all, and it will not be enough.

The universe maximized its irony this time around by getting me twice. First, I was gonna do it, but didn't.

That was just the head fake. I was not thinking much at all about our home machine. That is where the irony came squarely to rest.

Shut up. I know things. You will listen to me. Do it anyway.

Trust Zawinski, Gruber, and every other sane computer user. Trust me.

Do it. Run; don't walk. Whether your plan uses custom tools or a lowly cron running rysnc, do it now. Whether you go as far as using a service such as dropbox to maintain working files or not, set up an automatic, complete, and bootable backup of your hard drives.

I know I can't be alone. There must be others like me out there. Maybe you used to maintain automatic and complete system backups and for whatever reason fell out of the habit. Maybe you have never done it but know it's the right thing to do. Maybe, for whatever reason, you have never thought about a hard drive failing. You've been lucky so far and don't even know that your luck might change at any moment.

Do it now, before dinner, before breakfast. Do it before someone you love loses valuable possessions they care deeply about.

I will say this: my daughters have been unbelievable through all this. Based on what happened Sunday night, I certainly don't deserve their trust or their faith. Now it's time to give them what they deserve.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

March 31, 2010 3:22 PM

"Does Not Play Well With Others"

Today I ran across a recent article by Brian Hayes on his home-baked graphics. Readers compliment him all the time on the great graphics in his articles. How does he do it? they ask. The real answer is that he cares what they look like and puts a lot of time into them. But they want to know what tools he uses. The answer to that question is simple: He writes code!

His graphics code of choice is PostScript. But, while PostScript is a full-featured postfix programming language, it isn't the sort of language that many people want to write general-purpose code in. So Hayes took the next natural step for a programmer and built his own language processor:

... I therefore adopted the modus operandi of writing a program in my language of choice (usually some flavor of Lisp) and having that program write a PostScript program as its output. After doing this on an ad hoc basis a few times, it became clear that I should abstract out all the graphics-generating routines into a separate module. The result was a program I named lips (for Lisp-to-PostScript).

Most of what lips does is trivial syntactic translation, converting the parenthesized prefix notation of Lisp to the bracketless postfix of PostScript. Thus when I write (lineto x y) in Lisp, it comes out x y lineto in PostScript. The lips routines also take care of chores such as opening and closing files and writing the header and trailer lines required of a well-formed PostScript program.

Programmers write code to solve problems. More often than many people, including CS students, realize, programmers write a language processor or even create a little language of their own to make solving the more convenient. We have been covering the idea of syntactic abstractions in my programming languages course for the last few weeks, and Hayes offers us a wonderful example.

Hayes describes his process and programs in some detail, both lips and his homegrown plotting program plot. Still, he acknowledges that the world has changed since the 1980s. Nowadays, we have more and better graphics standards and more and better tools available to the ordinary programmer -- many for free.

All of which raises the question of why I bother to roll my own. I'll never keep up -- or even catch up -- with the efforts of major software companies or the huge community of open-source developers. In my own program, if I want something new -- treemaps? vector fields? the third dimension? -- nobody is going to code it for me. And, conversely, anything useful I might come up with will never benefit anyone but me.

Why, indeed? In my mind, it's enough simply to want to roll my own. But I also live in the real world, where time is a scarce resource and the list of things I want to do grows seemingly unchecked by any natural force. Why then? Hayes answers that question in a way that most every programmer I know will understand:

The trouble is, every time I try working with an external graphics package, I run into a terrible impedance mismatch that gives me a headache. Getting what I want out of other people's code turns out to be more work than writing my own. No doubt this reveals a character flaw: Does not play well with others.

That phrase stood me up in my seat when I read it. Does not play well with others. Yep, that's me.

Still again, Hayes recognizes that something will have to give:

In any case, the time for change is coming. My way of working is woefully out of date and out of fashion.

I don't doubt that Hayes will make a change. Programmers eventually get the itch even with their homebrew code. As technology shifts and the world changes, so do our needs. I suspect, though, that his answer will not be to start using someone else's tools. He is going to end up modifying his existing code, or writing new programs all together. After all, he is a programmer.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 30, 2010 8:57 PM

Matthias Felleisen Wins the Karl Karlstrom Award

Early today, Shriram Krishnamurthi announced on the PLT Scheme mailing list that Matthias Felleisen had won the Karl Karlstrom Outstanding Educator Award. The ACM presents this award annually

to an outstanding educator who is .. recognized for advancing new teaching methodologies, or effecting new curriculum development or expansion in Computer Science and Engineering; or making a significant contribution to the educational mission of the ACM.

A short blurb in the official announcement touts Felleisen "for his visionary and long-term contributions to K-12 outreach programs, innovative textbooks, and pedagogically motivated software". Krishnamurthi, in his message to the community borne out of Felleisen's leadership and hard work, said it nicely:

Everyone on this list has been touched, directly or indirectly, by Matthias Felleisen's intellectual leadership: from his books to his software to his overarching vision and singular execution, as well as his demand that the rest of us live up to his extraordinary standards.

Matthias Felleisen

As an occasional Scheme programmer and a teacher of programmers, I have been touched by Felleisen's work regularly over the last 20 years. I first read "The Little Lisper" long before I knew Matthias, and it changed how I approached programming with inductive data types. I assign "The Little Schemer" as the only textbook for my programming languages course, which introduces and uses functional programming. I have always felt as if I could write my own materials to teach functional programming and the languages content of the course, but "The Little Schemer" is a tour de force that I want my students to read. Of course, we also use Dr. Scheme and all of its tools for writing Scheme programs, though we barely scratch the surface of what it offers in our one-semester course.

We have never used Felleisen's book "How to Design Programs" in our introductory courses, but I consider its careful approach to teaching software design one of the most important intro CS innovations of the last twenty years. Back in the mid-1990s, when my department was making one of its frequent changes to the first-year curriculum, I called Matthias to ask his advice. Even after he learned that we were not likely to adopt his curriculum, he chatted me for a while and offered me pedagogical advice and even strategic advice my making a case for a curriculum based in a principle outside any given language.

That's one of the ironic things about Felleisen's contribution: He is most closely associated with Scheme and tools built in and for Scheme, but his TeachScheme! project is explicitly not about Scheme. (The "!" is even pronounced "not", a programming pun using the standard C meaning of the symbol.) TeachScheme! uses Scheme as a tool for creating languages targeted at novices who progress through levels of understanding and complexity. Just today in class, I talked with my students about Scheme's mindset of bringing to users of a language the same power available to language creators. This makes it an ideal intellectual tool for implementing Felleisen's curriculum, even as its relative lack of popularity has almost certainly hindered adoption of the curriculum more widely.

As my department has begun to reach out to engage K-12 students and teachers, I have come to appreciate just how impressive the TeachScheme! outreach effort is. This sort of engagement requires not only a zeal for the content but also sustained labor. Felleisen has sustained both his zeal and his hard work, all the while building a really impressive group of graduate students and community supporters. The grad students all seem to earn their degrees, move on as faculty to other schools, and yet remain a part of the effort.

Closer to my own work, I continue to think about the design recipe, which is the backbone of the HtDP curriculum. I remain convinced that this idea is compatible with the notion of elementary patterns, and that the design recipe can be integrated with a pattern language of novice programs harmoniously to create an even more powerful model for teaching new programmers how to design programs.

As Krishnamurthi wrote to the PLT Scheme developer and user communities, Felleisen's energy and ideas have enriched my work. I'm happy to see the ACM honor him for his efforts.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 28, 2010 6:15 PM

SIGCSE Day 3 -- Interdisciplinary Research

[A transcript of the SIGCSE 2010 conference: Table of Contents]

My SIGCSE seem to come to an abrupt end Saturday morning. After a week of long days, I skipped a last run in Milwaukee and slept in a bit. The front desk messed up my bill, so I spent forty-five minutes wangling for my guaranteed rate. As a result, I missed the Nifty Assignments session and arrived just in time for some time meeting with colleagues at the exhibits during the morning break. This left me time for one last session before my planned pre-luncheon departure.

I chose to attend the special session called Interdisciplinary Computing Education for the Challenges of the Future, with representatives of the National Science Foundation who have either carried out funded interdisciplinary research or who have funded and managed interdisciplinary programs. The purpose of the session was to discuss...

... the top challenges and potential solutions to the problem of educating students to develop interdisciplinary computing skills. This includes a computing perspective to interdisciplinary problems that enables us to think deeply about difficult problems and at the same time engage well under differing disciplinary perspectives.

This session contributed to the conference buzz of computer science looking outward, both for research and education. I found it quite interesting and, of course, think the problems it discussed are central to what people in CS should be thinking about these days. I only wish my mind had been more into the talk that morning.

Three ideas stayed with me as my conference closed:

  • One panelist made a great comment in the spirit of looking outward. Paraphrase: While we in CS argue about what "computational thinking" means, we should embrace the diversity of computational thinking done out in the world and reach out to work with partners in many disciplines.

  • Another panelist commented on the essential role that computing plays in other disciplines. He used biology as his example. Paraphrase: To be a biologist these days requires that you understand simulation, modeling, and how to work with large databases. Working with large databases is the defining characteristic of social science these days.

  • Many of the issues that challenge computer scientists who want to engage in interdisciplinary research of this sort are ones we have encountered for a long time. For instance, how can a computer scientist find the time to gain all of the domain knowledge she needs?

    Other challenges follow from how people on either side of the research view the computer scientist's role. Computer science faculties that make tenure and promotion decisions often do not see research value in interdisciplinary research. The folks on the applied side often contribute to this by viewing the computer science as a tool builder or support person, not as an integral contributor to solving the research problem. I have seen this problem firsthand while helping members of my department's faculty try to contribute to projects outside of our department.

This panel was a most excellent way to end my conference, with many thoughts about how to work with CS colleagues to develop projects that engage colleagues in other disciplines.

Pretty soon after the close of this session I was on the road home, whether repacked my bags and headed off for a few days of spring break with my family in St. Louis. This trip was a wonderful break, though ten days too early to see my UNI Panthers end their breakout season with an amazing run.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 24, 2010 7:42 PM

SIGCSE Day 2 -- Al Aho on Teaching Compiler Construction

[A transcript of the SIGCSE 2010 conference: Table of Contents]

Early last year, I wrote a blog entry about using idea's from Al Aho's article, Teaching the Compilers Course, in the most recent offering of my course. When I saw that Aho was speaking at SIGCSE, I knew I had to go. As Rich Pattis told me in the hallway after the talk, when you get a chance to hear certain people speak, you do. Aho is one of those guys. (For me, so is Pattis.)

The talk was originally scheduled for Thursday, but persistent fog over southeast Wisconsin kept several people from arriving at the conference on time, including Aho. So the talk was rescheduled for Fri. I still had to see it, of course, so I skipped the attention-grabbing "If you ___, you might be a computational thinker".

Aho's talk covered much of the same ground as his inroads paper, which gave me the luxury of being able to listen more closely to his stories and elaborations than to the details. The talk did a nice job of putting the compiler course into its historical context and tried to explain why we might well teach a course very different -- yet in many ways similar -- to the course we taught forty, twenty-five, or even ten years ago.

He opened with lists of the top ten programming languages in 1970 and 2010. There was no overlap, which introduced Aho's first big point: the landscape of programming languages has changes in a big way since the beginning of our discipline, and there have been corresponding changes in the landscape of compilers. The dimensions of change are many: the number of languages, the diversity of languages, the number and kinds of applications we write. The growth in number and diversity applies not only to the programming languages we use, which are the source language to a compiler, but also to the target machines and the target languages produced by compilers.

From Aho's perspective, one of the most consequential changes in compiler construction has been the rise of massive compiler collections such as gcc and LLVM. In most environments, writing a compiler is no longer a matter of "writing a program" as much a software engineering exercise: work with a large existing system, and add a new front end or back end.

So, what should we teach? Syntax and semantics are fairly well settled as matter of theory. We can thus devote time to the less mathematical parts of the job, such as the art of writing grammars. Aho noted that in the 2000s, parsing natural languages is mostly a statistical process, not a grammatical one, thanks to massive databases of text and easy search. I wonder if parsing programming languages will ever move in this direction... What would that mean in terms of freer grammar, greater productivity, or confusion?

With the availability of modern tools, Aho advocates an agile "grow a language" approach. Using lex and yacc, students can quickly produce a compiler in approximately 20 lines of code. Due to the nature of syntax-directed translation, which is closely related to structural recursion, we can add new productions to a grammar with relative ease. This enables us to start small, to experiment with different ideas.

The Dragon book circa 2010 adds many new topics to its previous editions. It just keeps getting thicker! It covers much more material, both breadth and depth, than can be covered in the typical course, even with graduate students. This gives instructors lots of leeway in selecting a subset around which to build a course. The second edition already covers too much material for my undergrad course, and without enough of the examples that many students need these day. We end up selecting such a small subset of the material that the price of the book is too high for the number of pages we actually used.

The meat of the talk matched the meat of his paper: the compiler course he teaches these days. Here are a few tidbits.

On the Design of the Course

  • Aho claims that, through all the years, every team has delivered a working system. He attributes this to experience teaching the course and the support they provide students.
  • Each semester, he brings in at least one language designer in as a guest speaker, someone like Stroustrup or Gosling. I's love to do this but don't have quite the pull, connections, or geographical advantage of Aho. I'll have to be creative, as I was the last time I taught agile software development and arranged a phone conference with Ken Auer.
  • Students in the course become experts in one language: the one they create. They become much more knowledgable in several others: the languages they to to write, build, and test their compiler.

On System Development

  • Aho sizes each student project at 3,000-6,000 LOC. He uses Boehm's model to derive a team size of 5, which fits nicely with his belief that 5 is the ideal team size.
  • Every team member must produce at least 500 lines of code on the project. I have never had an explicit rule about this in the past, but experience in my last two courses with team projects tells me that I should.
  • Aho lets teams choose their own technology, so that they can in the way that makes them most comfortable. One serendipitous side effect of this choice is that requires him to stay current with what's going on in the world.
  • He also allows teams to build interpreters for complex languages, rather than full-blown compilers. He feels that the details of assembly language get in the way of other important lessons. (I have not made that leap yet.)

On Language Design

  • One technique he uses to scope the project is to require students to identify an essential core of their language along with a list of extra features that they will implement if time permits. In 15 years, he says, no team has ever delivered an extra feature. That surprises me.
  • In order to get students past the utopian dream of a perfect language, he requires each team to write two or three programs in their language to solve representative problems in the language's domain. This makes me think of test-first design -- but of the language, not the program!
  • Aho believes that students come to appreciate our current languages more after designing a language and grappling with the friction between dreams and reality. I think this lesson generalizes to most forms of design and creation.

I am still thinking about how to allow students to design their own language and still have the time and energy to produce a working system in one semester. Perhaps I could become more involved early in the design process, something Aho and his suite of teaching assistants can do, or even lead the design conversation.

On Project Management

  • "A little bit of process goes a long way" toward successful delivery and robust software. The key is finding the proper balance between too much process, which stifles developers, and too little, which paralyzes them.
  • Aho has experimented with different mechanisms for organizing teams and selecting team leaders. Over time, he has found it best to let teams self-organize. This matches my experience as well, as long as I keep an eye out for obviously bad configurations.
  • Aho devotes one lecture to project management, which I need to do again myself. Covering more content is a siren that scuttles more student learning than it buoys.

~~~~

Aho peppered his talk with several reminiscences. He told a short story about lex and how it was extended with regular expressions from egrep by Eric Schmidt, Google CEO. Schmidt worked for Aho as a summer intern. "He was the best intern I ever had." Another interesting tale recounted one of his doctoral student's effort to build a compiler for a quantum computer. It was interesting, yes, but I need to learn more about quantum computing to really appreciate it!

My favorite story of the day was about awk, one of Unix's great little languages. Aho and his colleagues Weinberger and Kernighan wrote awk for their own simple data manipulation tasks. They figured they'd use it to write throwaway programs of 2-5 lines each. In that context, you can build a certain kind of language and be happy. But as Aho said, "A lot of the world is data processing." One day, a colleague came in to his office, quite perturbed at a bug he had found. This colleague had written a 10,000-line awk program to do computer-aided design. (If you have written any awk, you know just how fantabulous this feat is.) In a context where 10K-line programs are conceivable, you want a very different sort of language!

The awk team fixed the bug, but this time they "did it right". First, they built a regression test suite. (Agile Sighting 1: continuous testing.) Second, they created a new rule. To propose a new language feature for awk, you had to produce regression tests for it first. (Agile Sighting 2: test-first development.) Aho has built this lesson into his compiler course. Students must write their compiler test-first and instrument their build environments to ensure that the tests are run "all of the time". (Agile Sighting 3: continuous integration.)

An added feature of Aho's talk over his paper was three short presentations from members of a student team that produced PixelPower, a language which extends C to work with a particular graphics library. They shared some of the valuable insights from their project experience:

  • They designed their language to have big overlap with C. This way, they had an existing compiler that they understood well and could extend.
  • The team leader decided to focus the team, not try to make everyone happy. This is a huge lesson to learn as soon as you can, one the students in my last compiler course learned perhaps a bit too late. "Getting things done," Aho's students said, "is more important than getting along."
  • The team kept detailed notes of all their discussions and all their decisions. Documentation of process is in many ways much more important than documentation of code, which should be able to speak for itself. My latest team used a wiki for this purpose, which was a good idea they had early in the semester. If anything, they learned that they should have used it more frequently and more extensively.

One final note to close this long report. Aho had this to say about the success of his course:

If you make something a little better each semester, after a while it is pretty good. Through no fault of my own this course is very good now.

I think Aho's course is good precisely because he adopted this attitude about its design and implementation. This attitude serves us well when designing and implementing software, too: Many iterations. Lots of feedback. Collective ownership of the work product.

An hour well spent.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 22, 2010 4:35 PM

SIGCSE Day 2 -- Reimagining the First Course

[A transcript of the SIGCSE 2010 conference: Table of Contents]

The title of this panel looked awfully interesting, so I headed off to it despite not knowing just what it was about. It didn't occur to me until I saw the set of speakers at the table that it would be about an AP course! As I have written before, I don't deal much with AP, though most of my SIGCSE colleagues do. This session turned out to be a good way to spend ninety minutes, both as a window into some of the conference buzz and as a way to see what may be coming down the road in K-12 computer science in a few years.

What's up? A large and influential committee of folks from high schools, universities, and groups such as the ACM and NSF are designing a new course. It is intended as an alternative to the traditional CS1 course, not as a replacement. Rather than starting with programming or mathematics as the foundation, of the the course, the committee is first identifying a set of principles of computing and then designing a course to teach these principles. Panel leader Owen Astrachan said that the are engineering a course, given the national scale of the project and the complexity of creating something that works at lots of schools and for lots of students.

Later, I hope to discuss the seven big ideas and the seven essential practices of computational thinking that serve as the foundation for this course, but for now you should read them for yourself. At first blush, they seem like a reasonable effort to delineate what computing means in the world and thus what high school graduates these days should know about the technology that circumscribes their lives. They emphasize creativity, innovation, and connections across disciplines, all of which can be lost when we first teach students a programming language and whip out "Hello, World" and the Towers of Hanoi.

Universities have to be involved in the design and promotion of this new course because it is intended for advanced placement, and that mean that it must earn college credit. Why does the AP angle matter? Right now, because it is the only high school CS course that counts at most universities. It turns out that advanced placement into a major matters less to many parents and HS students than the fact that the course carries university credit. Placement is the #1 reason that HS students take AP courses, but university credit is not too far behind.

For this reason, any new high school CS course that does not offer college credit will be hard to sell to any K-12 school district. (This is especially true in a context where even the existing AP CS is taught in only 7% of our high schools.) That's not too high a hurdle. At the university level, it is much easier to have an AP course approved for university or even major elective credit than it is to have a course approved for advanced placement in the major. So the panel encouraged university profs in the audience to do what they can at their institutions to prepare the way.

Someone on the panel may have mentioned the possibility of having a principles-based CS AP course count as a general education course. At my school we were successful a couple of years ago at having a CS course on simulation and modeling added as one of the courses that satisfies the "quantitative reasoning" requirement in our Liberal Arts Core. I wonder how successful we could be at having a course like the new course under development count for LAC credit. Given the current climate around our core, I doubt we could get a high school AP course to count, because it would not be a part of the shared experience our students have at the university.

The most surprising part of this panel was the vibe in the room. Proposals such as this one that tinker with the introductory course in CS usually draw a fair amount of skepticism and outright opposition. This one did not. The crowd seemed quite accepting, even when the panel turned its message into one of advocacy. They encouraged audience members to become advocates for this course and for AP CS more generally at their schools. They asked us not to tear down these efforts, but to join in and help make the course better. Finally, they asked us to join the College Board, the CS Teachers Association, and the ACM in presenting a united front to our universities, high schools, and state governments about the importance and role of computing in the K-12 curriculum. The audience seemed as if it was already on board.

In closing, there were two memorable quotes from the panel. First, Jan Cuny, currently a program officer for CISE at the National Science Foundation, addressed concern that all the talk these days about the "STEM disciplines" often leaves computing out of the explicit discussion:

There is a C in STEM. Nothing will happen in the S, the T, the E, or the M without the C.

I've been telling everyone at my university this for the last several years, and most are open to the broadening of the term when they are confronted with this truth.

Second, the front-runner for syllogism of the year is this gem from Owen Astrachan. Someone in the audience asked, "If this new course is not CS1, then is it CS0?" (CS0 is a common moniker for university courses taken by non-majors before they dive into the CS major-focused CS1 course.) Thus spake Owen:


     This course comes before CS1.
     0 is the only number less than 1.
Therefore, this course is CS0.

This was only half of Owen's answer, but it was the half that made me laugh.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 20, 2010 9:04 PM

SIGCSE -- What's the Buzz?

[A transcript of the SIGCSE 2010 conference: Table of Contents]

Some years, I sense a definite undercurrent to the program and conversation at SIGCSE. In 2006, it was Python on the rise, especially as a possible language for CS1. Everywhere I turned, Python was part of the conversation or a possible answer to the questions people were asking. My friend Jim Leisy was already touting a forthcoming CS1 textbook that has since risen to the top of its niche. It pays for me to talk to people who are looking into the future!

Other years, the conference is melange of many ideas with little or no discernible unifier. SIGCSE 2008 seemed to want to be about computing with big data, but even a couple of high-profile keynote addresses couldn't put that topic onto everyone's lips. 2007 brought lots of program about "computational thinking", but there was no buzz. Big names and big talks don't create buzz; buzz comes from the people.

This year's wanna-be was concurrency. There were papers and special sessions about its role in the undergrad curriculum, and big guns Intel, Google, and Microsoft were talking it up. But conversation in the hallways and over drinks never seemed to gravitate toward concurrency. The one place where I saw a lot of people gathering to discuss it was a BoF on Thursday night, but even then the sense was more anticipation than activity. What are others doing? Maybe next year.

The real buzz this year was CS, and CS ed, looking outward. Consistent with recent workshops like SECANT, SIGCSE 2010 was full of talk about computer science interacting with other disciplines, especially science but also the arts. Some of this talk was about how CS can affect science education, and some was about how other disciplines can affect CS education.

But there was also a panel on integrating computing research and development with R&D in other sciences. While this may look like a step outside of SIGCSE's direct sphere of interest, it is essential that we keep the sights of our CS ed efforts out in the world where our graduates work. Increasingly, this is again in applications where computing is integral to how others do their jobs.

An even bigger outward-looking buzz coalesced around CS educators working with K-12 schools, teachers, and students. A big part of this involved teaching computer science in middle schools and high schools, including AP CS. But it also involved the broader task of teaching all students about "computational thinking": what it means, how to do it, and maybe even how to write programs that automate it. Such a focus on the general education of students is real evidence of looking outward. This isn't about creating more CS majors in college, though few at SIGCSE would object to that. It's about creating college students better prepared for every major and career they might pursue.

To me, this is a sign of how we in CS ed are maturing, going from a concern primarily for our own discipline to one for computing's larger role in the world. That is an ongoing theme of these blog, it seems, so perhaps I am suffering from confirmation bias. But I found it pretty exciting to see so many people working so hard to bring computing into classrooms and into research labs as a fundamental tool rather than as a separate discipline.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 11, 2010 8:33 PM

SIGCSE Day One -- The Most Influential CS Ed Papers

[A transcript of the SIGCSE 2010 conference: Table of Contents]

This panel aimed to start the discussion of how we might identify which CS education papers have had the greatest influence on our practice of CS education. Each panelist produced a short list of candidates and also suggested criteria and principles that the community might use over time. Mike Clancy made explicit the idea that we should consider both papers that affect how we teach and papers that affect what we teach.

This is an interesting= process that most areas of CS eventually approach. A few years ago, OOPSLA began selecting a paper from the OOPSLA conference ten years prior that had had the most influence on OO theory or practice. That turns out to be a nice starting criterion for selection: wait ten years so that we have some perspective on a body of work and an opportunity to gather data on the effects of the work. Most people seem to think that ten years is long enough to wait.

You can see the list of papers, books, and websites offered by the panelists on this page. The most impassioned proposal was Eric Roberts's tale of how much Rich Pattis's Karel the Robot affects Stanford's intro programming classes to this day, over thirty years after Rich first created Karel.

I was glad to see several papers by Eliot Soloway and his students on the list. Early in my career, Soloway had a big effect on how I thought about novice programmers, design, and programming patterns. My patterns work was also influenced strongly by Linn and Clancy's The Case for Case Studies of Programming Problems, though I do not think I have capitalized on that work as much as I could have.

Mark Guzdial based his presentation on just this idea: our discipline in general does not fully capitalize on great work that has come before. So he decided to nominate the most important papers, not the most influential. What papers should we be using to improve our theory and practice?

I know Anderson's cognitive tutors work well, from the mid-1990s when I was preparing to move my AI research toward intelligent tutoring systems. The depth and breadth of that work is amazing.

Some of my favorite papers showed up as runners-up on various lists, including Gerald Weinberg's classic The Psychology of Programming. But I was especially thrilled when, in the post-panel discussion, Max Hailperin suggested Robert Floyd's Turing Award lecture, The Paradigms of Programming. I think this is one of the all-time great papers in CS history, with so many important ideas presented with such clarity. And, yes, I'm a fan.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 11, 2010 7:47 PM

SIGCSE Day One -- What Should Everyone Know about Computation?

[A transcript of the SIGCSE 2010 conference: Table of Contents]

This afternoon session was a nice follow up to the morning session, though it had a focus beyond the interaction of computation and the sciences: What should every liberally-educated college graduate know about computation? This almost surely is not the course we start CS majors with, or even the course we might teach scientists who will apply computing in a technical fashion. We hope that every student graduates with an understanding of certain ideas from history, literature, math, and science. What about computation?

Michael Goldweber made an even broader analogy in his introduction. In the 1800s, knowledge about farming was pervasive throughout society, even among non-farmers. This was important for people, even city folk, to understand the world in which they lived. Just as agriculture once dominated our culture, so does technology now. To understand the world in which they live, people these days need to understand computation.

Ultimately, I found this session disappointing. We heard a devil's advocate argument against teaching any sort of "computer literacy"; a proposal that we teach all students what amounts to an applied, hand-waving algorithms course; and a course that teaches abstraction in contexts that connects with students. There was nothing wrong with these talks -- they were all entertaining enough -- but they didn't shed much new light on what is a difficult question to answer.

Henry Walker did say a few things that resonated with me. One, he reminded us that there is a difference between learning about science and doing science. We need to be careful to design courses that do one of these well. Two, he tried to explain why computer science is the right discipline for teaching problem solving as a liberal art, such as how a computer program can illustrate the consequences specific choices, the interaction of effects, and especially the precision with which we must use language to describe processes in a computer program. Walker was the most explicit of the panelists in treating programming as fundamental to what we offer the world.

In a way unlike many other disciplines, writing programs can affect how we think in other areas. A member of the audience pointed out CS also fundamentally changes other disciplines by creating new methodologies that are unlike anything that had been practical before. His example was the way in which Google processes and translates language. Big data and parallel processing have turned the world of linguistics away from Chomskian approach and toward statistical models of understanding and generating language.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 11, 2010 7:04 PM

SIGCSE Day One -- Computation and The Sciences

[A transcript of the SIGCSE 2010 conference: Table of Contents]

I chose this session over a paper session on Compilers and Languages. I can always check those papers out in the proceedings to see if they offer any ideas I can borrow. This session connects back to my interests in the role of computing across all disciplines, and especially to my previous attendance at the SECANT workshops on CS and science in 2007 and 2008. I'm eager to hear about how other schools are integrating CS with other disciplines, doing data-intensive computing with their students, and helping teachers integrate CS into their work. Two of these talks fit this bill.

In then first, Ali Erkan talked about the need to prepare today's to work on large projects that span several disciplines. This is more difficult than simply teaching programming languages, data structures, and algorithms. It requires students to have disciplinary expertise beyond CS, the ability to do "systems thinking", and the ability to translate problems and solutions across the cultural boundaries of the disciplines. A first step is to have students work on problems that are bigger than the students of any single discipline can solve. ( Astrachan's Law marches on!)

Erkin then described an experiment at Ithaca College in which four courses run as parallel threads against a common data set of satellite imagery: ecology, CS, thermodynamics, and calculus. Students from any course can consult students in the other courses for explanations from those disciplines. Computer science students in a data structures course use the data not only to solve the problem but also to illustrate ideas, such as memory usage of depth-first and breadth-first searches of a grid of pixels. They can also apply more advanced ideas, such as data analysis techniques to smooth curves and generate 3D graphs.

I took away two neat points from this talk. The first was a link to The Cryosphere Today, a wonderful source of data on arctic and antarctic ice coverage for students to work with. The second was a reminder that writing programs to solve a problem or illustrate a data set helps students to understand the findings of other sciences. Raw data become real for them in writing and running their code.

In the second paper, Craig Struble described a three-day workshop for introducing computer science to high school science teachers. Struble and his colleagues at Marquette offered the workshop primarily for high school science teachers in southeast Wisconsin, building on the ideas described in A Novel Approach to K-12 CS Education: Linking Mathematics and Computer Science. The workshop had four kinds of sessions:

  • tools: science, simulation, probability, Python, and VPython
  • content: mathematics, physics, chemistry, and biology
  • outreach: computing careers, lesson planning
  • fun: CS unplugged activities, meals and other personal interaction with the HS teachers

This presentation echoed some of what we have been doing here. Wisconsin K-12 education presents the same challenge that we face in Iowa: there are very few CS courses in the middle- or high schools. The folks at Marquette decided to attack the challenge in the same way we have: introduce CS and the nebulous "computational thinking" through K-12 science classes. We are planning to offer a workshop for middle- and high school teachers. We are willing to reach an audience wider than science teachers and so will be showing teachers how to use Scratch to create simulations, to illustrate math concepts, and even to tell stories.

I am also wary of one of the things the Marquette group learned in follow-up with the teachers who attended their workshop. Most teachers likes it and learned a lot, but many are not able to incorporate what they learn into their classes. Some face time constraints from a prescribed curriculum, while others are limited by miscellaneous initiatives external to their curriculum that are foisted on them by their school. That is a serious concern for us as we try to help teachers do cool things with CS that change how they teach their usual material.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 10, 2010 8:35 PM

SIGCSE DAY 0 -- Media Computation Workshop

[A transcript of the SIGCSE 2010 conference: Table of Contents]

I headed to SIGCSE a day early this year in order to participate in a couple of workshops. The first draw was Mark Guzdial's and Barbara Ericson's workshop using media computation to teach introductory computing to both CS majors and non-majors. I have long been a fan of this work but have never seen them describe it. This seemed like a great chance to learn a little from first principles and also to hear about recent developments in the media comp community.

Because I taught a CS1 course in Java, using media comp four years ago, I was targeted by other media comp old-timers as a media comp old-timer. They decided, with Mark's blessing, to run a parallel morning session with the goal of pooling experience and producing a resource of value to the community.

When the moment came for the old-timers to break out on their own, I packed up my laptop, stood to leave -- and stopped. I felt like spending the morning as a beginner. This was not an entirely irrational decision. First, while I have done Java media comp, I have never worked with the original Python materials or the JES programming environment students use to do media comp in Python. Second, I wanted to see Mark present the material -- material he has spent years developing and for which he has great passion. I love to watch master teachers in action. Third, I wanted to play with code!

Throughout the morning, I diddled in JES with Python code to manipulate images, doing things I've done many times in Java. It was great fun. Along the way, I picked up a few nice quotes, ideas, and background facts:

  • Mark: "For liberal arts students and business students, the computer is not a tool of calculation but a tool of communication.

  • The media comp data structures book is built largely on explaining the technology needed to create the wildebeest stampede in The Lion King. (Check out this analysis, which contains a description of the scene in the section "Building the Perfect Wildebeests".)

  • We saw code that creates a grayscale version of an image attuned to human perception. The value used for each color in a pixel weights its original values as 0.299*red + 0.587*blue + 0.114*green. This formula reinforces the idea that there are an infinite number of weightings we can use to create grayscale. There are, of course, only a finite number of grayscale versions of an image, though that number is very large: 256 raised to a power equal to the number of pixels in the image.

  • After creating several Python methods that modify an image, non-majors eventually bump into the need to return a value, often a new image. Deciding when a function should return a value can be tough, especially for non-CS folks. Mark uses this rule of thumb to get them started: "If you make an image in the method, return it."

  • Mark and Barb use Making of "The Matrix" to take the idea of chromakey beyond the example everyone seems to know, TV weather forecasters.

  • Using mirroring to "fix" a flawed picture leads to a really interesting liberal arts discussion: How do you know when a picture is fake? This is a concept that every person needs to understand these days, and understanding the computations that can modify an image enables us to understand the issues at a much deeper level.

  • Mark showed an idea proposed to him by students at one of his workshops for minority high school boys: when negating an image, change the usual 255 upper bound to something else, say, 180. This forces many of the resulting values to 0 and behaves like a strange posterize function!

I also learned about Susan Menzel's work at Indiana University to port media computation to Scheme. This is the second such project I've heard of, after Sam Rebelsky's work at Grinnell College connecting Scheme to Gimp.

Late in the morning, we moved on to sound. Mark demonstrated some wonderful tools for playing with and looking at sound. He whistled, sang, hummed, and played various instruments into his laptop's microphone, and using their MediaTools (written in Squeak) we could see the different mixes of tones available in the different sounds. These simple viewers enable us to see that different instruments produce their own patterns of sounds. As a relative illiterate in music, I only today understood how it is that different musical instruments can produce sounds of such varied character.

The best quote of the audio portion of the morning was, "Your ears are all about logarithms." Note systems with halving and doubling of frequencies across sets of notes is not an artifact of culture but an artifact of how the human ear works!

This was an all-day workshop, but I also had a role as a sage elder at the New Educators Roundtable in the afternoon, so I had to duck out for a few hours beginning with lunch. I missed out on several cool presentations, including advanced image processing ideas such as steganography and embossing, but did get back in time to hear how people are now using media computation to teach data structures ideas such as linked lists and graphs. Even with a gap in the day, this workshop was a lot of fun, and valuable as we consider expanding my department's efforts to teach computing to humanities students.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 10, 2010 7:40 PM

Notes on SIGCSE 2010: Table of Contents

The set of entries cataloged here records some of my thoughts and experiences at SIGCSE 2010, in Milwaukee, Wisconsin, March 10-13. I'll update it as I post new essays about the conference.

Primary entries:

Ancillary entries:


Posted by Eugene Wallingford | Permalink | Categories: Computing, Running, Software Development, Teaching and Learning

March 07, 2010 5:45 PM

Programming as Inevitable Consequence

My previous entry talked about mastering tools and improving process on the road to achievement. Garry Kasparov wrote of "programming yourself" as the way to make our processes better. To excel, we must program ourselves!

One way to do that is via computation. Humans use computers all the time now to augment their behavior. Chessplayers are a perfect example. Computers help us do what we do better, and sometimes they reconfigure us, changing who we are and what we do. Reconfigured well, a person or group of people can push their capabilities beyond even what human experts can do -- alone, together, or with computational help.

But what about our tools? How many chessplayers, or any other people for that matter, program their computers these days as a means of making the tools they need, or the tools they use better? This is a common lament among certain computer scientists. Ian Bogost reminds us that writing programs used to be an inevitable consequence of using computers. Computer manufacturers used to make writing programs a natural step in our mastery of the machine they sold us. They even promoted the personal computer as part of how we became more literate. Many of us old-timers tell stories of learning to program so that we could scratch some itch.

It's not obvious that we all need to be able to program, as long as the tools we need to use are created for us by others. Mark Guzdial discusses his encounters with the "user only" point of view in a recent entry motivated by Bogost's article. As Mark points out, though, the computer may be different than a bicycle and our other tools. Most tools extend our bodies, but the computer extends our minds. We can program our bodies by repetition and careful practice, but the body is not as malleable as the mind. With the right sort of interaction with the world, we seem able to amplify our minds in ways much different than what a bicycle can do for our legs.

Daniel Lemire expresses it nicely and concisely: If you understand an idea, you can implement it in software. To understand an idea is to be able to write a program. The act of writing itself gives rise to a new level of understanding, to a new way of describing and explaining the idea. But there is more than being able to write code. Having ideas and being able to program is, for so many people, a sufficient condition to want to program: Sometimes to scratch an itch; sometimes to understand better; and sometimes simply to enjoy the act.

This feeling is universal. As I wrote not long ago, computing has tools and ideas that make people feel superhuman. But there is more! As Thomas Guest reminds us, "ultimately, the power of the programmer is what matters". The tools help to make us powerful, true, but they also unleash power that is already within is.

By the way, I strongly recommend Guest's blog, Word Aligned. Guest doesn't write as frequently as some bloggers, but when he does, it is technically solid, deep, and interesting.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 05, 2010 9:21 PM

Mastering Tools and Improving Process

Today, a student told me that he doesn't copy and paste code. If he wants to reuse code verbatim, he requires himself to type it from scratch, character by character. This way, he forces himself to confront the real cost of duplication right away. This may motivate him to refactor as soon as he can, or to reconsider copying the code at all and write something new. In any case, he has paid a price for copying and so has to take it seriously.

The human mind is wonderfully creative! I'm not sure I could make this my practice (I use duplication tactically), but it solves a very real problem and helps to make him an even better programmer. When our tools make it too easy to do something that can harm us -- such as copy and paste with wild abandon, no thought of the future pain it will cause us -- a different process can restore some balance to the world.

The interplay between tools and process came to mind as I read Clive Thompson's Garry Kasparov, cyborg. this afternoon. Last month, I read the same New York Review of Books essay by chess grandmaster Garry Kasparov, The Chess Master and the Computer, that prompted Thompson's essay. When I read Kasparov, I was drawn in by his analysis of what it takes for a human to succeed, as contrasted to what makes computers good at chess:

The moment I became the youngest world chess champion in history at the age of twenty-two in 1985, I began receiving endless questions about the secret of my success and the nature of my talent. ... I soon realized that my answers were disappointing. I didn't eat anything special. I worked hard because my mother had taught me to. My memory was good, but hardly photographic. ...

Garry Kasparov

Kasparov understood that, talent or no talent, success was a function of working and learning:

There is little doubt that different people are blessed with different amounts of cognitive gifts such as long-term memory and the visuospatial skills chess players are said to employ. One of the reasons chess is an "unparalleled laboratory" and a "unique nexus" is that it demands high performance from so many of the brain's functions. Where so many of these investigations fail on a practical level is by not recognizing the importance of the process of learning and playing chess. The ability to work hard for days on end without losing focus is a talent. The ability to keep absorbing new information after many hours of study is a talent. Programming yourself by analyzing your decision-making outcomes and processes can improve results much the way that a smarter chess algorithm will play better than another running on the same computer. We might not be able to change our hardware, but we can definitely upgrade our software.

"Programming yourself" and "upgrading our software" -- what a great way to describe how it is that so many people succeed by working hard to change what they know and what they do.

While I focused on the individual element in Kasparov's story, Thompson focused on the social side: how we can "program" a system larger than a single player? He relates one of Kasparov's stories, about a chess competition in which humans were allowed to use computers to augment their analysis. Several groups of strong grandmasters entered the competition, some using several computers at the same time. Thompson then quotes this passage from Kasparov:

The surprise came at the conclusion of the event. The winner was revealed to be not a grandmaster with a state-of-the-art PC but a pair of amateur American chess players using three computers at the same time. Their skill at manipulating and "coaching" their computers to look very deeply into positions effectively counteracted the superior chess understanding of their grandmaster opponents and the greater computational power of other participants. Weak human + machine + better process was superior to a strong computer alone and, more remarkably, superior to a strong human + machine + inferior process.

Thompson sees this "algorithm" as an insight into how to succeed in a world that consists increasingly of people and machines working together:

[S]erious rewards accrue to those who figure out the best way to use thought-enhancing software. ... The process matters as much as the software itself.

I see these two stories -- Kasparov the individual laboring long and hard to become great, and "weak human + machine + better process" conquering all -- as complements to one another, and related back to my student's decision not to copy and paste code. We succeed by mastering our tools and by caring about our work processes enough to make them better in whatever ways we can.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 23, 2010 6:34 PM

Strength Later, Weakness Now

One of the things I've always liked about Scheme is that the procedures I write are indistinguishable from the primitive procedures of the language. All procedures are applied prefix and named using the same binding mechanism. This similarity extends beyond simple procedure definitions to other features such as variable arity and special forms, which aren't procedures at all.

All this makes Scheme an especially handy tool for creating domain-specific languages. I can create a whole new language for the domain I'm working in, say, language processing or financial accounting, simply by defining the procedures and forms that embody the domain's concepts. When I write programs using these procedures, they mix seamlessly with Scheme's primitive procedures and forms to create a nearly friction-less writing -- and reading -- experience.

I've always touted this feature of the language to my students, who usually learn Scheme as a tool for writing interpreters and other simple language processors in our programming languages course. However, over the last couple of offerings of the course, I am beginning to realize that this strength is a weakness for many beginners.

At the outset of their journey into functional programming, students have so many things to think about (language syntax, new programming principles, idioms to accomplish tasks they understand well, and so on) that a lack of friction may actually hurt them. They have trouble finding the boundary between the language Scheme and the language we build on top of it. For example, when we first began to implement recursive procedures, I gave students a procedure sequence:

    (define sequence
      (lambda (start finish)
        (if (> start finish)
            '()
            (cons start (sequence (+ start 1) finish)))))

as a simple example and to use for testing code that processes lists of numbers. For weeks afterwards, I had students e-mailing me because they wrote code that referred to sequence: "Why am I getting errors?" Well, because it's not a primitive and you don't define it. "But we use it in class all the time." Well, I define it each time I need it. "Oh, I guess I forgot." This sequence has re-played itself many times already this semester, with several other pieces of code.

I suppose you could say the students ought to be more disciplined in their definition and use of code, or that I ought to do a better job of teaching them how to write and reuse code, or simply that the students need better memories. One or all of these may be true, but I think there is more happening here. A language with no friction between primitives and user-defined code places one more burden on students who are already juggling a lot of new ideas in their minds.

As students become more experienced with the language and the new style of programming, they have a better chance to appreciate the value of seamless layers of code as they grow a program. They begin to notice that the lack of friction helps them, as they don't have to slow down to handle special cases, or to change how a piece of code works when they decide to add an abstraction between the code and its clients. Whereas before the lack of friction slowed them down while they pondered boundaries and looked up primitives, now it helps them move faster.

This phenomenon is not peculiar to functional programming or Scheme. I think it is also true of OOP and Java. Back when Java was first becoming part of CS 1 at many schools, many of my colleagues objected to use the use of home-grown packages and libraries. The primary argument was that students would not be learning to write "real Java" (which is so wrong!) and that their code would not be as portable (which is true). In retrospect, I think a more compelling case can be made that the use of homegrown packages might interfere with students cognitively as they learn the language and the boundaries around it. There are elements of this in both of their objections, but I now think of it as the crux of the issue.

This phenomenon is also not a necessary condition to learning functional programming or Scheme. A number of schools use Scheme in their first-year courses and do just fine. Perhaps instructors at these schools have figured out ways to avoid this problem entirely, or perhaps they rely on some disciplines to help students work around it. I may need to learn something from them.

I have noticed my students having this difficulty the last two times we've offered this course and not nearly as much before, so perhaps our students are changing. On the other hand, maybe this reflects an evolution in my own recognition and understanding of the issue. In either case, my job is pretty much the same: find ways to help students succeed. All suggestions are welcome.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 11, 2010 5:40 PM

Creativity and the Boldness of Youth

While casting about Roy Behrens's blog recently, I came across a couple of entries that connected with my own experience. In one, Behrens discusses Arthur Koestler and his ideas about creativity. I enjoyed the entire essay, but one of its vignettes touched a special chord with me:

In 1971, as a graduate student at the Rhode Island School of Design, I finished a book manuscript in which I talked about art and design in relation to Koestler's ideas. I mailed the manuscript to his London home address, half expecting that it would be returned unopened. To my surprise, not only did he read it, he replied with a wonderfully generous note, accompanied by a jacket blurb.

My immediate reaction was "Wow!", followed almost imperceptibly by "I could never do such a thing." But then my unconscious called my bluff and reminded me that I had once done just such a thing.

Back in 2004, I chaired the Educators' Symposium at OOPSLA. As I first wrote back then, Alan Kay gave the keynote address at the Symposium. He also gave a talk at the main conference, his official Turing Award lecture. The Educators' Symposium was better, in large part because we gave Kay the time he needed to say what he wanted to say.

2004 was an eventful year for Kay, as he won not only the Turing Award but also the Draper Prize and Kyoto Prize. You might guess that Kay had agreed to give his Turing address at OOPSLA, given his seminal influence on OOP and the conference, and then consented to speak a second time to the educators.

But his first commitment to speak was to the Educators' Symposium. Why? At least in part because I called him on the phone and asked.

Why would an associate professor at a medium-sized regional public university dare to call the most recent Turing Award winner on the phone and ask him to speak at an event on the undercard of a conference? Your answer is probably as good as mine. I'll say one part boldness, one part hope, and one part naivete.

All I know is that I did call, hoping to leave a message with his secretary and hoping that he would later consider my request. Imagine my surprise when his secretary said, "He's across the hall just now; let me get him." My heart began to beat in triple time. He came to the phone, said hello, and we talked.

For me, it was a marvelous conversation, forty-five minutes chatting with a seminal thinker in my discipline, of whose work I am an unabashed fan. We discussed ideas that we share about computer science, computer science education, and universities. I was so caught up in our chat that I didn't consider just how lucky I was until we said our goodbyes. I hung up, and the improbability of what had just happened soaked in.

Why would my someone of Kay's stature agree to speak at a second-tier event before he had even been contacted to speak at the main event? Even more, why would he share so much time talking to me? There are plenty of reasons. The first that comes to mind is most important: many of the most accomplished people in computer science are generous beyond my ken. This is true in most disciplines, I am sure, but I have experienced it firsthand many times in CS. I think Kay genuinely wanted to help us. He was certainly willing to talk to me at some length about my hopes for the symposium and the role he could play.

I doubt that this was enough to attract him, though. The conference venue being Vancouver helped a lot; Kay loves Vancouver. The opportunity also to deliver his Turing Award lecture at OOPSLA surely helped, too. But I think the second major reason was his longstanding interest in education. Kay has spent much of his career working toward a more authentic kind of education for our children, and he has particular concerns with the state of CS education in our universities. He probably saw the Educators' Symposium as an opportunity to incite revolution among teachers on the front-line, to encourage CS educators to seek a higher purpose than merely teaching the language du jour and exposing students to a kind of computing calcified since the 1970s. I certainly made that opportunity a part of my pitch.

For whatever reason, I called, and Kay graciously agreed to speak. The result was a most excellent keynote address at the symposium. Sadly, his talk did not incite a revolt. It did plant seeds in the minds of at least of a few of us, so there is hope yet. Kay's encouragement, both in conversation and in his talk, inspire me to this day.

Behrens expressed his own exhilaration "to be encouraged by an author whose books [he] had once been required to read". I am in awe not only that Behrens had the courage to send his manuscript to Koestler but also that he and Koestler continued to correspond by post for over a decade. My correspondence with Kay since 2004 has been only occasional, but even that is more than I could have hoped for as a undergrad, when I first heard of Smalltalk or, as a grad student, when I first felt the power of Kay's vision by living inside a Smalltalk image for months at a time.

I have long hesitated to tell this story in public, for fear that crazed readers of my blog would deluge his phone line with innumerable requests to speak at conferences, workshops, and private parties. (You know who you are...) Please don't do that. But for a few moments once, I felt compelled to make that call. I was fortunate. I was also a recipient of Kay's generosity. I'm glad I did something I never would do.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Personal

February 10, 2010 6:43 PM

Recent Connections: Narrative and Computation

Reader Clint Wrede sent me a link to A Calculus of Writing, Applied to a Classic, another article about author Zachary Mason and his novel The Lost Books of the Odyssey. I mentioned Mason and his book in a recent entry, Diverse Thinking, Narrative, Journalism, and Software, which considered the effect of Mason's CS background on his approach to narrative. In "A Calculus of Writing", makes that connection explicit:

"What I'm interested in scientifically is understanding thought with computational precision," he explained. "I mean, the romantic idea that poetry comes from this deep inarticulable ur-stuff is a nice idea, but I think it is essentially false. I think the mind is articulable and the heart probably knowable. Unless you're a mystic and believe in a soul, which I don't, you really don't have any other conclusion you can reach besides that the mind is literally a computer."

I'm not certain whether the mind is or is not a computer, but I share Mason's interest in "understanding thought with computational precision". Whether poets and novelists create through a computational process or not, building ever-more faithful computational models of what they do interests to people like Mason and me. It also seems potentially valuable as a way to understand what it means to be human, a goal scientists and humanists share.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 30, 2010 6:02 PM

The Evolution of the Textbook

Over the last week, there has been a long thread on SIGCSE listserv about writing textbooks. Most interesting to me was Kim Bruce's note, "Thinking about electronic books". Kim noted Apple's announcement of the iPad, which comes with software for reading electronic books. Having written dead-tree books before, he wondered how the evolution of technology might help us to enhance our students' learning experience.

If we can provide books on a general-purpose computer, we have so many options available. Kim mentions one: replacing graphics with animations. Rather than seeing a static picture of the state of some computation, they could watch the computation unfold, with a direct connection to the code that produces it. This offers a huge improvement in the way that students can experience the ideas we want them to learn. You can see this difference in examples Kim posted of his printed textbook and his on-line lecture notes.

Right now, authors face a challenging practical obstacle: the lack of a standard platform. If a book requires features specific, say, to an iPad or to Windows, then its audience is limited. Even if it doesn't but a particular computer doesn't provide support for some near-standard technology, such as Flash on the Apple products, then users of those products are unable to access the book their devices. It would be nice to have an authoring system that runs across platforms, transparently, so that writers can focus on what they want to write, not on compatibility issues.

As Kim points out, we can accomplish some of this already on the web, writing for a browser. This isn't good enough, though. Reading long-ish documents at a desktop computer through a browser changes the reading experience in important ways. Our eyes -- and the rest of our bodies -- need something more.

With the evolution of handheld devices toward providing the full computational power we see on the desktop, our ability to write cross-platform books grows. The folks working on Squeak, Croquet, Sophie, and other spin-off technologies have this in mind. They are creating authoring systems that run across platforms and that rely less and less on underlying OS and application software for support.

As we think about how to expand the book-reading experience using new technologies, we can also see a devolution from the other side. Fifteen years ago, I spent a few years thinking about intelligent tutoring systems (ITS). My work on knowledge-based systems in domains such as engineering and business had begun to drift toward instruction. I hoped that we could use what we'd learned about knowledge representation and generic problem-solving patterns to build programs that could help people learn. These systems would encode knowledge from expert teachers in much the way that our earlier systems encoded knowledge from expert tax accountants, lawyers, and engineered.

Intelligent tutoring systems come at learning from the AI side of things, but the goal is the same as that of textbooks: to help people learn. AI promised something more dynamic than what we could accomplish on the printed page. I have not continued in that line of work, but I keep tabs on the ITS community to see what sort of progress they have been making. As with much of AI, the loftiest goals we had when we started are now grounded better in pragmatics, but the goal remains. I think Mark Guzdial has hit upon the key idea is his article Beat the book, not the teacher. The goal of AI systems should not be (at least immediately) to improve upon the perfomance of the best human teachers, or even to match it; the goal should be to improve upon the perfomance of the books we ask our students to read. This idea is the same one that Kim Bruce encourages us to consider.

As our technology evolves in the direction of reasonably compact mobile devices with the full computational power and high-fidelity displays, we have the ability to evolve how and what we write toward the dream of a dynabook. We should keep in mind that, with computation and computer programming, we are creating a new medium. Ultimately, how and what we write may not look all that much like a traditional book! They may be something new, something we haven't thought of yet. There is no reason to limit ourselves to producing the page-turning books that have served us so well for the last few centuries. That said, a great way to move forward is to try to evolve our books to see where our new technology can lead us, and to find out where we come up short.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 13, 2010 7:18 PM

Programming, Literacy, and Superhuman Strength

I've written occasionally here about programming as a new communications medium and the need to empower as many people as possible with the ability to write little programs for themselves. So it's probably not surprising that I read Clay Shirky's The Shock of Inclusion, which appears in Edge's How Has The Internet Changed The Way You Think?, with a thought about programming. Shirky reminds us that the revolution in thought created by the Internet is hardly in its infancy. We don't have a good idea of how the Internet will ultimately change how we think because the most important change -- to "cultural milieu of thought" -- has not happened yet. This sounds a lot like Alan Kay on the computer revolution, and like Kay, Shirky makes an analogy to the creation of the printing press.

When we consider the full effect of the Internet, as Shirky does in his essay, we think of its effect on the ability of individuals to share their ideas widely and to connect those ideas to the words of others. From the perspective of a computer scientist, I think of programming as a form of writing, as a medium both for accomplishing tasks and for communicating ideas. Just as the Internet has lowered the barriers to publishing and enables 8-year-olds to become "global publishers of video", it lowers the barriers to creating and sharing code. We don't yet have majority participation in writing code, but the tools we need are being developed and communities of amateur and professional programmers are growing up around languages, tools, and applications. I can certainly imagine a YouTube-like community for programmers -- amateurs, people we should probably call non-programmers who are simply writing for themselves and their friends.

Our open-source software communities have taught us not only that "collaboration between loosely joined parties can work at scales and over timeframes previously unimagined", as Shirky notes, but other of his lessons learned from the Internet: that sharing is possible in ways far beyond the 20th-century model of publishing, that "post-hoc peer review can support astonishing creations of shared value", that whole areas of human exploration "are now best taken on by groups", that "groups of amateurs can sometimes replace single experts", and that the involvement of users accelerates the progress of research and development. The open-source software is a microcosm of the Internet. In its own way, with some conscious intent by its founders, it is contributing to creation of the sort of Invisible College that Shirky rightly points out is vital to capitalizing on this 500-year advance in man's ability to communicate. The OSS model is not perfect and has much room for improvement, but it is a viable step in the right direction.

All I know is, if we can put the power of programming into more people's hands and minds, then we can help more people to have the feeling that led Dan Meyer to write Put THAT On The Fridge:

... rather than grind the solution out over several hours of pointing, clicking, and transcribing, for the first time ever, I wrote twenty lines of code that solved the problem in several minutes.

I created something from nothing. And that something did something else, which is such a weird, superhuman feeling. I've got to chase this.

We have tools and ideas that make people feel superhuman. We have to share them!


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

January 05, 2010 3:25 PM

In Programming Style, as in Most Things, Moderation

As I prepare for a new semester of teaching programming languages, I've been enjoying getting back into functional programming. Over break, someone somewhere pointed me toward a set of blog entries on why functional programming doesn't work. My first thought as I read the root entry was, "Just use Scheme or Lisp", or for that matter any functional language that supports mutation. But the author explicitly disallows this, because he is talking about the weakness of pure functional programming.

This is common but always seems odd to me. Many of the arguments one sees against FP are against "pure" functional programming: all FP, all the time. No one ever seems to talk in the same way about stateful imperative programming in, say, C. No one seems to place a purity test on stateful programming: "Try writing that without any functions!". Instead, we use state and sequencing and mutation throughout programs, and then selectively use functional style in the parts of the program where it makes sense. Why should FP be any different? We can use functional style throughout, and then selectively use state where it makes sense.

Mixing state and functions is the norm in imperative programming. The same should be true when we discuss functional programming. In the Lisp world, it is. I have occasionally read Lispers say that their big programs are about 90% functional and 10% imperative. That ratio seems a reasonable estimate for the large functional programs I have written, give or take a few percent either way.

Once we get to the point of acknowledging the desirability of mixing styles, the question becomes which proportion will serve us best in a particular environment. In game programming, the domain used as an example in the set of blog entries I read, perhaps statefulness plays a larger role than 10%. My own experience tells me that whenever I can emphasize functional style (or tightly-constrained stateful style, a lá objects), I am usually better off. If I have to choose, I'll take 90:10 functional over 90:10 imperative any day.

If we allow ourselves to mix styles, then solving the author's opening problem -- making two completely unrelated functions interdependent -- becomes straightforward in a functional program: define the functions (or doppelgangers for them) in a closure and 'export' only the functions. To me, this is an improvement over the "pure" stateful approach, as it gives us state and dependent behavior without global variables mucking up the namespace of the program or the mindshare of the programmer.

Maybe part of the problem lies in how proponents of functional programming pitch things. Some are surely overzealous about the virtues of a pure style. But I think as much of the problem lies in how limited people with vast experience and deep understanding of one way to think feel when they move outside their preferred style. Many programmers still struggle with object-oriented programming in much the same way.

Long ago, I learned from Ralph Johnson to encourage people to think in terms of programming style rather than programming paradigm. Style implies choice and freedom of thought, whereas paradigm implies rigidity and single-mindedness. I like to encourage students to develop facility with multiple styles, so that they will feel comfortable moving seamlessly in and out of styles, across borders whenever that suits the program they are writing. It is better for what we build to be defined by what we need, not our limitations.

(That last is a turn of phrase I learned from the book Art and Fear, which I have referenced a couple of times before.)

I do take to heart one piece of advice derived from another article in the the author's set of articles on FP. People who would like to see functional programming adopted more widely could help the cause by providing more guidance to people who want to learn. What happens if we ask a professional programmer to rewrite a video game (the author's specialty) in pure FP, or

... just about any large, complex C++ program for that matter[?] It's doable, but requires techniques that aren't well documented, and it's not like there are many large functional programs that can be used as examples ...

First, both sides of the discussion should step away from the call for pure FP and allow a suitable mix of functional and stateful programming. Meeting in the middle better reflects how real programmers work. It also broadens considerably the set of FP-style programs available as examples, as well as the set of good instructional materials.

But let's also give credence to the author's plea. We should provide better and more examples, and do a better job of documenting the functional programming patterns that professional programmer needs. How to Design Programs is great, but it is written for novices. Maybe Structure and Interpretation of Computer Programs is part of the answer, and I've been excited to see so many people in industry turning to it as a source of professional development. But I still think we can do better helping non-FP software developers make the move toward a functional style from what they do now. What we really need is the functional programming equivalent of the Gang of Four book.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

December 12, 2009 10:15 PM

The Computer Reconfigured Me

Joe Haldeman is a writer of some renown in the science fiction community. I have enjoyed a novel or two of his myself. This month he wrote the Future Tense column that closes the latest issue of Communications of the ACM, titled Mightier Than the Pen. The subhead really grabbed my attention.

Haldeman still writes his novels longhand, in bound volumes. I scribble lots of notes to myself, but I rarely write anything of consequence longhand any more. In a delicious irony, I am writing this entry with pen and paper during stolen moments before a basketball game, which only reminds me how much my penmanship has atrophied has from disuse! Writing longhand gives Haldeman the security of knowing that his first draft is actually his first draft, and not the result of the continuous rewriting in place that word processors enable. Even a new generation word processor like WriteBoard, with automatic versioning of every edit, cannot ensure that we produce a first draft without constant editing quite as well as a fountain pen. We scientists might well think as much about the history and provenance of our writing and data.

Yet Haldeman admits that, if he had to choose, he would surrender his bound notebooks and bottles of ink:

... although I love my pens and blank books with hobbyist zeal, if I had to choose between them and the computer there would be no contest. The pens would have to go, even though they're so familiar they're like part of my hand. The computer is part of my brain. It has reconfigured me.

We talk a lot about how the digital computer changes how we work and live. This passage expresses that idea as well as any I've seen and goes one step more. The computer changes how we think. The computer is part of my brain. It has reconfigured me.

Unlike so many others, Haldeman -- who has tinkered with computers in order to support his writing since the Apple II -- is not worried about this new state of the writer's world. This reconfiguration is simply another stage in the ongoing development of how humans think and work.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

December 01, 2009 9:51 PM

Computing Something of Consequence

My daughter sent me a link to Pranav Mistry's TED India talk, which has apparently been making the rounds among media-savvy high school students and teachers. In it, Mistry demonstrates some very cool technology that blurs the boundary between human experience in the world and human experience mediated by a computer. The kids and teachers turned on by the video are all media-savvy, many are tech-savvy, but few are what we would consider computer science-style techies. They are so excited by Mistry's devices because these devices amplify what humans can do and create qualitatively different kinds of experience.

I loved Mistry's own way of accounting for the excitement his technology causes in people who see and experience it:

We humans actually are not interested in computing. What we are interested in is information. We want to know about things.

Spot on. People want to use computers to compute something of consequence. This is true of most non-techies, but I think it's also true of people who are inclined to study computer science. This is one of the key insights behind Astrachan's Law and its corollary, the Pixar Effect. Students want to do something worth doing. Programming with data and algorithms that are interesting enough to challenge students' expectations can be enough to satisfy these laws, but I have to admit that when we hook our programs up to devices that mediate between the world and our human experience -- wow, amazing things can happen.

If nothing else, Mistry's video has raised the bar on what my daughter would like for a Christmas present. I'll have to send him a thank-you note...


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 23, 2009 2:53 PM

Personality and Perfection

Ward Cunningham recently tweeted about his presentation at Ignite Portland last week. I enjoyed both his video and his slides.

Brian Marick has called Ward a "gentle humanist", which seems so apt. Ward's Ignite talk was about a personal transformation in his life, from driver to cyclist, but as is often the case he uncovers patterns and truths that transcend a single experience. I think that is why I always learn so much from him, whether he is talking about software or something else.

From this talk, we can learn something about change in habit, thinking, and behavior. Still, one nugget from the talk struck me as rather important for programmers practicing their craft:

Every bike has personality. Get to know lots of them. Don't search for perfection. Enjoy variety.

This is true about bikes and also true about programming languages. Each has a personality. When we know but one or two really well, we have missed out on much of what programming holds. When we approach a new language expecting perfection -- or, even worse, that it have the same strengths, weaknesses, and personality as one we already know -- we cripple our minds before we start.

When we get to know many languages personally, down to their personalities, we learn something important about "paradigms" and programming style: They are fluid concepts, not rigid categories. Labels like "OO" and "functional" are useful from some vantage points and exceedingly limiting from others. That is one of the truths underlying Anton van Straaten's koan about objects and closures.

We should not let our own limitations limit how we learn and use our languages -- or our bikes.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 11, 2009 1:08 PM

Time Waits for No One

OR: for all p, eventually (modal)passed(p)

~~~~

Last week saw the passing of computer scientist Amir Pnueli. Even though, Pnueli received the Turing Award, I do not have the impression that many computer scientists know much about his work. That is a shame. Pnueli helped to invent an important new sub-discipline of computing:

Pnueli received ACM's A. M. Turing Award in 1996 for introducing temporal logic, a formal technique for specifying and reasoning about the behavior of systems over time, to computer science. In particular, the citation lauded his landmark 1977 paper, "The Temporal Logic of Programs," as a milestone in the area of reasoning about the dynamic behavior of systems.

I was fortunate to read "The Temporal Logic of Programs" early in my time as a graduate student. When I started at Michigan State, most of its AI research was done in the world-class Pattern Recognition and Image Recognition lab. That kind of AI didn't appeal to me much, and I soon found myself drawn to the Design Automation Research Group, which was working on ways to derive hardware designs from specs and to prove assertions about the behavior of systems from their designs. This was a neat application area for logic, modeling, and reasoning about design. I began to work under Anthony Wojcik, applying the idea of modal logics to reasoning about hardware design. That's where I encountered the work of Pnueli, which was still relatively young and full of promise.

Classical propositional logic allows us to reason about the truth and falsehood of assertions. It assumes that the world is determinate and static: each assertion must be either true or false, and the truth value of an assertion never changes. Modal logic enables us to express and reason about contingent assertions. In a modal logic, one can assert "John might be in the room" to demonstrate the possibility of John's presence, regardless of whether he is or is not in the room. If John were known to be out of the country, one could assert "John cannot be in the room" to denote that it is necessarily true that he is not in the room. Modal logic is sometimes referred to as the logic of possibility and necessity.

These notions of contingency are formalized in the modal operators possibly (modal)p, "possibly p," and necessarily (modal)p, "necessarily p." Much like the propositional operators "and" and "or", possibly (modal) and necessarily (modal) can be used to express the other in combination with ¬, because necessity is really nothing more than possibility "turned inside out". The fundamental identities of modal logic embody this relationship:

modal identities

Modal logic extends the operator set of classical logic to permit contingency. All the basic relationships of classical logic are also present in modal logic. possibly (modal) and necessarily (modal) are not themselves truth functions but quantifiers over possible states of a contingent world.

When you begin to play around with modal operators, you start to discover some fun little relationships. Here are a few I remember enjoying:

modal relationships

The last of those is an example of a distributive property for modal operators. Part of my master's research was to derive or discover other properties that would be useful in our design verification tasks.

The notion of contingency can be interpreted in many ways. Temporal logic interprets the operators of modal logic as reasoning over time. necessarily (modal)p becomes "always p" or "henceforth p," and possibly (modal)p becomes "sometimes p" or "eventually p." When we use temporal logic to reason over circuits, we typically think in terms of "henceforth" and "eventually." The states of the world represent discrete points in time at which one can determine the truth value of individual propositions. One need not assume that time is discrete by its nature, only that we can evaluate the truth value of an assertion at distinct points in time. The fundamental identities of modal logic hold in this temporal logic as well.

In temporal logic, we often define other operators that have specific meanings related to time. Among the more useful temporal logical connectives are:

temporal logic operators

My master's research focused specifically on applications of interval temporal logic, a refinement of temporal logic that treats sequences of points in time as the basic units of reasoning. Interval logics consider possible states of the world from a higher level. They are especially useful for computer science applications, because hardware and software behavior can often be expressed in terms of nested time intervals or sequences of intervals. For example, the change in the state of a flip-flop can be characterized by the interval of time between the instant that its input changes and the instant at which its output reflects the changed input.

Though I ultimately moved into the brand-new AI/KBS Lab for my doctoral work, I have the fondest memories of my work with Wojcik and the DARG team. It resulted in my master's paper, "Temporal Logic and its Use in the Symbolic Verification of Hardware", from which the above description is adapted. While Pnueli's passing was a loss for the computer science community, it inspired me to go back to that twenty-year-old paper and reminisce about the research a younger version of myself did. In retrospect, it was a pretty good piece of work. Had I continued to work on symbolic verification, it may have produced an interesting result or two.

Postscript. When I first read of Pnueli's passing, I didn't figure I had a copy of my master's paper. After twenty years of moving files from machine to machine, OS to OS, and external medium to medium, I figured it would have been lost in the ether. Yet I found both a hardcopy in my filing cabinet and an electronic version on disk. I wrote the paper in nroff format on an old Sparc workstation. nroff provided built-in char sequences for all of the special symbols I needed when writing about modal logic that worked perfectly -- unlike HTML, whose codes I've been struggling with for this entry. Wonderful! I'll have to see whether I can generate a PDF document from the old nroff source. I am sure you all would love to read it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

November 03, 2009 7:48 PM

Parts of Speech in Programming Languages

I enjoyed Reg Braithwaite's talk Ruby.rewrite(Ruby) (slides available on-line). It gives a nice survey of some metaprogramming hacks related to Ruby's syntactic and semantic structure.

To me, one of the most thought-provoking things Reg says is actually a rather small point in the overall message of the talk. Object-oriented programming is, he summarizes, basically a matter a matter of nouns and verbs, objects and their behaviors. What about other parts of speech? He gives a simple example of an adverb:

blitz.not.blank?

In this expression, not is an adverb that modifies the behavior of blank?. At the syntactic level, we are really telling blitz to behave differently in response to the next message, which happens to be blank?, but from the programmer's semantic level not modifies the predicate blank?. It is an adverb!

Reg notes that some purists might flag this code as a violation of the Law of Demeter, because it sends a message to an object received from another message send. But it doesn't! It just looks that way at the syntax level. We aren't chaining two requests together; we are modifying how one of the requests works, how its result is to be interpreted. While this may look like a violation of the Law of Demeter, it isn't. Being able to talk about adverbs, and thus to distinguish among different kinds of message, helps to make this clear.

It also helps us to program better in at least two ways. First, we are able to use our tools without unnecessary guilt at breaking the letter of a law that doesn't really apply. Second, we are freed to think more creatively about how our programs can say what we mean. I love that Ruby allows me to create constructs such as not and weave them seamlessly into my code. Many of my favorite gems and apps use this feature to create domain-specific languages that look and feel like what they are and look and feel like Ruby -- at the same time. Treetop is an example. I'd love to hear about your favorite examples.

So, our OO programs have nouns and verbs and adverbs. What about other parts of speech? I can think of at least two from Java. One is pronouns. In English, this is a demonstrative pronoun. It is in Java, too. I think that super is also demonstrative pronoun, though it's not a word we use similarly in English. As an object, I consist of this part of me and that (super) part of me.

Another is adjectives. When I teach Java to students, I usually make an analogy from access modifiers -- public, private, and protected -- to adjectives. They modify the variables and methods which they accompany. So do synchronized and volatile.

Once we free ourselves to think this way, though, I think there is something more powerful afoot. We can begin to think about creating and using our own pronouns and adjectives in code. Do we need to say something in which another part of speech helps us to communicate better? If so, how can we make it so? We shouldn't be limited to the keywords defined for us five or fifteen or fifty years ago.

Thinking about adverbs in programming languages reminds me of a wonderful Onward! talk I heard at the OOPSLA 2003 conference. Cristina Lopes talked about naturalistic programming. She suggested that this was a natural step in the evolution from aspect-oriented programming, which had delocalized references within programs in a new way, to code that is concise, effective, and understandable. Naturalistic programming would seek to take advantage of elements in natural language that humans have been using to think about and describe complex systems for thousand of years. I don't remember many of the details of the talk, but I recall discussion of how we could use anaphora (repetition for the sake of emphasis) and temporal references in programs. Now that my mind is tuned to this wavelength, I'll go back to read the paper and see what other connections it might trigger. What other parts of speech might we make a natural part of our programs?

(While writing this essay, I have felt a strong sense of deja vu. Have I written a previous blog entry on this before? If so, I haven't found it yet. I'll keep looking.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 02, 2009 6:59 PM

It's All Just Programming

One of my colleagues is an old-school C programmer. He can make the machine dance using C. When C++ came along, he tried it for a while, but many of the newly-available features seemed like overkill to him. I think templates fell into that category. Other features disturbed him. I remember him reporting some particularly bad experiences with operator overloading. They made code unreadable! Unmaintainable! You could never be sure what + was doing, let alone operators like () and casts. His verdict: Operator overloading and its ilk are too powerful. They are fine in theory, but real languages should not provide so much freedom.

Some people don't like languages with features that allow them to reconfigure how the language looks and works. I may have been in that group once, long ago, but then I met Lisp and Smalltalk. What wonderful friends they were. They opened themselves completely to me; almost nothing was off limits. In Lisp, most everything was open to inspection, code was data that I could process, and macros let me define my own syntax. In Smalltalk, everything was an object, including the integers and the classes and the methods. Even better, most of Smalltalk was implemented in Smalltalk, right there for me to browse and mimic... and change.

Once I was shown a world bigger than Fortran, PL/I, and Pascal, I came to learn something important, something Giles Bowkett captures in his inimitable, colorful style:

There is no such thing as metaprogramming. It's all just programming.

(Note: "Colorful" is a euphemism for "not safe to read aloud at work, nor to be read by those with tender sensibilities".)

Ruby fits nicely with languages such as Common Lisp, Scheme, and Smalltalk. It doesn't erect too many boundaries around what you can do. The result can be disorienting to someone coming from a more mainstream language such as Java or C, where boundaries between "my program" and "the language" are so much more common. But to Lispers, Schemers, and Smalltalkers, the freedom feels... free. It empowers them to express their ideas in code that is direct, succinct, and powerful.

Actually, when you program in C, you learn the same lesson, only in a different way. It's all just programming. Good C programmers often implement their own little interpreters and their own higher-order procedures as a part of larger programs. To do so, they simply create their own data structures and code to manipulate them. This truth is the raw material out of which Greenspun's Tenth Rule of Programming springs. And that's the point. In languages like C, if you want to use more powerful features, and you will, you have to roll them for yourself. My friends who are "C weenies" -- including the aforementioned colleague -- take great pride in their ability to solve any problem with just a bit more programming, and they love to tell us the stories

Metaprogramming is not magic. It is simply another tool in the prepared programmer's toolbox. It's awfully nice when that tool is also part of the programming language we use. Otherwise, we are limited in what we can say conveniently in our programs by the somewhat arbitrary lines drawn between real and meta.

You know what? Almost everything in programming looks like magic to me. That may seem like an overstatement, but it's not. When I see a program of a few thousand lines or more generate music, play chess, or even do mundane tasks like display text, images, and video in a web browser, I am amazed. When I see one program convert another into the language of a particular machine, I am amazed. When people show me shorter programs that can do these things, I am even more amazed.

The beauty of computer science is that we dig deeper into these programs, learn their ideas, and come to understand how they work. We also learn how to write them ourselves.

It may still feel like magic to me, but in my mind I know better.

Whenever I bump into a new bit of sorcery, a new illusion or a new incantation, I know what I need to do. I need to learn more about how to write programs.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 27, 2009 9:47 PM

William Cook on Industry and Academia

I really enjoyed reading the text of William Cook's banquet speech at ECOOP 2009. When I served as tutorials chair for OOPSLA 2006, Cook was program chair, and that gave me a chance to meet him and learn a bit about his background. He has an interesting career story to tell. In this talk, he tells us this story as a way to compare and contrast computer science in academia and in industry. It's well worth a read.

As a doctoral student, I thought my career path might look similar to Cook's. I applied for research positions in industry, with two of the Big Six accounting firms and with an automobile manufacturer, at the same time I applied for academic positions. In the end, after years of leaning toward industry, I decided to accept a faculty position. As a result, my experience with industrial CS research and development is limited to summer positions and to interactions throughout grad school.

Cook's talk needs no summary; you should read it for yourself. Here are a few points that stood out to me as I read:

Venture capitalists talk about pain killers versus vitamins. A vitamin is a product that might make you healthier over the long run, but it's hard to tell. Pain killers are products where the customer screams "give it to me now" and asks what it costs later. Venture capitalists only want to fund pain killers.

This is true not only of venture capitalists, but of lots of people. As a professor, I recognize this in myself and in my students all of the time. Cook points out that most software development tools are vitamins. So are many of the best development practices. We need to learn tools and practices that will make us most productive and powerful in the long run, but without short-term pain we may not generate the resolve to do so.

We read all the standard references on OO for business applications. It didn't make sense to us. We started investigating model-driven development styles. We created modeling languages for data, user interfaces, security and workflow. These four aspects are the core of any enterprise application. We created interpreters to run the languages, and our interpreters did many powerful optimizations by weaving together the different aspects.

To me, this part of the talk exemplifies best how a computer scientist thinks differently than a non-computer scientist, whether experienced in software development or not. Languages are tools we create to help us solve problems, not merely someone else's solutions we pluck off the shelf. Language processors are tools we create to make our languages come to life in solving instances of actual problems.

The way I see it is that industry generally has more problems than they do solutions, but academia often has more solutions than problems.

Cook makes a great case for a bidirectional flow between industry, with its challenging problems in context, and academia, with its solutions built of theory, abstraction, and design. This transfer can be mutually beneficial, but it is complicated by context:

Industrial problems are often messy and tied to specific technical situations or domains. It is not easy to translate these complex problems into something that can be worked on in academia. This translation involves abstraction and selection.

The challenge is greatest when we then take solutions to problems abstracted from real-world details and selected for interestingness more than business value and try to re-inject them into the wild. Too often, these solutions fail to take hold, not because people in industry are "stupid or timid" but because the solution doesn't solve their problem. It solves an ideal problem, not a real one. The translation process from research to development requires a finishing step that people in the research lab often have little interest in doing and that people in the development studio have little time to implement. The result is a disconnect that can sour the relationship unnecessarily.

Finally, the talk is full of pithy lines that I hope to steal and use to good effect sometime soon. Here is my favorite:

Simplicity is not where things start. ... It is where they end.

Computer scientists seek simplicity, whether in academia or in industry. Cook gives us some clues in this talk about how people in these spheres can understand one another better and, perhaps, work better together toward their common goal.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 23, 2009 3:11 PM

Universal Ideas of Harmonious Design

I didn't write this comment on the blog entry Wa: The key to clear, harmonious design:

I know that you are talking about visual design, but I am struck by how this approach applies to many other domains.

But I could have.

I started university life intending to become an architect, and my interest in visual design has remained strong through the years. I was delighted when I learned of Christopher Alexander's influence on some in the software world, because it gave me more opportunities to read and think about architectural design -- and to think about how its ideas relate to how we design software. I am quite interested in the notion that there are universal truths about design, and even if not what we can learn from designers in other disciplines.

Garr Reynolds identifies seven principles of the Zen aesthetic of harmony. Like the commenter, my thoughts turned quickly from the visual world to another domain. For me, the domain is software. How well do these principles of harmony apply to software? Several are staples of software design. Others require more thought.

(1) Embrace economy of materials and means
(3) Keep things clean and clutter-free

These are no-brainers. Programmers want to keep their code clean, and most prefer an economical style, even when using a language that requires more overhead than they would like.

(6) Think not only of yourself, but of the other (e.g., the viewer).

When we develop software, we have several kinds of others to consider. The most obvious are our users. We have disciplines, such as human-computer interaction, and development styles, such as user-centered design, focused squarely on the people who will use our programs.

We also need to think of other programmers. These are the people who will read our code. Software usually spends much more time in maintenance than in creation, so readability pays off in a huge way over time. We can help our readers by writing good documentation, an essential complement to our programs. However, the best way to help our readers is to write readable code. In this we are more like Reynolds's presenters. We need to focus on the clarity and beauty of our primary product.

Finally, let's not forget our customers and our clients, the people who pay us to write software. To me, one of the most encouraging contributions of XP was its emphasis on delivering tangible value to our customers every day.

(7) Remain humble and modest.

This is not technical advice. It is human advice. And I think it is underrated in too many contexts.

I have worked with programmers who were not humble enough. Sadly, I have been that programmer, too.

A lack of humility almost always hurts the project and the team. Reynolds is right in saying that true confidence follows from humility and modesty. Without humility, a programmer is prone to drift into arrogance, and arrogance is more dangerous than incompetence.

A programmer needs to balance humility against gumption, the hubris that empowers us to tackle problems which seem insurmountable. I have always found that humility is a great asset when I have the gumption to tackle a big problem. Humility keeps me alert to things I don't understand or might not see otherwise, and it encourages me to take care at each step.

... Now come a couple of principles that cause me to thing harder.

(2) Repeat design elements.

Duplication is a bane to software developers. We long ago recognized that repetition of the same code creates so many problems for writing and modifying software that we have coined maxims such as "Don't repeat yourself" and "Say it one once and only once." We even create acronyms such as DRY to get the idea across in three spare letters.

However, at another level, repetition is unavoidable. A stack is a powerful way to organize and manipulate data, so we want to use one whenever it helps. Rather than copy and paste the code, we create an abstract data type or a class and reuse the component by instantiating it.

Software reuse of this sort is how programmers repeat design elements. Indeed, one of the most basic ideas in all of programming is the procedure, an abstraction of a repeated behavioral element. It is fundamental to all programming, and one of the contributions that computer science made as we moved away from our roots in mathematics.

In the last two decades, programmers have begun to embrace repeatable design units at higher levels. Design patterns recur across contexts, and so now we do our best to document them and share them with others. Architectures and protocols and, yes, even our languages are ways to reify recurring patterns in a way that makes using them as convenient as possible.

(4) Avoid symmetry.

Some programmers may look at this principle and say, "Huh? How can this apply? I'm not even sure what it means in the context of software."

When linguistic structures and data structures repeat, they repeat just as they are, bringing a natural symmetry to the algorithms we use and the code we write. But at the level of design patterns and architectures, things are not so simple. Christopher Alexander, the building architect who is the intellectual forefather of the software patterns community, famously said that a pattern appears a million times, but never exactly the same. The pattern is molded to fit the peculiar forces at play in each system. This seems to me a form of breaking symmetry.

But we can take the idea of avoiding symmetry farther. In the mathematical and scientific communities, there has long been a technical definition of symmetry in groups, as well as a corresponding definition of breaking symmetry in patterns. Only a few people in the software community have taken this formal step with design patterns. Chief among them are Jim Coplien and Liping Zhao. Check out their book chapter, Symmetry Breaking in Software Patterns, if you'd like to learn more.

A few years ago I was able to spend some time looking at this paper and at some of the scientific literature on patterns and symmetry breaking. Unfortunately, I have not been able to return to it since. I don't yet fully understand these ideas, but I think I understand enough to see that there is something important here. This glimmer convinces me that avoiding symmetry is perhaps an important principle for us software designers, one worthy of deeper investigation.

... This leaves us with one more principle from the Presentation Zen article:

(5) Avoid the obvious in favor of the subtle

This is the one principle out of the seven that I think does not apply to writing software. All other things being equal, we should prefer the obvious to the subtle. Doing something that isn't obvious is the single best reason to write a comment in our code. When we must do something unexpected by our readers, we must tell them what we have done and why. Subtlety is an impediment to understanding code.

Perhaps this is a way in which we who work in software differ from creative artists. Subtlety can enhance a work of art, by letting -- even requiring -- the mind to sense, explore, and discover something beyond the surface. As much art as there is in good code, code is at its core a functional document. Requiring maintenance programmers to mull over a procedure and to explore its hidden treasures only slows them down and increases the chances that they will make errors while changing it.

I love subtlety in algorithms and designs, and I think I've learned a lot from reading code that engages me in a way I've not experienced before. But there is something dangerous about code in which subtlety becomes more important than what the program does.

Blaine Buxton recently wrote a nice entry on the idea of devilishly clever code:

But, it got me thinking about clever and production code. In my opinion, clever is never good or wanted in production code. It's great to learn and understand clever code, though. It's a great mental workout to keep you sharp.

Maybe I am missing something subtle here; I've been accused of not seeing nuance before. This may be like the principle of avoiding symmetry, but I haven't reached the glimmer of understanding yet. Certainly, many people speak of Apple's great success with subtle design that engages and appeals to users in a way that other design companies do not. Others, though attribute its success to creating products that are intuitive to use. To me, intuitiveness points more to obviousness and clarity than to subtlety. And besides, Apple's user experience is at the level of design Reynolds is talking about, not at the level of code.

I would love to hear examples, pro and con, of subtlety in code. I'd love to learn something new!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 20, 2009 8:21 PM

AP Computer Science, Back on Peoples' Minds

A while back -- a year? two? -- the folks at the College Board announced some changes to the way they would offer the AP exams in computer science. I think the plan was to eliminate the B exam (advanced) and redesign the A exam (basic). At the time, there was much discussion among CS educators, at conferences, on the SIGCSE mailing list, and in a few blogs. In 2008 sometime, I read a CACM article by a CS educator on the issue. Her comments were interesting enough that I made some notes in the margins and set the article aside. I also collected a few of my thoughts about the discussions I had read and heard into a text file. I would write a blog article!

But I never did.

I went looking for that text file today. I found it in a folder named recent/, but it is not recent. The last time I touched the file was Tuesday, December 9, 2008.

I guess it wasn't all that urgent after all.

Actually, this isn't all that uncommon for blog article ideas. Many come to mind, but few make it to the screen. Yet this seems different. When the original news was announced, the topic seemed so urgent to many of my close friends and colleagues, and that made it seem urgent to me. The Future of Computer Science in the High Schools was at stake. Yet I could never bring myself to write about the article.

To be honest, it is hard for me to care much about AP. I have been teaching at my university for over seventeen years, and I cannot recall a single student who asked us about AP CS credit. We simply never see it.

Computer programming courses long ago disappeared from most high schools in my state. I am willing to wager that no Iowa schools ever taught computer science qua computer science; if any did, the number was nearly zero. Even back in the early to mid-1990s when dedicated CS courses existed, they were always about learning to program, usually in Basic or Pascal. That made sense, because the best way to help high school students get ready for the first college CS course is to introduce them to programming. Whatever you think about programming as the first course, that is the way most universities work, as well as nearly every college in Iowa. Those programming courses could have been AP courses, but most were not.

Unfortunately, falling budgets, increasing demands in core high school subjects, and a lack of certified CS teachers led many schools to cut their programming courses. If students in my state see a "computer course" in high school these days, it is almost always a course on applications, usually productivity tools or web design.

Maybe I am being self-centered in finding it hard to care about the AP CS exams. We do not see students with AP CS credit or receive inquiries about its availability here. AP CS matters a lot to other people, and they are better equipped to deal with the College Board and the nature and content of the exams.

Then again, maybe I am being short-sighted. Many argue that AP CS is the face of computer science in the high schools, and for better or worse it defines what most people in the K-12 world think CS is. I am less bothered with programming as the focus of that course than many of my friends and colleagues. I'm even sympathetic to Stuart Reges's ardent defense of the current exam structure at his site to preserve it in the penumbra of the University of Washington. But I do think that the course and exam could do a better job of teaching and testing programming than it has over the last decade or so.

Should the course be more than programming, or different altogether? I am open to that, too; CS certainly is more than "just programming". Alas, I am not sure that the academic CS world can design a non-programming high school CS course that satisfies enough of the university CS departments to garner widespread adoption.

But for someone at a university like mine, and in a state like mine, all of the money and mindshare spent on AP Computer Science seems to go for naught. It may benefit the so-called top CS programs, the wealthier school districts, and the students in states where computing already has more of a presence in the high school classroom. In my context? It's a no-op.

Why did I dig a ten-month old text file out for blogging now? There is much ado again about AP CS in light of the Georgia Department of Education announcing that AP Computer Science would no longer count towards high school graduation requirements. This has led to a wide-ranging discussion about whether CS should count as science or math (the TeachScheme! folks have a suggestion for this), the content of the course, and state curriculum standards. Ultimately, the issue comes down to two things: politics, both educational and governmental, and the finite number of hours available in the school day.

So, I will likely return to matters of greater urgency to my university and my state. Perhaps I am being short-sighted, but the simple fact is this. The AP CS curriculum has been around for a long time, and its existence has been of no help in getting my state to require or endorse high school CS courses, certify high school CS teachers, or even acknowledge the existence of computer science as a subject or discipline essential to the high school curriculum. We will continue to work on ways to introduce K-12 students to computer science and to help willing and interested schools to do more and better CS-related material. The AP CS curriculum is likely to have little or no effect on our success or failure.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 29, 2009 5:51 PM

Life, Artificial and Oh So Real

Artificial. Tyler Cowen writes about a new arena of battle for the Turing Test:

I wonder when the average quality of spam comment will exceed the average quality of a non-spam comment.

This is not the noblest goal for AI, but it may be one for which the economic incentive to succeed drives someone to work hard enough to do so.

Oh So Real. I have written periodically over the last sixteen months about being sick with an unnamed and undiagnosed malady. At times, I was sick enough that I was unable to run for stretches of several weeks. When I tried to run, I was able to run only slowly and only for short distances. What's worse, the symptoms always returned; sometimes they worsened. The inability of my doctors to uncover a cause worried me. The inability to run frustrated and disappointed me.

Yesterday I read an essay by a runner about the need to run through a battle with cancer:

I knew, though, if I was going to survive, I'd have to keep running. I knew it instinctively. It was as though running was as essential as breathing.

Jenny's essay is at turns poetic and clinical, harshly realistic and hopelessly romantic. It puts my own struggles into a much larger context and makes them seem smaller. Yet in my bones I can understand what she means: "... that is why I love running: nothing me feel more alive. I hope I can run forever."


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Running

September 28, 2009 8:32 PM

Two Sides of My Job

Today brought high contrast to my daily duties.

I spent my morning preparing a talk on computer science, careers, and job prospects for an audience of high school counselors from around the state. Then I gave the talk and lunched with them. Both prep and presentation were a lot of fun! I get to take some element of computer science and use it as a way to communicate how much fun CS and why these counselors should encourage their students to consider CS as a major. Many of the adults in my audience were likely prepared for the worst, to be bored by a hard topic they don't understand. Seeing the looks in their eyes when they saw how an image could be disassembled and reassembled to hide information -- I hope the look in my eyes reflected it.

Early afternoon found me working on next semester's schedule of courses and responding to requests for consultations on curricular changes being made by another department. The former is in most ways paperwork, seemingly for paperwork's sake. The latter requires delicate language and, all too frequently, politics. Every minute seems a grind. This is an experience of getting caught up in stupid details, administration style.

I ended the day by visiting with a local company about a software project that one of our senior project courses might work on. I learned a little about pumps and viscosity and flow, and we discussed the challenges they face in deploying an application that meets all their needs. Working on real problems with real clients is still one of the great things about building software.

Being head of our department has brought me more opportunities like the ones that bookended my day, but it has also thrust me into far too many battles with paperwork and academic process and politics like the ones that filled the in-between time. After four-plus years, I have not come to enjoy that part of the job, even when I appreciate the value it can bring to our department and university when I do it well. I know it's a problem when I have to struggle to maintain my focus on the task at hand just to make progress. Such tasks offer nothing like the flow that comes from preparing and giving a talk, writing code, or working on a hard problem.

A few months ago I read about a tool called Freedom, which "frees you from the distractions of the internet, allowing you time to code, write, or create." (It does this by disabling one's otherwise functional networking for up to eight hours.") I don't use Freedom or any tool like it, but there are moments when I fear I might need one to keep doing my work. Funny, but none of those moments involve preparing and giving a talk, writing code, or working on a hard problem.

Tim Bray said it well:

If you need extra mental discipline or tool support to get the focus you need to do what you have to do, there's nothing wrong with that, I suppose. But if none of your work is pulling you into The Zone, quite possibly you have a job problem not an Internet problem.

Fortunately, some of my work pulls me into The Zone. Days like today remind me how much different it feels. When I am mired in a some tarpit outside of the zone, it's most definitely not an Internet problem.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Managing and Leading

September 19, 2009 9:09 PM

Quick Hits with an Undercurrent of Change

Yesterday evening, in between volleyball games, I had a chance to do some reading. I marked several one-liners to blog on. I planned a disconnected list of short notes, but after I started writing I realized that they revolve around a common theme: change.

Over the last few months, Kent Beck has been blogging about his experiences creating a new product and trying to promote a new way to think about his design. In his most recent piece, Turning Skills into Money, he talks about how difficult it can be to create change in software service companies, because the economic model under which they operates actually encourages them to have a large cohort of relatively inexperienced and undertrained workers.

The best line on that page, though, is a much-tweeted line from a comment by Niklas Bjørnerstedt:

A good team can learn a new domain much faster than a bad one can learn good practices.

I can't help thinking about the change we would like to create in our students through our software engineering course. Skills and good practices matter. We cannot overemphasize the importance of proficiency, driven by curiosity and a desire to get better.

Then I ran across Jason Fried's The Next Generation Bends Over, a salty and angry lament about the sale of Mint to Intuit. My favorite line, with one symbolic editorial substitution:

Is that the best the next generation can do? Become part of the old generation? How about kicking the $%^& out of the old guys? What ever happened to that?

I experimented with Mint and liked it, though I never convinced myself to go all the way it. I have tried Quicken, too. It seemed at the same time too little and too much for me, so I've been rolling my own. But I love the idea of Mint and hope to see the idea survive. As the industry leader, Intuit has the leverage to accelerate the change in how people manage their finances, compared to the smaller upstart it purchased.

For those of us who use these products and services, the nature of the risk has just changed. The risk with the small guy is that it might fold up before it spreads the change widely enough to take root. The risk with the big power is that it doesn't really get it and wastes an opportunity to create change (and wealth). I suspect that Intuit gets it and so hold out hope.

Still... I love the feistiness that Fried shows. People with big ideas and need not settle. I've been trying to encourage the young people with whom I work, students and recent alumni, to shoot for the moon, whether in business or in grad school.

This story meshed nicely with Paul Graham's Post-Medium Publishing, in which Graham joins in the discussion of what it will be like for creators no longer constrained by the printed page and the firms that have controlled publication in the past. The money line was:

... the really interesting question is not what will happen to existing forms, but what new forms will appear.

Change will happen. It is natural that we all want to think about our esteemed institutions and what the change means for them. But the real excitement lies in what will grow up to replace them. That's where the wealth lies, too. That's true for every discipline that traffics in knowledge and ideas, including our universities.

Finally, Mark Guzdial ruminates on what changes CS education. He concludes:

My first pass analysis suggests that, to make change in CS, invent a language or tool at a well-known institution. Textbooks or curricula rarely make change, and it's really hard to get attention when you're not at a "name" institution.

I think I'll have more to say about this article later, but I certainly know what Mark must be feeling. In addition to his analysis of tools and textbooks and pedagogies, he has his own experience creating a new way to teach computing to non-majors and major alike. He and his team have developed a promising idea, built the infrastructure to support it, and run experiments to show how well it works. Yet... The CS ed world looks much like it always has, as people keep doing what they've always been doing, for as many reasons as you can imagine. And inertia works against even those with the advantages Mark enumerates. Education is a remarkably conservative place, even our universities.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development, Teaching and Learning

September 14, 2009 10:53 PM

Old Dreams Live On

"Only one, but it's always the right one."
-- Jose Raoul Capablanca,
when asked how many moves ahead
he looked while playing chess

When I was in high school, I played a lot of chess. That's also when I first learned about computer programming. I almost immediately was tantalized by the idea of writing a program to play chess. At the time, this was still a novelty. Chess programs were getting better, but they couldn't compete with the best humans yet, and I played just well enough to know how hard it was to play the game really well. Like so many people of that era, I thought that playing chess was a perfect paradigm of intelligence. It looked like such a wonderful challenge to the budding programmer in me.

I never wrote a program that played chess well, yet my programming life often crossed paths with the game. My first program written out of passion was a program to implement a ratings system for our chess club. Later, in college, I wrote a program to perform the Swiss system commonly used to run chess tournaments as a part of my senior project. This was a pretty straightforward program, really, but it taught me a lot about data structures, algorithms, and how to model problems.

Though I never wrote a great chessplaying program, that was the problem that mesmerized me and ultimately drew me to artificial intelligence and a major in computer science.

In a practical sense, chess has been "solved", but not in the way that most of us who loved AI as kids had hoped. Rather than reasoning symbolically about positions and plans, attacks and counterattacks, Deep Blue, Fritz, and all of today's programs win by deep search. This is a strategy that works well for serial digital computers but not so well for the human mind.

To some, the computer's approach seems uncivilized even today, but those of us who love AI ought be neither surprised nor chagrined. We have long claimed that intelligence can arise from any suitable architecture. We should be happy to learn how it arises most naturally for machines with fast processors and large memories. Deep Blue's approach may not help us to understand how we humans manage to play the game well in the face of its complexity and depth, but it turns out that this is another question entirely.

Reading David Mechner's All Systems Go last week brought back a flood of memories. The Eastern game of Go stands alone these days among the great two-person board games, unconquered by the onslaught of raw machine power. The game's complexity is enormous, with a branching factor at each ply so high that search-based programs soon drown in a flood of positions. As such, Go encourages programmers to dream the Big Dream of implementing a deliberative, symbolic reasoner in order to create a programs that plays the game well. The hubris and well-meaning naivete of AI researchers have promised huge advances throughout the years, only to have ambitious predictions go unfulfilled in the face of unexpected complexity. Well-defined problems such as chess turned out to be complex enough that programs reasoning like humans were unable to succeed. Ill-defined problems involving human language and the interconnected network of implicit knowledge that humans seem to navigate so easily -- well, they are even more resistant to our solutions.

Then, when we write programs to play games like chess well, many people -- including some AI researchers -- move the goal line. Schaefer et al. solved checkers with Chinook, but many say that its use of fast search and a big endgame databases is unfair. Chess remains unsolved in the formal sense, but even inexpensive programs available on the mass market play far, far better than all but a handful of humans in the world. The best program play better than the best humans.

Not so with Go. Mechner writes:

Go is too complex, too subtle a game to yield to exhaustive, relatively simpleminded computer searches. To master it, a program has to think more like a person.

And then:

Go sends investigators back to the basics--to the study of learning, of knowledge representation, of pattern recognition and strategic planning. It forces us to find new models for how we think, or at least for how we think we think.

Ah, the dream lives!

Even so, I am nervous when I read Mechner talking about the subtlety of Go, the depth of its strategy, and the impossibility of playing it well in by search and power. The histories of AI and CS have demonstrated repeatedly that what we think difficult often turns out to be straightforward for the right sort of program, and that what we think easy often turns out to be achingly difficult to implement. What Mechner calls 'subtle' about Go may well just be a name for our ignorance, for our lack of understanding today. It might be wise for Go aficionados to remain humble... Man's hubris survives only until the gods see fit to smash it.

We humans draw on the challenge of great problems to inspire us to study, work, and create. Lance Fortnow wrote recently about the mystique of the open problem. He expresses the essence of one of the obstacles we in CS face in trying to excite the current generation of students about our discipline: "It was much more interesting to go to the moon in the 60's than it is today." P versus NP may excite a small group of us, but when kids walk around with iPhones in their pockets and play on-line games more realistic than my real life, it is hard to find the equivalent of the moon mission to excite students with the prospect of computer science. Isn't all of computing like Fermat's last theorem: "Nothing there left to dream there."?

For old fogies like me, there is still a lot of passion and excitement in the challenge of a game like Go. Some days, I get the urge to ditch my serious work -- work that matters to people in the world, return to my roots, and write a program to play Go. Don't tell me it can't be done.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

August 14, 2009 3:13 PM

Whither Programming?

I've been thinking a lot about the Software Engineering course I'm teaching this fall, which commences a week from Tuesday. Along the way, I have been looking at a lot of books and articles with "software engineering" in the title. It's surprising how few of them get any where near code. I know I shouldn't be surprised. I remember taking courses on software engineering, and I've stayed close enough to the area all these years to know what matters. There are issues other than programming that we need to think about while building big systems. And there is plenty of material out there about programming and programming tools and the nuts and bolts of programming.

Still, I think it is important when talking about software engineering to keep in mind what the goal is: a working program, or collection of programs. When we forget that, it's too easy to spin off into a land of un-reality. It's also important to keep in mind that someone has to actually write code, or no software will ever be engineered. I hope that the course I teach can strike a good balance.

In the interest of keeping code in mind, I share with you an assortment of programming news. Good, bad, ugly, or fun? You decide.

Hiding the Rainforest. Mark Guzdial reports that Georgia Tech is eliminating yet another language from its computing curriculum. Sigh. Thought experiments notwithstanding, variety in language and style is good for programmers. On a pragmatic note, someone might want to tell the GT folks that programming for the JVM may soon look more like Lisp and feel more like ML than Java or C++.

Programming meets the Age of Twitter. A Processing programming contest with a twist: all programs must be 200-characters or less. I'll give extra credit for any program that is a legal tweet.

Power to the Programmer! Phil Windley enjoys saving $60 by writing his own QIF->CSV converter. But the real hero is the person who wrote Finance::QIF.

Why Johnny Can't Read Perl. Courtesy of Lambda the Ultimate comes news we all figured had to be true: a formal proof that Perl cannot be parsed. Who said the Halting Theorem wasn't useful? I guess I'll stop working on my refactoring browser for Perl.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

August 12, 2009 5:01 PM

One Giant Leap for Computing

Charles Duke walking on the moon, Apollo 16

Last month, in honor of the Apollo 11 mission's fortieth anniversary, Google Code announced the open-sourcing of code for the mission's command and lunar modules. Very cool indeed. This project creates opportunities for many people. CS historians can add this code to their record of computing and can now study computing at NASA in a new way. Grady Booch will be proud. He has been working for many years on the task of preserving code and other historic artifacts of our discipline that revolve around software.

Software archeologists can study the code to find patterns and architectural decisions that will help us understand the design of software better. What we learn can help us do more, just as the Apollo 11 mission prepared the way for future visitors to the moon, such as Charles Duke of Apollo 16 (pictured here).

This code could help CS educators convince a few students to assembly language programming seriously. This code isn't Java or even C, folks. Surely some young people are still mesmerized enough by space travel that they would want to dig in to this code?

As a person who appreciates assembly-level programming but prefers working at a higher level, I can't help but think that it would be fun to reverse-engineer these programs to code at a more abstract level and then write compilers that could produce equivalent assembly that runs on the simulator. The higher-level programs created in this way would be a great way for us to record the architecture and patterns we find in the raw source.

Reading this code and about the project that surrounds it, I am in awe of the scale of the programming achievement. For a computer scientist, this achievement is beautiful. I'd love to use this code to share the excitement of computing with non-computer scientists, but I don't know how. It's assembly, after all. I'm afraid that most people would look at this code and say, "Um, wow, very impressive" while thinking, "Yet another example of how computer science is beyond me."

If only those people knew that many computer scientists feel the same way. We are in awe. At one level, we feel like this is way over our heads, too. How could these programmers done so much with so little? Wow. But then we take a breath and realize that we have the tools we need to dig in and understand how this stuff works. Having some training and experience, we can step back from our awe and approach the code in a different way. Like a scientist. And anyone can have the outlook of a scientist.

When I wonder how could the programmers of the 1960s could have done so much with so little, I feel another emotion, too: sheepishness. How far have we as a discipline progressed in forty years? Stepping back from the sheepishness, I can see that since that time programmers have created some astoundingly complex systems in environments that are as harsh or harsher than the Apollo programmers faced. It's not fair to glorify the past more than it deserves.

But still... Wow. Revisiting this project forty years later ought to motivate all of us involved with computer science in 2009 -- software professionals, academics, and students -- to dream bigger dreams and tackle more challenging projects.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

August 10, 2009 2:57 PM

Scratching Itches and the Peril of Chapter 1

In my entry about advice on my advice to students interested in web development, I quoted an alum who dared suggest that undergrads study one language in depth for four years. This seems extreme to me, but it is often useful to examine the axioms on which we base our curricula. Wade's idea fits well with the idea that we would like to teach students enough that they can scratch an itch. I think that this approach overvalues what we can learn from a single language, and any language that takes four years to master is probably the wrong language for undergrads. Even if we accept that our goal is to help students scratch their own itches in the service of motivation, can't we do better? I think so.

My goal here is not to add to the seemingly endless cacophony about which first language(s) we should use in CS1. But I was impressed by a comment Terry Chay made on the entry of his to which I linked in my "itch to scratch" post, about textbooks. Although he is a proponent of PHP for web development, he argues that it isn't suitable for people learning their first programming language. One of the reasons is the the best books about PHP assume too much background.

Consider Welling and Thomson's PHP and MySQL Web Development. It is about web programming and so assumes that students are familiar with HTML, starting and restarting Apache, activating PHP, installing and setting up MySQL, and all the editor- and OS-specific details of writing, storing, and running scripts. That's a lot of hidden prerequisites, and it is one of the challenges we face anytime we try to embed CS1 in a context. Context is context, and students have to have it before they move on.

However, Chay claims that, after its first chapter, PHP and MySQL Web Development "offers more 'immersion' gratification (at the least cost) than any other language's textbook." But it's too late. The first chapter is what beginners see first and what they must make sense of before moving on. Unfortunately,

It's that first chapter that does the first timer in.

Many people who don't teach CS1 and who have never tried writing for that audience underestimate just how important this is, and just how difficult an obstacle it is to overcome. I occasionally see really good books about programming written to solve problems in a specific domain or context that might work well for beginners -- but only as a second course, after someone has taught a course that gets them through the introduction.

Right now I have examination copies of three books sitting on my desk or desktop that are relevant to this discussion.

  • A Web-Based Introduction to Programming, by Mike O'Kane. I requested this when it was the first text book I'd seen that aimed to use PHP in context to teach a CS1 course. Often, books written specifically for CS1 lose the appeal and flavor of books written for motivated practitioners with an itch to scratch. Can this book be as good as the book Chay recommends?
  • Using Google App Engine: Building Web Applications, by Charles Severance. I've see this book criticized for covering too many low-level details, but it aims to be a self-contained introduction to programming. The only way to do that is to cover all the knowledge usually assumed by Chapter 1. The combination of web applications and Google seems like a potential winner.
  • Practical Programming: An Introduction to Computer Science Using Python, by Campbell, Gries, Montojo, and Wilson. This book was motivated at least in part by Greg Wilson's efforts to teach programming to scientists. Unlike the previous two, Practical Programming uses several themes to introduce the ideas of CS and the programming tools needed to play with them. Will the same ideas work as well when brought to the CS1 level, outside of a single unifying context?

I'm disappointed that I haven't taken the time to study these in detail. I am familiar with drafts of Practical Programming after having reviewed them in the books early stages and know it to be a well-written book. But that's not enough to say whether it works as well as I hope. Severance's book also promises big things, but I need to dig deeper to see how well it works. O'Kane's looks like the most traditional CS1 book of the bunch, with a twist: if-statements don't arrive until Chapter 7, and loops until Chapter 9.

Gotta make time! But then there is my own decidedly non-freshman course to think about. Fifteen days and counting...


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 07, 2009 2:18 PM

A Loosely-Connected Friday Miscellany

An Addition to My News Aggregator

Thanks to John Cook, I came across the blog of Dan Meyers, a high school math teacher. Cook pointed to an entry with a video of Meyer speaking pecha kucha-style at OSCON. One of the important messages for teachers conveyed in this five minutes is Be less helpful. Learning happens more often when people think and do than when they follow orders in a well-defined script.

While browsing his archive I came across this personal revelation about the value of the time he was spending on his class outside of the business day:

I realize now that the return on that investment of thirty minutes of my personal time isn't the promise of more personal time later. ... Rather it's the promise of easier and more satisfying work time now.

Time saved later is a bonus. If you depend on that return, you will often be disappointed, and that feeds the emotional grind that is teaching. Kinda like running in the middle. I think it also applies more than we first realize to reuse and development speed in software.

Learning and Doing

One of the underlying themes in Meyers's writing seems to be the same idea in this line from Gerd Binnig, which I found at Physics Quote of Day:

Doing physics is much more enjoyable than just learning it. Maybe 'doing it' is the right way of learning ....

Programming can be a lot more fun than learning to program, at least the way we often try to teach it. I'm glad that so many people are working on ways to teach it better. In one sense, the path to better seems clear.

Knowing and Doing

One of the reasons I named by blog "Knowing and Doing" was that I wanted to explore the connection between learning, knowing, and doing. Having committed to that name so many years ago, I decided to stake its claim at Posterous, which I learned about via Jake Good. Given some technical issues with using NanoBlogger, at least an old version of it, I've again been giving some thought to upgrading or changing platforms. Like Jake, I'm always tempted to roll my own, but...

I don't know if I'll do much or anything more with Knowing and Doing at Posterous, but it's there if I decide that it looks promising.

A Poignant Convergence

Finally, a little levity laced with truth. Several people have written to say they liked the name of my recent entry, Sometimes, Students Have an Itch to Scratch. On a whim, I typed it into Translation Party, which alternately translates a phrase from English into Japanese and back until it reaches equilibrium. In only six steps, my catchphrase settles onto:

Sometimes I fear for the students.

Knowing how few students will try to scratch their own itches with their new-found power as a programmer, and how few of them will be given a chance to do so in their courses on the way to learning something valuable, I chuckled. Then I took a few moments to mourn.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development, Teaching and Learning

August 03, 2009 8:51 PM

Sometimes, Students Have an Itch to Scratch

Mark Guzdial recently wrote:

It's interesting is how the contextualized approach [to teaching intro CS] impacted future student choice. Did we convince students that computing is interesting? I don't think so. Instead, we convinced students that computing was relevant to the things that they already had value for, that it had utility for them. Thus, we had students who were willing to take "More Media Computation" classes (as long as we kept explaining the utility), but not "More Computer Science" classes.

This came back to mind while I was reading Terry Chay's 5 million, which was recommended to me by alumnus Tony Bibbs in response to my recent request for assistance. While discussing how to recommend what language programmers should learn first, Chay wrote something in the same vein. I have cleaned up what I assume to be a typographical glitch in what he posted:

You know you can learn it in a classroom, but immersion is a much faster way to learn.

The best way to learn to program is to have an itch that needs scratching.

Together, these passages brought to mind advice that Alistair Cockburn once gave for introducing agile software development ideas into an organization. I recall his advise as this: Don't go into an organization talking about your solutions. Ask them about the problems they are having producing software. What causes them pain? Then look for a change they could make that would reduce or eliminate the pain. Often times, an agile practice will be the ticket, and this will serve as an opportunity to help them do something that helps them, not something that merely pulls a play from the playbook you are trying to sell.

I once ran across a quote on a blog at JavaRanch that seems not to exist anymore> which talked about the change in mindset that should accompany adopting Alistair's advice:

Changing other people in ways that I deem appropriate, that's hard. Asking people how they want to change, and how I can help them change, that's easy. Why don't I do more of the latter?

Those of us who teach students to program and who hope to attract creative and interested minds to CS cannot rely just on scratching the itches that students have, but that does seem like a useful prong in a multi-pronged effort. As Mark points out, many students interested in programming within a context are really interested in that context, not in programming or CS more generally. That's okay. Teaching a broad set of people how to do media computation is valuable on its own. But there are students like the ones Terry Chay describes who will immerse themselves in programming to scratch their own itches and then find they want to go deeper or broader than the one context.

Even with all the thinking out loud I do here, I am not sure yet which students will be the ones who go all the way with CS or how we can begin to identify them. Perhaps the best thing we can do is to draw them in with their own interests and see what happens. Teaching a generic, CS-centric intro sequence is not the best way to reach all students, even the ones who come in thinking they want to do CS. Empowering students to solve problems that matter to them seems like a promising way for us to approach the issue.

One reader commented on my CS/basketball fantasy that a CS1 course built around an obsession with sports would be a frivolous waste of time. That is probably true, but I have seen a fair number of students over the years in our CS1 courses and in Basic programming courses who invested significant numbers of hours into writing programs related to football, baseball, and basketball. I'm glad that those students engaged themselves with a programming language and set out to solve problems they cared about. If I could engage such students with my assignments, that would be an excellent use of our time in class, not a frivolous waste. I may not want to build an entire course around a particular student's interest in ranking NFL teams, but I am always looking for ways to incorporate student interests into what we need to do anyway.

Among other things, teachers need to keep in mind that students have itches, too. It never hurts to ask them every once in a while what those itches are.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 28, 2009 12:40 PM

CS in Everything: On the Hardwood

The economics blog Marginal Revolution has an occasional series of posts called "Markets in Everything", in which the writers report examples of markets at work in various aspects of everyday life. I've considered doing something similar here with computing, as a way to document some concrete examples of computational thinking and -- gasp! -- computer programs playing a role how we live, work, and play. Perhaps this will be a start.

Courtesy of Wicked Teacher of the West, I came across this story about NBA player Shane Battier, who stands out in an unusual way: by not standing out with his stats. A parallel theme of the story is how the NBA's Houston Rockets are using data and computer analysis in an effort to maximize their chances of victory. The connection to Battier is that the traditional statistics we associate with basketball -- points, rebounds, assists, blocked shots, and the like -- do not reflect his value. The Rockets think that Battier contributes far more to their chance of winning than his stat line shows.

The Rockets collect more detailed data about players and game situations, and Battier is able to use it to maximize his value. He has developed great instincts for the game, but he is an empiricist at heart:

The numbers either refute my thinking or support my thinking, and when there's any question, I trust the numbers. The numbers don't lie.

For an Indiana boy like myself, nothing could be more exciting than knowing that the Houston Rockets employ a head of basketball analytics. This sort of data analysis has long been popular among geeks who follow baseball, a game of discrete events in which the work of Bill James and like-minded statistician-fans of the American Pastime finds a natural home. I grew up a huge baseball fan and, like all boys my age, lived and died on the stats of my favorite players. But Indiana is basketball country, and basketball is my first and truest love. Combining hoops with computer science -- could there be a better job? There is at least one guy living the dream, in Houston.

I have written about the importance of solving real problems in CS courses, and many people are working to redefine introductory CS to put the concepts and skills we teach into context. Common themes include bioinformatics, economics, and media computation. Basketball may not be as important as sequencing the human genome, but it is real and it matters to a enough people to support a major entertainment industry. If I were willing to satisfy my own guilty pleasures, I would design a CS 1 course around Hoosier hysteria. Even if I don't, it's comforting to know that some people are beginning to use computer science to understand the game better.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

July 08, 2009 4:38 PM

Miscellaneous Notes on Using Computers

Good Question

Last week, I gave a talk on careers in computing to thirty or so high school kids in a math and science program on campus this summer. Because it's hard to make sense out of computing careers if one doesn't even know what computer science is, I started off with half an hour or so talking about CS. Part of that was distinguishing between discovering things, creating things, and studying things.

At the end, we had time for the usual question-and-answer session. The first question came from a young man who had looked quite disinterested throughout the talk: What is the most important thing you have discovered or invented?

Who says kids don't pay attention?

The Age of Fire

Yesterday, I took my laptop with me to do advising at freshmen orientation. It allows me to grab course enrollment data of the university web site (processed, but raw enough), rather than look at the print-outs the advising folks provide every morning. With that data and little more than grep and sorting on columns, I can find courses for my students much more easily than thumbing back and forth in the print-outs. And the results are certainly of a higher quality than my thumbing would give.

The looks on the other advisors' faces at our table made me think of how a group of prehistoric men must have looked when one of their compatriots struck two rocks together to make fire.

Computer Science's Dirty Little Secret

An alumnus sent me a link to an MSNBC article about Kodu, a framework for building Xbox-like games aimed at nine-year-olds.

I like how Matthew MacLaurin, lead developer, thinks:

MacLaurin ... says he hopes it doesn't just teach programming, but teaches us to appreciate programming as a modern art form.

(Emphasis added.)

The piece talks about "the growing importance of user-generated content in gaming" and how most people assume "that all of the creativity in video games takes place in the graphics and art side of the gaming studios, while the programming gets done by a bunch of math guys toiling over dry code. Author Winda Benedetti writes (emphasis added):

I had asked [McLaurin] if [Kodu] was like putting chocolate on broccoli -- a means of tricking kids into thinking the complex world of programming was actually fun.

But he insists that's not the case at all.

"It's teaching them that it was chocolate the whole time, it just looked like a piece of broccoli," he explains. "We're really saying that programming is the most fun part of creating games because of the way it surprises you. You do something really simple, and you get something really complex and cool coming back at you."

Programming isn't our dirty little secret. It is a shining achievement.

Afterthoughts

I am still amazed when lay people respond to me using a computer to solve daily problems, as if I have brought a computation machine from the future. Shocking! Yes, I actually use it to compute. The fact that people are surprised even when a computer scientist uses it that way should help us keep in mind just how little people understand what computer science is and what we can do with it.

Have an answer to the question, "What is the most important thing you have made?" ready at hand, and suitable for different audiences. When someone asks, that is the moment when you might be able to change a mind.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 07, 2009 4:39 PM

What Remains Is What Will Matter

Quoted by Harry Lewis in Excellence Without a Soul:

A liberal education is what remains after you have forgotten the facts that were first learned while becoming educated.
-- Jorge Dominguez

I think this applies not only to a liberal education broadly construed but also to specialized areas of study -- and even to a "narrow" technical field such as computer science. What is left five or ten years from now will be the education our students have received. Students may not remember the intricacies of writing an equals method in Java. I won't mind one bit. What will they remember? This is the true test of the courses we create and of the curricula we design. Let's set our sights high enough to hit the target we seek.

Lately I've been trying to swear off scare quotes and other writing affectations. I use them above with sincere intention. Computer science is not as narrow as most people think. Students usually think it is, and so do many of their parents. I hope that what we teach and do alleviates this misconception. Sadly, too often those of us who study computer science -- and teach it -- think of the discipline too narrowly. We may not preach it that way, but we often practice it so.

With good courses, a good curriculum, and a little luck, students may even remember some of their CS education. I enjoyed reading how people like Tim O'Reilly have been formed by elements of their classical classical education. How are we forming our students in the spirit of a classical CS education? If any discipline needs to teach enduring truths, it is ours! The details disappear with every new chip, every new OS, every new software trend.

What is most likely to remain from our stints in school are habit. Sure, CS students must take with them some facts and truths: trade-offs matter; in some situations, the constant dominates the polynomial; all useful programming languages have primitives, means for combining them, and means for abstracting away detail. Yes, facts matter, but our nature is tied to its habits. I said last time that publishing the data I collect and use would be a good habit because habits direct how we think. I am a pragmatist in the strong sense that knowledge is habit of thought. Habit of action creates habit of thought. Knowledge is not the only value born in habit. As Aristotle taught us,

Excellence is an art won by training and habituation. We do not act rightly because we have virtue or excellence, but rather we have those because we have acted rightly.

Even an old CS student can remember some of his liberal arts education...

Finally, we will do well to remember that students learn as much or more from the example we set as from what we say in the classroom, or even in our one-on-one mentoring. All the more reason to create habits of action we don't mind having our students imitate.

~~~~

Note. Someone might read Excellence Without a Soul and think that Harry Lewis is a classicist or a humanities scholar. He is a computer scientist, who just happened to spend eight years as Dean of Harvard College. Dominguez, whom Lewis quotes, is a political science professor at Harvard, but he claims to be paraphrasing Alfred North Whitehead -- a logician and mathematician -- in the snippet above. Those narrow technical guys...

My favorite Lewis book is, in fact, a computer science book, Elements of the Theory of Computation, which I mentioned here a while back. I learned theory of computation from that book -- as well as a lot of basic discrete math, because my undergrad CS program didn't require a discrete course. Often, we learn well enough what we need to learn when we need it. Elements remains one of my favorite CS books ever.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 06, 2009 3:26 PM

Cleaning Data Off My Desk

As I mentioned last time, this week I am getting back to some regular work after mostly wrapping up a big project, including cleaning off my desk. It is cluttered with a lot of loose paper that the Digital Age had promised to eliminate. Some is my own fault, paper copies of notes and agendas I should probably find a way to not to print. Old habits dies hard.

But I also have a lot paper sent to me as department head. Print-outs; old-style print-outs from a mainframe. The only thing missing from a 1980s flashback is the green bar paper.

Some of these print-outs are actually quite interesting. One set is of grade distribution reports produced by the registrar's office, which show how many students earned As, Bs, and so on in each course we offered this spring and for each instructor who taught a course in our department. This sort of data can be used to understand enrollment figures and maybe even performance in later courses. Some upper administrators have suggested using this data in anonymous form as a subtle form of peer pressure, so that profs who are outliers within a course might self-correct their own distributions. I'm ready to think about going there yet, but the raw data seems useful, and interesting in its own right.

I might want to do more with the data. This is the first time I recall receiving this, but in the fall it would be interesting to cross-reference the grade distributions by course and instructor. Do the students who start intro CS in the fall tend to earn different grades than those who start in the spring? Are there trends we can see over falls, springs, or whole years? My colleagues and I have sometimes wondered aloud about such things, but having a concrete example of the data in hand has opened new possibilities in my mind. (A typical user am I...)

As a programmer, I have the ability to do such analyses with relatively straightforward scripts, but I can't. The data is closed. I don't receive actual data from the registrar's office; I receive a print-out of one view of the data, determined by people in that office. Sadly, this data is mostly closed even to them, because they are working with an ancient mainframe database system for which there is no support and a diminishing amount of corporate memory here on campus. The university is in the process of implementing a new student information system, which should help solve some of these problems. I don't imagine that people across campus will have much access to this data, though. That's not the usual M.O. for universities.

Course enrollment and grade data aren't the only ones we could benefit from opening up a bit. As a part of the big project I just wrapped up, the task force I was on collected a massive amount of data about expenditures on campus. This data is accessible to many administrators on campus, but only through a web interface that constrains interaction pretty tightly. Now that we have collected the data, processed almost all of it by hand (the roughness of the data made automated processing an unattractive alternative), and tabulated it for analysis, we are starting to receive requests for our spreadsheets from others on campus. These folks all have access to the data, just not in the cleaned-up, organized format into which we massaged it. I expressed frustration with our financial system in a mini-rant a few years ago, and other users feel similar limitations.

For me, having enrollment and grade data would be so cool. We could convert data into information that we could then us to inform scheduling, teaching assignments, and the like. Universities are inherently an information-based institutions, but we don't always put our own understanding of the world into practice very well. Constrained resources and intellectual inertia slow us down or stop us all together.

Hence my wistful hope while reading Tim Bray's "Hello-World" for Open Data. Vancouver has a great idea:

  • Publish the data in a usable form.
  • License it in a way that turns people loose to do whatever they want, but doesn't create unreasonable liability risk for the city.
  • See what happens. ...

Would anyone on campus take advantage? Maybe, maybe not. I can imagine some interesting mash-ups using only university data, let alone linking to external data. But this isn't likely to happen. GPA data and instructor data are closely guarded by departments and instructors, and throwing light on it would upset enough people that any benefits would probably be shouted down. But perhaps some subset of the data the university maintains, suitably anonymized, could be opened up. If nothing else, transparency sometimes helps to promote trust.

I should probably do this myself, at the department level, with data related to schedule, budget, and so on. I occasionally share the spreadsheets I build with the faculty, so they can see the information I use to make decisions. This spring, we even discussed opening up the historic formula used in the department to allocate our version of merit pay.

(What a system that is -- so complicated that that I've feared making more than small editorial changes to it in my time as head. I keep hoping to find the time and energy to build something meaningful from scratch, but that never happens. And it turns out that most faculty are happy with what we have now, perhaps for "the devil you know" reasons.)

I doubt even the CS faculty in my department would care to have open data of this form. We are a small crew, and they are busy with the business of teaching and research. It is my job to serve them by taking as much of this thinking out of our way. Then again, who knows for sure until we try? If the cost of sharing can be made low enough, I'll have no reason not to share. But whether anyone uses the data that might not even be the real point. Habits change when we change them, when we take the time to create new ones to replace the old ones. This would a good habit for me to have.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Managing and Leading

June 26, 2009 4:01 PM

The Why of X

Where did the title of my previous entry come from? Two more quick hits tell a story.

Factoid of the Day

On a walk the other night, my daughter asked why we called variables x. She is reviewing some math this summer in preparation to study algebra this fall. All I could say was, "I don't know."

Before I had a chance to look into the reason, one explanation fell into my lap. I was reading an article called The Shakespeare of Iran, which I ran across in a tweet somewhere. And there was an answer: the great Omar Khayyam.

Omar was the first Persian mathematician to call the unknown factor of an equation (i.e., the x) shiy (meaning thing or something in Arabic). This word was transliterated to Spanish during the Middle Ages as xay, and, from there, it became popular among European mathematicians to call the unknown factor either xay, or more usually by its abbreviated form, x, which is the reason that unknown factors are usually represented by an x.

However, I can't confirm that Khayyam was first. Both Wikipedia and another source also report the Arabic language connection, and the latter mentions Khayyam, but not specifically as the source. That author also notes that "xenos" is the Greek word for "unknown" and so could be the root. However, I also haven't found a reference for this use of x that predates Khayyam, either. So may be.

My daughter and I ended up with as much of a history lesson as a mathematical terminology lesson. I like that.

Quote of the Day

Yesterday afternoon, the same daughter was listening in on a conversation between me and a colleague about doing math and science, teaching math and science, and how poorly we do it. After we mentioned K-12 education and how students learn to think of science and math as "hard" and "for the brains", she joined the conversation with:

Don't ask teachers, 'Why?' They don't know, and they act like it's not important.

I was floored.

She is right, of course. Even our elementary school children notice this phenomenon, drawing on their own experiences with teachers who diminish or dismiss the very questions we want our children to ask. Why? is the question that makes science and math what they are.

Maybe the teacher knows the answer and doesn't want to take the time to answer it. Maybe she knows the answer but doesn't know how to answer it in a way that a 4th- or 6th- or 8th-grader can understand. Maybe he really doesn't know the answer -- a condition I fear happens all too often. No matter; the damage is done when the the teacher doesn't answer, and the child figures the teacher doesn't know. Science and math are so hard that the teacher doesn't get it either! Better move on to something else. Sigh.

This problem doesn't occur only in elementary school or high school. How often do college professors send the same signal? And how often do college professors not know why?

Sometimes, truth hits me in the face when I least expect it. My daughters keep on teaching me.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

June 17, 2009 9:48 PM

Another Connection to Journalism

a picture of Dave Winer

While reading about the fate of newspapers prior to writing my recent entry on whether universities are next, I ran across a blog entry by Dave Winer called If you don't like the news.... Winer had attended a panel discussion at the UC-Berkeley school of journalism. After hearing what he considered the standard "blanket condemnation of the web" by the journalists there, he was thinking about all the blogs he would love to have shown them -- examples of experts and citizens alike writing about economics, politics, and the world; examples of a new sort of journalism, made possible by the web, which give him hope for the future of ideas on the internet.

Here is the money quote for me:

I would also say to the assembled educators -- you owe it to the next generations, who you serve, to prepare them for the world they will live in as adults, not the world we grew up in. Teach all of them the basics of journalism, no matter what they came to Cal to study. Everyone is now a journalist. You'll see an explosion in your craft, but it will cease to be a profession.

Replace "journalism" with "computer science", and "journalist" with "programmer", and this statement fits perfectly with the theme of much of this blog for the past couple of years. I would be happy to say this to my fellow computer science educators: Everyone should now be a programmer. We'll see an explosion in our craft.

Will programming cease to be a profession? I don't think so, because there is still a kind and level of programming that goes beyond what most people will want to do. Some of us will remain the implementors of certain tools for others to use, but more and more we will empower others to make the tools they need to think, do, and maybe even play.

Are academic computer scientists ready to make this shift in mindset? No more so than academic journalists, I suspect. Are practicing programmers? No more so than practicing journalists, I suspect.

Purely by happenstance, I ran across another quote from Winer this week, one that expresses something about programming from the heart of a programmer:

i wasn't born a programmer.
i became one because i was impatient.
-- @davewiner

I suspect that a lot of you know just what he means. How do we cultivate the right sort of impatience in our students?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 04, 2009 8:38 PM

The Next 700 ...

I have regarded it as the highest goal
of programming language design to enable
good ideas to be elegantly expressed.
-- Tony Hoare, The Emperor's Old Clothes

One of computing's pioneers has passed. Word came this morning from Queen Mary, University of London, that Peter Landin died yesterday. Landin is perhaps not as well known on this side of the pond as he should be. His university web page lists his research interests as "logic and foundations of programming; theoretical computations; programming foundations; computer languages", but it doesn't say much about his seminal contributions in those areas. Nor does it mention his role helping to establish computer science as a discipline in the UK.

In the world of programming languages, though, Landin is well-known. He was one of the early researchers influenced by McCarthy's Lisp, and he helped to develop the connection between the lambda calculus and the idea of a programming language. In turn, He influenced Tony Hoare and Hoare's creation of Quicksort. This followed his involvement in the design of Algol 60, which introduced recursion to a wider world of computer scientists and programmers. Algol 60 was in many ways the alpha of modern programming languages.

I am probably like many computer scientists in having read only one of Landin's papers, The Next 700 Programming Languages. I remember first running across this paper while studying functional languages a decade or so ago. Its title intrigued me, and its publication date -- July 1965 -- made me wonder just what he could mean by it. I was blown away. He distinguished among four different levels of features that denote a language: physical, logical, abstract, and "applicative expressions". The last of these abstracted even more grammatical detail away from what many of us tend to think of as the abstract tree of a program. He also wrote about the role of where clauses in specifying local bindings natural just as mathematicians long have.

Before reading this paper, I had never seen a discussion of the physical appearance of programs written in a language. Re-reading the paper now, I had forgotten that Landin used an analogy to soccer, the off-side rule, to define a class of physical appearances in which indentation mattered. After we as a discipline left the punch card behind, for many years it was unstylish at best, and heresy at worst, for whitespace to matter in programming language design. These days, languages such as Python and Haskell have sidestepped this tradition and put whitespace to good use.

On a lighter note, Landin also coined the term syntactic sugar, a fact I learned only while reading about Landin after his passing. What whimsy! A good name is sometimes worth as much as a good idea. I join Hoare in praising Landin for showing him the elegance of recursion, but also reserve a little laud for giving us such a wonderful term to use when talking about the elegance of small languages.

This isn't quite an END DO moment for me. I heard about John Backus throughout my undergrad and graduate careers, and his influence on the realization of the compiler has affected me deeply for as long as I've been a student of computer science. I came to Landin later, through his theoretical contributions. Yet it's interesting that they shared a deep appreciation for functional languages. For much of the discipline's history, functional programming has remained within the province of the academic, looked upon disdainfully by practitioners as impractical, abstract, and a distraction from the real business of programming. Now the whole programming world is atwitter with new languages, and new features for old languages, that draw on the abstraction, power, and beauty of functional programming. The world eventually catch up with ideas of visionaries.

An e-mail message from Edmund Robinson shared the sad news of Landin's passing with many people. In it, Robinson wrote:

The ideas in his papers were truly original and beautiful, but Peter never had a simplistic approach to scientific progress, and would scoff at the idea of individual personal contribution.

Whatever Landin's thoughts about "the idea of individual personal contribution", the computing world owes him a debt for what he gave us. Read "The Next 700 Programming Languages" in his honor. I am reading it again and plan next to look into his other major papers, to see what more he has to teach me.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 26, 2009 7:18 PM

The Why of Lambda

The Lambda Chair

For the programming languages geek who has everything, try the Lambda Chair. The company took the time to name it right, then sadly took the promotional photo from the wrong side. Still, an attractive complement to any office -- and only $2,000. Perfect for my university budget!

I found out about this chair on the PLT mailing list today. The initial frivolity led to an interesting excursion into history when someone asked:

Does anyone know if Church had anything in mind for lambda to stand for, or was it just an arbitrary choice?

In response, Matthias Felleisen shared a story that is similar to one I'd heard in the languages community before. At the beginning of the last century, mathematicians used ^ to indicate class abstractions, such as î : i is prime. Church used ^`, the primed version of the hat, to indicate function abstraction, because a function is a special kind of set. Church's secretary read this notation as λ, and Church let it stand.

Later in the thread, Dave Herman offered pointers to a couple of technical references that shed further light on the origin of lambda. On Page 7 of History of Lambda-Calculus and Combinatory Logic, Cardone and Hindley cite Church himself:

(By the way, why did Church choose the notation "λ"? In [Church, 1964, §2] he stated clearly that it came from the notation "î" used for class-abstraction by Whitehead and Russell, by first modifying "î" to "∧i" to distinguish function-abstraction from class-abstraction, and then changing "∧" to "λ" for ease of printing. This origin was also reported in [Rosser, 1984, p.338]. On the other hand, in his later years Church told two enquirers that the choice was more accidental: a symbol was needed and "λ" just happened to be chosen.)

The two internal references are to an unpublished letter from Church to Harald Dickson, dated July 7, 1964, and to J. B. Rosser's 1984 paper Highlights of the History of the Lambda Calculus from the Annals of the History of Computing.

Herman also pointed to Page 182 of The Impact of the Lambda Calculus in Logic and Computer Science:

We end this introduction by telling what seems to be the story how the letter 'λ' was chosen to denote function abstraction. In [100] Principia Mathematica the notation for the function f with f(i) = 2i + 1 is 2 î + 1. Church originally intended to use the notation î.2i + 1. The typesetter could not position the hat on top of the i and placed it in front of it, resulting in ∧i.2i + 1. Then another typesetter changed it into λi.2i + 1.

(I changed the variable x to an i in the preceding paragraph, because, much like the alleged trendsetting typesetter, I don't know how to position the circumflex on top of an x in HTML!)

Even in technical disciplines, history can be an imprecise endeavor. Still, it's fun when we go from anecdote to a more reliable source. I don't know that I'll ever need to tell the story of the lambda, but I like knowing it anyway.


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 22, 2009 4:05 PM

Parsing Expression Grammars in the Compiler Course

Yesterday, a student told me about the Ruby gem Treetop, a DSL for writing language grammars. This language uses parsing expression grammar, which turns our usual idea of grammar inside out. Most compiler theory is built atop the context-free and regular grammars of Chomsky. These grammars are generative: they describe rules that allow us to create strings which are part of the language. Parsing expression grammars describe rules that allow us to recognize strings which are part of the language.

This new kind of grammar offers a lot of advantages for working with programming languages, such as unifying lexical and syntactic descriptions and supporting the construction of linear-time parsers. I remember seeing Bryan Ford talk about packrat parsing at ICFP 2002, but at that point I wasn't thinking as much about language grammars and so didn't pay close attention the type of grammar that underlay his parsing ideas.

While generative grammars are a fundamental part of computing theory, they don't map directly onto the primary task for which many software people use them: building scanners and parsers for programming languages. Our programs recognize strings, not generate them. So we have developed mechanisms for building and even generating scanners and parsers, given grammars that we have written under specific constraints and then massaged to fit our programming mechanisms. Sometimes the modified grammars aren't as straightforward as we might like. This can be a problem for anyone who comes to the grammar later, as well as a problem for the creators of the grammar when they want to change it in response to changes requested by users.

A recognition-based grammar matches our goals as compiler writers more closely, which could be a nice advantage. Parsing expression grammars make explicit the specification of the code we write against them.

For those of us who teach compiler courses, something like a parsing expression grammar raises another question. Oftentimes, we hope that the compiler course can do double duty: teach students how to build a compiler, and help them to understand the theory, history, and big issues of language processors. I think of this as a battle between two forces, "learning to" versus "learning about", a manifestation of epistemology's distinction between "knowing that" and "knowing how".

Using recognition-based grammars as the foundation for a compiler course introduces a trade-off: students may be empowered more quickly to create language grammars and parsers but perhaps not learn as much about the standard terminology and techniques of the discipline. These standard ways are, of course, our historical ways of doing things. There is much value in learning history, but at what point do we take the step forward to techniques that are more practical than reminiscent?

This is a choice that we have to make all the time in a compiler course: top-down versus bottom-up parsing, table-driven parsers versus recursive-descent parsers, writing parsers by hand versus using parser generators... As I've discussed here before, I still ask students to write their parser by hand because I think the experience of writing this code teaches them more than just about compilers.

Now that I have been re-introduced to this notion of recognition-based grammars, I'm wondering whether they might help me to balance some of the forces at play more satisfactorily. Students would have the experience of writing a non-trivial parser by hand, but against a grammar that is more transparent and easier to work with. I will play with parsing expression grammars a bit in the next year or so and consider making a change the next time I teach the course. (If you have taught a compiler course using this approach, or know someone who has, please let me know.)

Going this way would not commit me to having students write their parsers by hand. The link that started this thread of thought points to a tool for automating the manipulation of parsing expression grammars. Whatever I do, I'll add that tool to the list of tools I share with students.

Oh, and a little Ruby Love to close. Take a look at TreeTop. Its syntax is beautiful. A Treetop grammar reads cleanly, crisply -- and is executable Ruby code. This is the sort of beauty that Ruby allows, even encourages, and is one of the reasons I remain enamored of the language.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 14, 2009 9:10 PM

Computer as Medium

While waiting for a school convocation to start last night, I was digging through my bag looking for something to read. I came across a print-out of Personal Dynamic Media, which I cited in my entry about Adele Goldberg. I gladly re-re-read it. This extended passage a good explanation of the idea that the digital computer is more than a tool, that it is a medium, and a more powerful medium than any other we humans have created and used:

"Devices" which variously store, retrieve, or manipulate information in the form of messages embedded in a medium have been in existence for thousands of years. People use them to communicate ideas and feelings both to others and back to themselves. Although thinking goes on in one's head, external media serve to materialize thoughts and, through feedback, to augment the actual paths the thinking follows. Methods discovered in one medium provide metaphors which contribute new ways to think about notions in other media.

For most of recorded history, the interactions of humans with their media have been primarily nonconversational and passive in the sense that marks on paper, paint on walls, even "motion" pictures and television, do not change in response to the viewer's wishes. A mathematical formulation -- which may symbolize the essence of an entire universe -- once put down on paper, remains static and requires the reader to expand its possibilities.

Every message is, in one sense or another, a simulation of some idea. It may be representational or abstract. The essence of a medium is very much dependent on the way messages are embedded, changed, and viewed. Although digital computers were originally designed to do arithmetic computation, the ability to simulate the details of any descriptive model means that the computer, viewed as a medium itself, can be all other media if the embedding and viewing methods are sufficiently well provided. Moreover, this new "metamedium" is active -- it can respond to queries and experiments -- so that the messages may involve the learner in a two-way conversation. This property has never been available before except through the medium of an individual teacher. We think the implications are vast and compelling.

I agree. But after reading this paper again all I can think is: No wonder Kay is so disappointed by what we are doing in the world of computing in 2009. Looking at what he, Goldberg, and their team were doing back in the 1970s, with technology that looks so very primitive to us these days -- not only the interactivity of the medium they were creating, but the creations of the people working in the medium, even elementary school students. Even if I think only in terms of how they viewed and created language... We have not done a good job living up to the promise of that work.

If you are eager to embrace this promise, perhaps you will be inspired by this passage from the Education in the Digital Age interview I mentioned in Making Language:

Music is in the person.
An instrument amplifies it.
The computer is like that.

How are you using the computer to amplify the music inside of you? What can you do to help the computer amplify what is inside others?


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 06, 2009 4:16 PM

Making Language

I've been catching up on some reading while not making progress on other work. I enjoyed this interview with Barbara Liskov, which discusses some of the work that earned her the 2008 Turing Award. I liked this passage:

I then developed a programming language that included this idea [of how to abstract away from the details of data representation in programs]. I did that for two reasons: one was to make sure that I had defined everything precisely because a programming language eventually turns into code that runs on a machine so it has to be very well-defined; and then additionally because programmers write programs in programming languages and so I thought it would be a good vehicle for communicating the idea so they would really understand it.

Liskov had two needs, and she designed a language to meet them. First, she needed to know that her idea for how to organize programs were sound. She wanted to hold herself accountable. A program is an effective way to implement an idea and show that it works as described. In her case, her idea was about _writing_ programs, so she created a new language that embodied the idea and wrote a processor for programs written in that language.

Second, she needed to share her idea with others. She wanted to teach programmers to use her idea effectively. To do that, she created a language. It embodied her ideas about encapsulation and abstraction in language primitives that programmers could use directly. This made it possible for them to learn how to think in their terms and thus produce a new kind of program.

This is a great example of what language can do, and why having the power to create new languages makes computer science different. A program is an idea and a language is a vehicle for expressing ideas. We are only beginning to understand what this means for how we can learn and communicate. In the video Education in the Digital Age, Alan Kay talks about how creating a new language changes how we learn:

The computer allows us to put what we are thinking into a dynamic language and probe it in a way we never could before.

We need to find a way to help CS students see this early on so that they become comfortable with the idea of creating languages to help themselves learn. Mark Guzdial recently said much the same thing: we must help students see that languages are things you build, not just use. Can we introduce students to this idea in their introductory courses? Certainly, under the right conditions. One of my colleagues loves to use small BASIC-like interpreters in his intro course or his assembly language courses. This used to be a common idea, but as curricula and introductory programming languages have changed over time, it seems to have fallen out of favor. Some folks persist, perhaps with simple a simple command language. But we need to reinforce the idea throughout the curriculum. This is less a matter of course content than the mindset of the instructor.

After reading so much recently about Liskov, I am eager to spend some time studying CLU. I heard of CLU as an undergraduate but never had a chance for in-depth study. Even with so many new languages to dive into, I still have an affinity for older languages and and for original literature on many CS topics. (If I were in the humanities, I would probably be a classicist, not a scholar of modern lit or pop culture...)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 28, 2009 9:27 AM

Follow Up to "Bug or Feature"

Thanks to several of you who have pointed out that Scheme behaves exactly Python on the Python example in my previous post:

    > (define (f x) (if (> x 0) (f (- x 1)) 0))
    > (define g f)
    > (define (f x) x)
    > (g 5)
    > 4

Sigh. Indeed, it does. That is what I get for writing code for a blog and not test-driving it first. As an instructor, I learned long ago about the dangers of making claims about code that I have not executed and explored a bit -- even seemingly simple code. Now I can re-learn that lesson here.

The reason this code behaves as it does in both Scheme and Python is that the bindings of function f don't involve a closure at all. They refer to a free variable that must be looked up at in the top-level environment when they are executed.

While writing that entry, I was thinking of a Scheme example more like this:

    (define f
      (lambda (x)
        (letrec ((f (lambda (x)
                      (if (> x 0)
                          (f (- x 1))
                          0))))
          (f x))))

... in which the recursive call (f (- x 1)) takes place in the context of a local binding. It is also a much more complex piece of code. I do not know whether there is an idiomatic Python program similar to this or to my earlier Scheme example:

    (define f
      (let ((i 100))
        (lambda (x)
           (+ x i))))
    (define i 1)
    (f 1)

If there is, I suspect that real Python programmers would say that they simply don't program in this way very often. As van Rossum's The History of Python blog points out, Python was never intended as a functional language, even when it included features that made functional-style programming possible. So one ought not be too surprised when a purely functional idiom doesn't work out as expected. Whether that is better or worse for students who are learning to program in Python, I'll leave up to the people responsible for Python.

One can find all kinds of discussion on the web about whether Python closures are indeed "broken" or not, such as here (the comments are as interesting as the article). Until I have a little more time to dig deeper into the design and implementation of Python, though, I will have to accept that this is just one of those areas where I am not keeping up as well as I might like. But I will get to the bottom of this.

Back to the original thought beyond my previous post: It seems that Python is not the language to use as an example dynamic scope. Until I find something else, I'll stick with Common Lisp and maybe sneak in a little Perl for variety.


Posted by Eugene Wallingford | Permalink | Categories: Computing

April 27, 2009 7:24 PM

Dynamic Scope as Bug or Feature

When I teach programming languages, we discuss the concepts of static and dynamic scoping. Scheme, like most languages these days, is statically scoped. This means that a variable refers to the binding that existed when the variable was created. For example,

> (define f
    (let ((i 100))
      (lambda (x)
        (+ x i))))
> (define i 1)
> (f 1)
101

This displays 101, not 2, because the reference to i in the body of function f is to the local variable i that exists when the function was created, not to the i that exists when the function is called. If the interpreter looked to the calling context to find its binding for i, that would be an example of dynamic scope, and the interpreter would display 2 instead.

Most languages use static typic these days for a variety of reasons, not the least of which is that it is easier for programmers to reason about code that is statically scoped. It is also easier to decompose programs and create modules that programmers can understand easily and use reliably.

In my course, when looking for an example of a dynamically-scoped language, I usually refer to Common Lisp. Many old Lisps were scoped dynamically, and Common Lisp gives the programmer the ability to define individual variables as dynamically-scoped. Lisp does not mean much to students these days, though. If I were more of a Perl programmer, I would have known that Perl offers the same ability to choose dynamic scope for a particular variable. But I'm not, so I didn't know about this feature of the language until writing this entry. Besides, Perl itself is beginning to fade from the forefront of students' attention these days, too. I could use an example closer to my students' experience.

A recent post on why Python does not optimize tail calls brought this topic to mind. I've often heard it said that in Python closures are "broken", which is to say that they are not closures at all. Consider this example drawn from the linked article:

IDLE 1.2.1      
>>> def f(x):
    if x > 0:
       return f(x-1)
    return 0;

>>> g = f >>> def f(x): return x

>>> g(5) 4

g is a function defined in terms of f. By the time we call g, f refers to a different function at the top level. The result is something that looks a lot like dynamic scope.

I don't know enough about the history of Python to know whether such dynamic scoping is the result of a conscious decision of the language designer or not. Reading over the Python history blog, I get the impression that it was less a conscious choice and more a side effect of having adopted specific semantics for other parts of the language. Opting for simplicity and transparency as an overarching sometimes means accepting their effects downstream. As my programming languages students learn, it's actually easier to implement dynamic scope in an interpreter, because you get it "for free". To implement static scope, the interpreter must go to the effort of storing the data environment that exists at the time a block, function, or other closure is created. This leads to a trade-off: a simpler interpreter supports programs that can be harder to understand, and a more complex interpreter supports programs that are easier to understand.

So for now I will say that dynamic scope is a feature of Python, not a bug, though it may not have been one of the intended features at the time of the language's design.

If any of your current favorite languages use or allow dynamic scope, I'd love to hear about it -- and especially whether and how you ever put that feature to use.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 23, 2009 8:51 PM

Getting Caught Up In Stupid Details

I occasionally sneak a peek at the Learning Curves blog when I should be working. Yesterday I saw this entry, with a sad bullet point for us CS profs:

Keep getting caught up in stupid details on the computer science homework. String handling. Formatting times. That sort of thing. The problem no longer interests me now that I grasp the big idea.

This is an issue I see a lot in students, usually the better ones. In some cases, the problem is that the students feel they have a right not to be bothered with any details, stupid or otherwise. But a lot of programming involves stupid details. So do most other activities, like playing the piano, playing a sport, reading a book, closing a company's financial books, or running a chemistry experiment.

Life isn't a matter only of big ideas that never come into contact with the world. Our fingers have to strike the keys and play the right notes, in the correct order at the proper tempo. I can understand the big ideas of shooting a ball through a hoop, but players succeed because they shoot thousands of shots, over and over, paying careful attention to details such as their point of release, the rotation of the ball, and the bending of their knees.

There may be an element of this in Hirta's lament, but I do not imagine that this is her whole of her problem. Some details really are stupid. For the most part, basketball players need not worry about the lettering on the ball, and piano players need not think about whether their sheet music was printed on 80% or 90% post-consumer recycled paper. Yet too often people who write programs have to attend to details just as silly, irrelevant, and disruptive.

This problem is even worse for people learning to write programs. "Don't worry what public static void main( String[] args ) means; just type it in before you start." Huh? Java is not alone here. C++ throws all sorts of silly details into the faces of novice programmers, and even languages touted for their novice-friendliness, such as Ada, push all manner of syntax and convention into the minds of beginners. Let's face it: learning to program is hard enough. We don't need to distract learners with details that don't contribute to learning the big idea, and maybe even get in the way.

If we hope to excite people with the power of programming, we will do it with big ideas, not the placement of periods, spaces, keywords, and braces. We need to find ways so that students can solve problems and write programs by understanding the ideas behind them, using tools that get in the way as little as possible. No junk allowed. That may be through simpler languages, better libraries, or something else that I haven't learned about yet.

(And please don't post a link to this entry on Reddit with a comment saying that that silly Eugene fella thinks we should dumb down programming and programming languages by trying to eliminate all the details, and that this is impossible, and that Eugene's thinking it is possible is a sign that he is almost certainly ruining a bunch of poor students in the heartland. Re-read the first part of the entry first...)

Oh, and for my agile developer friends: Read a little farther down the Learning Curves post to find this:

Email from TA alleges that debugging will be faster if one writes all the test cases ahead of time because one won't have to keep typing things while testing by hand.

Hirta dismisses the idea, saying that debugging will still require diagnosis and judgment, and thus be particular to the program and to the bug in question. But I think her TA has re-discovered test-first programming. Standing ovation!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 13, 2009 7:36 PM

Keeping Up Versus Settling Down

The last few days I have run across several pointers to Scala and Clojure, two dynamic languages that support functional programming style on the JVM. Whenever I run into a new programming language, I start thinking about how much time I should put into learning and using it. If it is a functional language, I think a bit harder, and naturally wonder whether I should consider migrating my Programming Languages course from the venerable but now 30-plus-years-old Scheme to the new language.

My time is finite and in scarce supply, so I have to choose wisely. If I try to chase every language thread that pops up everywhere, I'll end up completely lost and making no progress on anything important. Choosing which threads to follow requires good professional judgment and, frankly, a lot of luck. In the worst case, I'd like to learn something new for the time I invest.

Scala and Clojure have been on my radar for a while and, like books that receive multiple recommendations, are nearing critical mass for a deeper look. With summer around the corner and my usual itch to learn something new, chances go up even more.

Could one of these languages, or another, ever displace Scheme from my course? That's yet another major issue. A few years ago I entertained the notion of using Haskell in lieu of Scheme for a short while, but Scheme's simplicity and dynamic typing won out. Our students need to see something as different as possible from what they are used to, whatever burden that places on me and the course. My own experience with Lisp and Scheme surely had some effect on my decision. For every beautiful idea I could demonstrate in Haskell, I knew of a similar idea or three in Scheme.

My circumstance reminds me of a comment by Patrick Lam to a blog entry at Learning Curves:

I've noticed that computer science faculty technology usage often seems to be frozen to when they start their first tenure-track job. Unclear yet if I'll get stuck to 2008 technology.

Lam is wise to think consciously of this now. I know I did not. Then again, I think my track record learning new technologies, languages, and tools through the 1990s, in my first decade as a tenure-track professor, holds up pretty well. I picked up several new programming languages, played with wikis, adopted and used various tools from the agile community, taught courses in several new courses that required more than passing familiarity with the tools of those subdisciplines, and did a lot of work in the software patterns world.

My pace learning new technologies may have slowed a bit in the 2000s, but I've continued to learn new things. Haskell, Ruby, Subversion, blogs, RSS, Twitter, ... All of have become part of my research, teaching, or daily practice in the last decade. And not just as curiosities next to my real languages and tools; Ruby has become one of my favorite programming languages, alongside old-timers Smalltalk and Scheme.

A language that doesn't affect
the way you think about programming,
is not worth knowing.

-- Alan Perlis,
Epigrams on Programming
Alan Perlis

At some point, though, there is something of a "not again..." feeling that accompanies the appearance of new tools on the scene. CVS led to Subversion, which led to ... Darcs, Mercurial, Git, and more. Which new tool is most worth the effort and time? I've always had a fondness for classics, for ideas that will last, so learning yet another tool of the same kind looks increasingly less exciting as time passes. Alan Perlis was right. We need to spend our time and energy learning things that matter.

This approach carries one small risk for university professors, though. Sticking with the classics can leave one's course materials, examples, and assignments looking stale and out of touch. Any CS 1 students care to write a Fahrenheit-to-Celsius converter?

In the 1990s, when I was learning a lot of new stuff in my first few years on the faculty, I managed to publish a few papers and stay active. However, I am not a "research professor" at a "research school", which is Lam's situation. Hence the rest of his comment:

Also unclear if getting stuck is actually necessary for being successful faculty.

As silly as this may sound, it is a legitimate question. If you spend all of your time chasing the next technology, especially for teaching your courses, then you won't have time to do your research, publish papers, and get grants. You have to strike a careful balance. There is more to this question than simply the availability of time; there is also a matter of mindset:

Getting to the bottom of things -- questioning assumptions, investigating causes, making connections -- requires a different state of mind than staying on top of things.

This comes from John Cook's Getting to the Bottom of Things. In that piece, Cook concerns himself mostly with multitasking, focus, and context switching, but there is more. The mindset of the scientist -- who is trying to understand the world at a deep level -- is different than the mindset of the practitioner or tool builder. Time and energy devoted to the latter almost certainly cannibalizes the time and energy available for the former.

As I think in these terms, it seems clearer to me one advantage that some so-called teaching faculty have over research faculty in the classroom. I've always had great respect for the depth of curiosity and understanding that active researchers bring to the classroom. If they are also interested in teaching well, they have something special to share with their students. But teaching faculty have a complementary advantage. Their ability to stay on top of things means that their courses can be on the cutting edge in a way that many research faculty's courses cannot. Trade-offs and balance yet again.

For what it's worth, I really am intrigued by the possibilities offered by Scala and Clojure for my Programming Languages course. If we can have all of the beauty of other functional languages at the same time as a connection to what is happening out in the world, all the better. Practical connections can be wonderfully motivating to students -- or seem seem cloyingly trendy. Running on top of the JVM creates a lot of neat possibilities not only for the languages course but also for the compilers course and for courses in systems and enterprise software development. The JVM has become something of a standard architecture that students should know something about -- but we don't want to give our students too narrow an experience. Busy, busy, busy.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 08, 2009 6:32 PM

Quick Hits on the Way Out of Dodge

Well, Carefree. But it plays the Western theme to the hilt.

This was a shorter conference visit than usual. Due to bad weather on the way here, I arrived on the last flight in on Sunday. Due to work constraints of my workshop colleagues, I am heading out before the Wednesday morning session. Yet it was a productive trip -- like last year, but this time on our own work, as originally planned. We produced

  • the beginnings of a catalog of data-driven real-world problems used in CS1 courses across the country, and
  • half of a grant proposal to NSF's CPATH program, to fund some of our more ambitious ideas about programming for everyone, including CS majors.
A good trip.

Yesterday over our late afternoon break, we joined with the other workshop group and had an animated discussion started by a guy who has been involved with the agile community. He claimed that XP and other agile approaches tell us that "thinking is not allowed", that no design is allowed. A straw man can be fun and useful for exploring the boundaries of a metaphor. But believing it for real? Sigh.

A passing thought: Will professionals in other disciplines really benefit from knowing how to program? Why can't they "just" use a spreadsheet or a modeling tool like Shazam? This question didn't come to mind as a doubt, but as a realization that I need a variety of compelling stories to tell when I talk about this with people who don't already believe my claim.

While speaking of spreadsheets... My co-conspirator Robert Duvall was poking around Swivel, a web site that collects and shares open data sets, and read about the founders' inspiration. They cited something Dan Bricklin said about his own inspiration for inventing the spreadsheet:

I wanted to create a word processor for data.

Very nice. Notice that Bricklin's word processor for data exposes a powerful form of end-user programming.

When I go to conferences, I usually feel as if the friends and colleagues I meet are doing more, and more interesting, things than I -- in research, in class, in life. It turns out that a lot of my friends and colleagues seem to think the same thing about their friends and colleagues, including me. Huh.

I write this in the air. I was booked on a 100% full 6:50 AM PHX-MSP flight. We arrive at the airport a few minutes later than planned. Rats, I have been assigned a window seat by the airline. Okay, so I get on the plane and take my seat. A family of three gets on and asks me hopefully whether there is any chance I'd like an aisle seat. Sure, I can help. (!) I trade out to the aisle seat across the aisle so that they can sit together. Then the guy booked into the middle seat next to me doesn't show. Surprise: room for my Macbook Pro and my elbows. Some days, the smile on me in small and unexpected ways.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 29, 2009 11:39 AM

Looking Forward to Time Working

In real life there is no such thing as algebra.

-- Fran Lebowitz

At this time next week, I will be on my way to ChiliPLoP for a working session. Readers here know how much I enjoy my annual sojourn to this working conference, but this year I look forward to it with special fervor.

First, my day job the last few months -- the last year, really -- has been heavier than usual with administrative activities: IT task force, program review, budget concerns. These are all important tasks, with large potential effects on my university, my department, and our curriculum and faculty. But they are not computer science, and I need to do some computer science.

Second, I am still in a state of hopeful optimism that my year-long Running Winter is coming to an end. I put in five runs this week and reached 20 miles for the first time since October. The week culminated this morning in a chilly, hilly 8 miles on a fresh dusting of snow and under a crystal clear blue sky. ChiliPLoP is my favorite place to run away from home. I never leave Carefree without being inspired, unless I am sick and unable to run. Even if I manage only two short runs around town, which is what I figure is in store, I think that the location will do a little more magic for me.

Our hot topic group will be working at the intersection of computer science and other disciplines, stepping a bit farther from mainstream CS than it has in recent years. We all see the need to seek something more transformative than incremental, and I'd like to put into practice some of the mindset I've been exploring in my blog the last year or so.

The other group will again be led by Dave West and Dick Gabriel, and they, too, are thinking about how we might re-imagine computer science and software development around Peter Naur's notion of programming as theory building. Ironically, I mentioned that work recently in a context that crosses into my hot topic's focus. This could lead to some interesting dinner conversation.

Both hot topics' work will have implications for how we present programming, software development, and computer science to others, whether CS students are professionals in other disciplines. Michael Berman (who recently launched his new blog) sent a comment on my Sweating the Small Stuff that we need to keep in mind whenever we want people to learn how to do something:

I think that's an essential observation, and one that needs to be designed into the curriculum. Most people don't learn something until they need it. So trying to get students to learn syntax by teaching them syntax and having them solve toy problems doesn't teach them syntax. It's a mistake to think that there's something wrong with the students or the intro class -- the problem is in the curriculum design.

I learned algebra when I took trig, and trig when I took calculus, and I learned calculus in my physics class and later in queueing theory and probability. (I never really learned queueing theory.)

One of the great hopes of teaching computation to physicists, economists, sociologists, and anyone else is that they have real problems to solve and so might learn the tool they need to solve them. Might -- because we need to tell them a story that compels them to want to solve them with computation. Putting programming into the context of building theories in an applied discipline is a first step.

(Then we need to figure out the context and curriculum that helps CS students learn to program without getting angry...)


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

March 24, 2009 3:45 PM

Meta-Blog: Follow-Up to My Adele Goldberg Entry

When I was first asked to consider writing a blog piece for the Ada Lovelace Day challenge, I wasn't sure I wanted to. I don't usually blog with any particular agenda; I just write whatever is in my mind at the time, itching to get out. This was surely a topic I have thought and written about before, and it's one that I have worked on with people at my university and across the state. I think it is in the best interest of computer science to be sure that we are not missing out on great minds who might be self-selecting away from the discipline for the wrong reasons. So I said yes.

Soon afterwards, ACM announced Barbara Liskov as the winner of the Turing Award. I had written about Fran Allen when she won the Turing Award, and here was another female researcher in programming languages whose work I have long admired. I think the Liskov Substitution Principle is one of the great ideas in software development, a crucial feature of object-oriented programming, of any kind of programming, really. I make a variant of the LSP the centerpiece of my undergraduate courses on OOP. But Liskov has done more -- CLU and encapsulation, Thor and object-oriented databases, the idea of Byzantine fault tolerance in distributed computing, ... It was a perfect fit for the challenge.

But my first thought, Adele Goldberg, would not leave me. That thought grew out of my long love affair with Smalltalk, to which she contributed, and out of a memory I have from my second OOPSLA Educators' Symposium, where she gave a talk on learning environments, programming, and language. Goldberg isn't a typical academic Ph.D.; she is versatile, having worked in technical research, applications, and business. She has made technical contributions and contributions to teaching and learning. She helped found companies. In the end, that's the piece I wanted to write.

So, if my entry on Goldberg sounds stilted or awkward, please cut me a little slack. I don't write on assigned topics much any more, at least not in my blog. I should probably have set aside more time to write that entry, but I wrote it much as I might write any other entry. If nothing else, I hope you can find value in the link to her Personal Dynamic Media article, which I was so happy to find on-line.

At this point, one other person has written about Goldberg for the Lovelace Day challenge. That entry has links to a couple of videos, including one of Adele demonstrating a WIMP interface using an early implementation of Smalltalk. A nice piece of history. Mark Guzial mentions Adele in his Lovelace Day essay, but he wrote about three women closer to home. One of his subjects is Janet Kolodner, who did groundbreaking research on case-based reasoning that was essential to my own graduate work. I'm a fan!


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

March 24, 2009 6:13 AM

Adele Goldberg, Computer Scientist and Entrepreneur

My slogan is:
computing is too important to be left to men.

-- Karen Sparck-Jones, 1935-2007

We talk a lot about the state of women in computing. Girls have deserted computer science as an academic major in recent years, and female undergrad enrollment is at a historic low relative to boys. Some people say, "Girls don't like to program," but I don't think that explains all of the problem. At least a few women agree with me... During a session of the Rebooting Computing Summit in January, one of the men asserted that girls don't like to program, and one of the women -- Fran Allen, I think -- asked, "Says who?" From the back of the room, a woman's voice called out, "Men!"

A lot of people outside of computer science do not know how much pioneering work in our discipline was done by women. Allen won a Turing Award for her work on languages and compilers, and the most recent Turing Award was given to Barbara Liskov, also for work in programming languages. Karen Sparck-Jones, quoted above, discovered the idea of inverse document frequency, laying the foundation for a generation of advances in information retrieval. And these are just the ones ready at hand; there many more.

Adele Goldberg

When people assert that women don't like (just) to program, they usually mean that women prefer to do computer science in context, where they can see and influence more directly the effects that their creations will have in the world. One of my heroes in computing, Adele Goldberg, has demonstrated that women can like -- and excel -- on both sides of the great divide.

(Note: I am not speaking of this Adele Goldberg, who is, I'm sure, a fine computer scientist in her own right!)

Goldberg is perhaps best known as co-author of several books on Smalltalk. Many of us fortunate enough to come into contact with Smalltalk back in the 1980s cut our teeth on the fabulous "blue book", Smalltalk-80: The Language and Its Implementation. You can check out a portion of the blue book on-line. This book taught many a programmer how to implement a language like Smalltalk. It is still one of the great books about a language implementation, and it still has a lot to teach us a lot about object-oriented languages.

But Goldberg didn't just write about Smalltalk; she was in the lab doing the work that created it. During the 1970s, she was one of the principal researchers at Xerox PARC. The team at PARC not only developed Smalltalk but also created and experimented with graphical user interfaces and other features of the personal computing experience that we all now take for granted.

Goldberg's legacy extends beyond the technical side of Smalltalk. She worked with Alan Kay to develop an idea of computing as a medium for everyone and a new way for young people to learn, using the computer as a dynamic medium. They described their vision in Personal Dynamic Media, a paper that appeared in the March 1977 issue of IEEE Computer. This was a vision that most people did not really grasp until the 1990s, and it inspired many people to consider a world far beyond what existed at the time. But this paper did not just talk about vision; it also showed their ideas implemented in hardware and software, tools that children were already using to create ideas. When I look back at this paper, it reminds me of one reason I admire Goldberg's work: it addresses both the technical and the social, the abstract and the concrete, idea and implementation. She and Kay were thinking Big Thoughts and also then testing them in the world.

(A PDF of this paper is currently available on-line as part of the New Media Reader. Read it!)

After leaving PARC, Goldberg helped found ParcPlace, a company that produced a very nice Smalltalk product suitable for corporate applications and CS research alike. The Intelligent Systems Lab I worked in as a grad student at Michigan State was one of ParcPlace's first clients, and we built all of our lab's software infrastructure on top of its ObjectWorks platform. I still have ObjectWorks on 3.5" floppies, as well as some of the original documentation. (I may want to use it again some day...)

Some academics view founding a business as antithetical to the academic enterprise, or at least as not very interesting, but Goldberg sees it as a natural extension of what computer science is:

The theoretical and practical knowledge embodied in CS is interesting as standalone study. But the real opportunity lies in equipping oneself to partner with scientists or business experts, to learn what they know and, together, to change how research or business is conducted.

(I found this quote as a sidebar in Women in Computing -- Take 2, an article in a recent issue of Communications of the ACM.)

I suppose that the women-don't-like-to-program crowd might point to Goldberg's career in industry as evidence that she prefers computing in its applied context to the hard-core technical work of computer science, but I don't think that is true. Her work on Smalltalk and real tools at PARC was hard-core technical, and her work at ParcPlace on Smalltalk environments was hard-core technical, too. And she has the mentality of a researcher:

Don't ask whether you can do something, but how to do it.

When no one knows the answer, you figure it out for yourself. That's what Goldberg has done throughout her career. And once she knows how, she does it -- both to test the idea and make it better, and to get the idea out into the world where people can benefit from it. She seems to like working on both sides of the divide. No, she would probably tell us that the divide is an artificial barrier of our own making, and that more of us should be doing both kinds of work.

When we are looking for examples of women who have helped invent computer science, we find researchers and practitioners. We find women working in academia and in industry, working in technical laboratories and in social settings where applications dominate theory. We don't have to limit our vision of what women can do in computing to any one kind of work or work place. We can encourage young women who want to be programmers and researchers, working on the most technical of advances. We can encourage young women who want to work out in the world, changing how people do what they do via the dynamic power of software. If you are ever looking for one person to serve as an example of all these possibilities, Adele Goldberg may be the person you seek.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 20, 2009 9:09 PM

At Least It's Not Too Easy

Tim Bray talks about how photography is too easy to learn:

Quoting from About Photography (1949) by American photographer Will Connell (hat tip Brendan MacRae): "Every medium suffers from its own particular handicap. Photography's greatest handicap is the ease with which the medium as such can be learned. As a result, too many budding neophytes learn to speak the language too long before they have anything to say."

Programming doesn't seem to suffer from this problem! Comments to Bray's entry about books like "C for Dummies" notwithstanding, there are not many people walking around who think programming is too easy. Mark Guzdial has described the reaction of students taking a non-majors course with a computational economics theme when they found out they would have to do a little scripting in Python. Most people who do not already have an interest in CS express disdain for programming's complexity, or fear of it. No one likes to feel stupid. Perhaps worst of all, even students who do want to major in CS don't want to program.

We in the business seem almost to have gone out of our way to make programming hard. I am not saying that programming is or can be "easy", but we should stop erecting artificial barriers that make it harder than it needs to be -- or that create an impression that only really smart people can write code. People who have ideas can write. We need to extend that idea to the realm of code. We cannot make professional programmers out of everyone, any more than piano and violin lessons can make professional musicians out of everyone. But we ought to be able to do what music teachers can do: help anyone become a competent, if limited, practitioner -- and come to appreciate the art of programming along the way.

The good news is that we can solve this "problem", such as it is. As Guzdial wrote in another fine piece:

An amazing thing about computing is that there are virtually no ground rules. If we don't like what the activity of programming is like, we can change it.

We need to create tools that expose powerful constructs to novices and hide the details until later, if they ever need to be exposed. Scratch and Alice are currently popular platforms in this vein, but we need more. We also need to connect the ability to program with people's desires and interests. Scripting Facebook seems like the opportunity du jour that is begging to be grasped.

I'm happy to run across good news about programming, even if it is only the backhanded notion that programming is not too easy. Now we need to keep on with the business of making certain that programming is not too hard.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 24, 2009 12:01 PM

Even More on Programming and Computational Thinking

Confluence... The Education column in the February 2009 issue of Communications of the ACM, Human Computing Skills: Rethinking the K-12 Experience, champions computational thinking in lieu of programming:

Through the years, despite our best efforts to articulate that CS is more than "just programming," the misconception that the two are equivalent remains. This equation continues to project a narrow and misleading image of our discipline -- and directly impacts the character and number of students we attract.

I remain sympathetic to this concern. Many people, including lost potential majors, think that CS == programming. I don't know any computer scientists who think that is true. I'd like for people to understand what CS is and for potential majors who end up not wanting to program for a living to know that there is room for them in our discipline. But pitching programming to the aside altogether is the wrong way to do that, and will do more harm than good -- even for non-computer scientists.

It seems to me that the authors of this column conflate CS with programming at some level, because they equate writing a program with "scholarly work" in computer science:

While being educated implies proficiency in basic language and quantitative skills, it does not imply knowledge of or the ability to carry out scholarly English and mathematics. Indeed, for those students interested in pursuing higher-level English and mathematics, there exist milestone courses to help make the critical intellectual leaps necessary to shift from the development of useful skills to the academic study of these subjects. Analogously, we believe the same dichotomy exists between CT, as a skill, and computer science as an academic subject. Our thesis is this: Programming is to CS what proof construction is to mathematics and what literary analysis is to English.

In my mind, it is a big -- and invalid -- step from saying "CT and CS are different" to saying that programming is fundamentally the domain of CS scholars. I doubt that many professional software developers will agree with a claim that they are academic computer scientists!

I am familiar with Peter Naur's Programming as Theory Building, which Alistair Cockburn brought to the attention of the software development world in his book, Agile Software Development. I'm a big fan of this article and am receptive to the analogy; I think it gives us an interesting way to look at professional software development.

But I think there is more to it than what Naur has to say. Programming is writing.

Back to the ACM column. It's certainly true that, at least for many areas of CS, "The shift to the study of CS as an academic subject cannot .. be achieved without intense immersion in crafting programs." In that sense, Naur's thesis is a good fit. But consider the analogy to English. We all write in a less formal, less intense way long before we enter linguistic analysis or even intense immersion in composition courses. We do so as a means of communicating our ideas, and most of us succeed quite well doing so without advanced formal training in composition.

How do we reach that level? We start young and build our skills slowly through our K-12 education. We write every year in school, starting with sentences and growing into larger and larger works as we go.

I recall that in my junior year English class we focused on the paragraph, a small unit of writing. We had written our first term papers the year before, in our sophomore English course. At the time, this seemed to me like a huge step backward, but I now recognize this as part of the Spiral pattern. The previous year, we had written larger works, and now we stepped back to develop further our skills in the small after seeing how important they were in the large.

This is part of what we miss in computing: the K-8 or K-12 preparation (and practice) that we all get as writers, done in the small and across many other learning contexts.

Likewise, I disagree that proof is solely the province of mathematics scholars:

Just as math students come to proofs after 12 or more years of experience with basic math, ...

In my education, we wrote our first proofs in geometry -- as sophomores, the same year we wrote our first term papers.

I do think one idea from the article and from the CT movement merits more thought:

... programming should begin for all students only after they have had substantial practice acting and thinking as computational agents.

Practice is good! Over the years, I have learned from CS colleagues encountered many effective ways to introduce students, whether at the university or earlier, to ideas such as sorting algorithms, parallelism, and object-oriented programming by role play and other active techniques -- through the learner acting as a computational agent. This is an area in which the Computational Thinking community can contribute real value. Projects such as CS Unplugged have already developed some wonderful ways to introduce CT to young people.

Just as we grow into more mature writers and mathematical problem solvers throughout our school years, we should grow into more mature computational thinkers as we develop. I just don't want us to hold programming out of the mix artificially. Instead, let's look for ways to introduce programming naturally where it helps students understand ideas better. Let's create languages and build tools to make this work for students.

As I write this, I am struck by the different nouns phrases we are using in this conversation. We speak of "writers", not "linguistic thinkers". People learn to speak and write, to communicate their ideas. What is it that we are learning to do when we become "computational thinkers"? Astrachan's plea for "computational doing" takes on an even more XXXXX tone.

Alan Kay's dream for Smalltalk has always been the children could learn to program and grow smoothly into great ideas, just as children learn to read and write English and grow smoothly into the language and great ideas of, say, Shakespeare. This is a critical need in computer science. The How to Design Programs crowd have shown us some of the things we might do to accomplish this: language levels, tool support, thinking support, and pedagogical methods.

Deep knowledge of programming is not essential to understand all basic computer science, some knowledge of programming adds so very much even to our basic ideas.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 19, 2009 4:32 PM

More on Programming and Computational Thinking

I've heard from a few of you about my previous post. People have strong feelings in both directions. If you haven't seen it already, check out Mark Guzdial's commentary on this topic. Mark explores a bit further what it means to understand algorithms and data structures without executing programs, and perhaps without writing them. I'm glad that he is willing to stake out a strong position on this issue.

Those of you who receive inroads, SIGCSE's periodical, should watch for a short article by Owen Astrachan in the next issue, called "Cogito Ergo Hack". Owen hits the target spot-on: without what he calls "computational doing", we miss a fantastic opportunity to help people understand computational ideas at a deeper level by seeing them embodied in something they themselves create. Computational doing might involve a lot of different activities, but programming is one of the essential activities.

We need as many people as possible, and especially clear thinkers and writers like Mark and Owen, to ask the questions and encourage others to think about what being a computational thinker means. Besides, catchy phrases like "computational doing" and "Cogito Ergo Hack" are likely to capture the attention of more people than my pedestrian prose!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 17, 2009 3:53 PM

Posts of the Day

Tweet of the Day

Haskell is a human-readable program compression format.
-- Michael Feathers

Maybe we should write a generator that produces Haskell.

Non-Tech Blog of the Day

Earlier in my career I worked hard to attract attention. ... The problem with this approach is that eventually it all burns down to ashes and no one knows a thing more about software development than they did before.
-- Kent Beck

Seek truth. You will learn to focus your life outside your own identity, and it makes finding out you're wrong not only acceptable, but desirable.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

February 17, 2009 9:31 AM

Computational Thinking without Programming

Last week, I read a paper on changes how statistics is taught. In the last few decades, more schools have begun to teach stats conceptually, so that the general university graduate might be able to reason effectively about events, variation, and conditions. This contrasts with the older style in which it was taught as a course for mathematicians, with the focus on formulas and mastery of underlying theorems. The authors said that the new style emphasized statistical thinking, rather than traditional statistics.

For some reason, this brought to mind the buzz around "computational thinking" in the CS world. I have to be honest: I don't know exactly what people mean when they talk about computational thinking. I think the idea is similar to what the stats guys are talking about: using the ideas and methods discovered in computer science to reason about processes in the real world. I certainly agree that most every college graduate could benefit from this, and that popularizing these notions might do wonders for helping students to understand why CS is important and worth considering as a major and career.

But when I look at the work that passes under the CT banner, I have a hard time distinguishing computational thinking from what I would call a publicly-accessible view of computer science. Maybe that's all it is: an attempt to offer a coherent view of CS for the general public, in a way that all could begin to think computationally.

The one thing that stands out in all the papers and presentations about CT I've seen is this: no programming. Perhaps the motivation for leaving programming out of the picture is that people find it scary and hard, so omitting it makes for a more palatable public view. Perhaps some people think that programming isn't an essential part of computational thinking. If it's the former, I'm willing to cut them some slack. If it's the latter, I disagree. But that's not surprising to readers here.

While thinking on this, I came across this analogy: computational thinking with no programming is like statistical thinking without any mathematics. That seems wrong. We may well want stats courses aimed at the general populace to emphasize application and judgment, but I don't think we want students to see statistics devoid of any calculation. When we reason about means and variance, we should probably have some idea how these terms are grounded in arithmetic that people understand and can do.

When I tried my analogy out on a colleague, he balked. We don't need much math to reason effectively in a "statistical" way, he said; that was precisely the problem we had before. Is he overreacting? How well can people understand the ideas of mean and standard deviation without knowing how to compute them? How little math can they know and still reason effectively? He offered as an example the idea of a square root. We can understand what a square root and what it means without knowing how to calculate one by hand. Nearly every calculator has a button for the square root, and most students' calculators these days have buttons for the mean -- and maybe the variance; I'll have to look at my high school daughter's low-end model to see.

For the most part, my colleague feels similarly about programming for everyone. His concern with CT is not eliminating programming but what would be taught in lieu of programming. Many of the articles we have seen on CT seem to want to replace programming with definitions and abstractions that are used by professional computer scientists. The effect is to replace teaching a programming language with teaching a relatively formal "computational thinking" language. In his mind, we should replace programming with computational skills and ideas that are useful for people involved in everyday tasks. He fears that, if we teach CT as if the audience is a group of budding computer scientists, we will make the same mistake that mathematics often has: teaching for the specialists and finding out that "everyone else is rotten at it".

The stats teaching paper I read last week says all the right things. I should look at one of the newer textbooks to see how well they carry it out, and how much of the old math skills they still teach and require.

I'm still left thinking about how well people can think computationally without learning at least a little programming. To the extent that we can eliminate programming, how much of what is left is unique to computer science? How far can we take the distinction between formula and algorithm without seeing some code? And what is lost when we do? There is something awfully powerful about watching a program in action, and being able to talk about and see the difference between dynamic behavior and static description.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 29, 2009 7:33 AM

Using Code to Document Lab Procedure

a lab notebook

I've been following the idea of open notebook science for a while, both for its meaning to science and for the technological need it creates. Yesterday I read Cameron Neylon's piece on a web-native lab notebook. It was an interesting read, though it did contain a single paragraph that ran for two pages... After he describes how to use a blog to organize an integrated record of lab activities, he says:

What we're left with is the procedures which after all is the core of the record, right? Well no. Procedures are also just documents. Maybe they are text documents, but perhaps they are better expressed as spreadsheets or workflows (or rather the record of running a workflow). Again these may well be better handled by external services, be they word processors, spreadsheets, or specialist services. They just need to be somewhere where we can point at them.

Procedures are, well, procedures. It would be very cool if we could help scientists record their procedures as code. Programs require more precision than free-form text, which would make sharing scientific procedures more reliable, and they support naturally the distinction between static and dynamic presentation. Neylon sees this opportunity when he talks about recording a procedure as a workflow or as a record of running one.

Just another wild-hair idea at the boundary where CS meets people doing their work.


Posted by Eugene Wallingford | Permalink | Categories: Computing

January 20, 2009 4:27 PM

Rebooting Computing Workshop Approach Redux

I commented a couple of times in my review of the Rebooting Computing summit, about the process we used, and the reaction it created among the participants. We used an implementation of Appreciative Inquiry that emphasized collectively exploring our individual experiences and visions before springing into action.

Some people were skeptical from the outset, as some people will be about anything that so "touchy-feely". Perhaps some assumed that we all know what the problem is and that enough of us know what the solution should be and so wanted to get down to business. That seems unlikely to me. Were this problem easily understood or solved, we wouldn't have needed a summit to form action groups.

The process began to unravel on the second day, and many people were losing patience. After lunch -- more than halfway through the summit -- the entire group worked together to construct an overlaid set of societal, technical, and personal timelines of computing. For me, time began to drag during this exercise, though I enjoyed hearing the stories and seeing what so many people thought stood out in our technical history.

The 'construct a timeline of computing' was the final straw for some, especially some of the bigger names. While constructing the timeline, we found that many people in the room didn't know the right dates for the events they were adding. One of my colleagues finally couldn't stop himself and moved a couple of items to their right spots. It must have been especially frustrating for the people in the room who were there when these events were happening -- or who were the principal players! This is an example of how collaboration for collaboration's sake often doesn't work.

Later, Alan Kay referred to the summit's process as "participation without progress". Our process emphasized the former but in many ways impeded (or prevented) the latter. Kay said that this approach assumes all people's ideas are equally valid or useful. He called for something more like a representative democracy than what we had, which I would liken to a town hall form of government.

Kay may well be right, though a representative democracy-like approach risks losing buy-in and enthusiasm from the wider audience. Our representatives would need to be able either to solve the problems themselves or to energize "we, the people" to do the work. I'm not sure whether what ails computing can be solved by only a few, so I think bringing a large community on board at some point is necessary.

The next question is, who has the ideas that we should be acting on? I think the summit was a least partly successful in giving people with good ideas a chance to express them to a large group and in particular to let some of the more influential people in the room know about the people and the ideas. Unfortunately, the process throttled some other ideas with lack of interest. One that stands out to me now is the issue of outsourcing and intellectual property, which has been a dominant topic of discussion on the listserv since we left San Jose. Fortunately, people are talking about it now.

(I have to admit that I do not yet fully understand the problem. I either need more time to read or a bigger brain -- maybe both.)

In the end when we broke identified and joined action groups, some people asked, "How different is this set of action groups than if we had started the summit with this exercise?" Many people thought we would have ended up with the same groups. I think they might be right, though I'm not sure this means there was no value in the work that led to them.

Perhaps one problem was that this process does not scale well to 200+ people. If it does, then perhaps it just didn't scale well to these 200+ people. The room was full of Type A personalities. The industry people were the kind with strong opinions and a drive to solve problems. The faculty were... Well, most faculty are control freaks, and this group was representative.

Personally, I found the process to be worth at least some of the time we spent. I enjoyed looking back at my life in computing, reflecting on my own history, reliving a few stories, and thinking about what has influenced. I realized that my interest in computer science wasn't driven by math or CS teachers in high school or my undergraduate years.. I had a natural affinity for computing and what it means. The teachers who most affected me were ones who encouraged me to think abstractly and to take ideas seriously, who gave me reason to think I could do those things. The key was to find my passion and run.

I first saw that sort of passion in William Magrath my honors humanities prof, as a freshmen in college. As much as I loved humanities and political science and all the rest, I had a sense that CS was where my passion lay. The one undergrad CS prof who comes to mind as an influence, William Brown, was not a researcher. He was a serious systems guy who had come from IBM. In retrospect, I credit him with showing me that CS had serious ideas and was worth deep thought. He encouraged me subtly to go to grad school and answered a lot of questions from a naive student whose background made grad school seem as far away as the moon.

I can't give a definitive review of the Rebooting Computing workshop process because, sadly, I had to return home to do my other day job so missed the third day. From what I last, hear, the last day seemed to have clicked better for more people. I will say this. We have to realize that the goal of "rebooting" computing is a big task, and not everyone who needs to be involved shares the same context, history, motivation, or goals. It was worth trying to figure out some of that background before trying to make plans. Even if the process moved slower than some of us thought it should, it did get us all talking. That is a start, and the agile developer in me knows the value in that.


Posted by Eugene Wallingford | Permalink | Categories: Computing

January 19, 2009 9:53 AM

Rebooting the Public Image of Computing

In addition to my more general comments on the Rebooting Computing summit, I made a lot of notes about the image of the discipline, which was, I think, one of the primary motivations for many summit participants. The bad image that most kids and parents have of careers in computing these days came up frequently. How can we make computing as attractive as medicine or law or business?

One of my table mates told us a story of seeing brochures for two bioinformatics programs at the same university. One was housed in the CS department, and the other was housed with the life sciences. The photos used in the two brochures painted strikingly different images in terms of how people were dressed and what the surroundings looked like. One looked like a serious discipline, while the other was "scruffy". Which one do you think ambitious students will choose? Which one will appeal to the parents of prospective students? Which one do you think was housed in CS?

Sometimes, the messages we send about our discipline are subtle, and sometimes not.

Too often, what K-12 students see in school these days under the guise of "computing" is applications. It is boring, full of black boxes with no mystery. It is about tools to use, not ideas for making things. After listening to several people relate their dissatisfaction with this view of computing, it occurred to me that one thing we might do to immediately improve the discipline's image is to get what currently passes for computing out of our schools. It tells the wrong stories!

The more commonly proposed solution is to require CS in K-12 schools and do it right. Cutting computing would be easier... Adding a new requirement to the crowded K-12 curriculum is a tall task fraught with political and economic roadblocks. And, to be honest, our success in presenting a good image of computing through introductory university courses doesn't fill me with confidence that we are ready to teach required CS in K-12 everywhere.

Don't take any of these thoughts too seriously. I'm still thinking out loud, in the spirit of the workshop. But I don't think there are any easy or obvious answers to the problems we face. One thing I liked about the summit was spending a few days with many different kinds of people who care about the problems and who all seem to be trying something to make things better.

The problems facing computing are not just about image. Some think to think so, but they aren't. Yet image is part of the problem. And the stories we tell -- explicitly and implicitly -- matter.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 17, 2009 3:09 PM

Notes on the Rebooting Computing Summit

the Rebooting Computing logo

[This is the first of several pieces on my thoughts at the Rebooting Computing summit, held January 12-14 in San Jose, California. Later articles talk about image of CS, thoughts on approach, thoughts on personal stories and, as always, a little of this and that.]

This was unusual: a workshop with no laptops out. Some participants broke the rule almost immediately, including some big names, but I decided to follow my leaders. That meant no live note-taking, except on paper, nor any interleaved blogging. I decided to expand this opportunity and take advantage of an Internet-free trip, even back in the hotel!

The workshop brought together 225 people or so from all parts of computer science: industry, universities, K-12 education, and government. We had a higher percentage of women than the discipline as a whole, which made sense given our goals, and a large international contingent. Three Turing Award winners joined in: Alan Kay, Vinton Cerf, and Fran Allen, who was an intellectual connection to the compiler course I put on hold for a few days in order to participate myself. There were also a number of other major contributors to computing, people such as Grady Booch, Richard Gabriel, Dan Ingalls, and the most famous of three Eugenes in the room, Gene Spafford.

Most came without knowing what in particular we would do for these three days, or how. The draw was the vision: a desire to reinvigorate computing everywhere.

a photo of the Computer History Museum's Difference Engine

The location was a perfect backdrop, the Computer History Museum. The large printout of a chess program on old green and white computer paper hanging from the rafters near our upper-floor meeting room served as a constant reminder of a day when everything about computers and programming seemed new and exciting, when everything was a challenge to tackle. The working replica of Charles Babbage's Difference Engine on the main floor reminded us that big dreams are good -- and may go unfulfilled in one's lifetime.

The Introduction

Peter Denning opened with a few remarks on what led him to organize the workshop. He expressed his long-standing dissatisfaction with the idea that computer science == programming. Dijkstra famously proclaimed that he was a programmer, and proud of it, but Denning always knew there was something bigger. What is missing if we think only of programming? Using his personal experience as an example, he claimed that the outliers in the general population who make enter CS have invested many hours in several kinds of activity: math, science, and engineering.

Throughout much of the history of computer science, people made magic because they didn't know what they were doing was impossible. This gave rise to the metaphor driving the workshop and his current effort -- to reboot computing, to clean out the cruft. We have non-volatile memory, though, so we can start fresh with wisdom accumulated over the first 60-70 years of the discipline. (Later the next day, Alan Kay pointed out that rebooting leaves same operating system and architecture in place, and that what we need to do is redesign them from the bottom up, too!)

A spark ignited inside each of us once that turned us onto computing. What was it? Why did it catch? How? One of the goals of the summit was to find out how we can create the same conditions for others.

The Approach

The goal of the workshop was to change the world -- to change how people think about computing. The planning process destroyed the stereotypes that the non-CS facilitators held about CS people.

The workshop was organized around Appreciative Inquiry (AI, but not the CS one -- or the one from the ag school!), a process I first heard about in a paper by Kent Beck. It uses a an exploration of positive experiences to help understand a situation and choose action. For the summit, this meant developing a shared positive context about computing before moving on to the choosing of specific goals and plans.

Our facilitators used an implementation they characterized as Discovery-Dream-Design-Destiny. The idea was to start by examining the past, then envision the future, and finally to return to the present and make plans. Understanding the past helps us put our dreams into context, and building a shared vision of the future helps us to convert knowledge into actions.

One of our facilitators, Frank Barrett, is a former prof jazz musician, said an old saying from his past career is "Great performances create great listeners." He believes, though, that the converse is true: "Great listeners create great performances." He encouraged us to listen to one another's stories carefully and look for common understanding that we could convert into action.

Frank also said that the goal of the workshop is really to change how people talk, not just think, about computing. Whenever you propose that kind of change, people will see you as a revolutionary -- or as a lunatic.

a photo of one of the workshop posters drawn live

An unusual complement to this humanistic approach to the workshop was a graphic artist who was recording the workshop live before our eyes, in image and word. Even when the record was mostly catchphrases that could have become part of a slide presentation, the colors and shapes added a nice touch to the experience.

The Spark

What excited most of us about computing was solving a problem -- having some something that that was important to us, sometimes bigger than we could do easily by hand, and doing it with a computer. Sometimes we enjoyed the making of things that could serve our needs. Sometimes we were enlivened by making something we found beautiful.

Still, a lot of people in the room "stumbled" or "drifted" into CS from other places. Those words carry quite different images of peoples' experiences. However subtle the move, they all seemed to have been working on real problems.

One of the beauties of computer science is that it is in and about everything. For many, computing is a lens through which to study problems and create solutions. Like mathematics, but more.

One particular comment made the first morning stood out in my mind. The gap between what people want to make with a computer and what they can reasonably make has widened considerably in the last thirty years. What they want to make is influenced by what they see and use every day. Back in 1980 I wanted to write a program to compute chess ratings, and a bit of BASIC was all I needed. Kids these days walk around with computational monsters in their pockets, sometimes a couple, and their desires have grown to match. Show them Java or Python, let alone BASIC, and they may well feel deflated before considering just what they could do.

Computing creates a new world. It builds new structures on top of old, day by day. Computing is different today than it was thirty years ago -- and so is the world. What excited us may well not excite today's youth.

What about what excited us might?

(Like any good computer scientist, I have gone meta.)

Educators cannot create learning. Students do that. What can educators provide? A sense of quality. What is good? What is worth doing? Why?

History

A lot of great history was made and known by the people in this room. The rest of us have lived through some of it. Just hearing some of these lessons reignited the old spark inside of me.

Consider Alan Turing's seminal 1935 paper on the Halting Problem. Part of the paper is Turing "thumbing his nose" at his skeptical mathematician colleagues, saying "The very question is computation. You can't escape it."

One time, Charles Babbage was working with his friend, the astronomer Herschel. They were using a set of astronomy tables to solve a problem, and Babbage became frustrated by errors in the tables. He threw the book at a wall and said, "I wish these calculations had been executed by steam!" Steam.

Ada Lovelace referred to Babbage's Difference Engine as a machine that "weaves patterns of ideas".

Alan Kay reminded us of the ultimate importance of what Turing taught us: If you don't like the machine you have, you can make the machine you want.

AI -- the computing kind, which was the source of many of my own initial sparks -- has had two positive effects on the wider discipline. First, it has always offered a big vision of what computing can be. Second, even when it struggles to reach that vision, it spins off new technologies it creates along the way.

At some point in the two days, Alan Kay chastised us. Know your history! Google has puts all seventy-five or so of Douglas Engelbart's papers at your fingertips. Do you even type the keywords into the search box, let alone read them?

About Computing

A common thread throughout the workshop was, what are the key ideas and features of computing that we should not lose as we move forward? There were some common answers. People want to solve real problems for real people. They want to find ideas in experience and applications. Another was the virtue of persistence, which one person characterized as "embracing failure" -- a twisted but valuable perspective. Yet another was the idea of "no more black boxes", whether hardware or software. Look inside, and figure out what makes it tick. None of these are unique to CS, but they are in some essential to it.

Problem solving came up a lot, too. I think that people in most disciplines "solve problems" and so would claim problem solving as an essential feature of the discipline. Is computer science different? I think so. CS is about the process of solving problems. We seek to understand the nature of algorithms and how they manipulate information. Whichever real problem we have just solved, we ask, "How?" and "Why?" We try to generalize and understand the data and the algorithm.

Another common feature that many thought essential to computing is that it is interdisciplinary. CS reaches into everything. This has certainly been one of the things that has attracted me to computing all these years. I am interested in many things, and I love to learn about ideas that transcend disciplines -- or that seem to but don't. What is similar and dissimilar between two problems or solutions? Much of AI comes down to knowing what is similar and what is not, and that idea held my close attention for more than a decade.

While talking with one of my table mates at the summit, I realized that this was one of the biggest influences my Ph.D. advisor, Jon Sticklen, had on me. He approached AI from all directions, from the perspectives of people solving problems in all disciplines. He created an environment that sought and respected ideas from everywhere, and he encouraged that mindset in all who studied in his lab.

Programming

While I respect Denning's dissatisfaction with the idea that computer science == programming, I don't think we should lose the idea of programming whatever we do to reinvigorate computing. Whatever else computing is, in the end, it all comes down to a program running somewhere.

When it was my turn to describe part of my vision for the future of computing, I said something like this:

When they have questions, children will routinely walk to the computer and write a program to find the answers, just as they now use Google, Wikipedia, or IMDB to look up answers.

I expected to have my table mates look at me funny, but my vision went over remarkably well. People embraced the idea -- as long as we put it in the context of "solving a problem". When I ventured further to using a program "to communicate an idea", I met resistance. Something unusual happened, though. As the discussion continued, every once in a while someone would say, "I'm still thinking about communicating an idea with a program...". It didn't quite fit, but they were intrigued. I consider that progress.

Closing

At the end of the second day, we formed action groups around a dozen or so ideas that had a lot of traction across the whole group. I joined a group interested in using problem-based learning to change how we introduce computing to students.

That seemed like a moment when we would really get down to business, but unfortunately I had to miss the last day of the summit. This was the first week of winter semester classes at my university, and I could not afford to miss both sessions of my course. I'm waiting to hear from other members of my group, to see what they discussed on Wednesday and what we are going to do next.

As I was traveling back home the next day, I thought about whether the workshop had been worth missing most of the first week of my semester. I'll talk more about the process and our use of time in a later entry. But whatever else, the summit put a lot of different people from different parts of the discipline into one room and got us talking about why computing matters and how we can help to change how the world thinks -- and talks -- about it. That was good.

Before I left for California, I told a colleague that this summit held the promise of being something special, and that it also bore the the risk of being the same old thing, with visionaries, practitioners, and career educators chasing their tails in a vain effort to tame this discipline of ours. In the end, I think it was -- as so many things turn out to be -- a little of both.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 10, 2008 6:27 AM

Echoes

This is the busy end to a busier-than-usual semester. As a result, my only opportunity and drive to blog come from echoes. Sometimes that's okay.

Running

... or not. After six weeks or so of 26-28 miles a week -- not much by standards, but a slow and steady stream -- November hit me hard. Since 11/03 I've managed only 6-8 miles a week and not felt well the days after. My doctors are running out of possible diagnoses, which is good in one way but bad in another. In addition to the blog echo, I have an actual echo running through my head, from John Mellencamp's "Junior": Sometimes I feel better / But I never do feel well.

Building Tools

As we wrap up the semester's study of programming languages, my students took their final quiz today. I used the free time before the quiz to show them how we could imperative features -- an assignment operator and sequences of statements -- to a simple functional interpreter that they have been building over the course of the last few homework assignments. After writing a simple cell data type (10 lines of code) to support mutable data, we added 25 or so lines of code to their interpreter and modified 5 or so more. That's all it took.

I'm always amazed by what we can do in a few lines of code. Those few lines also managed to illustrate several of the ideas students encountered this semester: map, currying, and even a higher-order type predicate. Today's mini-demonstration has me psyched to add more features to the language, to make it more useful both as a language and as an example of how language works. If only we had more time...

After class, I was talking with a student about this and that related to class, internships, and programming. He commented that he now groks what The Pragmatic Programmer says about writing your own tools and writing programs to generate code for you. I smiled and thought, yep, that's what programmers do.

40th Anniversaries

Today was one of the 40th anniversaries I mentioned six weeks ago: Douglas Engelbart's demonstration of a mouse-controlled, real time-interactive, networked computer. SFGate heralds this as the premiere of the PC, but this event has always seemed more about interaction than personal computing. Surely, the kind of interactivity that Engelbart showed off was a necessary precursor to the PC, but this demonstration was so much more -- it showed that people can interact with digital media and, yes, programs in a way that connects with human needs and wants. Engelbart's ideas will out-live what we know as the personal computer.

No matter, though. The demonstration inspired a generation. A friend of mine sent a note to all his friends today, asking us to "drink a toast to Douglas Engelbart" and reminiscing on what personal computing means to many of us:

Think how much this has changed our lives... The communication capabilities allow us to communicate extremely quickly, throughout the globe. The PC, and Internet, allow me to have friends in Australia, Belfast, Brazil, China, Holland, India, Japan, London, Switzerland, and many other places. ... Can you even picture a world without PC's? I've seen and used them in remote places like Nosy Be, Madagascar, and Siem Reap, Cambodia.

The world is a different place, and many of us -- my friend included -- contribute to that. That humbles and awes me.

Programming Matters

Don't need programmers, only smart people? Or people with good "people skills"? Today on the Rebooting Computing mailing list, Peter Norvig wrote:

At Google, a typical team might be 1 software architect, 5 software designers who are also responsible for development, testing, and production, and one product manager. All 7 would have CS BS degrees, and maybe 2 MS and 2 PhDs. Programming is a significant part of the job for everyone but the product manager, although s/he typically has programming experience in the past (school or job). Overall, programming is a very large part of the job for the majority of the engineering team.

Sure, Google is different. But at the end of the day, it's not that different. The financial services companies that hire many of my university's graduates are producing business value through information technology. Maximizing value through computing is even more important in this climate of economic uncertainty. Engineering and scientific firms hire our students, too, where they work with other CS grads and with scientists and engineers of all sorts. Programming matters there, and many of the programmers are scientists. The code that scientists produce is so important to these organizations that people such as Greg Wilson would like to see us focus more on helping scientists build better software than on high-performance computing.

Those who can turn ideas into code are the masters of this new world. Such mastery can begin with meager steps, such as adding a few lines of code to an interpreter make imperative programming come alive. It continues when a programmer looks at the result and says, "I wonder what would happen if..."


Posted by Eugene Wallingford | Permalink | Categories: Computing, Running, Teaching and Learning

November 22, 2008 7:19 AM

Code, and Lots Of It

Today, I was asked the best question ever by a high-school student.

During the fall, we host weekly campus visits by prospective students who are interested in majoring in CS. Most are accompanied by their parents, and most of the dialogue in the sessions is driven by the parents. Today's visitors were buddies from school who attended sans parents. As a part of our talking about careers open to CS grads, I mentioned that some grads like to move into positions where they don't deal much with code. I told them that two of the things I don't like about my current position is that I only get to teach one course each semester and that I don't have much time to cut code. Off-handedly, I said, "I'm a programmer."

Without missing a beat, one the students asked me, "What hobby projects are you working on?"

Score! I talked about a couple of things I work on whenever I can, little utilities I'm growing for myself in Ruby and Scheme, and some refactoring support for myself in Scheme. But the question was much more important than the answer.

Some people like to program. Sometimes we discover the passion in unexpected ways, as we saw in the article I referred to in my recent entry:

[Leah] Culver started out as an art major at the University of Minnesota, but found her calling in a required programming class. "Before that I didn't even know what programming was," she admits. ... She built Pownce from scratch using a programming language called Python.

Programmers find a way to program, just as runners find a way to run. I must admit, though, that I am in awe of the numbers Steve Yegge uses when talking about all the code he has written when you take into account his professional and personal projects:

I've now written at least 30,000 lines of serious code in both Emacs Lisp and JavaScript, which pales next to the 750,000 or so lines of Java I've [spit] out, and doesn't even compare to the amount of C, Python, assembly language or other stuff I've written.

Wow. I'll have to do a back-of-the-envelope estimate of my total output sometime... In any case, I am willing to stipulate to his claim that:

... 30,000 lines is a pretty good hunk of code for getting to know a language. Especially if you're writing an interpreter for one language in another language: you wind up knowing both better than you ever wanted to know them.

The students in our compiler course will get a small taste of this next semester, though even I -- with the reputation of a slave driver -- can't expect them to produce 30 KLOC in a single project! I can assure them that they will make a non-trivial dent in the 10,000 hours of practice they need to master their discipline. And most will be glad for it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 20, 2008 8:22 PM

Agile Thoughts: Humans Plus Code

Courtesy of Brian Marick and his Agile Development Practices keynote:

Humans plus running code are smarter than humans plus time.

We can sit around all day thinking, and we may come up with something good. But if we turn our thoughts into code and execute it, we will probably get there faster. Running a piece of code gives us information, and we can use that feedback to work smarter. Besides, the act of writing the code itself tends to make us smarter, because writing code forces us to be honest with ourselves in places where abstract thought can get away with being sloppy.

Brian offers this assertion as an assumption that underlies the agile software value of working software, and specifically as assumption that underlies a guideline he learned from Kent Beck:

No design discussion should last more than 15 minutes without someone turning to a computer to do an experiment.

An experiment gives us facts, and facts have a way of shutting down the paths to fruitless (and often strenuous) argument we all love to follow whenever we don't have facts.

I love Kent Beck. He has a way of capturing great ideas in simple aphorisms that focus my mind. Don't make the mistake that some people make, trying to turn one of his aphorisms into more than it is. This isn't a hard and fast rule, and it probably does not apply in every context. But it does capture an idea that many of us in software development share: execucting a real program usually gives us answers faster and more reliably than a bunch of software developers sitting around pontificating about a theoretical program.

As Brian says:

Rather than spending too much time predicting the future, you take a stab at it and react to what you learn from writing and running code...

This makes for a nice play on Alan Kay's most famous bon mot, "The best way to predict the future is to invent it." The best way to predict the future of a program is to invent it: to write the program, and to see how it works. Agile software development depends on the fact that software is -- or should be -- soft, malleable, workable in our hands. Invent the future of your program, knowing full well that you will get some things wrong, and use what you learn from writing and running the program to make it better. Pretty soon, you will know what the program should look like, because you will have it in hand.

To me, this is one of the best lessons from Brian's keynote, and well worth an Agile Thought.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 19, 2008 4:36 PM

Where Influential Women in Computing Come From

A recent article at Fast Company highlights some of the Most Influential Women in Web 2.0. This list "wasn't chosen by star power, nor by career altitude" but for the biggest innovations in the nebulous sphere of Web 2.0. This list is also not a random sample, which makes drawing conclusions from it dicey. Yet I could not help noticing how few of them have CS background:

  • Sinha: cognitive neuroscience
  • Kaplan: economics, government, philosophy, and history
  • Page/Des Jardins/Stone: theatre/English lit/political science
  • Huffington: economics
  • Banister: HS drop-out
  • Bianchini: political science, then MBA
  • Fake: English
  • Trott: English
  • Hamlin: political economy and human rights
  • Mayer: systems science (after starting in biology and chemistry), then M.S. in CS
  • Culver: computer science (after starting in art)

Perhaps this is a non-representative sample of women in computing or Web 2.0. Or perhaps the range of backgrounds we see here, especially its tilt toward the humanities, says more about the social and interactive nature of Web 2.0 than about computer science. Perhaps it says something about women and what they want out of their academic majors.

Somehow, though, I think this list is a great example of why we need to broaden the public's conception of computing, in hopes that more might choose to major in CS. I also think it is also a great example of why we as a discipline need to engage students all across the campus. We need to expose more (all?) students to expose them to the power of the ideas of computing and give them some of the skills they will need to apply it to whatever they decide to do.

I'm glad these innovators and women found computing along the way to turning their ideas into reality, but I'd like for us to eliminate some of the barriers that we erect between computer science and tomorrow's innovators.


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 13, 2008 6:35 PM

Lest We Forget the Mathematicians

Just before I left for the SECANT workshop, I wrote an entry on programming based on a conversation with a colleague in the math department. Then I went off to SECANT, which gave me a chance to think about the intersection of CS and science, which took my mind off of the role CS plays in mathematics. But the gods are reminding us: Computer science is changing how some mathematicians work, and not just via programming.

The December 2008 issue of the Notices of the American Mathematical Society features four papers on the use of computers in mathematical proof, both to create new results and to ask and explore "new mathematical questions about the nature and technique of such proofs". This idea is known as formal proof. All four papers of the special issue are available as PDF files on the web site. (This is an excellent feature of Notices: the current issue is openly available to the community!).

The lead article of the issue, by Thomas Hales, lays out the problem for which Formal Proof is the solution. In a nutshell: mathematicians verify their results by a social process, which means those results are fallible.

Traditional mathematical proofs are written in a way to make them easily understood by mathematicians. Routine logical steps are omitted. An enormous amount of context is assumed on the part of the reader.

Many of you probably already know that this is true, from your own negative experience. I know that I often felt as if I was missing a lot of context when I was reading proofs in the abstract algebra book we used in grad school!

How are these proofs validated? By other mathematicians reading them and accepting them as valid. The best known and most important proofs have been read and understood by many mathematicians, so we can trust that they are probably correct. If there were something wrong with the result, someone would have found out by now -- either by by finding an error in the proof or by discovering that something else breaks when we rely on it. Programmers know that this is how most programs are "proven" correct: by using them reliably for a long time under many different conditions.

One problem with this process is that some proofs are so long and so complicated that the number of people who can read and understand them is quite small. For the results that break the most new ground, even the best mathematicians have to learn a lot to understand the proof. These "proof assistants" may not have the time, energy, or attention span to validate a result well enough that we "know" it is correct.

A formal proof is an attempt to bypass the social process of mathematicians reading a result, providing the necessary context, filling in the details, and approving the result by automating those steps:

A formal proof is a proof in which every logical inference has been checked all the way back to the fundamental axioms of mathematics. All the intermediate logical steps are supplied, without exception. No appeal is made to intuition, even if the translation from intuition to logic is routine.

A formal proof will be less intuitive to most human readers, especially the advanced ones, but it should be less susceptible to errors by relying on formal specification of the context and of all the intermediate steps. "Show your work!"

Specifying all of the extra detail completely is something humans are not so good at. Many great mathematicians -- and programmers -- are great in some part because they are able to take improbable intuitive leaps onto new ground. But computers thrive on such tedious detail, so work on formal proof comes naturally to rest in the realm of programs that aid the process.

Computerized "proof assistants" are nothing new. Hales cites examples going back to 1954 and the Johniac computer. I can't go that far back (my parents were still in grade school!), but I remember working with programs of this sort back in the mid- to late-1980s as a grad student in AI. Prof. Rich Hall, a philosophy professor specializing in epistemology and philosophy of mind, was a treasured member of my Ph.D. committee. For the philosophy component of my comprehensive, Hall asked me to study two such programs, Tarski's World and Computer Assisted Logic Lessons (CALL), intended as logic tutors and proof assistants for beginners. (Tarski's World is discussed here. CALL was a program home-brewed at Michigan State. I found one reference to it via Google, in an MSU prof's syllabus from 1998.) My task was to identify the fundamental distinctions between the two programs, especially with regard to their respective assumptions, and to evaluate their instructional utility. Remarkably, I still have the paper I wrote for my exam -- formatted in nroff!

Formal proof sounds perfect: Let a program do the grunt work to validate even our most complex proofs. But formal proof offers up to problems... The first is that the human has to specify some of the initial detail for the program in some logical language. One thing that many years in AI taught me is that logic languages are just like programming languages, and writing proofs in them encounters all the same perils as writing a program. Writing proofs is the cross borne by mathematicians, though, and this problem is exactly the one formal proof seeks to solve (a recursive problem!): making sure that our proofs contain all the necessary detail, written and used correctly.

The second problem is that now we have to consider whether the computer program itself is correct. To use the program to validate a proof, we need first to validate the program. This is a different recursion. Fortunately, we have some things on our side. One is scale, both relative to large programs and to the proofs we hope to construct. As Hales points out:

The computer code that implements the axioms and rules of inference is referred to as the kernel of the system. It takes fewer than 500 lines of computer code to implement the kernel of HOL Light [a particular computer proof assistant]. (By contrast, a Linux distribution contains approximately 283 million lines of computer code.)

The "kernel" of the system is small enough to be amenable to validation in several ways. One is the social process used for other programs: make the code open source and let everyone and his brother study it and use it. I love the way Hales embraces this approach:

I wish to see a poster of the lines of the kernel, to be taught in undergraduate courses, and published throughout the world, as the bedrock of mathematics. It is math commenced afresh as executable code.

(The emphasis is mine. More on that later.)

This replaces the social process for validating proofs, up one level. Even if we validate the proof assistant only informally, it will be used repeatedly to validate proofs, and every use is an opportunity to find errors in the program. We can even use the program to validate proofs we already understand well just for the purpose of validating the program. (The world of mathematics is full of test data!) We use a social process to build a tool which then serves us over and over. Building tools is an essence of computer science.

Given the small size of the program, we could also use formal methods to prove its correctness or at least offer evidence that it is correct. John Harrison used HOL Light to something akin to a formal proof of its own soundness. This is the first time in a long while that I have read about Gödel's incompleteness theorem coming into play with a real program... This approach made me think of the compiler-writing technique known as bootstrapping, though that's not quite what Harrison did.

Finally, we might try to validate the program using another proof assistant. Hales calls this "exporting" a proof. The idea is this. Translate a proof written for one assistant into the language of another.

If a proof in one system is incorrect because of an underlying flaw in the theorem-proving program itself, then the export to a different system fails, and the underlying flaw is exposed.

For some reason, this reminded of cross-compilation, where we use a compiler on one platform to generate code for another platform. The purpose of cross compiling is to propagate programs onto systems where they do not exist yet. The purpose of exporting is different, to increase our confidence in one or both proof assistants. When we combine the confidence we have in the program via social acceptance with the confidence we gain from validating exported proofs, we have even more reason to trust the program, and thus the proofs it validates. Our confidence grows.

This process reminds us all that math, while not a natural science, is imbued with the spirit of science:

With a computer -- indeed with any physical artifact, whether a codex, transistor, or a flash drive made of proteins from salt-marsh bacteria -- it is never a matter of achieving philosophical certainty. It is a scientific knowledge of the regularity of nature and human technology, akin to the scientific evidence that Planck's constant lies reliably within its experimental range. Technology can push the probability of a false certification ever closer to zero: 10-6, 10-9, 10-12, ....

We never know our proofs are correct. We only have good reason to believe so. The same is true of programs. That's another reason for a computer scientist like me to be fascinated by this direction in mathematics.

The connection to the topic of the SECANT workshop is strong. Computing is helping to revolutionize how mathematicians work, just as it is revolutionizing how scientists work. Part of the math revolution will resemble the one in science, because some math research is itself inherently computational. The changes we talk about with formal proofs are a bit different, in that they are about how we validate results, not how we create them.

Both traditional mathematics and computational mathematics depend ultimately on validation. Formal proof is aimed at addressing the former, but what of the latter? Certainly some scientists have recognized the problem and embarked on efforts to solve it. I wrote about one such effort early this year, a simulation-and-documentation system that interleaves programs, their execution, and the papers written to publish the results. Not surprisingly, such thinking and the systems that implement it require a change in mindset, one that will likely come only after a long... social process.

Hales recognizes what the use of computing to generate and check proofs means for his discipline: It is math commenced afresh as executable code. I think many disciplines will find themselves redefined in just this way.


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 10, 2008 7:31 PM

Workshop 6: The Next Generation of Scientists in the Workforce

[A transcript of the SECANT 2008 workshop: Table of Contents]

The last session of an eventful workshop consisted of two people. One was a last minute sub for a science speaker who had to pull out. The sub, from Microsoft Research, didn't add much science content, but did say something I wish undergrads would pick up on. What do all companies look for these days? Short ramp-up time, self-starters. These boil down to curiosity and initiative.

The second speaker gave the sort of industry report I so enjoyed last year. David Spellmeyer, a Purdue computer science and chemistry grad, is CTO and CIO at Nodality. He titled his talk, "Computational Thinking as a Competitive Advantage in Industry". I love that title! because I love the ways computing confers a competitive advantage over companies that don't get it yet. The downside of Spellmeyer talking about his own company's competitive advantage: he can't post his slides.

Spellmeyer did tell us a bit about his company's science at various points in his story. Nodality works on patient-specific classification of disease and response to therapies. At least part of that involves evaluating phosphoprotein-signaling networks. (I hope that doesn't give too much away.)

He looks for computational thinking skills in all of the scientists Nodality hires. His CT wish list included items familiar and surprising:

  • familiarity with the complexity of computing
  • exposure to programming languages
  • analytical methods for experimental studies
  • familiarity with the technology and inner workings of the computer, especially database

Edvard Munch's Scream

These skills give competitive advantage to his company -- and also to the individual! The company is able to do more better and faster. The individual has better judgment across a wider range of problems. These advantages intersect at a point where computational thinking demystifies the computer, computer systems, and programming. Understanding even a little about computers and programs helps to dispel myth of the perfect computer and the perfect computer system. Those myths create frustrations that grow into more. (Spellmeyer used another image to drive this point home: Hitchcock's North by Northwest.)

How does computational thinking help the company do more better and faster? By...

  • ... letting scientists spend more time doing what they love.
  • ... eliminating low-value-add transactional activities in the business process.
  • ... boosting the speed and scalability of their systems.

Notice that these advantages range from the scientific to business process to the technical. It's not only about techies sitting in front of monitors.

On the scientific side of the equation, Nodality has a data problem. A robust assay produces a flood of data:

106 cells/patient X 50 patients/experiment 20 challenges X 20 markers
→ 1010 data points per experiment

Thereafter followed a lot of detail that I couldn't follow in real time, which is probably just as well. There is a reason that Spellmeyer can't post his slides...

How do they eliminate low-value-added transactional activities?

  • Talk to customers.
  • Find patterns of practice.
  • Propose computational tools to improve practice.
  • Use an agile approach to gather requirements, design a system, field, get feedback, and iterate in short cycles.

Computational thinking enables scientists and techies to think of their experiments, and how to set them up, in different ways. For example, they might conceive of a way to set up a cytometer differently. They also think differently about experiment analysis and inventory management.

As Spellmeyer wrapped up, he he included a few snippets to motivate his ideas and the scale of the problems that he and his company face. He quoted Margaret Wheatley as saying that all science is a metaphor, a description of a reality we can never fully know. As a pragmatist, this is something I believe almost from the outset. He also said that in business, learning occurs naturally through normal interactions in work practices. Not in classes. "Context, community, and content" are the triumvirate that drives all they do. For this reason, his company puts a lot of effort into its community software tools.

The problem ultimately comes down to an issue at the intersection of combinatorics, pragmatics, and even ethics. We can make billions of unique molecules. Which ones should we make? We need to consider molecules similar enough to ones we understand but dissimilar enough to offer hope of a new result. This leads to a question of similarity and dissimilarity, one of those AI-complete tasks. There is room for a lot of great algorithm exploration here.

Finally, Spellmeyer weighed in on a hot topic from the previous session: Excel is a basic tool in his company. The business guys have developed an extremely complex business model, and all of their work is in Excel. But it's not just a work horse on the business side; scientists use Excel to transform data. He is happy to find scientists and techies alike who know how to use Excel at full strength.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 07, 2008 8:59 AM

Workshop 5: Curriculum Development

[A transcript of the SECANT 2008 workshop: Table of Contents]

This session was labeled "birds-of-a-feather", likely a track for short talks that didn't fit very well elsewhere. The most common feather was curriculum, efforts to develop it and determine its effect.

First up was Ruth Chabay, on looking in detail at students' computational thinking skills. She is involved in a pilot study that aims to answer the question, "Do students think differently about physics after programming?" This is the sort of outcomes assessment that people who develop curriculum rarely do. Even CS faculty -- despite the fact that we would never think of writing programs and not checking to see whether they ran correctly. This study is mostly question and method at this point, with only hints at answers.

The research methodology is a talk-aloud protocol with videotaping of the participants' behavior. Chabay showed an illustrative video of a student reasoning through a very simple program, talking about the problem. I'd love to be able to observe students from some of my courses in this way. It would be hard to gather useful quantitative data, but the qualitative results would surely give some insight into what some students are thinking when they are going their own way.

Next up was Rubin Landau, who developed a Computational Physics program at Oregon State. He started with a survey from the American Physical Society which reported what do physics grads do 5 years after they leave school. A large percentage are involved in developing software, but alumni said that the number one skill they use is "scientific problem solving". Even for those working in scientific positions, the principles of physics are far from the most important skill. Landau stressed that this does not mean that physics isn't important; it's just that students don't graduate to repeat what they learned as an undergrad. In Landau's opinion, much of physics education is driven by the needs of researchers and for graduate students. Undergraduate curriculum is often designed as a compromise between those forces and the demands of the university and its undergraduates.

Landau described the Computational Physics curriculum they created at Oregon State with the needs of undergrad education as the driving force. I don't know enough physics to follow his description in real-time, but I noticed a few futures. Students should learn two "compiled languages"; it doesn't really matter which, though he still likes it if one is Fortran. The intro courses introduce many numerical analysis concepts involving computation (underflow, rounding). This course is now so well settled that they offer it on-line with candid video mini-lectures. Upper-division courses include content that students may well work with in industry but which have disappeared from other schools' curricula, such as fluid dynamics..

Landau is fun to listen to, because he has an arsenal of one-liners at the ready. He reported one of his favorite computational physics student comments:

Now I know what's "dynamic" in thermodynamics!

Bruce Sherwood reported a physics student comment of his own: "I don't like computers." Sherwood responded, "That's okay. You're a physicist. I don't like them either." But physics students and professors need to realize that saying they don't like computers is like saying, "I don't like voltmeters." If you can't work with a voltmeter or a computer, you are in the wrong business. That's just the way the world is.

My favorite line of Landau's is one that applies as well to computer science as to physics:

We need a curriculum for doers, not monks.

The next two speakers were computer scientists. James Early described a project in which students are developing learner-centered content for their introductory computer science course. This project was motivated by last year's SECANT workshop. Most of the students in their intro course are not CS majors. The goal of the project is to excite these students about computation, so they'll take it back to their majors and put it to good use. I immediately thought, "I'd like to have CS majors get excited about computation and take it back to their major, too!" Too few CS students take full advantage of their programming skills to improve their own academic lives...

Resource link to explore: the Solomon-Felder Index of Learning Styles, which has gained some market share in engineering world. Besides, it's on-line and free!

Mark Urban-Lurain closed the session by describing the CPATH project at Michigan State, my old graduate school stomping grounds. This project is aimed at creating a process for redesigning engineering curriculum. But much of the interesting discussion revolved the fact that most engineering firms request that students have computational skills in... Excel! Several of the CS faculty in the room nodded their heads, because they have pointed this out to their colleagues and run into a stonewall. CS departments balk at such "tools". Now, Excel is not my tool of choice, but macros really are a form of programming. I've been following with interest some work in the Haskell community on programming in spreadsheets (see some of the papers here. We in CS have more powerful tools to use and teach, but we also need to meet users of computation where the live. And in many domains, that is the spreadsheet.

I ended the workshop by chatting with Urban-Lurain, with whom I came into contact as a teaching assistant. His colleague on this CPATH project is my doctoral advisor. It is a small world.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 03, 2008 7:19 PM

Workshop 4: Computer Scientists on CS Education Issues

[A transcript of the SECANT 2008 workshop: Table of Contents]

The first day of the workshop ended with two panels of two computer scientists each. The first described two current projects on introductory CS courses, and the second presented two CPATH projects related to the goals of SECANT. I either knew about these projects already or was familiar with their lessons from my department's experiences, so I didn't take quite as detailed notes. Then again, maybe I was just tiring after a long day of good stuff.

On intro CS, Deepak Kumar talked about Learning Computing with Robots, which has developed a course that serves primarily non-majors, with a goal of broadening interest in computing, even as a general education course. This course teaches computing, not robotics. Kumar mentioned that the cost of materials is no longer the issue it once was. They have built the course around a robot kit that costs in the neighborhood o $110 -- about the same price as a textbook these days!

Next, Tom Cortina talked about Teaching Key Principles of Computer Science Without Programming. In many ways, Cortina was swimming against the tide of this workshop, as he argued that non-majors could (should?) learn CS minus the programming. There certainly is a lot of cool stuff that students can learn using canned tools, talking about history, and doing some light math and logic. Cortina's course in particular covers a lot of neat material about algorithms. But still I think students miss out on something useful -- even central to computing -- when they bypass programming altogether. However, if the choice is between this course and a majors-style course that leaves non-majors confused, frustrated, or hating CS, well, then, I'll take this!

The second "panel" presented two related CPATH projects. Valerie Barr of Union College described efforts creating a course in computational science across the curriculum at Union and Lafayette College. The key experience she reported was on how to build an initial audience for the course, so that later word of mouth can spread. Barr's experience sound familiar: blanket e-mail to faculty tends not to work well, but one-on-one conversations with faculty do -- especially ongoing contact and continued conversation. This sort of human contact is time-intensive, which makes it hard to scale as you move to schools much larger than Union or Lafayette. Barr said that they had had good luck dealing with people in their Career Center, who could tell students how useful computational skills are across all the majors on campus. At my school, we have had similar good results working with people in Academic Advising and Career Services. They seem to get the value of computational skills as well as or better than faculty across campus, and they have different channels than we do for reaching students over the long term.

Finally, Lenny Pitt described the iCUBED project at the University of Illinois. The one content fact I remember from Pitt's talk is that they are working to develop applied CS programs and "CS + <X>" programs within other departments. The most memorable part of his talk for me, though, was how he had reconfigured the project's acronym (which they inherited from enabling policy or legislation) based on the workshop's theme and 2008 mantra: "Infiltration: Computing Used By Every Discipline." Creative!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 31, 2008 10:52 AM

SECANT This and That

[A transcript of the SECANT 2008 workshop: Table of Contents]

As always, at this workshop I have heard lots of one-liners and one-off comments that will stick with me after the event ends. This entry is a place for me to think about them by writing, and to share them in case they click with you, too.

The buzzword of this year's workshop: infiltration. Frontal curricular assaults often fail, so people here are looking for ways to sneak new ideas into courses and programs. An incremental approach creates problems of its own, but agile software proponents understand its value.

College profs like to roll their own, but high-school teachers are great adapters. (And adopters.)

Chris Hoffman, while describing his background: "When you reach my age, the question becomes, 'What haven't you done?' Or maybe, 'What have you done well?'"

Lenny Pitt: "With Python, we couldn't get very far. Well, we could get as far as we wanted, but students couldn't get very far." Beautiful. Imagine how far students will get with Java or Ada or C++.

Rubin Landau: "multidisciplinary != interdisciplinary". Yes! Ideas that transform a space do more than bring several disciplines into the same room. The discipline is new.

It's important to keep in mind the relationship between modeling and computing. We can do model without computing. But analytical models aren't feasible for all problems, and increasingly the problems we are interested in fall into this set.

Finally let me re-run an old rant by linking to the original episode. People, when you are second or third or sixth, you look really foolish.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

October 31, 2008 10:26 AM

Workshop 3: Computational Thinking in Physics

[A transcript of the SECANT 2008 workshop: Table of Contents]

As much as computation is now changing biology, it has already changed physics. Last year's workshop had a full complement of physicists and astronomers. In their minds, it is already clear that physicists must program -- even students learning intro physics. The question is, what problems do they face in bringing more computation to physics education? This panel session shared some physicists' experience in the trenches. Bruce Sherwood, the panel chair, set the stage: We used to be able to describe physics as theory, experiment, and the interplay between the two. This is no longer true, and it hasn't been for a while. Physics is now theory, experiment, simulation, and the interplay among the three! Yet this truth is not reflected in the undergraduate physics curriculum -- even at so-called "respectable schools".

Rubin Landau described a systemic approach, a Computational Physics major he designed and implemented at Oregon State. He was motivated by what he saw as a turning inward of physics, efforts to cover all of the history of physics in the undergrad curriculum, with a focus on mathematics from the 19th century, and not looking outward to how physics is done today. (This CS educator felt immediate empathy for Landau's plight.) He noted his own embarrassment: computational physicists at major physics conferences who refuse to discuss their algorithms or the verification of their programs. This is simply not part of the culture of physics.

Students learn by doing, so projects are key to this Computational Physics curriculum. Students use a "compiled language", which is Landau's way to distinguish programming in a CS-style language from Mathematica and Maple. For him, the key is to separate the program from engine; students need to see the program as an idea. Two languages is better, as that gives students a chance to generalize the issues at play in using computation for modeling.

The OSU experience is that the political issues in changing the curriculum are much tougher to solve than the academic issues: the need for budget, the resistance of senior faculty, the reluctance of junior faculty to risk tenure, and so on.

Landau closed by saying that, for physics-minded students, using computation in physics and then taking a CS course seems to work best. He likened this to the use of Feynman diagrams in grad school: students learn to calculate with them, and then learn the field theory behind them the next year. His undergrads have several "A-ha!" moments throughout CS1. I suspect that this approach would work for a lot of CS students, too, if we can get them to use computation. Media computation is one avenue I've seen work with some.

Next up was Robert Swendsen, from Carnegie-Mellon. In the old days, physicists wrote programs because they did not know how to solve a problem analytically. Now, they compute to solve problems that no one knows how to solve analytically. (Mental note: It also lets them ask new questions.) The common problem many of us face: we tend to teach the course we took -- something of a recursion problem. (Mental note: Where is the base case? Aristotle, I suppose.)

Swendsen identified a few other challenges. Students are used to looking at equations, though if they don't get as much from them as we do, but they have no experience looking at and reasoning about data. They struggle even with low-level issues such as accuracy in terms of the number of significant digits. Further, many students do not think that computational physics is "real" physics. To them, physics == equations.

This is a cultural expectation across the sciences, a product of the few centuries of practice. Nor is it limited to students; people out in the world think of science as equations. Perhaps they pick this notion up in their high-school courses, or even in their college courses. I think that faculty in and out of the sciences share this misperception as well. The one exception is probably biology, which may account for part of its popularity as a major -- no math! no equations! I couldn't help but think of Bernard Chazelle's efforts to popularize the notion that the algorithm is the idiom of modern science.

Listening to Swendsen, I also had an overriding sense of deja vu, back to when CS faculty across the country were trying to introduce OO thinking into the first-year CS curriculum. Curriculum change must share some essential commonalities due to human nature.

Physicist Mark Haugan focused on a particular problem he sees: a lack of continuity across courses in the physics curriculum with respect to computation. Students may use computation in one course and then see no follow-through in their next courses. In his mind, students need to learn that computation is a medium for expressing ideas -- a theme regular readers of this blog will recognize. Mathematical equations are one medium, and programs are another. I think the key is that we need to discuss and work with problems where computation matters -- think Astrachan's Law -- problems for which the lack of computation would limit our ability to understand and solve the problem. This, too, echoes the OO experience in computer science education. We still face the issue that other courses and other professors will do things in a more traditional way. This is another theme common to both SECANT workshops: we need to help students feel so empowered by computation that they use it unbidden in their future courses.

The Q-n-A session contained a wonderful thread on the idea of physics as a liberal art. One person reported a comment made by a student who had taken a computational physics course and then read a newspaper article on climate modeling:

Wow. Now I know what that means.

I can think of no higher "student learning outcome" we in computer science can have for our general education and introductory programming courses: Wow. Now I know what that means.

There are many educated people who don't what "computer model" means. They don't understand what is reported in the news. There are many educated people reporting the news who don't understand the news they are reporting.

That's not right.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 30, 2008 8:39 PM

Workshop 2: Computational Thinking in the Health Sciences

[A transcript of the SECANT 2008 workshop: Table of Contents]

The next session of the workshop was a panel of university faculty working in the health sciences, talking about how they use computation in their disciplines and what the key issues are. Panel chair, Raj Acharya, from Penn State's Computer Science and Engineering department, opened with the bon mot "all science is computer science", a reference to a 2001 New York Times piece that I have been using for the last few years when speaking to prospective students, their parents, and other faculty. By itself, this statement sounds flip, but it is true in many ways. The telescope astronomers use today is as much a computational instrument as a mechanical one. Many of the most interesting advances in biology these days are really bioinformatics.

The dawn of big data is changing what we do in CS, but it's having an even bigger effect in some other sciences by creating a new way to do science. Modeling is a nascent research method based in computation: propose model, test it against the data, and iterate. Data mining is an essential step in this new process: all of the data goes into a box, and the box has to make the sense of the data. This swaps two steps in the traditional scientific method... Instead of forming a hypothesis and then testing it by collecting data, a scientist can mine a large collection of data to find candidate hypotheses, and then confirm with more traditional bench science and by checking models against other and larger data sets.

Tony Hazbun, who works in the School of Pharmacy at Purdue, talked about work in systems biology. He identified four key ideas that biologists need to learn from computer science, which echoed a talk from last year's workshop:

  • data visualization
  • database management (relational, not flat)
  • data classification (cluster analysis)
  • modeling

Hazbun made one provocative claim that I think hits the heart of why this sort of science is important. We mine data sets to see patterns that we probably would not have seen otherwise. This is approach is more objective than traditional science, in which the hypotheses we test are the ones we create out of our own experience. This is a much more personal approach -- and thus more subjective. Data mining helps us to step outside our own experience.

Next up was Daisuke Kihara, a Purdue bioinformatician who was educated in Japan. He talked about the difficulties he has had building a research group of graduate students. The main problem is that biology students have few or no skills in mathematics and programming, and CS students know little or no biology. In the US, he said, education is often too discipline-specific, with not enough breadth, which limits the kind of cross-fertilization needed by researchers in bioinformatics. My university created an undergraduate major in Bioinformatics three years ago in an effort to bridge this gap, in part because biotechnology is an industry targeted for economic development in our state.

(My mind wandered a bit as I thought about Kihara's claim about US education. If he is right, then perhaps the US grew strong technically and academically during a time when the major advances came within specific disciplines. Now that the most important advances are coming in multidisciplinary areas, we may well need to change our approach, or lose our lead. I've been concerned about this for a year or so, because I have seen the problem of specializing too soon creeping down into our high schools. But then I wondered, is Kihara's claim true? Computer science has a history grounded in applications that motivate our advances; I think it's a relatively recent phenomenon that we spend most of our time looking inward.)

In addition to technical skills and domain knowledge, scientists of the future need the elusive "problem-solving skills" we all talk about and hope to develop in our courses. Haixu Tang, from the Informatics program at Indiana contrasted the mentality of what he called information technology and scientific computing:

  • technique-driven versus problem-driven
  • general models versus specific, even novel, models
  • robust, scalable, and modular software versus accurate, efficient programs

These distinctions reflect a cultural divide that makes integrating CS into science disciplines tough. In Tang's experience, domain knowledge is not the primary hurdle, but he has found it easier to teach computer scientists biology than to teach biologists computer science.

Tang also described the shift in scientific method that computing enables. In traditional biology, scientists work from hypothesis to data to knowledge, with a cycle from data back to hypothesis. In genome science, science can proceed from data to hypothesis to knowledge, with a cycle from hypothesis back to data. The shift is from hypothesis-driven science to data-driven science. Simulation has joined theory and statistics in the methodological toolbox.

In the Q-n-A session that followed the panel, someone expressed concern with data-driven research. Too many people don't go back to do the experiments needed to confirm hypotheses found via data mining or to verify their data by independent means. The result is bad science. Olga Vitek, a statistical bioinformatician, replied that the key is developing skill in experimental design. Some researchers in this new world are learning the hard way.

The last speaker was Peter Waddell, a comparative biologist who is working to reconstruct the tree of life based on genome sequences. One example he offered was that the genome record shows primates' closest relatives to be... tree lemurs and shrews! This process is going slowly but gaining speed. He told a great story about shotgun sequencing, BLAST, and the challenges in aligning and matching sequences. I couldn't follow it, because I am a computer scientist who needs to learn more biology.

When Waddell began to talk about some of the computing challenges he and his colleagues face, I could follow the details much better. They are working with a sparse matrix that will have between 102 and 103 rows and between 102 and 109 (!!) columns. The row and column sums will differ, but he needs to generate random matrices having the same row and column sums as the original matrix. In his estimation, students almost need to have a triple major in CS, math, and stats, with lots of biology and maybe a little chemistry thrown in, in order to contribute to this kind of research. The next best thing is cross-fertilization. His favorite places to work have been where all of the faculty lunch together, where they are able to share ideas and learn to speak each other's languages.

This remark led to another question, because it "raised the hobgoblin of multidisciplinary research": an undergraduate needs seven years of study in order to prepare for a research career -- and that is only for the best students. Average undergrads will need more, and even that might not be enough. What can we do? One idea: redesign the whole curriculum to be interdisciplinary, with problems, mathematics, computational thinking, and research methods taught and reinforced everywhere. Graduating students will not be as well-versed in any one area, but perhaps they will be better at solving problems across the boundaries of any single discipline.

This isn't just a problem for multidisciplinary science preparation. We face the same problem in computer science itself, where the software development side of our discipline requires a variety of skills that are often best learned in context. The integrated curriculum suggestion made here makes me think of the integrated apprenticeship-style curriculum that ChiliPLoP produced this year.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 30, 2008 1:31 PM

Workshop 1: A Course in Computational Thinking

[A transcript of the SECANT 2008 workshop: Table of Contents]

To open the workshop, the SECANT faculty at Purdue described an experimental course they taught last spring, Introduction to Computational Thinking. It was designed by a multi-disciplinary team from physics, chemistry, biology, and computer science for students from across the sciences.

The first thing that jumped out to me from this talk was that the faculty first designed the projects that they wanted students to do, and then figured out what students would need to know in order to do the projects. This is not a new idea (few ideas are), but while many people talk about doing this, I don't see as many actually doing it. It's always interesting to see how the idea works in practice. Owen Astrachan would be proud.

The second was the focus on visualization of results as essential to science and as a powerful attractor for students. It is not yet lunch time on Day 1, but I have heard enough already to say that visualization will be a key theme of the workshop. That's not too surprising, because visualization was also a recurring theme in last year's workshop. Again, though, I am glad to be reminded of just how important this issue is outside the walls of the Computer Science building. It should affect how we prepare students for careers applying CS in the world.

The four projects in the Purdue course's first offering were:

  • manipulating digital audio -- Data representation is a jump for many students.
  • percolation in grids -- Recursion is very hard, even for very bright students. Immediate feedback, including visualization, is helpful.
  • Monte Carlo simulation of a physical system
  • protein-protein interaction -- Graph abstractions are also challenging for many students.

This looks like a broad set of problems, the sort of interdisciplinary science that the core natural sciences share and which we computer scientists often miss out on. For CS students to take this course, they will need to know a little about the several sciences. That would be good for them, too.

Teaching CS principles to non-CS students required the CS faculty to take an approach unlike what they are used to. They took advantage of Python's strengths as a high-level, dynamic scripting language to use powerful primitives, plentiful libraries, and existing tools for visualizing results. (They also had to deal with its weaknesses, not the least of which for them was the delayed feedback about program correctness that students encounter in a dynamically-typed language.) They delayed teaching the sort of software engineering principles that we CS guys love to teach early. Instead, they tried to introduce abstractions only on a need-to-know basis.

Each project raised particular issues that allowed the students to engage with principles of computing. Audio manipulation exposed the idea of binary representation, and percolation introduced recursion, which exposed the notion of the call stack. Other times, the mechanics of writing and running programs exposed underlying computing issues. For example, when a program ran slower than students expected on the basis of previous programs, they got to learn about the difference in performance between primitive operations and user-defined functions.

The panelists reported lessons from their first experience that will inform their offering next spring:

  • The problem-driven format was a big win.
  • Having students write meaningful programs early was a big win.
  • Having students see the results of their programs early via visualization was a big win.
  • Python worked well in these regards.
  • The science students' interest in computing is bimodal. Computing either has a strong appeal to them almost immediately, or the student exhibits strong resistance to computing as a tool.
  • On the political front, interaction with science faculty is essential to succeeding. They have to buy in to this sort of course, as do administrators who direct resources.

One of the open questions they are considering is, do they need or want to offer different sections of this course for different majors? This is a question many of us are facing. Having a more homogeneous student base would allow the use of different kinds of problem and more disciplinary depth. But narrowing the problem set would lose the insight available across disciplines. At a school like mine, we also risk spreading the student base so thin that we are unable to offer the courses at all.

Somewhere in this talk, speaker Susanne Hambrusch, the workshop organizer and leader, said something that made me think about what in my mind is the key to bringing computation to the other disciplines most naturally: We need to leave students thinking, "This helps me answer questions in my discipline -- better, or faster, or ...". This echoed something that Ruth Chabay said at the end of last year's workshop. Students who see the value of computation and can use computation effectively will use computation to solve their own problems. That should be one of the primary goals of any course in computing we teach for students outside of CS.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 30, 2008 10:08 AM

Notes on the SECANT Workshop: Table of Contents

This set of entries records my experiences at the 2008 SECANT 2008 workshop October 30-31, hosted by the Department of Computer Science at Purdue University.

Primary entries:

  • Workshop 1: A Course in Computational Thinking
    -- SECANT a year later
  • Workshop 2: Computational Thinking in the Health Sciences
    -- big data is changing the research method of science
  • Workshop 3: Computational Thinking in Physics
    -- bringing computation to the undergrad physics curriculum
  • Workshop 4: Computer Scientists on CS Education Issues
    -- bringing science awareness to computer science departments
  • Workshop 5: Curriculum Development
    -- some miscellaneous projects in the trenches
  • Workshop 6: The Next Generation of Scientists in the Workforce
    -- computational thinking as competitive advantage

Ancillary entries:


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 29, 2008 9:11 PM

Information, Dystopia, and a Hook

On my drive to Purdue today, I listened to the first 3/4 of Caleb Carr's novel, "Killing Time". This is not a genre I read or listen to often, so it's hard for me to gauge the book's quality. If you are inclined, you can read reviews on-line. At this point, I would say that it is not a very good book, but it delivered fine escapism for a car ride on a day when I needed a break more than deep thought. But it did get me to thinking about... computer science. The vignette that sets up the novel's plot is based on a typical use case for Photoshop, or a homework assignment in a media computation CS1 course.

Carr describes a world controlled by "information barons", a term intended to raise the specter of the 19th century's rail barons and their control of wealth and commerce. The central feature of his world in 2023 is deception -- the manipulation of information, whether digital or physical, to control what people think and feel. The novel's opening involves the role a doctored video plays in a presidential assassination, and later episodes include doctored photos, characters manufactured via the data planted on the internet, the encryption of data on disk, and real-time surveillance of encrypted communication.

If students are at all interested in this kind of story, whether for the science fiction, the intrigue, or the social implications of digital media and their malleability, then we have a great way to engage them in computing that matters. It's CSI for the computer age.

Carr seems to have an agenda on the social issues, and as is often the case, such an agenda interferes with the development of the story. His characters are largely cut-outs in service of the message. Carr paints a dystopian view striking for its unremitting focus on the negatives of digital media and the science's increasing understanding of the world at a molecular level. The book seems unaware that biology and chemistry are helping us to understand diseases, create new drugs, and design new therapies, or that computation and digital information create new possibilities in every discipline and part of life. Perhaps it is more accurate to say that Carr starts with these promises as his backdrop and chooses to paint a world in which everything that could go wrong has. That makes for an interesting story but ultimately an unsatisfying thought experiment. For escapism, that may be okay.

After my previous entry, I couldn't help but wonder whether I would have the patience to read this book. I have to think not. How many pages? 274 pages -- almost slender compared to Perec's book. Still, I'm glad I'm listening and not reading.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 25, 2008 7:33 AM

40th Anniversaries

There are two big 40th anniversary events coming up for those of us in computer science. On November 5, the Computer History Museum is hosting the 40th Anniversary of the Dynabook, with Alan Kay, Charles Thacker, and Mary Lou Jepsen. Then on December 9, Stanford is hosting Engelbart and the Dawn of Interactive Computing: SRI's Revolutionary 1968 Demo. Much of the last 40 years of technology has been an evolution toward the ideas embodied in Kay's FLEX machine and Engelbart's mouse-controlled, real time-interactive, networked computer. These ideas showed us what was possible. In addition to technological vision, Kay and Engelbart also shared a greater goal: "to use computing to augment society's collective intellect and ability to solve the complex issues of our time".

I expect that 40 is going to be a common number in computer science celebrations in the next few years.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 24, 2008 12:10 PM

I've Been Reddited

I don't know if "reddited" is a word like "slashdotted" yet, but I can say that yesterday's post, No One Programs Any More, has reached #1 on Reddit's programming channel. This topic seems to strike a chord with a lot of people, both in the software business and in other technology pursuits. Here are my favorite comments so far:

I can't think of a single skill I've learned that has had more of an impact on my life than a semi-advanced grasp of programming.

This person received some grief for ranking learning how to program ahead of, say, learning how to eat, but I know just what the commenter means. Learning to program changes one's mind in the same way that learning to read and write. Another commenter agreed:

It's amazing how after a year of programming at university, I began to perceive the world around me differently. The way I saw things and thought about them changed significantly. It was almost as if I could feel the change.

Even at my advanced age, I love when that feeling strikes. Beginning to understand call/cc felt like that (though I don't think I fully grok it yet).

My favorite comment is a bit of advice I recommend for you all:

I will not argue with a man named Eugene.

Reddit readers really are smart!


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 23, 2008 9:58 PM

No One Programs Any More

One of my colleagues in the Math department sent me some e-mail today:

I am a constant advocate for our (math) majors receiving some sort of 'computer-programming experience' before they graduate.

Of course not all of my colleagues are as enthusiastic about this... In fact, at a recent meeting, someone stated: "No one programs anyone." This is was the basis of their argument against requiring a programming course... and it turns out that several people believe this statement.

He asked for my reaction to their stance.

This request comes as I prepare to attend the second SECANT workshop at Purdue next week. Last fall I wrote several articles about the inaugural workshop for this NSF-funded project. The NSF must think that programming and, more generally, computer science are important beyond the walls of the CS building, because it has funded projects like SECANT, the goal of which is to:

... bring together computer scientists and natural scientists who recognize that computing has become indispensable to scientific inquiry and is set to permeate science in a transformative manner.

Most of the attendees last year, and many on the roster for this year, are scientists: physicists, biologists, chemists, and astronomers. They all program in some form, because CS has redefined how they do science. Some of them are developing curriculum in their disciplines that are programming-based so that future grads are better prepared for their careers.

In the time since I joined the faculty here, many departments have dropped the computer programming requirement from their majors. Part of the reason is probably that the intro programming courses were not meeting their students' needs, and our department needs to take responsibility for that. But a big part of the reason is that many faculty across campus believe as the Math faculty do, that their students don't need to learn computer programming anymore. Not too surprisingly, I disagree.

We have started to see some movement in the other direction. The Physics department now requires an introductory programming course because so many physicists need to know how to write and modify simulation programs that serve as their experiments. One result has been a steady stream of students in our intro C course, which focuses on scientific applications. Another is an ongoing research relationship among a member of the Physics faculty, a member of the CS faculty, and undergraduates from both departments that has produced several papers (with undergrad co-authors) and occasional award recognition. None of this research is possible without physics students being able to program complex molecular system simulations.

Scientists are not the only non-CS people who need to program -- or want to. People working in finance and other areas of business program, even if only in the form of complex spreadsheets, which are constraint propagation programs. Even further afield, artists are beginning to use computational media to create art and to explore concepts of form and color in a new way.

Saying all this, I can understand how mathematicians who work at a distance from computational applications might think that programming is passe. They have little experience with code themselves, and then they read vague articles in the newspapers about off-shoring and the demise of programming. Even among computer scientists who work with scientists know "surprisingly little about how scientists develop and use software in their research", which is why some of them are conducting sponsored research to to survey scientists on how they use computers.

But surely mathematicians are aware computational work on number theory that requires a nearly global network of computers to perform massive calculations, for instance, to find large prime numbers. One might dismiss such work as "merely" applications, not real math, but these applications are testing mathematical theorems about numbers in ways we could only have dreamed of in past times.

Math profs at mid-sized universities are not the only ones with the impression that programming is disappearing or less important than it used to be. Mark Guzdial recently wrote that some think programming isn't essential even for computer scientists:

I was at a meeting a couple weeks ago where an anecdote was related that speaks to this concern. A high-ranking NSF official made the argument that programming is not a critical skill for computer scientists. "Google doesn't want smart programmers! They want smart people!" A Google executive was in the audience and responded, "No, we want people who program."

I'm glad that Google knows better what Google needs than this particular high-ranking NSF official, and I realize that said official may only have meant that the smart people Google hires can become good programmers. But I do think that this story indicates the breadth of the misunderstanding people have about the role programming plays in the world today.

Perhaps the math profs here who said that "no one programs any more" were speaking only of math graduates from this university. But even that very limited claim is false. I suggested to my friend that they should probably survey their own alumni. I know several of them who program for a living. And some of them came back after graduation to learn how.


Posted by Eugene Wallingford | Permalink | Categories: Computing

October 07, 2008 5:49 AM

Databases and the Box

Last time I mentioned a Supreme Court justice's thoughts on how universal access to legal case data changes the research task associated with the practice of the law. Justice Roberts's comments brought to mind two thoughts, one related to the law and one not.

As a graduate student, I worked on the representation and manipulation of legal arguments. This required me to spend some time reading legal journals for two different purposes. First, I needed to review the literature on applying computers to legal tasks, ad in particular how to represent knowledge of statute and cases. Second, I needed to find, read, and code cases for the knowledge base of my program. I'm not that old, but I'm old enough that my research preceded the Internet Age's access to legal cases. I went to the campus library to check out thick volumes of the Harvard Law Review and other legal collections and journals. These books became my companions for several months, as I lay on the floor of my study and pored over them.

When I could not find a resource I needed on campus, I rode my bike to the Michigan State Law Library in downtown Lansing to use law reviews in its collection. I was not allowed to take these home, so I worked through them one at a time in carols there. I was quite an anomalous sight there, in T-shirt and shorts with a bike helmet at my side!

I loved that time, reading and learning. I never considered studying the law as a profession, but this work was a wonderful education in a fascinating domain where computing can be applied. My enjoyment of the reading almost certainly extending my research time in grad school by a couple of months.

The second thought was of the changes in chess brought about by the application of simple database technology. I've written about chess before, but not about computing applications to it. Of course, the remarkable advances in chess-playing computers that came to a head in Hitech and Deep Thought have now reached the desktop in the form of cheap and amazingly strong programs. This has affected chess in so many ways, from eliminating the possibility of adjournments in most tournaments to providing super-strong partners for every player who wants to play, day or night. The Internet does the same, though now we are never sure if we are playing against a person or a person sitting next to a PC running Fritz.

But my thoughts turned to the same effect Justice Roberts talked about, the changes created by opening databases on how players learn, study, and stay abreast of opening theory. If you have never played tournament chess, you may not be aware of how much knowledge of chess openings has been recorded. Go to a big-box bookstore like Amazon or Barnes and Noble or Borders and browse the library of chess titles. (You can do that on-line now, of course!) You will see encyclopedias of openings like, well, the Encyclopedia of Chess Openings; books on classes of openings, such as systems for defending against king pawn openings; and books upon books about individual openings, from the most popular Ruy Lopez and Sicilian Defense to niche openings like my favorites, Petroff's Defense and the Center Game.

In the olden days of the 1980s, players bought books on their objects of study and pored over them with the same vigor as legal theorists studying law review articles. We hunted down games featuring our openings so that we could play through them to see if there was a novelty worth learning or if someone had finally solved an open problem in a popular variation. I still have a binder full of games with Petroff's Defense, cataloged using my own system, variation by variation with notes by famous players and my own humble notes from unusual games. My goal was to know this opening so well that I could always get a comfortable game out of the opening, against even stronger players, and to occasionally get a winning position early against a player not as well versed in the Petroff as I.

Talk about a misspent youth.

Chessplayers these days have the same dream, but they rarely spend hours with their heads buried inside opening books. These days, it is possible to subscribe to a database service that puts at our fingertips, via a computer keyboard, every game played with any opening -- anywhere in the recorded chess world, as recently as the latest update a week ago. What is the study of chess openings like now? I don't know, having grown up in the older era and not having kept up with chess study in many years. Perhaps Justice Roberts feels a little like this these days. Clerks do a lot of his research, and when he needs to do his own sleuthing, those old law reviews feel warm and inviting.

I do know this. Opening databases have so changed chess practice, from grandmasters down to patzers like me, that the latest issue of Chess Life, the magazine of U.S. Chess, includes a review of the most recent revision of Modern Chess Openings -- the opening bible on which most players in the West once relied as the foundation of broad study -- whose primary premise is this: What role does MCO play in a world where computer database is king? What is the use of this venerable text?

From our gamerooms to our courtrooms, applications of even the most straightforward computing technology have changed the world. And we haven't even begun to talk about programs.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 05, 2008 8:47 PM

The Key Word is "Thinking"

The Chief Justice of the U.S. Supreme Court, John Roberts, spoke last week at Drake University, which merited an article in our local paper. Robert spoke on the history of technology in the law, and in particular on how the internet is changing in fundamental ways how the law is practiced. He likened the change to that created by the printing press, an analogy I use whenever I speak with parents and prospective CS majors.

The detective work that was important and rewarding when I was starting out is now almost ... irrelevant.

I wonder if this will have an effect on the kind of students who undertake study of the law, or the kind of lawyers who succeed in the profession. I don't imagine that it will affect the attractiveness of the law for a while, because I doubt that a desire to spend countless hours poring through legal journals is the primary motivator for most law students. Prestige and money are certainly more prominent, as is a desire to "make a difference". But who performs best way well change, as the circumstances under which lawyers work change. This sort of transformation is almost unavoidable when a new medium redefines even part of a discipline.

Roberts is perhaps concerned about this part of the change himself. Technology makes information more accessible, which means skill in finding it is no longer as valuable. How about skill at manipulating it? Being able to find information more readily can liberate practitioners, but only if they know what to do with it.

There's a lot of value in thinking outside the box. But the key word is "thinking". ... You cannot think effectively outside the box if you don't know where the box is.

I love that sentence! It's a nice complement to a phrase of Twyla Tharp's that I wrote about over three years ago: Before you can think out of the box, you have to start with a box. Tharp and Roberts are speaking of different boxes, and both are so right about both boxes.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 30, 2008 4:35 PM

Radical Code

I received an interesting request in e-mail today. I was asked to identify a few...

software innovators--people who have challenged, disrupted, and redefined disciplines through radical code.

The request is specifically for programmers, thinkers, and writers who revolutionized my field, computer science. The first people who come to mind, of course, are John McCarthy and Alan Kay. McCarthy may not himself have written code for the first Lisp interpreter back in 1958, but this page certainly revolutionized computing. I know that Kay didn't write all of Smalltalk, even at the beginning (Dan Ingalls is responsible for most or all of the VM), but the ideas in Smalltalk certainly changed how I and many other people think about programming. So they are at the top of my list.

Why limit my response to the products of my timy brain? I am interested in hearing whom you think changed computing with "radical code". Don't limit yourself to CS, either. I'd love to hear about people who changed other disciplines with their code. Drop me a message with your ideas!


Posted by Eugene Wallingford | Permalink | Categories: Computing

September 16, 2008 9:43 PM

More on the Nature of Computer Science

Another entry generated from a thread on a mailing list...

A recent thread on the SIGCSE list began as a discussion of how programming language constructs are abstractions of the underlying hardware, and what that means for how students understand the code they write. For example, this snippet of Java:

    int x = 1;
    while (x > 0)
        x++;

does not result in an infinite, because Java ints are not integers.

This is one of many examples that remind us how important it is to study computer organization and architecture, and more generally to learn that abstractions are never 100% faithful to the details they hide. If they were, they would not be abstractions! A few good abstractions make all the difference in how we work, but -- much like metaphor -- we have to pay attention to what happens at their edges.

Eventually, the thread devolved toward a standard old discussion on this list, "What is Computer Science?" I conjecture that every mailing list, news group, and bulletin board has a topic that is its "fixed point", the topic toward which every conversation ultimately leads if left to proceed long enough, unfettered by an external force. Just about every Usenet newsgroup in which I participated during the late 1980s and early 1990s had one, and the SIGCSE list does, too. It is, "What is Computer Science?"

This question matters deeply to many people, who believe that graduates of CS programs have a particular role to play in the world. Some think that the primary job of undergraduate CS programs is to produce software engineers. If CS is really engineering (or at least should be thought of that way for practical reasons), then the courses we teach and the curricula we design should have specific outcomes, teach specific content, and imbue in students the mindset and methodology of an engineer. If CS is some sort of liberal art, then our courses and curricula will look quite different.

Much of this new thread was unremarkable if only because it all sounded so familiar to me. One group of people argued that CS is engineering, and another argued that it was more than engineering, perhaps even a science. I must have been in an ornery mood, because one poster's assertion provoked me to jump into the fray with a few remarks. He claimed that CS was not a science, because it is not a "natural science", and that it is not a natural science because the object of its study is not a natural phenomenon:

I don't believe that I have ever seen a general purpose, stored-program computing device that occurs in nature... unless we want to claim that humans are examples of such devices.

This seems like such a misguided view of computer science, but many people hold it. I'm not surprised that non-computer scientists believe this, but I am still surprised to learn that someone in our discipline does, too. Different people have different backgrounds and experiences, and I guess those differences can lead people to widely diverging viewpoints.

Computer science does not study the digital computer. Dijkstra told us so a long time ago, and if we didn't believe him then, we should now, with the advent of ideas such as quantum computing and biological computing.

Computer science is about processes that transform information. I see many naturally-occurring processes in the world. It appears now that life is the result of an information process, implement in the form of DNA. Chemical processes involve information as well as matter. And some physicists now believe that the universe as we experience it is a projection of two-dimensional information embodied in the interaction of matter and energy.

When we speak of these disciplines, we are saying more than that computer scientists use their tool -- a general-purpose computation machine -- to help biologists, chemists, and physicists do science in their areas. We are talking about a more general view of processes and information, how they behave in theory and under resource constraints. Certainly, computer scientists use their tools to help practitioners of other disciplines do their jobs differently. But perhaps more important, computer scientists seek to unify our understanding of processes and information across the many disciplines in which they occur, in a way that sheds light on how information processing works in each discipline. We are still at the advent of the cycle feeding back what we learn from computing into the other disciplines, but many believe that this is where the greatest value of computer science ultimately lies. This means that computer science is wonderful not only because we help others by giving them tools but also because we are studying something important in its own right.

If we broaden our definition of "naturally occurring" to include social phenomena in large. complex systems that were not designed by anyone in particular, then the social sciences give rise to a whole new class of information processes. Economic markets, political systems, and influence networks all manifest processes that manipulate and communicate information. How do these processes work? Are they bound by the same laws as physical information processing? These are insanely interesting questions, whose answers will help us to understand the world we live in so much better than we do now. Again, study of these processes from the perspective of computer science is only just beginning, but we have to start somewhere. Fortunately, some scientists are taking the first steps.

I believe everything I've said here today, but that doesn't mean that I believe that CS is only science. Much of what we do in CS is engineering: of hardware systems, of software systems, of larger systems in which the manipulation of information is but one component. Much of what we do is mathematics: finding patterns, constructing abstractions, and following the implications of our constructions within a formal system. That doesn't mean computer science is not also science. Some people think we use the scientific method only as a tool to study engineered artifacts, but I think that they are missing the big picture of what CS is.

The fact that people within our discipline still grapple with this sense of uncertainty about its fundamental nature does not disconcert me. We are a young discipline and unlike any of the disciplines that came before (which are themselves human constructs in trying to classify knowledge of the world). We do not need to hide from this unique character, but should embrace it. As Peter Denning has written over the years Is computer science science? Engineering? Mathematics? The answer need not be one of the above. From different perspectives, it can be all three.

Of course, we are left with the question of what it is like for a discipline to comprise all three. Denning's Rebooting Computing summit will bring together people who have been thinking about this conundrum in an effort to make progress, or chart a course. On the CS education front, we need to think deeply about the implications of CS's split personality for the design of our curricula. Owen Astrachan is working on innovating the image of CS in the university by turning our view outward again to the role of computer science in understanding a world bigger than the insides of our computers or compilers. Both of these projects are funded by the NSF, which seems to appreciate the possibilities.

I can't think about the relationship between computer science and natural science with thinking of Herb Simon's seminal Sciences of the Artificial. I don't know whether reading it would change enough minds, but it affected deeply how I think about complex systems, intentionality, and science.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

September 09, 2008 6:24 PM

Language, Patterns, and Blogging

My semester has started with a busy bang, complicated beyond usual by a colleague's family emergency, which has me teaching an extra course until he returns. The good news is that my own course is programming languages, so I am getting to think about fun stuff at least a couple of days a week.

Teaching Scheme to a typical mix of eager, indifferent, and skeptical students brought to mind a blog entry I read recently on Fluent Builders in Java. This really is a neat little design pattern for Java or C++ -- a way to make those code written in these languages look and feel so much better to the reader. But looking at the simple example:

Car car = Car.builder()
   .year(2007)
   .make("Toyota")
   .model("Camry")
   .color("blue")
   .build();

... can't help me think about the old snark that we are reinventing Smalltalk and Lisp one feature at a time. A language extension here, a design pattern there, and pretty soon you have the language people want to use. Once again, I am turning into an old curmudgeon before my time.

As the author points out in a comment, Ruby gives us an more convenient way to fake named parameters: passing a hash of name/value pairs to the constructor. This is a much cleaner hack for programmers, because we don't have to do anything special; hashes are primitives. From the perspective of teaching Programming Languages this semester, what like most about the Ruby example is that it implements the named parameters in data, not code. The duality of data and program is one of those Big Ideas that all CS students should grok before they leave us, and now I have a way to talk about the trade-off using Java, Scheme, and an idiomatic construction in Ruby, a language gaining steam in industry.

Of course, we know that Scheme programmers don't need patterns... This topic came up in a recent thread on the PLT Scheme mailing list. Actually, the Scheme guys gave a reasonably balanced answer, in the context of a question that implied an unnecessary insertion of pattern-talk into Scheme programming. How would a Scheme programmer solve the problem that gives rise to fluent builders? Likely, write a macro: extend the language with new syntax that permits named parameters. This is the "pattern as language construct" mentality that extensible syntax allows. (But this leaves other questions unanswered, including: When is it worth the effort to use named parameters in this way? What trade-offs do we face among various ways to implement the macro?)

Finally, thinking ahead to next semester's compilers class, I can't help but think of ways to use this example to illustrate ideas we'll discuss there. A compiler can look for opportunities to optimize the cascaded message send shown above into a single function call. A code generator could produce a fluent builder for any given class. The latter would allow a programmer to use a fluent builder without the tedium of writing boilerplate code, and the former would produce efficient run-time code while allowing the programmer to write code in a clear and convenient way. See a problem; fix it. Sometimes that means creating a new tool.

Sometimes I wonder whether it is worth blogging ideas as simple as these. What's the value? I have a new piece of evidence in favor. Back in May 2007, I wrote several entries about a paper on the psychology of security. It was so salient to me for a while that I ended up suggesting to a colleague that he might use the paper in his capstone course. Sixteen months later, it is the very same colleague's capstone course that I find myself covering temporarily, and it just so happens that this week the students are discussing... Schneier's paper. Re-reading my own blog entries has proven invaluable in reconnecting with the ideas that were fresh back then. (But did I re-read Schneier's paper?)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

September 01, 2008 2:31 PM

B.B. King, CS Wannabe

The Parade magazine insert to my Sunday paper contained an interview with legendary bluesman B.B. King that included this snippet:

There's never a day goes by that I don't miss having graduated and gone to college. If I went now, I would major in computers and minor in music. I have a laptop with me all the time, so it's my tutor and my buddy.

CS and music are, of course, a great combination. Over the years, I've had a number of strong and interesting students whose backgrounds included heavy doses of music, from alt rock to marching band to classical. But B.B. King is from a different universe. Maybe I can get this quote plastered on the walls of all the local haunts for musicians.

I wonder what B.B. calls his laptop?


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

August 28, 2008 4:22 PM

The Universe is a Visualization

I'm not a physicist and don't keep up on the latest -- or even much of the not-so-latest -- in string theory. But recently a colleague pointed me toward Leonard Susskind's 2008 book, The Black Hole War. Before tracking down the book, I read a review from the Los Angeles Times. The review introduced me to Susskind's "holographic principle", which holds that

... our universe is a three-dimensional projection of information stored in two dimensions at the boundary of space ...

Suddenly, data and algorithm visualizations seem so much more important than just ways to make pretty graphs. When we tell the world that computation is everywhere, we may be more right than I ever realized. Add in the principle of the conservation of information that lies at the center of the dispute between Susskind's work and Hawking's, and "computing for everyone" takes on a whole new meaning.

Note to self: read this book.


Posted by Eugene Wallingford | Permalink | Categories: Computing

August 26, 2008 3:58 PM

The Start of the Semester

I taught my first session of Programming Languages today, so the semester is officially underway. Right now, my mind is a swirl of Scheme, closures, continuations, MapReduce, and lazy evaluation. I've been teaching this course for a dozen years based on functional programming (a style new to our students at this point) and writing interpreters in Scheme. This makes me very comfortable with the material. Over the years I have watched ideas work their way from the niche of PL courses into mainstream languages. The resurgence of scripting languages has been both a result of this change and a trigger. The discussion of true closures in languages such as Ruby and Java is one example.

This evolution is fun to watch, even if it moves haltingly and perhaps slower than I'd prefer. In order to keep my course current, I need to incorporate some of these changes into my course. This time around, I find myself thinking about what ideas beyond the "edge" of practical languages I should highlight in my course. I'd like for my students to learn about some of the coolest ideas that will be appearing in their professional practice in the near future. For some reason, lazy evaluation seems ripe for deeper consideration. Working it into my class more significantly will be a project for me this semester.

Delving headlong into a new semester's teaching makes Jorge Cham's recent cartoon seem all the more true:

How Professors Spend Their Time -- Jorge Cham

For faculty at a "teaching university", the numbers are often skewed even further. Of course, I am an administrator now, so I teach but one course a semester, not three. Yet the feeling is the same, and the desire to spend more time on real CS -- teaching and research -- is just as strong. Maybe I can add a few hours to each day?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 22, 2008 4:39 PM

Unexpected Computer Science Reference

Today I'm listening to the August 21 episode of The Bob and Tom Show, a syndicated morning radio show. One of the guests on this episode is comedian Dwayne Perkins. There I was working on my previous entry when I hear Perkins say "I was a computer science student in college...". Surprise! Of course, the reference was part of a skit on his dating life and ended up mentioning a preponderance of Asian-American males in his classes, but still.

I don't know if CS plays any role in his comedy more generally, but he at least is tech-savvy enough to have a blog with both discursive entries and short videos.

I'm guessing that it's a good thing to have a performer say "computer science" every so often as part of his work. Whatever he says about CS, people aren't likely to remember much of the content, but they might remember hearing the words "computer science". Something that puts our discipline in the mainstream may well help demythologize it.

Coincidentally, I heard this episode on a day when we did final registration for fall courses, and every introductory programming course we offer is full. This includes our intro course for majors, which is a good 50% larger than last year, and non-majors courses in VB and C++. Perhaps the tide has turned. Or perhaps some of the efforts we have mad in the last three years are beginning to pay off.


Posted by Eugene Wallingford | Permalink | Categories: Computing

August 20, 2008 2:19 PM

Stalking the Wily Misconception

Recently, someone sent me a link to Clifford Stoll's TED talk from February 2006, and yesterday I finally watched. Actually, I listened more than I watched, for two reasons. First, because I was multitasking in several other windows, as I always am at the machine. Second, because Stoll's manic style jumping around the stage isn't much to my liking.

As a university professor and a parent, I enjoyed the talk for its message about science and education. It's worth listening to simply for the epigram he gives in the first minute or so, about science, engineering, and technology, and for the quote he recites to close the talk. (Academic buildings have some of the coolest quotes engraved right on their walls.) But the real meat of the talk, which doesn't start until midway through, is the point.

Prodded by schoolteachers to whom he was talking about science in the schools, Stoll decided that he should put his money where his mouth is: he became a science teacher. Not just giving a guest lecture at a high school, but teaching a real junior-high science class four days a week. He doesn't do the "turn to Chapter 7 and do all the odd problems" kind of teaching either, but real physics. For example, his students measure the speed of light. They may be off by 25%, but they measured the speed of light, using experiments they helped design and real tools. This isn't the baking soda volcano, folks. Good stuff. And I'll bet that junior-high kids love his style; he's much better suited for that audience than I!

One remark irked me, even if he didn't mean it the way I heard it. At about 1:38, he makes a short little riff on his belief that computers don't belong in schools. "No! Keep them out of schools", he says.

In one sense, he is right. Educators, school administrators, and school boards have made "integrating technology" so big a deal that computers are put into classrooms for their own sake. They become devices for delivering lame games and ineffective simulations. We teach Apple Keynote, and students think they have learned "computers" -- and so do most teachers and parents. When we consider what "computers in schools" means to most people, we probably should keep kids away from them, or at least cut back their use.

At first, I thought I was irked at Stoll for saying this, but now I realize that I should be irked at my profession for not having done a better job both educating everyone about what computers really mean for education and producing the tools that capitalize on this opportunity.

Once again I am shamed by Alan Kay's vision. The teachers working with Alan also have their students do real experiments, too, such as measuring the speed of gravity. Then they use computers to build executable models that help students to formalize the mathematics for describing the phenomenon. Programming is one of their tools.

Imagine saying that we should keep pencils and paper out of our schools, returning to the days of chalk slates. People would laugh, scoff, and revolt. Saying we should keep computers out of schools should elicit the same kind of response. And not because kids wouldn't have access to e-mail, the web, and GarageBand.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 18, 2008 5:27 PM

Inquisitive Computing

I've written here a lot in the last year or so about ideas in the vein of "programming for everyone", so I feel I should point you toward Calculemus!, Brian Hayes's latest column in The American Scientist. This short paper is something of a hodgepodge in celebration of the 25th anniversary of Hayes writing monthly articles on the joys of computation.

First, he talks about the history of his columns, which explore the realm of inquisitive computing -- writing programs as a way to explore ideas that interest him and his readers. This isn't software development, with "requirements" and "process" and "lifecycle management". This is asking a cool question (say, "Is there any pattern in the sequence of numbers that are perfect medians?) and writing some code in search of answers, and more questions. This is exactly the way in which I conceive of programming as an essential intellectual skill of the future to be. I don't imagine that most people will ask themselves such mathematical questions (though they might), but they might be inquisitive at work or at home in their own areas of interest.

They may be sitting on a plane talking with a fellow passenger and have a question about football overtimes. I lived that story once many years ago, talking about sudden-death overtime in the NFL. My seatmate was a pilot who liked to follow football, and after we discussed some of the days scores he asked out loud, "I wonder how frequently the team that wins the coin toss wins the game?" He figured that was the end of the question, because how could we answer it? I whipped out my laptop, fired up Dr. Scheme, and built a little model. We experimented with several parameters, including a percentage we had read for how often the first team scores on its opening drive, until we had a good sense of how much of an advantage winning the coin toss is. He was amazed that we could do what we did. I could only say, it would be great if more people knew this was possible, and learned the little bit of programming they need to do it. I'm not sure he believed that.

Hayes then gives three examples of the kinds of problem he likes to explore and how programs can help. I'm tempted to elaborate on his examples, but that would make this post as long as the paper. Just read it. I can say that all three were fun for me to play with, and two of them admit almost trivial implementations for getting started in the search for answers. (I was surprised to learn that what he calls abc-hits has implications for number theory.)

Finally, Hayes closes with a discussion of what sort of programming environments and languages we need to support inquisitive programming by the masses. He laments the passing of Dartmouth BASIC into the bowels of structured programming, object-oriented programming, and, dare I add, VB.NET -- from a language for everyone to a language for a small group of professionals writing serious programs in a serious setting. (He also laments that "GUI-coddled computer users have forgotten how to install an executable in usr/local/bin and add it to their $PATH", so he's not completely consistent in aiming at the computing non-professional!)

He hopes to be language-agnostic, though he confesses to being a Lisp weenie and suggests that Python may be the language best positioned to fill the needs of inquisitive programmers, with its avid user community in the sciences. He is probably right, as I have noted Python's ascendancy in CS ed and the physics community before. Most encounters I have with Python leave me thinking "Boy, I like Ruby", so I would love to see the Ruby grow into this area, too. For that to happen, we need more books like this one to introduce real people to Ruby in their own contexts. I'm looking forward to seeing an exam copy of this title to see whether it or one like it can be useful among students and professionals working in the life sciences.

I've long enjoyed Hayes's column and still follow it on-line after having dropped my subscription to The American Scientist a few years ago. (It's a fine publication, but I have only so many hours in my day to read!) You can find many of his articles on-line now at bit-player. If you can make time to explore a bit, I encourage you to look there...


Posted by Eugene Wallingford | Permalink | Categories: Computing

August 08, 2008 4:11 PM

SIGCSE Day 2 -- This and That

[A transcript of the SIGCSE 2008 conference: Table of Contents]

(Okay, so I am over four months behind posting my last couple of entries from SIGCSE. Two things I've read in the last week or so jolted my memory about one of these items. I'll risk that they are longer of much interest and try to finish off my SIGCSE reports before classes start.)

A Discipline, Not Just a Job

During his talk, I think, Owen Astrachan said:

Don't talk about where the jobs are. We do not need to kowtow to the market. CS is ideas, a discipline.

We do, of course, need to keep in mind that the first motivation for many of our students is to get a job. But Owen is right. To the extent that we "sell" anything, let's sell that CS is a beautiful and powerful set of ideas. We can broaden the minds of our job-seeking students -- and also attract thinking students who are looking for powerful ideas.

When Good Students Are Too Good

Rich Pattis tossed out an apparently old saw I had never heard: Don't give your spec to the best programmer in the room. She will make it work, even if the spec isn't what you want and doesn't make sense. Give it to a mediocre programmer. If the spec is bad, he will fail and come back with questions.

This applies to homework assignments, too. Good students can make anything work, and most will. That good students solved your problem is not evidence of a well-written spec.

Context Complicates

I've talked a lot here about giving students problems in context, whether in the context of large projects or in the context of "real" problems. As I was listening to Marissa Mayer's talk and lunchtable conversation, I was reminded that context complicates matters, for both teacher and students. We have to be careful when designing instruction to be sure that students are able to attend to what we want them to learn, and not be constantly distracted by details in the backstory. Otherwise, a task being in context hurts more than it helps.

The solution: Start with problems in context, then simplify to a model that captures the essence of the context and eliminates unnecessary complexity and distraction. Joe Bergin has probably already written a pedagogical pattern for this, but I don't see it after a quick glance at some of his papers. I've heard teachers like Owen, Nick Parlante, and Julie Zelensky talk about this problem in a variety of settings, and they have some neat approaches to solving it.

Overshooting Your Mark in the Classroom

It is easy for teachers to dream bigger than they can deliver when they lose touch with the reality of teaching a course. I see this all the time when people talk about first-year CS courses -- including myself. In my piece on the Nifty Assignments session, I expressed disappointment that one of the assignments had a write-up of four pages and suggested that I might be able to get away with giving students only the motivating story and a five-line assignment statement. Right. It is more likely that the assignment's creator knows what he is doing from the experience of actually using the assignment in class. From the easy chairs of the Oregon Convention Center, everything looks easier. (I call this the Jeopardy! Effect.)

The risk of overshooting is even bigger when the instructor has not been in the trenches, ever or even for a long while. Mark Guzdial recently told the story of Richard Feynman's freshman physics course, which is a classic example of this phenomenon. Feynman wrote a great set of lectures, but they don't really work as a freshman text, except perhaps with the most elite students.

I recently ran across a link to a new CS1 textbook for C++ straight from Bjarne Stroustrup himself. Stroustrup has moved from industry to academia and has had the opportunity to develop a new course for freshmen. "We need to improve the education of our software developers," he says. When one of my more acerbic colleagues saw this, he response was sharp and fast: "Gee, that quick! Seems those of us in 'academia' don't catch on as well as the newbies."

For all I know, Stroustrup's text will be just what every school that wants to teach C++ in CS1 needs, but I am also skeptical. A lot of smart guys with extensive teaching experience -- several of them my friends -- have been working on this problem for a long time, and it's hard. I look forward to seeing a copy of the book and to hearing how it works for the early adopters.

Joe, is there a pedagogical pattern called "In the Trenches"? If not, there should be. Let's write it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 30, 2008 12:40 PM

Scripting, CS1, and Language Theory

Yesterday, I wrote a bit about scripting languages. It seems odd to have to talk about the value of scripting languages in 2008, as Ronald Loui does in his recent IEEE Computer article, but despite their omnipresence in industry, the academic world largely continues to prefer traditional systems languages. Some of us would like to see this change. First, let's consider the case of novice programmers.

Most scripting languages lack some of the features of systems languages that are considered important for learners, such as static typing. Yet these "safer" languages also get in the way of learning, as Loui writes, by imposing "enterprise-sized correctness" on the beginner.

Early programmers must learn to be creative and inventive, and they need programming tools that support exploration rather than production.

This kind of claim has been made for years by advocates of languages such as Scheme for CS1, but those languages were always dismissed by "practical" academics as toy languages or niche languages. Those people can't dismiss scripting languages so easily. You can call Python and Perl toy languages, but they are used widely in industry for significant tasks. The new ploy of these skeptics is to speak of the "scripting language du jour" and to dismiss them as fads that will disappear while real languages (read: C) remain.

What scripting language would be the best vehicle for CS1? Python has had the buzz in the CS ed community for a while. After having taught a little PHP las semester, I would deem it too haphazard for CS1. Sure, students should be able to do powerful things, but the pocket-protected academic in me prefers a language that at least pretends to embody good design principles, and the pragmatist in me prefers a language that offers a smoother transition into languages beyond scripting. JavaScript is an idea I've seen proposed more frequently of late, and it is a choice with some surprising positives. I don't have enough experience with it to say much, but I am a little concerned about the model that programming in a browser creates for beginning students.

Python and Ruby do seem like the best choices among the scripting languages with the widest and deepest reach. As Loui notes, few people dislike either, and most people respect both, to some level. Both have been designed carefully enough to be learned by beginners and and support a reasonable transition as students move to the next level of the curriculum. Having used both, I prefer Ruby, not only for its OO-ness but also for how free I feel when coding in it. But I certainly respect the attraction many people have to Python, especially for its better developed graphics support.

Some faculty ask whether scripting languages scale to enterprise-level software. My first reaction is: For teaching CS1, why should we care? Really? Students don't write enterprise-level software in CS1; they learn to program. Enabling creativity and supporting exploration are more important than the speed of the interpreter. If students are motivated, they will write code -- a lot of it. Practice makes perfect, not optimized loop unrolling and type hygiene.

My second reaction is that these languages scale quite nicely to real problems in industry. That is why they have been adopted so widely. If you need to process a large web access log, you really don't want to use Java, C, or Ada. You want Perl, Python, or Ruby. This level of scale gives us access to real problems in CS1, and for these tasks scripting languages do more than well enough. Add to that their simplicity and the ability to do a lot with a little code, and student learning is enhanced.

Loui writes, "Indeed, scripting languages are not the answer for long-lasting, CPU-intensive nested loops." But then, Java and C++ and Ada aren't the answer for all the code we write, either. Many of the daily tasks that programmers perform lie in the space better covered by scripting languages. After learning a simpler language that is useful for these daily tasks, students can move on to larger-scale problems and learn the role of a larger-scale language in solving them. That seems more natural to me than going in the other direction.

Now let's consider the case of academic programming languages research. A lot of interesting work is being done in industry on the design and implementation of scripting language, but Loui laments that academic PL research still focus on syntactic and semantic issues of more traditional languages.

Actually, I see a lot of academic work on DSLs -- domain-specific languages -- that is of value. One problem is this research is so theoretical that it is beyond the interest of programmers in the trenches. Then again, it's beyond the mathematical ability and interest of many CS academics, too. (I recently had to comfort a tech entrepreneur friend of mine who was distraught that he couldn't understand even the titles of some PL theory papers on the resume of a programmer he was thinking of hiring. I told him that the lambda calculus does that to people!)

Loui suggest that PL language research might profitably move in a direction taken by linguistics and consider pragmatics rather than syntax and semantics. Instead of proving something more about type systems, perhaps a languages researcher might consider "the disruptive influence that Ruby on Rails might have on web programming". Studying how well "convention over configuration" works in practice might be of as much use as incrementally extending a compiler optimization technique. The effect of pragmatics research would further blur the line between programming languages and software engineering, a line we have seen crossed by some academics from the PLT Scheme community. This has turned out to be practical for PL academics who are interested in tools that support the programming process.

Loui's discussion of programming pragmatics reminds me of my time in studying knowledge-based systems. Our work was pragmatic, in the sense that we sought to model the algorithms and data organization that expert problem solvers used, which we found to be tailored to specific problem types. Other researchers working on such task-specific architectures arrived at models consistent with ours. One particular group went beyond modeling cognitive structures to the sociology of problem solving, John McDermott's lab at Carnegie Mellon. I was impressed by McDermott's focus on understanding problem solvers in an almost anthropological way, but at the time I was hopelessly in love with the algorithm and language side of things to incorporate this kind of observation into my own work. Now, I recognize it as the pragmatics side of knowledge-based systems.

(McDermott was well known in the expert systems community for his work on the pioneering programs R1 and XCON. I googled him to find out what he was up to these days but didn't find much, but through some publications, I infer that he must now be with the Center for High Assurance Computer Systems at the Naval Research Laboratory. I guess that accounts for the sparse web presence.)

Reading Loui's article was an enjoyable repast, though even he admits that much of the piece reflects old arguments from proponents of dynamic languages. It did have, I think, at least one fact off track. He asserts that Java displaced Scheme as the primary language used in CS1. If that is true, it is so only for a slender subset of more elite schools, or perhaps that Scheme made inroads during a brief interregnum between Java and ... Pascal, a traditional procedural language that was small and simple enough to mostly stay out of the way of programmers and learners.

As with so many current papers, one of the best results of reading it is a reminder of a piece of classic literature, in this case Ousterhout's 1998 essay. I usually read this paper again each time I teach programming languages, and with my next offering of that course to begin in three weeks, the timing is perfect to read it again.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 29, 2008 2:03 PM

Scripting Languages, Software Development, and Novice Programmers

Colleague and reader Michael Berman pointed me to the July 2008 issue of IEEE Computer, which includes an article on the virtues of scripting languages, Ronald Loui's In Praise of Scripting: Real Programming Pragmatism. Loui's inspiration is an even more important article in praise of scripting, John Ousterhout's classic Scripting: Higher Level Programming for the 21st Century. Both papers tell us that scripting deserves more respect in the hierarchy of programming and that scripting languages deserve more consideration in the programming language and CS education communities.

New programming languages come from many sources, but most are created to fill some niche. Sometimes the niche is theoretical, but more often the creators want to be able to do something more easily than they can with existing languages. Scripting languages in particular tend to originate in practice, to fill a niche in the trenches, and grow from there. Sometimes, they come to be used just like a so-called general-purpose programming language.

When programmers have a problem that they need solve repeatedly, they want a language that gives them tools that are "ready at hand". For these programming tasks, power comes from the level of abstraction provided by built-in tools. Usually these tools are chosen to fill the needs of a specific niche, but they almost always include the ability to process text conveniently, quickly, and succinctly.

Succinctness is a special virtue of scripting languages. Loui mentions the virtue of short source code, and I'm surprised that more people don't talk about the value of small programs. Loui suggests one advantage that I rarely see discussed: languages that allow and even encourage short programs enable programmers to get done with a task before losing motivation or concentration. I don't know how important this advantage is for professional programmers; perhaps some of my readers who work in the real world can tell me what they think. I can say, though, that, when working with university students, and especially novice programmers, motivation or concentration are huge factors. I sometimes hear colleagues say that students who can't stay motivated and concentrate long enough to solve an assignment in C++, Ada, or Java probably should not be CS majors. This seems to ignore reality, both of human psychology and of past experience with students. Not to mention the fact that teach non-majors, too.

Another advantage of succinctness Loui proposes relates to programmer error. System-level languages include features intended to help programmers make fewer errors, such as static typing, naming schemes, and verbosity. But they also require programmers to spend more time writing code and to write more code, and in that time programmers find other ways to err. This is, too, is an interesting claim if applied to professional software development. One standard answer is that software development is not "just" programming and that such errors would disappear if we simply spent more time up-front in analysis, modeling, and design. Of course, these activities add even more time and more product to the lifecycle, and create more space for error. They also put farther in the future the developers' opportunity to get feedback from customers and users, which in many domains is the best way to eliminate the most important errors that can arise when making software.

Again, my experience is that students, especially CS1 students, find ways to make mistakes, regardless of how safe their language is.

One way to minimize errors and their effects is to shrink the universe of possible errors. Smaller programs -- less code -- is one way to do that. It's harder to make as many or even the same kind of errors in a small piece of code. It's also easier to find and fix errors in a small piece of code. There are exceptions to both of these assertions, but I think that they hold in most circumstances.

Students also have to be able to understand the problem they are trying to solve and the tools they are using to solve it. This places an upper bound on the abstraction level we can allow in the languages we give our novice students and the techniques we teach them. (This has long been an argument made by people who think we should not teach OO techniques in the first year, that they are too abstract for the minds of our typical first-year students.) All other things equal, concrete is good for beginning programmers -- and for learners of all kinds. The fact that scripting languages were designed for concrete tasks means that we are often able to make the connection for students between the languages abstractions and tasks they can appreciate, such as manipulating images, sound, and text.

My biases resonate with this claim in favor of scripting languages:

Students should learn to love their own possibilities before they learn to loathe other people's restrictions.

I've always applied this sentiment to languages such as Smalltalk and Scheme which, while not generally considered scripting languages, share many of the features that make scripting languages attractive.

In this regard, Java and Ada are the poster children in my department's early courses. Students in the C++ track don't suffer from this particular failing as much because they tend not to learn C++ anyway, but a more hygienic C. These students are more likely to lose motivation and concentration while drowning in an ocean of machine details.

When we consider the problem of teaching programming to beginners, this statement by Loui stands out as well:

Students who learn to script early are empowered throughout their college years, especially in the crucial Unix and Web environments.

Non-majors who want to learn a little programming to become more productive in their disciplines of choice don't get much value at all from one semester learning Java, Ada, or C++. (The one exception might be the physics majors, who do use C/C++ later.) But even majors benefit from learning a language that they might use sooner, say, in a summer job. A language like PHP, JavaScript, or even Perl is probably the most valuable in this regard. Java is the one "enterprise" language that many of our students can use in the summer jobs they tend to find, but unfortunately one or two semesters are not enough for most of them to master enough of the language to be able to contribute much in a professional environment.

Over the years, I have come to think that even more important than usefulness for summer jobs is the usefulness a language brings to students in their daily lives, and the mindset it fosters. I want CS students to customize their environments. I want them to automate the tasks they do every day when compiling programs and managing their files. I want them to automate their software testing.

When students learns a big, verbose, picky language, they come to think of writing a program as a major production, one that may well cause more pain in the short term than it relieves in the long term. Even if that is not true, the student looks at the near-term pain and may think, "No, thanks." When students learn a scripting language, they can see that writing a program should be as easy as having a good idea -- "I don't need to keep typing these same three commands over and over", or "A program can reorganize this data file for me." -- and writing it down. A program is an idea, made manifest in an executable form. They can make our lives better. Of all people, computer scientists should be able to harness their power -- even CS students.

This post has grown to cover much more than I had originally planned, and taken more time to write. I'll stop here for now and pick up this thread of thought in my next entry.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 22, 2008 8:50 PM

Computing and Modern Culture

The recent rescue of hostages in Colombia relied on a strategy familiar to people interested in computer network security: a man-in-the-middle attack.

... for months, in an operation one army officer likened to a "broken telephone," military intelligence had been able to convince Ms. Betancourt's captor, Gerardo Aguilar, a guerrilla known as "Cesar," that he was communicating with his top bosses in the guerrillas' seven-man secretariat. Army intelligence convinced top guerrilla leaders that they were talking to Cesar. In reality, both were talking to army intelligence.

As Bruce Schneier reports in Wired magazine, this strategy is well-known on the internet, both to would-be system crackers and to security experts. The risk of man-in-the-middle attacks is heightened on-line because the primary safeguard against them -- shared social context -- is so often lacking. Schneier describes some of the technical methods available for reducing the risk of such attacks, but his tone is subdued... Even when people have a protection mechanism available, as they do in SSL, they usually don't take advantage of it. Why? Using the mechanism requires work, and most of us are just too lazy.

Then again, the probability of being victimized by a man-in-the-middle attack may be small enough that many of us can rationalize that the cost is greater than the benefit. That is a convenient thought, until we are victimized!

The problem feature that makes man-in-the-middle attacks possible is unjustified trust. This is not a feature of particular technical systems, but of any social system that relies on mediated communication. One of the neat things about the Colombian hostage story it that shows that some of the problems we study in computer science are relevant in a wider context, and that some of our technical solutions can be relevant, too. A little computer science can amplify the problem solving of almost anyone who deals with "systems", whatever their components.

This story shows a potential influence from computing on the wider world. Just so that you know the relationship runs both ways, I point you to Joshua Kerievsky's announcement of "Programming with the Stars", one of the events on the Developer Jam stage at the upcoming Agile 2008 conference. Programming with the Stars adapts the successful formula of Dancing with the Stars, a hit television show, to the world of programming. On the TV show, non-dancers of renown from other popular disciplines pair with professional dancers for a weekly dance competitions. Programming with the Stars will work similarly, only with (pair) programming plugged in for dancing. Rather than competitions involving samba or tango, the competitions will be in categories such as test-driven development of new code and refactoring a code base.

As in the show, each pair will include an expert and a non-expert, and there will be a panel of three judges:

I've already mentioned Uncle Bob in this blog, even in a humorous vein, and I envision him playing the role of Simon Cowell from "American Idol". How Davies and Hill compare to Paula Abdul and Randy Jackson, I don't know. But I expect plenty of sarcasm, gushing praise, and hip lingo from the panel, dog.

Computer scientists and software developers can draw inspiration from pop culture and have a little fun along the way. Just don't forget that the ideas we play with are real and serious. Ask those rescued hostages.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

July 19, 2008 3:04 PM

Papadimitriou, the Net, and Writing

I've been trying to take a break from the office and spend some time at home and with my family. Still, I enjoy finding time to read an occasional technical article and be inspired. While waiting for my daughter's play rehearsal to end last night, I read A Conversation with Christos Papadimitriou, a short interview in the August 2008 of Dr. Dobb's Journal. I first learned of Papadimitriou from a textbook of his that we used in one of my earliest graduate course, Elements of the Theory of Computation. Since that time, he has done groundbreaking work in computational complexity and algorithms, with applications in game theory, economics, networks, and most recently bioinformatics. It seems that many of the best theoreticians have a knack for grounding their research in problems that matter.

The article includes several tidbits that might interest computer scientists and professional programmers of various sorts. Some are pretty far afield from my work. For instance, Papadimitriou and two of his students recently produced an important result related to the Nash Equilibrium in game theory (have you seen A Beautiful Mind?) Nash's theorem tells us that an equilibrium exists in every game, but it does not tell us how to find the equilibrium. Is it possible to produce a tractable algorithm for finding it? Papadimitriou and his students showed that Nash's theorem depends intrinsically on the theorem which is the basis of Nash's proof, which means that, in practice, we cannot produce such an algorithm; finding the Nash equilibrium for any given problem is intractable.

The interview spent considerable time discussing Papadimitriou's recent work related to the Internet and the Web, which are ideas I will likely read more about. Papadimitriou sees the net as an unusual opportunity for computer scientists: a chance to study a computational artifact we didn't design. Unlike our hardware and software systems, it "emerged from an interaction of millions of entities on the basis of deliberately simple protocols". The result is a "mystery" that our designed artifacts can't offer. For a theoretical CS guy, the net and web serve as a research lab of unprecedented size.

It also offers a platform for research at the intersection of computing and other disciplines, such as communication, where my CS grad student Sergei Golitsinski is taking his research. The interviewer quoted net pioneer John Gilmore in the same arena: "The Net interprets censorship as damage and routes around it." This leads to open questions about how rumors spread, an area that Papadimitriou calls "information epidemiology". One of my former grad students, Nate Labelle, worked in this area for a particular part of the designed world, open-source software packages, and I'd love to have a student delve into the epidemiology of more generalized information.

I would also like to read Papadimitriou's novel, Turing. I recall when it came out and just haven't gotten around to asking my library to pick it up or borrow it. In the interview, Papadimitriou said,

I discovered this [novel] was inside me and had to come out, so I took time to write it. I couldn't resist it. ... If I had not done it, I would be a less happy man.

Powerful testimony, and the chance to read CS-themed fiction doesn't come along every day.


Posted by Eugene Wallingford | Permalink | Categories: Computing

July 11, 2008 11:12 AM

Wadler on Abelson and Sussman

After reading Lockhart, I read Matthias Felleisen's response to Lockhart, and from there I read Matthias's design for a second course to follow How to Design Programs. From an unlinked reference there, I finally found A Critique of Abelson and Sussman, also known as "Why Calculating Is Better Than Scheming" (and also available from the ACM Digital Library). I'm not sure why I'd never run into this old paper before; it appeared in a 1987 issue of the SIGPLAN Notices. In any case, I am glad I did now, because it offers some neat insights on teaching introductory programming. Some of you may recall its author, Philip Wadler, from his appearance in this blog as Lambda Man a couple of OOPSLAs ago.

In this paper, Wadler argues that Structure and Interpretation of Computer Programs, which I have lauded as one of the great CS books, could be improved as a vehicle for teaching introductory programming by using a language other than Scheme. In particular, he thinks that four particular language features are helpful, if not essential:

  • pattern matching
  • a more mathematics-like syntax
  • types, both static and user-defined
  • lazy evaluation

Read the paper for an excellent discussion of each, but I will summarize. Pattern matching pulls syntax of many decisions out of a single function and creates separate expressions for each. This is similar to writing separate functions for each case, and in some ways resembles function overloading in languages such as Java and C++. A syntax more like traditional math notation is handy when teaching students to derive expressions and to reason about values and correctness. Static typing requires code to state clearly the kinds of objects it manipulates, which eliminates a source of confusion for students. Finally, lazy evaluation allows programs to express meaningful ideas in a natural way without having the language enforce conclusions that are not strictly necessary. This can be also be useful when doing derivation and proof, but it also opens the door to some cool applications, such as infinite streams.

We teach functional programming and use some of these concepts in a junior-/senior-level programming languages course, where many of Wadler's concerns are less of an issue. (They do come into play with a few students, hough; Wadler might say we wouldn't have these problems if we taught our intro course differently!) But for freshmen, the smallest possibilities of confusion become major confusions. Wadler offers a convincing argument for his points, so much so that Felleisen, a Scheme guy throughout, has applied many of these suggestions in the TeachScheme! project. Rather than switching to a different language, the TeachScheme! team chose to simplify Scheme through a series of "teaching languages" that expose concepts and syntax just-in-time.

If you want evidence that Wadler is describing a very different way to teach introductory programming, consider this from the end of Section 4.1:

I would argue that the value of lazy evaluation outweighs the value of being able to teach assignment in the first course. Indeed, I believe there is great value in delaying the introduction of assignment until after the first course.

The assignment statement is like mom and apple pie in most university CS1 courses! The typical CS faculty could hardly conceive of an intro course without assignment. Abelson and Sussman recognized that assignment need not be introduced so early by waiting until the middle of SICP to use set. But for most computer scientists and CS faculty, postponing assignment would require a Kuhn-like paradigm shift.

Advocates of OOP in CS1 encountered this problem when they tried to do real OOP in the first course. Consider the excellent Object-Oriented Programming in Pascal: A Graphical Approach, which waiting until the middle of the first course to introduce if-statements. From the reaction of most faculty I know, you would have thought that Conner, Niguidula, and van Dam were asking people to throw away The Ten Commandments. Few universities adopted the text despite its being a wonderful and clear introduction to programming in an object-oriented style. As my last post noted, OOP causes us to think differently, and if the faculty can't make the jump in CS1 then students won't -- even if the students could.

(There is an interesting connection between the Conner, Niguidula, and van Dam approach and Wadler's ideas. The former postpones explicit decision structures in code by distributing them across objects with different behavior. The latter postpones explicit decision structures by distributing them across separate cases in the code, which look like overloaded function definitions. I wonder if CS faculty would be more open to waiting on if-statements through pattern matching than they were through the dynamic polymorphism of OOP?)

Wadler indicates early on that his suggestions do not presuppose functional programming except perhaps for lazy evaluation. Yet his suggestions are not likely to have a wide effect on CS1 in the United States any time soon, because even if they were implemented in a course using an imperative language, most schools simply don't teach CS1 in a way compatible with these ideas. Still, we would be wise to take them to heart, as Felleisen did, and use them where possible to help us make our courses better.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 10, 2008 1:54 PM

Object-Oriented Algorithm Flashback

Most papers presented at SIGCSE and the OOPSLA Educators' Symposium are about teaching methods, not computational methods. When the papers do contain new technical content, it's usually content that isn't really new, just new to the audience or to mainstream use in the classroom. The most prominent example of the latter that comes to mind immediately is the series of papers by Zung Nguyen and Stephen Wong at SIGCSE on design patterns for data structures. Those papers were valuable in principle because they showed that how one conceives of containers changes when one is working with objects. In practice, they sometimes missed their mark because they were so complex that many teachers in the audience said, "Cool! But I can't do that in class."

However, the OOPSLA Educators' Symposium this year received a submission with a cool object-oriented implementation of a common introductory programming topic. Unfortunately, it may not have made the cut for inclusion based on some technical concerns of the committee. Even so, I was so happy to see this paper and to play with the implementation a little on the side! It reminded me of one of the first efforts I saw in a mainstream CS book to show how we think differently about a problem we all know and love when working with objects. That was Tim Budd's implementation of the venerable eight queens problem in An Introduction to Object-Oriented Programming.

Rather than implement the typical procedural algorithm in an object-oriented language, Budd created a solution that allowed each queen to solve the problem for herself by doing some local computation and communicating with the queen to her right. I remember first studying his code to understand how it worked and then showing it to colleagues. Most of them just said, "Huh?" Changing how we think is hard, especially when we already have a perfectly satisfactory solution for the problem in mind. You have to want to get it, and then work until you do.

You can still find Budd's code from the "download area" link on the textbook's page, though you might find a more palatable version in the download area for the book's second edition. I just spent a few minutes creating a Ruby version, which you are welcome to. It is slightly Ruby-ized but mostly follows Budd's solution for now. (Note to self: have fun this weekend refactoring that code!)

Another thing I liked about "An Introduction to Object-Oriented Programming" was its linguistic ecumenism. All examples were given in four languages: Object Pascal, C++, Objective C, and Smalltalk. The reader could learn OOP without tying it to a single language, and Budd could point out subtle differences in how the languages worked. I was already a Smalltalk programmer and used this book as a way to learn some Objective C, a skill which has been useful again this decade.

(Budd's second edition was a step forward in one respect, by adding Java to the roster of languages. But it was also the beginning of the end. Java soon became so popular that the next version of his book used Java only. It was still a good book for its time, but it lost some of its value when it became monolingual.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 07, 2008 12:48 PM

More on Problems and Art in Computer Science

Last week I wrote about an essay by Paul Lockhart from a few years ago that has been making the rounds this year. Lockhart lamented that math is so poorly misrepresented in our schools that students grow up missing out on its beauty, and even still not being able to perform the skills in whose name we have killed scholastic math. I've long claimed that we would produce more skilled students if we allowed them to approach these skills from the angle of engaging problems. For Lockhart, such problems come form the minds of students themselves and may have no connection to the "real world".

In computer science, I think letting students create their own problems is also quite valuable. It's one of the reasons that open-ended project courses and independent undergraduate research so often lead to an amazing level of learning. When a group of students wants to train a checkers-playing program to learn from scratch, they'll figure out ways to do it. Along the way, they learn a ton -- some of it material I would have selected for them, some beyond what I would have guessed.

The problems CS students create for themselves often do come straight out of their real world, and that's okay, too. Many of us CS love abstract problems such as the why of Y, but for most of us -- even the academics who make a living in the abstractions -- came to computing from concrete problems. I think I was this way, starting when I learned Basic in high school and wanted to make things, like crosstables for chess tournaments and ratings for the players in our club. From there, it wasn't that far a journey into Gödel, Escher, Bach and grad school! Along the way, I had professors and friends who introduced me to a world much larger than the one in which I wrote programs to print pages on which to record chess games.

This is one reason that I tout Owen Astrachan's problem-based learning project for CS. Owen is interested in problems that come from the real world, outside the minds of the crazy but harmless computer scientists he and I know, love, and are. These are the problems that matter to other people, which is good for the long-term prospects of our discipline and great for hooking the minds of kids on the beauty and power of computing. For computer science students, I am a proponent of courses built around projects, because they are big enough to matter to CS students and big enough to teach them lessons they can't learn working on smaller pieces of code.

With an orientation toward the ground, discussions of functional programming versus object-oriented programming seem almost not to matter. Students can solve any problem in either style, right? So who cares? Well, those of us who teach CS care, and our students should, too, but it's important to remember that this is an inward-looking discussion that won't mean much to people outside of CS. It also won't matter much to our students as they first begin to study computer science, so we can't turn our first-year courses into battlegrounds of ideology. We need to be sure that, whatever style we choose to teach first, we teach it in a way that helps students solve problems -- and create the problems that interest them. They style needs to feel right for the kind of problems we expose them to, so that the students can begin to think naturally about computational solutions.

In my department we have for more than a decade introduced functional programming as a style in our programming languages course, after students have seen OOP and procedural programming. I see a lot of benefit in teaching FP sooner, but that's would not fit our faculty all that well. (The students would probably be fine!) Functional programming has a natural home in our languages course, where we teach it as an especially handy way of thinking about how programming languages work. This is a set of topics we want students to learn anyway, so we are able to introduce and practice a new style in the context of essential content, such as how local variables work and how to write a parser. If a few students pick up on some of the beautiful ideas and go do something crazy, like fire up a Haskell interpreter and try to grok monads, well, that's just fine.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 02, 2008 10:23 AM

Math and Computing as Art

So no, I'm not complaining about the presence
of facts and formulas in our mathematics classes,
I'm complaining about the lack of mathematics
in our mathematics classes.
-- Paul Lockhart

A week or so ago I mentioned reading a paper called A Mathematician's Lament by Paul Lockhart and said I'd write more on it later. Yesterday's post, which touched on the topic of teaching what is useful reminded me of Lockhart, a mathematician who stakes out a position that is at once diametrically opposed to the notion of teaching what is useful about math and yet grounded in a way that our K-12 math curriculum is not. This topic is especially salient for me right now because our state is these days devoting some effort to the reform of math and science education, and my university and college are playing a leading role in the initiative.

Lockhart's lament is not that we teach mathematics poorly in our K-12 schools, but rather that we don't teach mathematics at all. We teach definitions, rules, and formal systems that have been distilled away from any interesting context, in the name of teaching students skills that will be useful later. What students do in school is not what mathematicians do, and that's a shame, because mathematicians is fun, creative, beautiful -- art.

As Lockhart described his nightmare of music students not being allowed to create or even play music, having to copy and transpose sheet music, I cringed, because I recognized how much of our introductory CS courses work. As he talked about how elementary and HS students never get to "hear the music" in mathematics, I thought of Brian Greene's Put a Little Science in Your Life, which laments the same problem in science education. How have we managed to kill all that is beautiful in these wonderful ideas -- these powerful and even useful ideas -- in the name of teaching useful skills? So sad.

Lockhart sets out an extreme stance. Make math optional. Don't worry about any particular content, or the order of topics, or any particular skills.

Mathematics is the music of reason. To do mathematics is to engage in an act of discovery and conjecture, intuition and inspiration; to be in a state of confusion--not because it makes no sense to you, but because you gave it sense and you still don't understand what your creation is up to; to have a breakthrough idea; to be frustrated as an artist; to be awed and overwhelmed by an almost painful beauty; to be alive, damn it.

I teach computer science, and this poetic sense resonates with me. I feel these emotions about programs all the time!

In the end, Lockhart admits that his position is extreme, that the pendulum has swung so far to the "useful skills" side of the continuum he feels a need to shout out for the "math is beautiful" side. Throughout the paper he tries to address objections, most of which involve our students not learning what they need to know to be citizens or scientists. (Hint: Does anyone really think that most students learn that now? How much worse off could we be to treat math as art? Maybe then at least a few more students would appreciate math and be willing to learn more.)

This paper is long-ish -- 25 pages -- but it is a fun read. His screed on high school geometry is unrestrained. He calls geometry class "Instrument of the Devil" because it so thoroughly and ruthlessly kills the beauty of proof:

Other math courses may hide the beautiful bird, or put it in a cage, but in geometry class it is openly and cruelly tortured.

His discussion of proof as a natural product of a student's curiosity and desire to explain an idea is as well written as any I've read. It extends another idea from earlier in the paper that fits quite nicely with something I have written about computer science: Mathematics is the art of explanation.

By concentrating on what, and leaving out why, mathematics is reduced to an empty shell. The art is not in the "truth" but in the explanation, the argument. It is the argument itself which gives the truth its context, and determines what is really being said and meant. Mathematics is the art of explanation. If you deny students the opportunity to engage in this activity--to pose their own problems, make their own conjectures and discoveries, to be wrong, to be creatively frustrated, to have an inspiration, and to cobble together their own explanations and proofs--you deny them mathematics itself.

I am also quite sympathetic to one of the other themes that runs deeply in this paper:

Mathematics is about problems, and problems must be made the focus of a student's mathematical life.

(Ditto for computer science.)

... you don't start with definitions, you start with problems. Nobody ever had an idea of a number being "irrational" until Pythagoras attempted to measure the diagonal of a square and discovered that it could not be represented as a fraction.

Problems can motivate students, especially when students create their own problems. That is one of the beautiful things about math: almost anything you see in the world can become a problem to work on. It's also true of computer science. Students who want to write a program to do something -- play a game, predict a sports score, track their workouts -- will go out of their way to learn what they need to know. I'm guessing anyone who has taught computer science for any amount of time has experienced this first hand.

As I've mentioned here a few times, my colleague Owen Astrachan is working on a big project to explore the idea of problem-based learning in CS. (I'm wearing the project's official T-shirt as I type this!) This idea is also right in line with Alan Kay's proposal for an "exploratorium" of problems for students who want to learn to commmunicate via computation, which I describe in this entry.

I love this passage from one of Lockhart's little dialogues:

SALVIATI:     ... people learn better when the product comes out of the process. A real appreciation for poetry does not come from memorizing a bunch of poems, it comes from writing your own.

SIMPLICIO:     Yes, but before you can write your own poems you need to learn the alphabet. The process has to begin somewhere. You have to walk before you can run.

SALVIATI:     ... No, you have to have something you want to run toward.

You just have to have something you want to run toward. For teenaged boys, that something is often a girl, and suddenly the desire to write a poem becomes a powerful motivator. We should let students find goals to run toward in math and science and computer science, and then teach them how.

It's interesting that I end with a running metaphor, and not just because I run. My daughter is a sprinter and now hurdler on her school track team. She sprints because she likes to run short distances and hates to run anything long (where, I think, "long" is defined as anything longer than her race distance!). The local runners' club leads a summer running program for high school students, and some people thought my daughter would benefit. One benefit of the program is camaraderie; one drawback that it involves serious workouts. Each week the group does a longer run, a day of interval training, and a day of hill work.

I suggested that she might be benefit more from simply running more -- not doing workouts that kill her, just building up a base of mileage and getting stronger while enjoying some longer runs. My experience is that it's possible to get over the hump and go from disliking longs runs to enjoying them. Then you can move on to workouts that make you faster. So she and I are going to run together a couple of times a week this summer, taking it easy, enjoying the scenery, chatting and otherwise not stressing about "long runs".

There is an element of beauty versus duty in learning most things. When the task is all duty, you may do it, but you may never like it. Indeed, you may come to hate it and stop altogether when the external forces that keep you on task (your teammates, your sense of belonging) disappear. When you enjoy the beauty of what you are doing, everything else changes. So it is with math, I think, and computer science, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Running, Teaching and Learning

July 01, 2008 4:21 PM

A Small Curricular Tempest

A couple of weeks ago I linked to Shriram Krishnamurthi, who mentioned a recent SIGPLAN-sponsored workshop that has proposed a change to ACM's curriculum guidelines. The change is quite simple, shifting ten hours of instruction in programming languages from small topics into a single ten-hour category called "functional programming". Among the small topics that would be affected, coverage of recursion and event-driven programming would be halved, and coverage of virtual machines and language translation would no longer be mandated separately, nor would an "overview" of programming languages.

In practice, the proposal to eliminate coverage of some areas has less effect than you might think. Recursion is a natural topic in functional programming, and event-driven programming is a natural topic on object-oriented programming. The current recommendation of three hours total to cover virtual machines and language translation hardly does them justice anyway; students can't possibly learn any of the valuable ideas in depth in that amount of time. If schools adopt this change, they would be spending the time spent more productively helping students to understand functional programming well. Many schools will probably continue to teach those topics as part of their principles of programming languages course anyway.

I didn't comment on the proposal in detail earlier because it seemed more like the shuffling of deck chairs than a major change in stance. I do approve of the message the proposal sends, namely that functional programming is important enough to be a core topic in computer science. Readers of this blog already know where I stand on that.

Earlier this week, though, Mark Guzdial blogged Prediction and Invention: Object-oriented vs. functional, which has created some discussion in several circles. He starts with "The goal of any curriculum is to prepare the students for their future." Here is my take.

Mark seems to be saying that functional programming is not sufficiently useful to our students to make it a core programming topic. Mandating that schools teach ten hours each of functional and object-oriented programming, he thinks, tells our students that we faculty believe functional programming is -- or will be -- as important as object-oriented programming to their professional careers. Our students get jobs in companies that primarily use OO languages and frameworks, and our curricula should reflect that.

This piece has a vocational tone that I find surprising coming from Mark, and that is perhaps what most people are reacting to when they read it. When he speaks of making sure the curriculum teaches what is "real" to students, or how entry-level programmers often find themselves modifying existing code with an OO framework, it's easy to draw a vocational theme from his article. A lot of academics, especially computer scientists, are sensitive to such positions, because the needs of industry and the perceptions of our students already exert enough pressure on CS curriculum. In practical terms, we have to find the right balance between practical skills for students and the ideas that underlie those skills and the rest of computing practice. We already know that, and "esoteric" topics such as functional programming and computing theory are already part of that conversation.

Whether Mark is willing to stand behind the vocational argument or not, I think there is another theme in his piece that also requires a balance he doesn't promote. It comes back to the role of curriculum guidelines in shaping what schools teach and expressing what we think students should learn. Early on, he says,

I completely disagree that we should try to mandate that much functional programming through manipulation of the curriculum standards.

And later:

Then, when teaching more functional programming becomes a recognized best practice, it will be obvious that it should be part of the curriculum standards.

The question is whether curriculum standards should be prescriptive or descriptive. Mark views the current SIGPLAN proposal as prescribing an approach that contradicts both current best practice and the needs of industry, rather describing best practice in schools around the country. And he thinks curriculum standards should be descriptive.

I am sensitive to this sort of claim myself, because -- like Mark! -- I have been contending for many years with faculty who think OOP is a fad and has no place in a CS curriculum, or at least in our first-year courses. These faculty, both at my university and throughout the country, argue that our courses should be about what students "really do" in the world, not about esoteric design patterns and programming techniques. In the end, these people end up claiming that people like me are trying to prescribe a paradigm for how our students should think.

The ironic thing, of course, is that over the last fifteen years OOP and Java have gone from being something new to the predominant tools in industry. It's a good things that some schools started teaching more OOP, even in the first year, and developing the texts and teaching materials that other schools could use to join in later.

(The people arguing against OOP in the first year have not given up the case; they've now shifted to claiming that we should teach even Java "fundamentals first", going "back to basics" before diving into all that complicated stuff about data and procedures bearing some relation to one another. I've written about that debate before and have tremendous respect for many of the people on the front line of "basics" argument. I still disagree.)

As in the case of vocational versus theoretical content, I think we need to find the right balance between prescriptive and descriptive curriculum standards. These two dimensions are not wholly independent of each other, but they are different and so call for different balances. I agree with Mark that at least part of our curriculum standard should be descriptive of current practice, both in universities and in industry. Standard curricular practice is important in helping to create some consistency across universities and helping to keep schools who are out of the know on a solid and steady path. And the simple fact is that our students do graduate into professional careers and need to be prepared to participate in an economy that increasingly depends on information technology. For those of us at state-supported universities, this is a reasonable expectation of the people who pay our bills.

However, I think that we also need some prescriptive elements to our curricula. As Alan Kay says in a comment on Mark's blog, universities have a responsibility not only to produce graduates capable in participating in the economy but also to help students become competent, informed citizens in a democracy. This is perhaps even more important at state-supported universities, which serve the citizenry of the state. This may sound too far from the ground when talking about computer science curriculum, but it's not. The same ideas apply -- to growing informed citizens, and to growing informed technical professionals.

The notion that curriculum standards are partly prescriptive is not all that strange, because it's not that different from how curriculum standards have worked in the past, really. Personally, I like having experts in areas such as programming languages and operating systems helping us keep our curricular standards up to date. I certainly value their input for what they know to be current in the field. I also value their input because they know what is coming, what is likely to have an effect on practice in the near future, and what might help students understand better the more standard content we teach.

At first I had a hard time figuring out Mark's position, because I know him to grok functional programming. Why was he taking this position? What were his goals? His first paragraph seems to lay out his goal for the CS curriculum:

The goal of any curriculum is to prepare the students for their future. In just a handful of years, teachers aim to give the students the background to be successful for several decades.

He then recognizes that "the challenge of creating a curriculum is the challenge of predicting the future."

These concerns seem to sync quite nicely with the notion of encouraging that all students learn a modicum about functional programming! I don't have studies to cite, but I've often heard and long believed that the more different programming styles and languages a person learns, the better a programmer she will be. Mark points to studies show little direct transfer from skills learned in one language to skills learned in another, and I do not doubt their truth. But I'm not even talking about direct transfer of knowledge from functional programming to OOP; I'm thinking of the sort of expansion of the mind that happens when we learn different ways to think about problems and implement solutions. A lot of the common OO design patterns borrow ideas from other domains, including functional programming. How can we borrow interesting ideas if we don't know about them?

It is right and good that our curriculum standards push a little beyond current technical and curricular practice, because then we are able to teach ideas that can help computing evolve. This evolution is just as important in the trenches of a web services group at an insurance company as it is to researchers doing basic science. In the particular case of functional programming, students learn not only beautiful ideas but also powerful ideas, ideas that are germinating now in the development of programming languages in practice, from Ruby and Python to .NET. Our students need those ideas for their careers.

As I mentioned, Alan Kay chimed in with a few ideas. I think he disagrees that we can't predict the future by inventing it through curriculum. His idealism on these issues seems to frustrate some people, but I find it refreshing. We can set our sights higher and work to make something better. When I used the allusion to "shuffling the deck chairs" above, I was thinking of Kay, who is on record as saying that how we teach CS is broken. He has also talked to CS educators and exhorted us to set our sights higher. Kay supports the idea of prescriptive curricula for a number of reasons, the most relevant of which to this conversation is that we don't want to hard-code accidental or misguided practice, even if it's the "best" we have right now. Guzdial rightly points out that we don't want to prescribe new accidental or misguided practices, either. That's where the idea of striking a balance comes in for me. We have to do our best to describe what is good now and prescribe at least a little of what is good for the future.

I see no reason that we can't invent good futures by judiciously defined curriculum, any more than inventing futures in other arenas. Sure, we face social, societal, and political pressures, but how many arenas don't?

So, what about the particular curriculum proposal under discussion? Unlike Guzdial, I like the message it sends, that functional programming is an important topic for all CS grads to learn about. But in end I don't think it will cause any dramatic changes in how CS departments work. I used the word "encourage" above rather than Guzdial's more ominous "mandate", because even ACM's curriculum standards have no force of law. Under the proposed plan, maybe a few schools might try to present a coherent treatment of functional programming where now they don't, at the expense of covering a few good ideas at a shallow level. There will continue to be plenty of diversity, one of the values that guides Guzdial's vision. On this, he and I agree strongly. Diversity in curricula is good, both for the vocational reasons he asserts but also because we learn even better how to teach CS well from the labors and explorations of others.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 10, 2008 4:09 PM

Fall Semester Seems Far Away Right Now

I'm not yet thinking about my Programming Languages course this fall, but I wish I were. The first temptation came via my newsreader and Martin Fowler's recent piece ParserFear. Martin is writing a book on domain-specific languages and blogging occasionally on his ideas. This article is about just what the title says, programmers' often irrational fears of rolling a parser, which deter them from implementing their own DSLs. He speculates:

So why is there an unreasonable fear of writing parsers for DSLs? I think it boils down to two main reasons.

  • You didn't do the compiler class at university and therefore think parsers are scary.
  • You did do the compiler class at university and are therefore convinced that parsers are scary.

The first is easy to understand, people are naturally nervous of things they don't know about. The second reason is the one that's interesting. What this boils down to is how people come across parsing in universities.

CS students tend to learn about parsing for the first time in a compiler class, working on a large language with a full range of constructs. Parsing such languages is hard. In a few compiler courses, including my own, students still learn to build a parser by hand and so don't even have the chance to use parser generators as a labor- and pain-saving device. That's okay in a compiler course, which students take in large part to learn how their tools really work.

Martin doesn't suggest that we change the compiler course, and I don't either (though I'm open to possibilities). He does seem to think it's a shame that students are turned off to parsing and language design by first, and perhaps only, seeing them at their most complex. I agree and think that we should do more to introduce these ideas to students earlier.

I introduce students to the idea of parsing in my Programming Languages course, and ask students to write a few very small parsers to handle simple languages. I've been thinking for a while that I should do more in this area, and reading Martin's article has me itching to redesign the latter part of my course to allow more work with parsing and parsers. Another possibility is to use parsing as the content of some of our early programming exercises, when students are nominally learning to program in a functional style. Students can certainly apply the programming techniques they are learning to translate simple data formats into more abstract forms. This might help them to begin to see that parsing is an idea broader than just general-purpose programming languages. It might ease their transition a couple of weeks later to the idea of syntax-as-data structure and allow us to do some simple work with DSLs.

I have tried the idea of parsing at its simplest with relative novices as early as CS 1. When I taught a media computation CS1, one of the last programming assignments was to write a program to read a file of simple graphics commands and produce the desired graphical image. The idea was to bypass Java's verbose syntax for doing simple AWT/Swing graphics and allow non-programmers (artists) to make images. I asked students to implement a couple of simple commands, such as "draw line", and create at least one command of their own. I expected all of their graphics languages to be "straight-line", with no control or data structures, but that didn't mean that the resulting DSLs were not useful. They were just simple. A couple of students did really interesting work, creating very high-level commands for swirls and splashes. These students wrote methods to interpret those command using control and data structures that their "programmers" didn't have to know about.

Any change to my Programming Languages needs to be "footprint-neutral", to use Shriram Krishnamurthi's phrase. Anything I add has to fit in my fifteen-week course and either displace existing material or work in parallel. Shriram used this phrase in the broader context of rejiggering the teaching of teaching of programming languages within the core ACM curriculum, which a recent SIGPLAN-sponsored workshop tried to do. After having just read Martin's article on parsing, I was eager to refresh my memory on where parsing fits into the core curriculum proposal. Parsing falls under knowledge unit PL3, Language Translation, which has two hours allotted to it in the core. (An optional knowledge unit on Language Translation Systems includes more.) Interestingly, the working group Shriram reports on recommends cutting those two hours to zero, on the grounds that the current coverage is too superficial, and using those hours to build up a 10-hour unit on Functional Programming. That's a worthy goal, though I haven't hard a chance to think deeply about the proposal yet.

Working with constrained resources sometimes requires making tough choices. I know that Shriram and the people with whom he worked think that parsing is a worthwhile topic for CS students to know, so perhaps they have in mind something like what I suggested above: piggybacking some coverage of parsing on top of the coverage of functional programming. In any case, I think I'll work on finding more ways for my Programming Languages students to engage parsing and domain-specific languages.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 31, 2008 12:33 AM

K-12 Road Show Summit, Day Two

The second half of the workshop opened with one of the best sessions of the event, the presentation "What Research Tells Us About Best Practices for Recruiting Girls into Computing" by Lecia Barker, a senior research scientist at the National Center for Women and IT. This was great stuff, empirical data on what girls and boys think and prefer. I'll be spending some time looking into Barker's summary and citations later. Some of the items she suggested confirm commonsense, such as not implying that you need to be a genius to succeed in computing; you only need to be capable, like anything else. I wonder if we realize how often our actions and examples implicitly say "CS is difficult" to interested young people. We can also use implicit cues to connect with the interests of our audience, such as applications that involve animals or the health sciences, or images of women performing in positions of leadership.

Other suggestions were newer to me. For example, evidence shows that Latina girls differ more from white and African-American girls than white and African-American girls differ from each other. This is good to know for my school, which is in the Iowa metro area with the highest percentage of African-Americans and a burgeoning Latina population. She also suggested that middle-school girls and high-school girls have different interests and preferences, so outreach activities should be tailored to the audience. We need to appeal to girls now, not to who they will be in three years. We want them to be making choices now that lead to a career path.

A second Five-Minute Madness session had less new information for me. I thought most about funding for outreach activities, such as ongoing support for an undergraduate outreach assistant whom we have hired for next year using a one-time grant form the university's co-op office. I had never considered applying for a special projects grant from the ACM for outreach, and the idea of applying to Avon was even more shocking!

The last two sessions were aimed at helping people get a start on designing an outreach project. First, the whole group brainstormed ideas for target audiences and goals, and then the newbies in the room designed a few slides for an outreach presentation with guidance from the more experienced people. Second, the two groups split, with the newbies working more on design and the experienced folks discussing the biggest challenges they face and ways to overcome them.

These sessions again made clear that I need to "think bigger". One, Outreach need not aim only at schools; we can engage kids through libraries, 4-H (which has broadened its mission to include technology teams), the FFA, Boys and Girls Clubs, and the YMCA and YWCA. Some schools report interesting results from working with minority girls through mother/daughter groups at community centers. Sometimes, the daughters end up encouraging the moms to think bigger themselves and seek education for more challenging and interesting careers. Two, we have a lot more support from upper administration and from CS faculty at my school than most outreach groups have at their schools. This means that we could be more aggressive in our efforts. I think we will next year.

The workshop ended with a presentation by Gabe Cohen, the project manager for Google Apps. This was the only sales pitch we received from Google in the time we were here (other than being treated and fed well), and it lasted only fifteen minutes. Cohen showed a couple of new-ish features of the free Apps suite, including spreadsheets with built-in support for web-based form input. He closed hurriedly with a spin through the new AppEngine, which debuted to the public on Wednesday. It looks cool, but do I have time?

The workshop was well-done and worth the trip. The main point I take away is to be more aggressive on several fronts, especially in seeking funding opportunities. Several companies we work with have funded outreach activities at other schools, and our state legislative and executive branches have begun to take this issue seriously from the standpoint of economic development. I also need to find ways to leverage faculty interest in doing outreach and interest from our administration in both STEM education initiatives and community service and outreach.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

May 30, 2008 7:23 PM

K-12 Road Show Summit, Day One

The workshop has ended. Google was a great host, from beginning to end. They began offering food and drinks almost immediately, and we never hungered or thirsted for long. That part of the trip made Google feel like the young person's haven it is. Wherever we went, the meeting tables included recessed power and ethernet cables for every kind of laptop imaginable, including my new Mac. (Macbook Pros were everywhere we went at Google.) But we also learned right away that visitors also must stay within bounds. No wandering around was allowed; we had to remain within sight of a Googler. And we were told not to take any photos on the grounds or in the buildings.

The workshop was presented live from within Google Docs, which allowed the leaders and presenters to display from a common tool and to add content as we went along. The participants didn't have access to the doc, but we were it as a PDF file -- on the smallest flash drive I've ever owned. It's a 1GB stick with the dimensions of the delete key on my laptop (including height).

The introduction to the workshop consisted of a linked-list game in which each person introduced the person to his left, followed by remarks from Maggie Johnson, the Learning and Development Director at Google Engineering, and Chris Stephenson, the executive director of ACM's Computer Science Teachers Association. The game ran a bit long, but it let everyone see how many different kinds of people were in the room, including a lot of non-CS faculty who lead outreach activities for some of the bigger CS departments. Chris expressed happiness that K-12, community colleges, and universities were beginning to work together on the CS pipeline. Outreach is necessary, but it can also be joyful. (This brought to mind her panel statement at SIGCSE, in a session I still haven't written up...)

Next up was Liz Adams reporting on her survey of people and places who are doing road shows or thinking about it. She has amassed a lot of raw data, which is probably most useful as a source of ideas. During her talk, someone asked, does anyone know if what they are doing is working? This led to a good discussion of assessment and just what you can learn. The goals of these road shows are many. When we meet with students, are we recruiting for our own school? Or are we trying to recruit for discipline, getting more kids to consider CS as a possible major? Are we working to reach more girls and underrepresented groups, or do we seek a rising tide? Perhaps we are doing service for the economy of our community, region, or state? The general answer is 'yes' to all of these things, which makes measuring success all the more difficult. While it's comforting to shoot wide, this may not be the most effective strategy for achieving any goal at all!

One idea I took away from this session was to ask students to complete a short post-event evaluation. I view most of our outreach activities these days as efforts to broaden interest in computer science generally, and to broaden students' views of the usefulness and attractiveness of computing even more generally. So I'd like to ask students about their perceptions of computing after we work with them. Comparing these answers to ones gathered before the activity would be even better. My department already asks students declaring CS majors to complete a short survey, and I plan to ensure it includes a question that will allow us to see whether our outreach activities have had any effect on the new students we see.

Then came a session called Five-Minute Madness, in which three people from existing outreach programs answered several questions in round-robin fashion, spending five minutes altogether on each. I heard a few useful nuggets here:

  • Simply asking a young student "What will you be when you grow up?" and then talking about what we do can be a powerful motivator for some kids.

  • Guidance counselors in the high schools are seriously misinformed about computing. No surprise there. But they often don't have access to the right information, or the time to look for it. The outreach program at UNC-Charlotte has produced a packet of information specifically for school counselors, and they visit with the counselors on any school visit they can.

  • Reaching the right teacher in a high school can be a challenge. It is hard to find "CS teachers" because so few states certify that specialty. Don't send letters addressed to the "computing teacher"; it will probably end up in the trash can!

  • We have to be creative in talking to people in the Department of Education, as well as making sure we time our mailings and offerings carefully. Know the state's rules about curriculum, testing, and the like.

  • It's a relationship. Treat initial contacts with a teacher like a first date. Take the time to connect with the teacher and cultivate something that can last. One panelist said, "We HS teachers need a little romance." If we do things right, these teachers can become our biggest advocates and do a lot of "recruiting" for us through their everyday, long-term relationship with the students.

Dinner in one of the Google cafeterias was just like dinner in one of my university's residence halls, only with more diverse fare. A remarkable number of employees were there. Ah, to be young again.

Our first day closed with people from five existing programs telling us about their road shows. My main thought throughout this session was that these people spend a lot of time talking to -- at -- the kids. I wonder how effective this is with high school students and imagine that as the audience gets younger, this approach becomes even less effective. That said, I saw a lot of good slides with information that we can use to do some things. The presenters have developed a lot of good material.

Off to bed. Traveling west makes for long, productive days, but it also makes me ready to sleep!


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

May 30, 2008 6:13 PM

Geohashing in Ruby

A while back I read this xkcd comic, which introduced the idea of geohashing, selecting a meet-up location based on a date, the Dow-Jones Industrial Average, and MD5 hashing. Last month I ran across this wiki page on geohashing, which offers a reference implementation in Python. That's fine, even with all those underscores, but I decided to write a Ruby implementation for kicks. In particular, I had never worked with Ruby's MD5 digests and was glad to have a reason.

So, during a few stolen moments at the roadshow workshop (summary soon...), I knocked off an implementation. Here's my code. It's very simple and can certainly be improved. I tried to use idiomatic Ruby where I knew it, but some bits feel awkward. In other places, I mimicked the reference implementation perhaps too closely, so they still feel Python-y. Please send me your suggestions!


Posted by Eugene Wallingford | Permalink | Categories: Computing

May 28, 2008 1:14 PM

Off to Visit Google

I'm preparing for a quick visit to the Google campus tomorrow and Friday. This is my first trip to the Google campus, and I have to admit that I'm looking forward to it. To this wide-eyed Midwestern computer scientist, it feels as if I am visiting Camelot.

The occasion of my trip is a "roadshow summit" co-sponsored by the Computer Science Teachers Association and SIGCSE, and hosted by Google. The CSTA is a unit of the ACM "that supports and promotes the teaching of computer science and other computing disciplines" in K-12 schools. The goal of the workshop is:

... to bring together faculty and students who are currently offering, or planning to develop, outreach "road shows" to local K-12 schools. Our goal is to improve the quality and number of college and university-supported careers and equity outreach programs by helping to develop a community that will share research, expertise, and best practices, and create shared resources.

My selfish goal in wanting to attend the workshop initially was to steal lots of good ideas with more experience and creativity than I. My contribution will be to share what we have done in our department, especially over the last semester. I asked two faculty members to develop curricula for K-12 outreach activities, in lieu of one of their usual course assignments. The curriculum materials should be useful whether we take them on the road to the schools or when we have students on campus for visits. One professor started with robotics in mind but quickly switched to some simple programming activities with the Scratch programming environment. The other worked on high-performance and parallel computing for pre-college students, an education thread he has been working in for much of this decade. I do not have a link to materials he developed specifically for our outreach efforts yet, but I can point you to LittleFe, one of his ongoing projects.

I'm curious to see what other schools have done and still plan to steal as many ideas as I can! And, while I'm looking forward to the workshop and seeing Google's campus, I am not looking forward to the fast turnaround... My flight leaves tomorrow morning; we work Thursday afternoon, Thursday evening, Friday morning, and Friday early afternoon; and then I start the sojourn back home. I'll cover a lot of miles in forty-eight hours, but I hope they prove fruitful.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 20, 2008 12:47 PM

Cognitive Surplus and the Future of Programming

the sitcom All in the Family

I grew up on the sitcom of the 1970s and 1980s. As kids, we watched almost everything we saw in reruns, whether from the '60s or the '70s, but I enjoyed so many of them. By the time I got to college, I had well-thought ideas on why The Dick Van Dyke Show remains one of the best sitcoms ever, why WKRP in Cincinnati was underrated for its quality, and why All in the Family was _the_ best sitcom ever. I still hold all these biases in my heart. Of course, I didn't limit myself to sitcoms; I also loved light-action dramas, especially The Rockford Files.

Little did I know then that my TV viewing was soaking up a cognitive surplus in a time of social transition, or that it had anything in common with gin pushcarts in the streets of London at the onset of the Industrial Revolution.

Clay Shirky has published a wonderful little essay, Gin, Television, and Social Surplus that taught me these things and put much of what we see on happening on the web into the context of a changing social, cultural, and economic order. Shirky contends that, as our economy and technology evolve, a "cognitive surplus" is created. Energy that used to be spent on activities required in the old way is now freed for other purposes. But society doesn't know what to do with this surplus immediately, and so there is a transition period where the surplus is dissipated in (we hope) harmless ways.

My generation, and perhaps my parents', was part of this transition. We consumed media content produced by others. Some denigrate that era as one of mindless consumption, but I think we should not be so harsh. Shows like All in the Family and, yes, WKRP in Cincinnati often tackled issues on the fault lines of our culture and gave people a different way to be exposed to new ideas. Even more frivolous shows such as The Dick Van Dyke Show and The Rockford Files helped people relax and enjoy, and this was especially useful for those who were unprepared for the expectations of a new world.

We are now seeing the advent of the new order in which people are not relegated to consuming from the media channels of others but are empowered to create and share their own content. Much attention is given by Shirky and many, many others to the traditional media such as audio and video, and these are surely where the new generation has had its first great opportunities to shape its world. As Shirky says:

Here's something four-year-olds know: A screen that ships without a mouse ships broken. Here's something four-year-olds know: Media that's targeted at you but doesn't include you may not be worth sitting still for.

But as I've been writing about here, lets not forget the next step: the power to create and shape the media themselves via programming. When people can write programs, they are not relegated even to using the media they have been given but are empowered to create new media, and thus to express and share ideas that may otherwise have been limited to the abstraction of words. Flickr and YouTube didn't drop from the sky; people with ideas created new channels of dissemination. The same is true of tools like Photoshop and technologies such as wikis: they are ideas turned into reality through code.

Do read Shirky's article, if you haven't already. It has me thinking about the challenge we academics face in reaching this new generation and engaging them in the power that is now available to them. Until we understand this world better, I think that we will do well to offer young people lots of options -- different ways to connect, and different paths to follow into futures that they are creating.

One thing we can learn from the democratized landscape of the web. I think, is that we are not offering one audience many choices; we are offering many audiences the one or two choices each that they need to get on board. We can do this through programming courses aimed at different audiences and through interdisciplinary major and minor programs that embed the power of computing in the context of problems and issues that matter to our students.

Let's keep around the good old CS majors as well, for those students who want to go deep creating the technology that others are using to create media and content -- just as we can use the new technologies and media channels to keep great old sitcoms available for geezers like me.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

May 19, 2008 3:55 PM

"Rebooting Computing" Summit

I still have a couple of entries to make on SIGCSE 2008 happenings. If I don't hurry up, it will be time for reports from PLoP or OOPSLA! But I do have a bit of related good news to post...

I've received an invitation to Peter Denning's "Rebooting Computing" summit, which I first mentioned when covering Denning's talk at SIGCSE. The summit is scheduled for January 2009 and is part of Denning's NSF-funded Resparking Innovation in Computing Education project. This will be a chance to spend a few days with others thinking about this issue to outline concrete steps that we all might take to make change. I've written about this issue frequently here, most recently in the form of studio-based computing, project-based and problem-based learning, and programming for non-CS folks like computing into their own work. (Like scientists and artists. I'm excited about this chance.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 13, 2008 9:15 AM

Solid and Relevant

I notice a common rhetorical device in many academic arguments. It goes like this. One person makes a claim and offers some evidence. Often, the claim involves doing something new or seeing something in a new way. The next person rebuts the argument with a claim that the old way of doing or seeing things is more "fundamental" -- it is the foundation on which other ways of doing and seeing are built. Oftentimes, the rebuttal comes with no particular supporting evidence, with the claimant relying on many in the discussion to accept the claim prima facie. We might call this The Fundamental Imperative.

This device is standard issue in the CS curriculum discussions about object-oriented programming and structured programming in first-year courses. I recently noticed its use on the SIGCSE mailing list, in a discussion of what mathematics courses should be required as part of a CS major. After several folks observed that calculus was being de-emphasized in some CS majors, in favor of more discrete mathematics, one frequent poster declared:

(In a word, computer science is no longer to be considered a hard science.)

If we know [the applicants'] school well we may decide to treat them as having solid and relevant math backgrounds, but we will no longer automatically make that assumption.

Often, the conversation ends there; folks don't want to argue against what is accepted as basic, fundamental, good, and true. But someone in this thread had the courage to call out the emperor:

If you want good physicists, then hire people who have calculus. If you want good computer scientists, then hire people who have discrete structures, theory of computation, and program verification.

I don't believe that people who are doing computer science are not doing "hard science" just because it is not physics. The world is bigger than that.

...

You say "solid and relevant" when you really should be saying "relevant". The math that CS majors take is solid. It may not be immediately relevant to problems [at your company]. That doesn't mean it is not "solid" or "hard science".

I sent this poster a private "thank you". For some reason, people who drop the The Fundamental Imperative into an argument seem to think that it is true absolutely, regardless of context. Sure, there may be students who would benefit from learning to program using a "back to the basics" approach, and there may be CS students for whom calculus will be an essential skill in their professional toolkits. But that's probably not true of all students, and it may well be that the world has changed enough that most students would benefit from different preparation.

"The Fundamental Imperative" is a nice formal name for this technique, but I tend to think of it as "if it was good enough for me...", because so often it comes down to old fogies like me projecting our experience onto the future. Both parties in such discussions would do well not to fall victim to their own storytelling.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

May 12, 2008 12:24 PM

Narrative Fallacy on My Mind

In his recent bestseller The Black Swan: The Impact of the Highly Improbable, Nassim Nicholas Taleb uses the term narrative fallacy to describe man's penchant for creating a story after the fact, perhaps subconsciously, in order to explain why something happened -- to impute a cause for an event we did not expect. This fallacy derives from our habit of imposing patterns on data. Many view this as a weakness, but I think it is a strength as well. It is good when we use it to communicate ideas and to push us into backing up our stories with empirical investigation. It is bad when we let our stories become unexamined truth and when we use the stories to take actions that are not warranted or well-founded.

Of late, I've been thinking of the narrative fallacy in its broadest sense, telling ourselves stories that justify what we see or want to see. My entry on a response to the Onward! submission by my ChiliPLoP group was one trigger. Those of us who believe strongly that we could and perhaps should be doing something different in computer science education construct stories about what is wrong and what could be better; we're like anyone else. That one OOPSLA reviewer shed a critical light on our story, questioning its foundation. That is good! It forces us to re-examine our story, to consider to what extent it is narrative fallacy and to what extent it matches reality. In the best case, we now know more about how to tell the story better and what evidence might be useful in persuading others. In the worst, we may learn that our story is a crock. But that's a pretty good worst case, because it gets us back on the path to truth, if indeed we have fallen off.

A second trigger was finding a reference in Mark Guzdial's blog to a short piece on universal programming literacy at Ken Perlin's blog. "Universal programming literacy" is Perlin's term for something I've discussed here occasionally over the last year, the idea that all people might want or need to write computer programs. Perlin agrees but uses this article to consider whether it's a good idea to pursue the possibility that all children learn to program. It's wise to consider the soundness of your own ideas every once in a while. While Perlin may not be able to construct as challenging a counterargument as our OOPSLA reviewer did, he at least is able to begin exploring the truth of his axioms and the soundness of his own arguments. And the beauty of blogging is that readers can comment, which opens the door to other thinkers who might not be entirely sympathetic to the arguments. (I know...)

It is essential to expose our ideas to the light of scrutiny. It is perhaps even more important to expose the stories we construct subconsciously to explain the world around us, because they are most prone to being self-serving or simply convenient screens to protect our psyches. Once we have exposed the story, we must adopt a stance of skepticism and really listen to what we hear. This is the mindset of the scientist, but it can be hard to take on when our cherished beliefs are on the line.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns

May 09, 2008 8:03 AM

Verdict Is In On One OOPSLA Submission

The verdict is in on the paper we wrote at ChiliPLoP and submitted to Onward!: rejected. (We are still waiting to hear back on our Educators' Symposium submission.) The reviews of our Onward! paper were mostly on mark, both on surface features (e.g., our list of references was weak) and on the deeper ideas we offer (e.g., questions about the history of studio approaches, and questions about how the costs will scale). We knew that this submission was risky; our time was simply too short to afford enough iterations and legwork to produce a good enough paper for Onward!.

I found it interesting that the most negative reviewer recommended the paper for acceptance. This reviewer was clearly engaged by the idea of our paper and ended up writing the most thorough, thoughtful review, challenging many of our assumptions along the way. I'd love to have the chance to engage this person in conversation at the conference. For now, I'll have to settle for pointing out some of the more colorful and interesting bits of the review.

In at least one regard, this reviewer holds the traditional view about university education. When it comes to the "significant body of knowledge that is more or less standard and that everyone in the field should acquire at some point in time", "the current lecture plus problem sets approach is a substantially more efficient and thorough way to do this."

Agreed. But isn't it more efficient to give the students a book to read? A full prof or even a TA standing in a big room is an expensive way to demonstrate standard bodies of knowledge. Lecture made more sense when books and other written material were scarce and expensive. Most evidence on learning is that lecture is actually much less effective than we professors (and the students who do well in lecture courses) tend to think.

The reviewer does offer one alternative to lecture: "setting up a competition based on mastery of these skills". Actually, this approach is consistent with the spirit of our paper's studio-based, apprenticeship-based, and project-based. Small teams working to improve their skills in order to win a competition could well inhabit the studio. Our paper tended to overemphasize the softer collaboration of an idyllic large-scale team.

This comment fascinated me:

Another issue is that this approach, in comparison with standard approaches, emphasizes work over thinking. In comparison with doing, for example, graph theory or computational complexity proofs, software development has a much lower ratio of thought to work. An undergraduate education should maximize this ratio.

Because I write a blog called Knowing and Doing, you might imagine that I think highly of the interplay between working and thinking. The reviewer has a point: an education centered on projects in a studio must be certain to engage students with the deep theoretical material of the discipline, because it is that material which provides the foundation for everything we do and which enables us to do and create new things. I am skeptical of the notion that an undergrad education should maximize the ratio of thinking to doing, because thinking unfettered by doing tends to drift off into an ether of unreality. However, I do agree that we must try to achieve an appropriate balance between thinking and doing, and that a project-based approach will tend to list toward doing.

One comment by the reviewer reveals that he or she is a researcher, not a practitioner:

In my undergraduate education I tried to avoid any course that involved significant software development (once I had obtained a basic mastery of programming). I believe this is generally appropriate for undergraduates.

Imagine the product of an English department saying, "In my undergraduate education I tried to avoid any course that involved significant composition (once I had obtained a basic mastery of grammar and syntax). I believe this is generally appropriate for undergraduates." I doubt this person would make much of a writer. He or she might be well prepared, though, to teach lit-crit theory at a university.

Most of my students go into industry, and I encourage them to take as many courses as they can in which they will build serious pieces of software with intellectual content. The mixture of thinking and doing stretches them and keeps them honest.

An education system that produces both practitioners and theoreticians must walk a strange line. One of the goals of our paper was to argue that a studio approach could do a better job of producing both researchers and practitioners than our current system, which often seems to do only a middling job by trying to cater to both audiences.

I agree wholeheartedly, though, with this observation:

A great strength of the American system is that it keeps people's options open until very late, maximizing the ability of society to recognize and obtain the benefits of placing able people in positions where they can be maximally productive. In my view this is worth the lack of focus.

My colleagues and I need to sharpen our focus so that we can communicate more effectively the notion that a system based on apprenticeship and projects in a studio can, in fact, help learners develop as researchers and as practitioners better than a traditional classroom approach.

The reviewer's closing comment expresses rather starkly the challenge we face in advocating a new approach to undergraduate education:

In summary, the paper advocates a return to an archaic system that was abandoned in the sciences for good reason, namely the inefficiency and ineffectiveness of the advocated system in transmitting the required basic foundational information to people entering the field. The write-up itself reflects naive assumptions about the group and individual dynamics that are required to make the approach succeed. I would support some of the proposed activities as part of an undergraduate education, but not as the primary approach.

The fact that so many university educators and graduates believe our current system exists in its current form because it is more efficient and effective than the alternatives -- and that it was designed intentionally for these reasons -- is a substantial cultural obstacle to any reform. Such is the challenge. We owe this reviewer our gratitude for laying out the issues so well.

In closing, I can't resist quoting one last passage from this review, for my friends in the other sciences:

The problem with putting students with no mastery of the basics into an apprenticeship position is that, at least in computer science, they are largely useless. (This is less true in sciences such as biology and chemistry, which involve shallower ideas and more menial activities. But even in these sciences, it is more efficient to teach students the basics outside of an apprenticeship situation.)

The serious truth behind this comment is the one that explains why building an effective computer science research program around undergraduates can be so difficult. The jocular truth behind it is that, well, CS is just plain deeper and harder! (I'll duck now.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

May 07, 2008 3:20 PM

Patterns as Descriptive Grammar

I've tried to explain the idea of software patterns in a lot of different ways, to a lot of different kinds of people. Reading James Tauber's Grammar Rules reminds me of one of my favorites: a pattern language is a descriptive grammar. Patterns describe how (good) programmers "really speak" when they are working in the trenches.

Talking about patterns as grammar creates the potential for the sort of misunderstanding that Tauber discusses in his entry. Many people, including many linguists, think of grammar rules as, well, rules. I was taught to "follow the rules" in school and came to think of the rules as beyond human control. Linguists know that the rules of grammar are man-made, yet some still seem to view them as prescriptive:

It is as if these people are viewing rules of grammar like they would road rules--human inventions that one may disagree with, but which are still, in some sense, what is "correct"...

Software patterns are rarely prescriptive in this sense. They describe a construct that programmers use in a particular context to balance the forces at play in the problem. Over time, they have been found useful and so recur in similar contexts. But if a programmer decides not to use a pattern in a situation where it seems to apply, the programmer isn't "wrong" in any absolute sense. But he'll have to resolve the competing forces in some other way.

While the programmer isn't wrong, other programmers might look at him (or, more accurately, his program) funny. They will probably ask "why did you do it that way?", hoping to learn something knew, or at least confirm that the programmer has done something oddly.

This is similar to how human grammar works. If I say, "Me wrote this blog", you would be justified in looking at me funny. You'd probably think that what I speaking incorrectly.

Tauber points out that, while I might be violating the accepted rules of grammar, I'm not wrong in any absolute sense:

... most linguists focus on modeling the tacit intuitions native speakers have about their language, which are very often at odds with the "rules of grammar" learnt at school.

He gives a couple of examples of rules that we hear broken all of the time. For example, native speakers of English almost always say "It's me", not "It's I", though that violates the rules of nominative and accusative case. Are we all wrong? In Sr. Jeanne's 7th-grade English class, perhaps. But English grammar didn't fall from the heavens as incontrovertible rules; it was created by humans as a description of accepted forms of speech.

When a programmer chooses not to use a pattern, other programmers are justified in taking a second look at the program and asking "why?", but they can't really say he's guilty of anything more than doing things differently.

Like grammar rules, some patterns are more "right" than others, in the sense that it's less acceptable to break some than others. I can get away with "It's me", even in more formal settings, but I cannot get away with "Me wrote this blog", even in the most informal settings. An OO programmer might be able get away with not using the Chain of Responsibility pattern in a context where it applies, but not using Strategy or Composite in appropriate contexts just makes him look uninformed, or uneducated.

A few more thoughts:

So, patterns are not like a grammar for programming language, which is prescriptive. To speak Java at all, you have to follow the rules. They are like the grammar of a human language, which model observations about how people speak in the wild.

As a tool for teaching and learning, patterns are so useful precisely because they give us a way to learn accepted usages that go beyond the surface syntactic rules of a language. Even better, the pattern form emphasizes documenting when a construct works and why. Patterns are better than English grammar in this regard, at least better than the way English grammar is typically taught to us as schoolchildren.

There are certainly programmers, software engineers, and programming language theorists who want to tell us how to program, to define prescriptive rules. There can be value in this approach. We can often learn something from a model that has been designed based on theory and experience. But to me prescriptive models for programming are most useful when we don't feel like we have to follow them to the letter! I want to be able to learn something new and then figure out how I can use it to become a better programmer, not a programmer of the model's kind.

But there is also a huge, untapped resource in writing the descriptive grammar of how software is built in practice. It is awfully useful to know what real people do -- smart, creative people; programmers solving real problems under real constraints. We don't understand programming or software development well enough yet not to seek out the lessons learned by folks working in the trenches.

This brings to mind a colorful image, of software linguists venturing into the thick rain forest of a programming ecosystem, uncovering heretofore unexplored grammars and cultures. This may not seem as exotic as studying the Pirahã, but we never know when some remote programming tribe might upend our understanding of programming...


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

May 06, 2008 4:40 PM

Optimizing Education

Brian Marick lamented recently that his daughter's homework probably wasn't affecting her future in the same way that some of his school experiences affected his. I've had that feeling, too, but sometimes wonder whether (1) my memory is good enough to draw such conclusions and (2) my daughters will remember key experiences from their school days anyway. After teaching for all these years I am sometimes surprised by what former students remember from their time in my courses, and how those memories affect them.

Brian's mention of New Math elicited some interesting comments. Kevin Lawrence hit on a point that has been on my mind in two contexts lately:

A big decision point in education is whether you are optimizing for people who will go on to be very good at a subject or for people who find it difficult.

In the context of university CS curricula, I often field complaints from colleagues here and everywhere about how the use of (graphics | games | anything post-1980 | non-scientific applications) in CS courses is dumbing down of our curriculum. These folks claim that we are spending too much time catering to folks who won't succeed in the discipline, or at least excel, and that at the same time we drive away the folks who would be good at CS but dislike the "softness" of the new approach.

In the context of reaching out to pre-university students, to show folks cool and glitzy things that they might do in computer science, I sometimes hear the same sort of thing. Be careful, folks say, not to popularize the science too much. We might mislead students into thinking that CS is not serious, or that it is easy.

I fully agree that we don't want to mislead middle schoolers or CS majors about the content or rigor of our discipline, or to give the impression that we cannot do serious and important work. But physics students and math geeks are not the only folks who can or should use computing. They are most definitely not the only folks who can make vital contributions to the discipline. (We can even learn from people who quote "King Lear".)

By not reaching out to students with different views and interests, we do computer science a disservice. Once they are attracted to the discipline and excited to learn, we can teach them all about rigor and science and math. Some of those folks won't succeed in CS, but then again neither do some of the folks who come in with the more traditional "geeky" interests.

If this topic interests you, follow the trail from Brian's blog to two blog entries by Kevin Lawrence, one old and one new. Both are worth a read. (I always knew there was a really good reason to enable comments on my blog -- Alan Kay might drop by!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 03, 2008 10:10 PM

Another Thesis Defense

I may not be a web guy, but some of my students are -- and very good ones. Back in December, I wrote about one of my students, Sergei Golitsinski, defending an MA thesis in Communications, which used computing to elucidate a problem in that discipline. For that study, he wrote tools that allowed him to trace the threads of influence in a prominent blog-driven controversy.

Sergei finally defended his MS thesis in computer science yesterday. Its title -- "Specification and Automatic Code Generation of the Data Layer for Data-Intensive Web-Based Applications" -- sounds like the usual thesis title, but as is often the case the idea behind it is really quite accessible. This thesis shows how you can use knowledge about your web application to generate much of the code you need for your site.

I like this work for several reasons. First, it was all about finding patterns in real applications and using them to inform software development. Second, it focused on how to use domain knowledge to get leverage from the patterns. Third, it used standard language-processing ideas to create a modeling language and then use models written in it to generate code. This thesis demonstrates how several areas of computer science -- database, information storage and retrieval, and programming languages among them -- can work together to help us write programs to do work for us. I also like it because Sergei applied his ideas to his own professional work and took a critical look at what the outcome means for his own practice.

Listening to the defense, I had two favorite phrases. The first was recursive weakness. He used this term in reference to weak entities in a database that are themselves parents to weak entities. But it brought to mind so many images for the functional programmer in me. (I'm almost certainly recursively weak myself, but where is the base case?) The second arose arose while discussing alternative approaches to a particular problem. Referring to one, he said trivial approach; non-trivial implementation. It occurred to me that so many ideas fall into this category, and part of understanding your domain well is recognizing them. Sometimes we need to avoid their black holes; other times, we need their challenges. Another big part of becoming a master is knowing which path to choose once you have recognized them.

Sergei is a master, and soon he will have a CS degree that says so. But like all masters, he has much to learn. When I wrote about his previous defense, his plan was up in the air but pointing toward applying CS in the world of communications. Since then, he has accepted admission to a Ph.D. program in communications at the University of Maryland, where he hopes to be in the vanguard of a new discipline he calls computational communications. I look forward to watching his progress.

You can read his CS thesis on-line, and all of the code used in his study will be, too, soon.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

May 01, 2008 7:11 PM

Some Lessons from the Ruby Iteration

I am not a web guy. If I intend to teach languages (PHP) or frameworks (Rails) with the web as their natural home, I need to do a lot more practice myself. It's too easy to know how to do something and still not know it well enough to teach it well. Unexpected results under time pressure create too much trouble.

MySQL, no. PostgreSQL, yes. This, twenty years after cutting my database teeth on Ingres.

Ruby's dynamic features are so, so nice. Still, I occasionally find myself wishing for Smalltalk. First love never dies.

Fifteen hours of instruction -- 5 weeks at 3 hours per week -- is plenty of time to teach most or all of the ideas in a language like bash, PHP, or Ruby. But the instructor still needs to select specific examples and parts of the class library carefully. It's too easy to start down a path of "Now look at this [class, method, primitive]..."

When I succeeded in selecting carefully, I suffered from persistent omitter's resource. "But I wish I could have covered that... Sometimes that is what students wanted to see. But most of the time they can figure that out. What they want is some insight. What insight could I have shared had I covered that instead of this?

If you want to know what students want, ask them. Easy to say, hard to do unless I slow down occasionally to reflect.

Practice, practice, practice. That's where students learn. It's also where students who don't learn don't learn.

Oh, and professor: That "Practice, practice, practice" thing -- it applies to you, too. You'll remember just how much fun programming can be.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 25, 2008 7:56 AM

Programming in Several Guises

I remember learning in courses on simulation, operating systems, and networking that, for a given period, the number of events such as cars arriving at an intersection or processes arriving at a scheduler is often best modeled using the Poisson distribution. Mostly, I recall being surprised that these events often occur in clumps, rather than uniformly distributed over a larger time period. Sometimes, it feels like ideas work this way... When I encounter an idea once during the day, I often seem to bump into it again and again. I'm sure that it's just that my mind is sensitized to the idea and recognize -- or project -- it more easily, much as magic books affect us. In any case, yesterday was such a day.

At 3:30 PM I attended a department seminar on bioinformatics by a colleague. I asked him what sort of questions he and his students could ask about bacteriophages in a data-rich environment that they could not ask before. He said that they could now quantify the notions of similarity and difference between phages in ways inaccessible to them before and write programs to apply their metrics. Eventually, he talked about how digital processing of large data sets enforced a more disciplined approach on the approach to problems, in order to battle complexity. Now, they convert big questions into a sequence of smaller, well-defined steps that can be tackled in a clear way. For him as a biologist, this was a surprising and wonderful phenomenon.

I stayed in the same room for a 5:00 PM class taught by one of our adjuncts, whose teaching I was to evaluate. He was teaching a "skills and concepts" course for non-majors, and the day's topic was databases. They talked about the similarities and differences between spreadsheets and databases, especially on how the structural integrity of a database makes it possible to formulate concise queries that can find useful answers. He some of the ideas using an Access database, first using a wizard to query the system and then looking at a raw SQL query. For many queries, he told them, the wizard does all you need. But there will times when you want to ask a question the wizard doesn't support, and then the ability to write your own select statements in SQL becomes a valuable skill.

After class, I caught upon some paperwork in my office until 7:00 PM, when I attended a panel presentation entitled "Visual Art, the Big Screen, and Orchestral Performance". (Here is a poster for the talk, in PDF.) Three local artists -- illustrator Gary Kelley, conductor Jason Weinberger, and videographer Scott Smith -- shared parts of their recent multimedia presentation of Gustav Holst's The Planets and discussed the creative forces that drove them individually and collectively to produce the work. I learned that multimedia presentations of The Planets are relatively common but that this show differed in significant ways from the usual, not the least of which was Kelley's creation of thirty new paintings and monotypes for the show.

(You may recall Smith's name from an earlier post... He had a small acting role in the play I did last winter!)

The panel ended with a discussion of how changes in technology were fundamentally changing how artist work are created and distributed. Not long ago, Hollywood and other media centers produced the entertainment that we all consumed, but now it is possible for folks in the middle of nowhere -- Iowa! -- to create and export their work to a global audience. This is, of course, nothing new in the age of the Internet and YouTube, but it is still cause for marvel to artists who recently lived and worked in a different world.

One of the central themes of the panel was the level of trust and surrender that this kind of presentation required, especially of the symphony members and conductor Weinberger. The timing of the video required the orchestra to hit certain marks in the music on a dot, and Weinberger, who usually controls tempo and shapes the sound of the performance, had to give up control the artwork produced by Kelley and Smith. The visual artists expressed a willingness to turn the tables and find a way to cede control to Weinberger in a future collaboration.

This set me to thinking... The reason that the musicians had to surrender control was essentially technological. Once a video is produced, it is set. Performance of the music was the more malleable medium, as the players could speed up or slow down in real-time to stay in sync. Ideally, of course, they would play a steady predefined pace, but that is quite difficult. But these days, "video" is much more malleable because it is digital. Why not let the musicians play however they and the conductor see fit, and adjust the pace of the video playback to keep in sync with the music? I don't know if such a digital tool exists already, but if not, what fun it would be to write! Then in performance, the videographer could "play" the video by reacting in real-time to the music.

All three of these stories had me thinking the same thing: "Now there's programming." I know the feeling well the feeling my biologist colleague expressed, because both of his answers come down to programming as discipline and medium. When our adjunct instructor told his non-CS students from all over campus about the power of knowing a little SQL, I smiled at the thought of non-programmers writing programs, albeit small ones, to scratch their own itches. Likewise, the ability to imagine how the orchestra might turn the tables on the visual artists in their multimedia collaborations, and then implement the vision in a working tool, is nothing more or less than programming.


Posted by Eugene Wallingford | Permalink | Categories: Computing

April 11, 2008 4:44 PM

SIGCSE Day 3 -- CS Past, Present, and Future

[A transcript of the SIGCSE 2008 conference: Table of Contents]

Of the 20 greatest engineering achievements of the 20th century, two lie within computing: computers (#8) and the Internet (#13). Those are broad categories defined around general-purpose tools that have affected the lives of almost every person and the practice of almost every job.

In 2004, human beings harvested 10 quintillion grains of rice. In 2004, humans beings fabricated 10 quintillion transistors. 10 quintillion is a big number: 10,000,000,000,000,000,000.

Ed Lazowska

Ed Lazowska of the University of Washington opened his Saturday luncheon talk at SIGCSE with these facts, as a way to illustrate the broad effect that our discipline has had on the world and the magnitude of the discipline today. He followed by putting the computational power available today into historical context. The amount of computational power that was available in the mainstream in the 1950s is roughly equivalent to an electronic greeting card today. Jump forward to the Apollo mission to the moon, and that computation power is now available in a Furby. Lazowska didn't give sources for these claims or data to substantiate them, but sound reasonable to me within an order of magnitude.

The title of Lazowska's talk was "Computer Science: Past, Present, and Future", and it was intended to send conference attendees home energized about our discipline. He energized folks with cool facts about computer science's growth and effect. Then he looked to the future, at some of the challenges and some of the steps being taken to address them.

One of the active steps being taken within computing is the Computing Community Consortium a joint venture of the National Science Foundation and the Computing Research Association, whose mission is to "supports the computing research community in creating compelling research visions and the mechanisms to realize these visions". According to Lazowska, the CCC hopes to inspire "audacious and inspiring research" while at the same time articulating visions of the discipline to the rest of the world. Lazowska is one of the leaders of the group. The group's twin goals are both worth the attention of our discipline's biggest thinkers.

As I listened to Lazowska describe the CCC's initiatives, I was reminded of our discipline's revolutionary effect on other disciplines and industries. Lazowska reported that two or two and a half of the 20th century's greatest engineering results were computing, but take a look at the rest of the list. Over the last half century, computers and the Internet have played an increasingly important role in many of these greatest achievements, from embedded computers in automobiles, airplanes, and spacecraft to the software that has opened new horizons in radio and television, telephones, health technologies, and most of the top 20.

Now take a look at the Grand Challenges for Engineering in the 21st Century, which Lazowska pointed us to. Many of these challenges depend crucially upon our discipline. Here are seven:

  • secure cyberspace
  • enhance virtual reality
  • advance personalized learning
  • engineer the tools of scientific discovery
  • advance health informatics
  • reverse-engineer the brain
  • engineer better medicines

But imagine doing any of the other seven without involving computing in an intimate way!

I've written a few times about how science has come to be a computational endeavor. Lazowska gave an example that I reported from as part of the next generation of science: databases. A database makes it possible to answer questions that you think of next year, not just the ones you thought of five years ago, when you wrote your proposal to NSF and when you later defined the format of your flat text file. He illustrated his idea with examples of projects at the Ocean Observatories Initiative, and the Quality of Life Technology Center. He also mentioned the idea of prosthetics as the "future of interfaces", which is a natural research and entrepreneurial opportunity for CS students. You may recall having read about this entrepreneurial connection in this blog way back!

For his part, Lazowska suggested advancing personalized learning as an area in which computing could have an immeasurable effect. Adaptive one-on-one tutoring is something that could reach an enormous unserved population and help develop the human capital that could revolutionize the world. This is actually the area into which I was evolving back when I was doing AI research, intelligent tutoring systems. I remain immensely interested in the area and what it could mean for the world. Many folks are uncomfortable with the idea of "computers teaching our children", but I think it's simply a part of the evolution of communication that computer science embodies. The book is a means of educating, communicating, and sharing information, but it is a one-track medium. The computer is a multiple-track medium, a way to deliver interactive and dynamic content to a wide audience. A "dynabook"... I wonder if anyone has been promoting this idea for say, oh, thirty years?

Fear of computers playing a human-like role in human interaction is nothing new. It reminds me of another story Lazowska told, from Time Magazine's article on the computer as the 1982 Machine of the Year. The article mentions CADUCEUS, one of the medical expert systems that was at the forefront of AI's focus on intelligent systems in the '70s and '80s. Here's the best passage:

... while it is possible that a family doctor would recognize 4,000 different symptoms, CADUCEUS is more likely to see patterns in what patients report and can then suggest a diagnosis. The process may sound dehumanized, but in one hospital where the computer specializes in peptic ulcers, a survey of patients showed that they found the machine "more friendly, polite, relaxing and comprehensible" than the average physician.

There are days when I am certain that we can create an adaptive tutoring system that is more relaxing and comprehensible than I am as a teacher, and probably friendlier and politer to boot.

Lazowska closed with an exhortation that computer scientists adopt the stance of the myth buster in trying to educate the general population, whether myths about programming (e.g., "Programming is a solitary activity"), employment ("Computing jobs will all go overseas."), or intrinsic joy ("There are no challenges left.") He certainly gave his audience plenty of raw material for busting one of the myths about the discipline not being interesting: "Computer science lacks opportunities to change the world." Not only do we change the world directly in the form of things like the Internet; these days, when almost anyone changes the world, they do so by using computing!

Lazowska's talk was perhaps too long, trying to pack more information into an hour than we could comfortably digest. But it was a good way to close out SIGCSE, given that one of its explicit themes seemed to engaging the world and that the buzz everywhere I went at the conference was about how we need to reach out more and communicate more effectively.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 08, 2008 9:11 PM

The Worst Kind of Job

Comedian Rodney Laney

Yesterday I was listening to comedian Rodney Laney do a bit called "Old Jobs". He explained that the best kind of job to have is a one-word job that everyone understands. Manager, accountant, lawyer -- that's good. But if you have to explain what you do, then you don't have a good job. "You know the inside of the pin has these springs? I put the springs on the inside of the pins." And then you have to explain why you matter... "Without me, the pins wouldn't go click." Bad job.

Okay, computer scientists and software developers, raise your hands if you've had to explain what you do to a new acquaintance at a party. To a curious relative? I should say "tried to explain", because my own attempts come up short far too often.

I think good Rodney has nailed a major flaw in being a computer scientist.

Sadly, going with the one-word job title of "programmer" doesn't help, and the people who think know what a programmer is often don't really know.

Even still, I like what I do and know why it's a great job.

(Thanks to the wonders of the web, you can watch another version of Laney's routine, Good Jobs, on-line at Comedy Central. I offer no assurance that you'll like it, but I did.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 03, 2008 4:39 PM

Astrachan's Law for a New Generation?

Owen is reputed to have said something like "Don't give as a programming assignment something the student could just as easily do by hand." (I am still doing penance, even though Lent ended two weeks ago.) This has been dubbed Astrachan's Law, perhaps by Nick Parlante. In the linked paper, Parlante says that showmanship is the key to the Law, that

A trivial bit of code is fine for the introductory in-lecture example, but such simplicity can take the fun out of an assignment. As jaded old programmers, it's too easy to forget the magical quality that software can have, especially when it's churning out an unimaginable result. Astrachan's Law reminds us to do a little showing off with our computation. A program with impressive output is more fun to work on.

I think of this Astrachan's Law in a particular way. First, I think that it reaches beyond showmanship: Not only do students have less fun working on trivial programs, they don't think that trivial programs are worth doing at all -- which means they may not practice enough or at all. Second, I most often think of Astrachan's Law as talking about data. When we ask students to convert Fahrenheit to Celsius, or to sum ten numbers entered at the keyboard, we waste the value of a program on something that can be done faster with a calculator or -- gasp! -- a pencil and paper. Even if students want to know the answer to our trivial assignment, they won't see a need to master Java syntax to find it. You don't have to go all the way to data-intensive computing, but we really should use data sets that matter.

Yesterday, I encountered what might be a variant or extension of Astrachan's Law.

John Zelle of Wartburg College gave a seminar for our department on how to do virtual reality "on a shoestring" -- for $2000 or less. He demonstrated some of his equipment, some of the software he and his students have written, and some of the programs written by students in his classes. His presentation impressed me immensely. The quality of the experience produced by a couple of standard projects, a couple of polarizing filters, and a dollar pair of paper 3D glasses was remarkable. On top of that, John and his students wrote much of the code driving the VR, including the VR-savvy presentation software.

Toward the end of his talk, John was saying something about the quality of the VR and student motivation. He commented that it was hard to motivate many students when it came to 3D animation and filmmaking these days because (I paraphrase) "they grow up accustomed to Pixar, and nothing we do can approach that quality". In response to another question, he said that a particular something they had done in class had been quite successful, at least in part because it was something students could not have done with off-the-shelf software.

These comments made me think about how, in the typical media computation programming course, students spend a lot of time writing code to imitate what programs such as Photoshop and Audacity do. To me, this seems empowering: the idea that a freshman can write code for a common Photoshop filter in a few lines of Java or Python, at passable quality, tells me how powerful being able to write programs makes us.

But maybe to my students, Photoshop filters have been done, so that problem is solved and not worthy of being done again. Like so much of computing, such programs are so much a part of the background noise of their lives that learning how to make them work is as appealing to them as making a ball-point pen is to people of my age. I'd hope that some CS-leaning students do want to learn such trivialities, on the way to learning more and pushing the boundaries, but there may not be enough folks of that bent any more.

On only one day's thought, this is merely a conjecture in search of supporting evidence. I'd love to here what you think, whether pro, con, or other.

I do have some anecdotal experience that is consistent in part with my conjecture, in the world of 2D graphics. When we first started teaching Java in a third-semester object-oriented programming course, some of the faculty were excited by what we could do graphically in that course. It was so much more interesting than some of our other courses! But many students yawned. Even back in 1997 or 1998, college students came to us having experienced graphics much cooler than what they could do in a first Java course. Over time, fewer and fewer students found the examples knock-out compelling; the graphics became just another example.

If this holds, I suppose that we might view it as a new law, but it seems to me a natural extension of Astrachan's Law, a corollary, if you will, that applies the basic idea into the realm of application, rather than data.

My working title for this conjecture is the Pixar Effect, from the Zelle comment that crystallized it in my mind. However, I am open to someone else dubbing it the Wallingford Conjecture or the Wallingford Corollary. My humility is at battle with my ego.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 26, 2008 7:27 AM

Data-Intensive Computing and CS Education

An article in the March 2008 issue of Computing Research News describes a relatively new partnership among NSF, Google, and IBM to help the academic computing "explore innovative research and education ideas in data-intensive computing". They define data-intensive computing a new paradigm in which the size of the data dominates all other performance features. Google's database of the web is one example, but so are terabytes and petabytes of scientific data collected from satellites and earth-bound sensors. On the hardware side of the equation, we need to understand better how to assemble clusters of computers to operate on the data and how to network them effectively. Just as important is the need to develop programming abstractions, languages, and tools that are powerful enough so that we mortals can grasp and solve problems at this massive scale. Google's Map-Reduce algorithm (an idea adapted from the functional programming world) is just a start in this direction.

This notion of data-intensive computing came up in two of the plenary addresses at the recent SIGCSE conference. Not surprisingly, one was the talk by Google's Marissa Mayer, who encouraged CS educators to think about how we can help our students prepare to work within this paradigm. The second was the banquet address by Ed Lazowska, the chair of Washington's Computer Science department. Lazowska's focus was more on the need for research into the hardware and software issues that undergird computing on massive data sets. (My notes on Lazowska's talk are still in the works.)

This recurring theme is one of the reasons that our Hot Topic group at ChiliPLoP began its work on the assembly and preparation of large data sets for use in early programming courses. What counts as "large" for a freshman surely differs from what counts as "large" for Google, but we can certainly begin to develop a sense of scale in our students' minds as they write code and see the consequences of their algorithms and implementations. Students already experience large data in their lives, with 160 GB video iPods in their pockets. Having them compute on such large sets should be a natural step.

The Computing Research News also has an announcement of a meeting of the Big-Data Computing Study Group, which is holding a one-day Data-Intensive Computing Symposium today in Sunnyvale. I don't know how much of this symposium will report new research results and how much will share background among the players, in order to forge working relationships. I hope that someone writes up the results of the symposium for the rest of us...

Though our ChiliPLoP group ended up working on a different project this year, I expect that several of us will continue with the idea, and it may even be a theme for us at a future ChiliPLoP. The project that we worked on instead -- designing a radically different undergraduate computer science degree program -- has some currency, though, too. In this same issue of the CRN, CRA board chair Dan Reed talks about the importance of innovation in computing and computing education:

As we debate the possible effects of an economic downturn, it is even more important that we articulate -- clearly and forcefully -- the importance of computing innovation and education as economic engines.

[... T]he CRA has created a new computing education committee ... whose charge is to think broadly about the future of computing education. We cannot continue the infinite addition of layers to the computing curriculum onion that was defined in the 1970s. I believe we need to rethink some of our fundamental assumptions about computing education approaches and content.

Rethink fundamental assumptions and start from a fresh point of view is just what we proposed. We'll have to share our work with Reed and the CRA.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 20, 2008 4:21 PM

SIGCSE Day 2 -- Plenary Address by Marissa Mayer

[A transcript of the SIGCSE 2008 conference: Table of Contents]

Marissa Mayer

The second day of the conference opened with the keynote address by Google's VP of Search Products and User Experience, Marissa Mayer. She was one of the early hires as the company expanded beyond the founders, and from her talk it's clear that she has been involved with a lot of different products in her time there. She is also something the discipline of computer science could use more of, a young woman in a highly-visible technical and leadership and roles. Mayer is a great ambassador for CS as it seeks to expand the number of female high-school and college students.

This talk was called Innovation, Design, and Simplicity at Google and illustrated some of the ways that Google encourages creativity in its employees and gets leverage from their ideas and energy. I'll drop some of her themes into this report, though I imagine that the stories I relate in between may not always sync up. Such is the price of a fast-moving talk and five days of receding memory.

Creativity loves constraint.

I have written on this topic a few times, notably in the context of patterns, and it is a mantra for Google, whose home page remains among the least adorned on the web. Mayer said that she likes to protect its minimalist feel even when others would like to jazz it up. The constraints of a simple page force the company to be more creative in how it presents results. I suspect it also played a role in Google developing its cute practice of customizing the company logo in honor of holidays and other special events. Mayer said that minimalism may be a guide now, but it was not necessarily a reason for simplicity in the beginning. Co-founder Sergey Brin created the first Google home page, and he famously said, "I don't do HTML."

Mayer has a strong background both in CS and in CS education, having worked with the undergrad education folks at Stanford as a TA while an undergrad. (She said that it was Eric Roberts who first recommended Google to her, though at the time he could not remember the company's name!) One of her first acts as an employee was to run a user study on doing search from the Google homepage. She said that when users first sat down and brought up the page, they just sat there. And sat there. They were "waiting for the rest of it"! Already, users of the web were already accustomed to fancy pages and lots of graphics and text. She said Google added its copyright tag line at the bottom of the page to serve as punctutation, to tell the user that that's all there was.

Search is a fixed focus at Google, not a fancy user interface. Having a simple UI helps to harness the company's creativity.

Work on big problems, things users do every day.

Work on things that are easy to explain and understand.

Mayer described in some detail the path that a user's query follows from her browser to Google and back again with search results. Much of that story was as expected, though I was surprised by the fact that there are load balancers to balance the load on the load balancers that hand off queries to processors! Though I might have thought that another level of indirection would slow the process it down, indeed it is necessary in order to ensure that the system doesn't slow down. Even with the web server and the ad server and the mixers, users generally see their results in about 0.2 seconds. How is that for a real-time constraint to encourage technical creativity?

Search is the #2 task performed on the web. E-mail is (still) #1. Though some talking heads have begun to say that search is a mature business in need of consolidation, Google believes that search is just getting started. We know so little about how to do it well, how to meet the user's needs, and how to uncover untapped needs. Mayer mentioned a problem familiar to this old AI guy: determining the meaning of the words used in a query so that they can serve pages that match the user's conceptual intent. She used a nice example that I'll probably borrow the next time I teach AI. When a user asks for an "overhead view", he almost always wants to see a picture!

This points in the direction of another open research area, universal search. The age in which users want to search for text pages, images, videos, etc., as separate entities has likely passed. That partitioning is a technical issue, not a user's issue. The search companies will have to find a way to mix in links to all kinds of media when they serve results. For Google, this also means figuring out how to maintain or boost ad revenue when doing so.

Ideas come from everywhere.

Mayer gave a few examples. One possible source is office hours, which are usually thought of as an academic concept but which she has found useful in the corporate world. She said that the idea for Froogle walked through her office door one day with the scientist who had it.

Another source is experiments. Mayer told a longer story about Gmail. The company was testing it in-house and began to discuss how they could make many from it. She suggested the industry-standard model of giving a small amount of storage for free and then charging for more. This might well have worked, because Google's cost structure would allow it to offer much larger amounts at both pricing levels. But a guy named Paul -- he may be famous, but I don't know his last name -- suggested advertising. Mayer pulled back much as she expected users to do; do people really want Google to read their e-mail and serve ads? Won't that creep them out?

She left the office late that night believing that the discussion was on hold. She came back to work the next morning to find that Paul had implemented an experimental version in a few hours. She was skeptical, but the idea won her over when the system began to serve ads that were perfectly on-spot. Some folks still prefer to read e-mail without ads, but the history of Gmail has shown just how successful the ad model can be.

The insight here goes beyond e-mail. The search ad data base can be used on any paged on the web. This is huge... Search pages account for about 5% of the pages served on the web. Now Google knew that they could reach the other 95%. How's that for a business model?

To me, the intellectual lesson is this:

If you have an idea, try it out.

This is a power that computer programmers have. It is one of the reasons that I want everyone to be able to program, if only a little bit. If you have an idea, you ought to be able to try it out.

Not every idea will lead to a Google, but you never know which ones will.

Google Books started as a simple idea, too. A person, a scanner, and a book. Oh, and a metronome -- Mayer said that when she was scanning pages she would get out of rhythm with the scanner and end up photocopying her thumbs. Adding a metronome to the system smoothed the process out.

... "You're smart. We're hiring." worked remarkably well attracting job candidates. We programmers have big egos! Google is one of the companies that has made it okay again to talk about hiring smart people, not just an army of competent folks armed with a software development process, and giving them the resources they need to do big things.

Innovation, not instant perfection.

Google is also famous for not buying into the hype of agile software development. But that doesn't mean that Google doesn't encourage a lot of agile practices. For example, at the product level, it has long practiced a "start simple, then grow" philosophy.

Mayer contrasted two kinds of programmers, castle builders and nightly builders. Companies are like that, too. Apple -- at least to outside appearances -- is a castle-building company. Every once in a while, Steve Jobs et al. go off for a few years, and then come back with an iPod or an iPhone. This is great if you can do it, but only a few companies can make it work. Google is more of a nightly builder. Mayer offered Google News as a prime example -- it went through 64 iterations before it reached its current state. Building nightly and learning from each iteration is often a safer approach, and even companies that are "only good" can make it work. Sometimes, great companies are the result.

Data is a-political.

Mayer didn't mean Republican versus Democrat here, rather that well-collected data provide a more objective basis for making decisions than the preferences of a manager or the guesses of a designer or programmer. Designing an experiment that will distinguish the characteristics you are in interested, running it, and analyzing the data dispassionately are a reliable way to make good decisions. Especially when a leader's intuition is wrong, such as Mayer's was on Gmail advertising.

She gave a small plug for using Google Trends as a way to observe patterns in search behavior when they might give an idea about a question of interest. Query volume may not not change much, but the content of the queries does.

Users, users, users.

What if some users want more than the minimalist white front page offered by Google? In response to requests from a relatively small minority of users -- and the insistent voices of a few Google designers -- iGoogle is an experiment in offering a more feature-filled portal experience. How well will it play? As is often the case, the data will tell the story.

Give license to dream.

Mayer spent some time talking about the fruits of Google's well-known policy of 20% Time, whereby every employee is expected to spend 1/5 of his or her time working on projects of personal interest. While Google is most famous for this policy these days, like most other ideas it isn't new. At ChiliPLoP this week, Richard Gabriel reported that Eric Schmidt took this idea to Google with him when he left Sun, and Pam Rostal reported that 3M had a similar policy many years ago.

But Google has rightly earned its reputation for the fruits of 20% Time. Google News. Google Scholar. Google Alerts. Orkut. Froogle Wireless. Much of Google Labs. Mayer said that 50% of Google's new products come from these projects, which sounds like a big gain in productivity, not the loss of productivity that skeptics expect.

I have to think that the success Google has had with this policy is tied pretty strongly with the quality of its employees, though. This is not meant to diss the rest of us regular guys, but you have to have good ideas and the talent to carry them out in order for this to work well. That said, these projects all resulted from the passions of individual developers, and we all have passions. We just need the confidence to believe in our passions, and a willingness to do the work necessary to implement them.

Most of the rest of Mayer's talk was a litany of these projects, which one wag in the audience called a long ad for the goodness of Google. I wasn't so cynical, but I did eventually tire of the list. One fact that stuck with me was the description of just how physical the bits of Google Earth are. She described how each image of the Earth's surface needs to be photographed at three or four different elevations, which requires three or four planes passing over every region. Then there are the cars driving around taking surface-level shots, and cameras mounted to take fixed-location shots. A lot of physical equipment is at work -- and a lot of money.

Share the data.

This was the last thematic slogan Mayer showed, though based on the rest of the talk I might rephrase it as the less pithy "Share everything you can, especially the data." Much of Google's success seems based in a pervasive corporate culture of sharing. This extends beyond data to ideas. It also extends beyond Google campus walls to include users.

The data-sharing talk led Mayer to an Easter Egg she could leave us. If you check Google's Language Tools page, you will see Bork, Bork, Bork, a language spoken (only?) by the Swedish chef on the Muppets. Nonetheless, the Bork, Bork, Bork page gets a million hits a year (or was it a day?). Google programmers aren't the only ones having fun, I guess.

Mayer closed with suggestions for computer science educators. How might we prepare students better to work in the current world of computing? Most of her recommendations are things we have heard before: build and use applications, work on large projects and in teams, work with legacy code, understand and test at a large scale, and finally pay more attention to reliability and robustness. Two of her suggestions, though, are ones we don't hear as often and link back to key points in her talk: work with and understand messy data, and understand how to use statistics to analyze the data you collect.

After the talk, folks in the audience asked a few questions. One asked how Google conducts user studies. Mayer described how they can analyze data of live users by modding the key in user cookies to select 1/10 or 1/1000 of the user population, give those users a different experience, and then look at characteristics such as click rate and time spent on page by these users compared to the control group.

The best question was in fact a suggestion for Google. Throughout her talk, Mayer referred repeatedly to "Google engineers", the folks who come up with neat ideas, implement them in code, test them, and then play a big role in selling them to the public. The questioner pointed out that most of those "engineers" are graduates of computer science programs, including herself, Sergey Brin, and Larry Page. He then offered that Google could do a lot to improve the public perception of our discipline if it referred to its employees as computer scientists.

I think this suggestion caught Mayer a little off-guard, which surprised me. But I hope that she and the rest of Google's leadership will take it to heart. In a time when it is true both that we need more computer science students and that public perception of CS as a discipline is down, we should be trumpeting the very cool stuff that computer scientists are doing at places like Google.

All in all, I enjoyed Mayer's talk quite a bit. We should try to create a similarly creativity-friendly environment for our students and ourselves. (And maybe work at a place like Google every so often!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 19, 2008 12:40 AM

A Change in Direction at ChiliPLoP

As I mentioned in my last SIGCSE entry, I have moved from carefree Portland to Carefree, Arizona, for ChiliPLoP 2008. The elementary patterns group spent yesterday, its first, working on the idea of integrating large datasets into the CS curriculum. After a few years of working on specific examples, both stand-alone and running, we started this year thinking about how CS students can work on real problems from many different domains. In the sciences, that often means larger data sets, but more important it means authentic data sets, and data sets that inspire students to go deeper. On the pedagogical side of the ledger, much of the challenge lies in finding and configuring data sets so that they can used reliably and without unnecessary overhead placed on the adopting instructor.

This morning, we volunteered to listen to a presentation by the other hot topic group on its work from yesterday: a "green field" thought experiment designing an undergrad CS program outside of any constraints from the existing university structure. This group consists of Dave West and Pam Rostal, who presented an earlier version of this work at the OOPSLA 2005 Educators' Symposium, and Richard Gabriel, who brings to the discussion not only an academic background in CS and a career in computer science research and industry but also an MFA in poetry. Perhaps the key motivation for their hot topic is that most CS grads go on to be professional software developers or CS researchers, and that our current way of educating them doesn't do an ideal job of preparing grads for either career path.

Their proposal is much bigger than I can report here. They started by describing a three-dimensional characterization of different kinds of CS professionals, including provocative and non-traditional labels as "creative builder", "imaginative researcher", and "ordinologist". The core of the proposal is the sort of competency-based curriculum that West and Rostal talked about at OOPSLA, but I might also describe it as studio-based, apprenticeship-based, and project-based. One of their more novel ideas is that students would learn everything they need for a liberal arts, undergraduate computer science education through their software projects -- including history, English, writing, math, and social science. For example, students might study the mathematics underlying a theorem prover while building a inference engine, study a period of history in order to build a zoomable timeline on the web for an instructional web site, or build a Second Life for a whole world in ancient Rome.

In the course of our discussion, the devil's advocates in the room raised several challenging issues, most of which the presenters had anticipated. For example, how do the instructors (or mentors, as they called them) balance the busy work involved in, say, the students implementing some Second Life chunk with the content the students need to learn? Or how does the instructional environment ensure that students learn the intellectual process of, say, history, and not just impose a computer scientist's worldview on history? Anticipating these concerns does not mean that they have answers, only that they know the issues exist and will have to be addressed at some point. But this isn't the time for self-censorship... When trying to create something unlike anything we see around us, the bigger challenge is trying to let the mind imagine the new thing without prior restraint from the imperfect implementations we already know.

We all thought that this thought experiment was worth carrying forward, which is where the change of direction comes in. While our group will continue to work on the dataset idea from yesterday, we decided in the short term to throw our energies into the wild idea for reinventing CS education. The result will be two proposals to OOPSLA 2008: one an activity at the Educators' Symposium, and the other an Onward! paper. This will be my first time as part of a proposal to the Onward! track, which is both a cool feeling and an intimidating prospect. We'll see what happens.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 17, 2008 4:18 PM

SIGCSE Day 1 -- Innovating our Image

[A transcript of the SIGCSE 2008 conference: Table of Contents]

Owen Astrachan and Peter Denning

I ended the first day by attending a two-man "panel" of NSF's inaugural CISE Distinguished Educational Fellows, Owen Astrachan and Peter Denning. The CDEF program gives these two an opportunity to work on wild ideas having the potential "to transform undergraduate computing education on a national scale, to meet the challenges and opportunities of a world where computing is essential to U.S. leadership and economic competitiveness across all sectors of society." A second-order effect of these fellowships is to have a soapbox (Denning) or bully pulpit (Astrachan) to push their ideas out into the world and, more important, to encourage the rest of the community to think about how to transform computing education, whether on the CDEF projects or on their own.

Owen went first. He opened by saying that the CDEF call for proposals had asked for "untested ideas", and if he couldn't propose one of those, well... I've mentioned his Problem Based Learning learning project here before, as part of an ongoing discussion of teaching that is built around real problems, projects, and meaningful context. Owen's talk described some of the motivation for his project and the perceived benefits. I'll try to report his talk faithfully, but I suspect that some of my own thoughts will creep in.

When we talk to the world as we talk among ourselves -- about choices of programming language, IDE, operating systems, ... -- we tell the world that tools are the coin of our realm. When an engineer comes to us and asks for a course using MATLAB, we may be 100% technically correct to respond "Python is better than MATLAB. Here's why..." -- but we send the wrong message, a damaging message, which makes it the wrong response.

We can engage those folks better if we talk to them about the problems they need help solving. In many ways, that is how computing got its start, and it is time for us again to look outside our discipline for problems to solve, ideas to explore, and motivation to advance the discipline. Increasingly, outside our discipline may well be the place for us to look for learners, as fewer folks express interest in computing for computing's sake and as more non-computing look for ways to integrate computing into their own work. (Like scientists and even artists.)

That is one motivation for the PBL project. Another was the school Owen's children attend, where all learning is project-based. Students work on "real" problems that interest them, accumulating knowledge, skill, and experience as they work on increasingly challenging and open-ended projects. Problems have also driven certain kinds of scientific professional education at the university level for many decades.

For the first time in any of his talks I've seen, Owen took some time to distinguish problem-based learning from project-based learning. I didn't catch most of the specific differences, but Owen pointed us all to the book Facilitating Problem-Based Learning by Maggi Savin-Baden for its discussion of the differences. This is, of course, of great interest to me, as I have been thinking a lot in the last year or more about the role project-based courses can play in undergraduate CS. Now I know where to look for more to think about.

As a part of eating his own dog food, Owen is trying to ratchet up the levelof dialogue in his own courses this year by developing assignments that are based on problems, not implementation techniques. One concrete manifestation of this change is shorter write-ups for assignments, which lay out only the problem to be solved and not so much of his own thinking about how students should think about the problem. He likened this to giving his students a life jacket, not a straitjacket. I struggle occasionally with a desire to tie my students' thinking up with my own, and so wish him well.

Where do we find problems for our CS majors to work on? Drawing explicitly from other disciplines is one fruitful way, and it helps CS students see how computing matters in the world. We can also draw from applications that young people see and use everyday, which has the potential to reach an even broader audience and requires less "backstory". This is something the elementary patterns folks have worked on at ChiliPLoP in recent years, for example 2005. (Ironically, I am typing this in my room at the Spirit in the Desert Retreat Center, as I prepare for ChiliPLoP 2008 to begin in the morning. Outside my window is no longer rainy Portland but a remarkably cold Arizona desert.)

Owen said we only need to be alert to the possibilities. Take a look at TinyURL -- there are several projects and days of lecture there. Google the phrase dodger ball; why do we get those results? You can talk about a lot of computer science just by trying to reach an answer.

After telling us more about his NSF-funded project, Owen closed with some uplifting words. He hopes to build a community around the ideas of problem-based learning that will benefit from the energy and efforts of us all. Optimism is essential. Revolutionizing how we teach computing, and how others see computing, is a daunting task, but we can only solve problems if we try.

Denning took the stage next. He has long been driven by an interest in the great principles of computing, both as a way to understand our discipline better and as a way to communicate our discipline to others more effectively. His CDEF project focuses on the different "voices" of computing, the different ways that the world hear people in our discipline speak. In many ways, they correspond to the different hats that computing people wear in their professional lives -- or perhaps our different personalities in a collective dissociative identity disorder.

Denning identifies seven voices of computing: the programmer, the computational thinker, the user, the mathematician, the engineer, the scientist, and the "catalog". That last one was a mystery to us all until he explained it, when it became our greatest fear... The catalog voice speaks to students and outsiders in terms of the typical university course descriptions. These descriptions partition our discipline into artificial little chunks of wholly uninteresting text.

What makes these voices authentic? Denning answered the question in terms of concepts and practices. To set up his answer, he discussed three levels in the development of a person's understanding of a technology, from mystical ("it works") to operational (concrete models) to formal (abstractions). Our courses often present formal abstractions of an idea before students have had a chance to develop solid concrete models yet. We often speak in terms of formal abstraction to our colleagues from other disciplines. We would be more effective if instead we worked on their problems with them and helped them create concrete results that they can see and appreciate.

One advantage of this is that the computing folks are speaking the language of the problem, rather than the abstract language of algorithms and theory. Another is that it grounds the conversation in practices, rather than concepts. Academics like concepts, because they are clean and orderly, more than practices, which are messy and often admit no clean description. Denning asserts that voices are more authentic when grounded in practices, and that computing hurts itself whenever it grounds its speech in concepts.

His project also aims to create a community of people around his ideas. He mentioned something like a "Rebooting Computing" summit that will bring together folks interested in his CDEF vision and, more broadly, in inspiring a return to the magic and beauty of computing. Let's see what happens.

I heard several common threads running through Astrachan's and Denning's presentations. One is that we need to be more careful about how we talk about our discipline. Early on, Denning's said that we should talk about what computing is and how we do it, and not how we think about things. We academics may care about that, but no one else does. Later, Owen said that we should talk about computational doing, not computational thinking. These both relate to the intersection of their projects, where solving real problems in practice is the focus.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 15, 2008 1:06 PM

SIGCSE Day 1 -- Nifty Assignments

[A transcript of the SIGCSE 2008 conference: Table of Contents]

Last year's Nifty Assignments at SIGCSE merited only a retro-chic reference in my blog. I thought that was unusual until I looked back and noticed that I mentioned it not at all in 2006 or 2005. I guess that's not too surprising. The first few years of the now-annual panel were full of excitement, as many of the best-known folks in the SIGCSE shared some of their best assignments. These folks are also among the better teachers you'll find, and even when their assignments were only okay their presentations alone were worth seeing. As the years have passed, the excitement of the assignments and presentations alike have waned, even as the panel remains a must-see for many at the conference.

Still, each year I seem to find some nugget to take home with me. This year's panel actually offered a couple of neat little assignments, and one good idea.

The good idea came late in Cay Horstmann's presentation: have students build a small application using an open-source framework, and then critique the design of the framework. In courses where we want students to discuss design, too often we only read code and talk in the abstract. The best way to "get" a design is to live with it for a while. One small app may not be enough, but it's a start.

One of the niftier assignments was a twist on an old standard, Huffman coding, that made it accessible to first-semester students. Creating a Huffman code is usually reserved for a course in data structures and algorithms because the technique involves growing a tree of character sets, and the idea of trees -- let alone implementing them -- is considered by most people beyond CS1 students. McGuire and Murtagh take advantage of a nifty fact: If you are interested only in determining how many bits you need to encode a sequence, not in doing the encoding itself, then all you need to do is execute the loop that replaces the two smallest values in the collection with their sum. A simple linear structure suffices, and the code comes to resemble the selection part of a selection sort.

This assignment gives you a way to talk to students about data compression and how different techniques give different compression rates. The other twist in this assignment is that McGuire and Murtagh apply the coding to images, not text strings. This is something I tried in my media computation CS1 in the fall of 2006, with both sounds and images. I liked the result and will probably try something like it the next time I teach CS1.

Catching Plagiarists image

The other assignment that grabbed my attention was Baker Franke's Catching Plagiarists. This assignment isn't all that much different from many of the common text processing tasks folks use in CS1, but it is framed in a way that students find irresistible: detecting plagiarism. I used to tell my intro students, "Copy and paste is your life", which always drew a few laughs. They knew just how true it was. With universal web access and the growth of Wikipedia, I think my claim is more true now than it was then, to the point that students think nothing of liberal re-use of others' words. This assignment gets students to think about just how easy it is to detect certain kinds of copying.

So, putting the task in a context that is meaningful and way relevant ups the niftiness factor by a notch or two. The fact that it can use a data sets both small and large means that the students can run head-first into the idea that some algorithms use much more time or space than others, and that their program may not have enough of one or both of these resources to finish the job on an input that they consider tiny -- a MB or two. (Heck, that's not even enough space for a decent song.)

Another thing I liked about this assignment is that it is, by most standards, underspecified. You could tell students this little: "Write a program to find the common word sequences among a set of documents. Your input will be a set of plain-text documents and a number n, and your output will be a display showing the number of n-word sequences each document has in common with every other document in the set." Franke presented his assignment as requiring little prep, with a simple problem statement of this sort, so I was a little disappointed to see that his assignment write-up is four pages with a lot more verbiage. I think it would be neat to simply tell the students the motivating story and then give them the five-line assignment. After students have had a chance to think about their approach, I'd like to talk to them about possibilities and help them think through the design.

Then again, I have to cut Franke some slack. He is a high school instructor, so his students are even younger than mine. I'm encouraged to think that high school students anywhere are working on such a cool problem.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 14, 2008 3:36 PM

SIGCSE Day 1 -- The Mystery Problem

[A transcript of the SIGCSE 2008 conference: Table of Contents]

During the second session of the day, I bounced among several sessions, but the highlight was Stuart Reges talking about an interesting little problem -- not a problem he had, but a problem that appeared on the advanced placement exam in CS taken by high-school students. This problem stood out in an analysis of the data drawn from student performance on the 1988. Yes, that is twenty years ago!

Donald Knuth is famous for saying that perhaps only 2% of people have the sort of mind needed to be a computer science. He actually wrote, "I conclude that roughly 2% of all people 'think algorithmically', in the sense that they can reason rapidly about algorithmic processes." But is it that those 2% have that the rest do not? In his talk, Stuart read another passage from Knuth that speculates about one possible separator:

The other missing concept that seems to separate mathematicians from computer scientists is related to the 'assignment operation' :=, which changes values of quantities. More precisely, I would say that the missing concept is the dynamic notion of the state of a process. 'How did I get here? What is true now? What should happen next if m going to get to the end?' Changing states of affairs, or snapshots of a computation, seem to be intimately related to algorithms and algorithmic thinking.

In studying student performance on the 1988 AP exam, Reges found that performance on a small set of "powerhouse questions" was inordinately predictive of success on the exam as a whole, and of those five one stood out as most predictive. This question offers evidence in support of Knuth's speculation about "getting" assignment. Here it is:

If b is a Boolean variable, then the statement b := (b = false) has what effect?
  1. It causes a compile-time error message.
  2. It causes a run-time error message.
  3. It causes b to have value false regardless of its value just before the statement was executed.
  4. It always changes the value of b.
  5. It changes the value of b if and only if b had value true just before the statement was executed.

What a fun little question -- so simple, but with layers. It involves assignment, but also sequencing of operations, because the temporary result of (b = false) must be computed before assigning to b. (Do you know the answer?)

You can read about the correlations and much more about the analysis in Stuart's paper and slides, which are available on this resource page The full analysis may be interesting only to a subset of us, perhaps as few as 2%... I really enjoyed seeing the data and reading about how Stuart thought through the data. But here I'd like to think more about what this implies for how students reason, and how we teach.

This notion of state, the ability to take and manipulate "snapshots of a computation", does seem to be one of the necessary capabilities of students who succeed in computer science. With his speculation, Knuth is suggesting that how people think about computation matters. Stuart also quoted one of my heroes, Robert Floyd, who upon hearing an earlier version of this talk commented:

These questions seem to test whether a student has a model of computation; whether they can play computer in their head.

This is something folks in CS education think a lot about, but unfortunately we in then trenches teaching intro CS often don't apply what we know consistently or consciously. Whether we think about it or not, or whether we act on it or not, students almost certainly bring a naive computational model with them when they enter our courses. In the world of math and CS, we might refer to this as a naive operational semantics. How do variables work? What happens when an if statement executes? Or a loop? Or an assignment statement? I have read a few papers that investigate novice thinking about these issues, but I must confess to not having a very firm sense of what CS education researchers have studied and what they have learned.

I do have a sense that the physics education community has a more complete understanding of their students' naive (mis)understanding of the physical world and how to engage students there. (Unfortunately, doing that doesn't always help.)

Someone in the crowd suggested that we teach a specific operational semantics to students in CS1, as some authors do. That's a good idea complicated by the kinds of languages and programming models that we often teach. I think that we can do better just by remembering that our students have naive computational model in their heads and trying to find out how they understand variables, selection, loops, and assignments statements.

Stuart gave a great example of how he does this. He now sometimes asks his students this question:

public static int mystery(int n) {
    int x = 0;
    while (n % 2 == 0) {
        // Point A
        n = n / 2;
        x++;
        // Point B
    }
    // Point C
    return x;
}

Is (n % 2 == 0) always true, never true, or sometimes true/sometimes false at points A, B and C?

Stuart reported that many of his students think (n % 2 == 0) is always true at Point B because it's inside the loop, and the while loop condition guarantees that the condition is always true inside the loop. One wonders what these students think about infinite loops.

If we understand what students think in such basic situations, we are much better positioned to help students debug their programs -- and to write programs in a more reliable way. One way to help students learn and enjoy more is to give them tools to succeed. And recognizing when and how what students already think incorrectly is a prerequisite to that.

Plus, these are multiple-choice questions, which will make some students and professors even happier!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 14, 2008 10:55 AM

SIGCSE Day 1 -- Randy Pausch and Alice

[A transcript of the SIGCSE 2008 conference: Table of Contents]

Last September, the CS education community was abuzz with the sad news that Randy Pausch, creator of the Alice programming environment, had suffered a recurrence of his cancer and that his condition was terminal. Pausch approached his condition directly and with verve, delivering a last lecture that became an Internet phenomenon. Just google "lecture"; as of today, Pausch's is the second link returned.

Because Pausch, his team, and Alice have had such an effect on the CS education community, and not just the virtual reality community in which they started, the folks at SIGCSE gambled that his health would hold out long enough for him to accept the SIG's 2008 Award for Outstanding Contribution to Computer Science Education in person and deliver a plenary address. Alas, it did not. Pausch is relatively well but unable to travel cross country. In his stead, Dennis Cosgrove, lead project scientist on the Alice project, and Wanda Dann, one of the curriculum development leads on the project, gave a talk on the history of Pausch's work and on what's next with Alice 3.0. The theme of the talk was building bridges, from virtual reality research to cognitive psychology and then to the fine arts, and a parallel path to CS education.

I admire Cosgrove and Dann for taking on this task. It is impossible to top Pausch's now-famous last lecture, which nearly everyone has seen by now. (If you have not yet, you should. It's an inspiring 104 minutes.) I'll let the video speak for Pausch and only report some of the highlights from Cosgrove and Dann.

Pausch's work began like so many new assistant professors' work does: on the cheap. He wanted to start a virtual reality lab but didn't have the money to "do it right". So he launched a quest to do "VR on $5 a day". Alice itself began as a rapid prototyping language for VR systems and new interaction techniques. As his lab grew, Pausch realized that to do first-class VR, he needed to get into perceptual research, to learn how better to shape the user's experience.

This was the first bridge he built, to cognitive psychology. The unexpected big lesson that he learned was this: What you program is not what people see. I think the teacher in all of us recognizes this phenomenon.

Next came an internship at Disney Imagineering, a lifelong dream of his. There, he saw the power of getting artists and engineers to work together, not just in parallel on the same project. One of the big lessons he learned was that it's not easy to do. Someone has to work actively to keep artists and engineers working together, or they will separate into their own element. But the benefits of the collaboration are worth the work.

Upon his return to CMU, he designed a course called Building Virtual Worlds that became a campus phenomenon. Students came to view building their worlds as a performing art -- not from the perspective of the "user", but thinking about how an audience would respond. I think this shows that computer science students are more than just techies, and that placed in the right conditions will respond with a much broader set of interests and behaviors.

In the last phase of his work, Pausch has been working more in CS education than in VR. In his notes on this talk, he wrote, "Our quest (which we did not even realize in the beginning) was to revolutionize the way computer programming is taught." So often, we start with one goal in mind and make discoveries elsewhere. Sometimes we get lost, and sometimes we just wander in an unexpected direction. I think many folks in CS education first viewed Alice as a way to teach non-majors, but increasingly folks realize that it may have a profound effect on how we teach -- and recruit -- majors. I was glad to be pointed in the direction of Pausch's student Caitlin Kelleher, whose PhD dissertation, "Motivating Programming: Using Storytelling to Make Computer Programming Attractive to Middle School Girls" is of great interest to me. (And not just as father to two girls!)

Cosgrove wrapped up his talk with a cartoon that seems to express Pausch's Think Big outlook on life. I won't try to show you the image (who needs another pre-cease-and-desist message from the same artist?), but will describe it: Two spiders have built a web across the bottom of the playground slide. One turns to the other and says, "If we pull this off, we will eat like kings." Pausch and his team have been weaving a web of Alice, and we may well reap big benefits.

Pausch's career teaches us one more thing. To accomplish big things, you need both a strong research result, in order to convince folks your idea might work, and you need strong personal connections, in order that funders will be able to trust you with their money and resources.

Thanks, Randy.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 13, 2008 7:22 PM

Notes on SIGCSE 2008: Table of Contents

This set of entries records my experiences at SIGCSE 2008, in Portland, Oregon, March 12-16. I'll update it as I post new pieces about the conference. One of those entries will explain why my posts on SIGCSE may come more slowly than they might.

Primary entries:

Ancillary entries:


Posted by Eugene Wallingford | Permalink | Categories: Computing, Running, Software Development, Teaching and Learning

March 06, 2008 8:07 PM

An Odd Dinner Conversation

Last night, at dinner with my family, I casually mentioned this YouTube video in which Barack Obama answers a question from a Google interviewer about how to sort a million 32-bit integers. Obama gets a good laugh when he says that "the bubble sort would be the wrong way to go". My family knows that I enjoy pointing out pop references to CS, so I figured they'd take this one in stride and move.

But they didn't. Instead, they asked questions. What is "bubble sort"? Why do they call it that? As I described the idea, they followed with more questions and ideas of their own. I told them that bubble sort was the first sorting algorithm I ever learned, programming in BASIC as a junior in high school. My wife mentioned something like the selection sort, so I told them a bit about selection sort and insertion sort, and how they are considered "better" than bubble sort.

Why? they asked. That led us to Big-Oh notation and O(n²) versus (nlogn), and why the latter is better. We talked about how we can characterize an algorithm by its running time as proportional to n² or nlogn for some factor k, and the role that k plays in complicating our comparisons. I mentioned that O(n²) and a big k are part of the reason that bubble sort is considered bad, and that's what made the answer in the video correct -- and also why I am pretty sure that Obama did not understand any of the reasoning behind his answer, which is what made his deadpan confidence worth a chuckle.

(If you would like to learn more about bubble sort and have a chuckle of your own, read Owen Astrachan's Bubble Sort: An Archaeological Algorithmic Analysis (PDF), available from his web site.)

As the conversation wound down, we talked about how we ourselves sort things, and I got to mention my favorite sorting algorithm for day-to-day tasks, mergesort.

I suspect that my younger daughter enjoyed this conversation mostly for hearing daddy the computer scientist answer questions, but my wife and freshman daughter seemed to have grokked some of what we talked about. Honest -- this wasn't just me prattling on unprovoked. It was fun, yet strange. Maybe conversations like this one can help my daughters have a sense of the many kinds of things that computer scientists think about. Even if it was just bubble sort.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

March 02, 2008 2:03 PM

Them's Fighting Words

This provocative statement appears in the Roly Perera essay that I mentioned recently:

All I know is that if there is a place for post-modernist, lit-crit, social constructivist thinking in the modern world, it's nowhere near the field of computing.

Robert Biddle and James Noble may have a duel on their hands (PDF)! I think that we have the makings of an outstanding OOPSLA 2008 panel.

I'd certainly like to hear more from Perera if he has more ideas of this sort:

... emergent behaviours are in a sense dual to the requirements on a solution. Requirements are known and obligate the system in certain ways, whereas emergent behaviours ("emergents", one could call them) are those which are permitted by the system, but which were not known a priori.

This is an interesting analogy, one I'd like to think more about. For some reason, it reminds of Jon Postel's dictum about protocol design, "Be liberal in what you accept, and conservative in what you send" and Bertrand Meyer's ideas about design by contract.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 01, 2008 3:39 PM

Toward Less Formal Software

This week I ran across Jonathan Edwards's review of Gregor Kiczales's OOPSLA 2007 keynote address, "Context, Perspective and Programs" (which is available from the conference podcast page). Having recently commented on Peter Turchi's keynote, I figured this was a good time to go back and listen to all of Gregor's again. I first listened to part of it back when I posted it to the page, but I have to admit that I didn't understand it all that well then. So a second listen was warranted. This time I had access to his slides, which made a huge difference.

In his talk, Kiczales tries to bring together ideas from philosophy, language, and our collective experience writing software to tackle a problem that he has been working around his whole career: programs are abstractions, and any abstraction represents a particular point of view. Over time, the point of view changes, which means that the program is out of sync. Software folks have been thinking about ways to make programs capable of responding naturally as their contexts evolve. Biological systems have offered some inspiration in the last decade or so. Kiczales suggests that computer science's focus on formality gets in the way of us finding a good answer to our problem.

Some folks took this suggestion as meaning that we would surrender all formalism and take up models of social negotiation as our foundation. Roly Perera wrote a detailed and pointed review of this sort. While I think Perera does a nice job of placing Kiczales's issues in their philosophical context, I do not think Kiczales was saying that we should go from being formal to being informal. He was suggesting that we shouldn't have to go from being completely formal to being completely informal; there should be a middle ground.

Our way of thinking about formality is binary -- is that any surprise? -- but perhaps we can define a continuum between the two. If so, we could write our program at an equilibrium point for the particular context it is in and then, as the context shifts, allow the program to move along the continuum in response.

Now that I understand a little better what Kiczales is saying, his message resonates well with me. It sounds a lot like the way a pattern balances the forces that affect a system. As the forces change, a new structure may need to emerge to keep the code in balance. We programmers refactor our code in response to such changes. What would it be like for the system to recognize changes in context and evolve? That's how natural systems work.

As usual, Kiczales is thinking thoughts worth thinking.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

February 27, 2008 5:48 PM

We Are Not Alone

In case you think me odd in my recent interest in the idea of computer science for all students, even non-majors, check out an interview with Andries van Dam in the current issue of The Chronicle of Higher Education on-line:

Q: What do you hope to bring to computer-science education?

A: We'll try to figure out "computing in the broad sense" -- not just computer-science education, but computing education in other fields as well. What should high-school students know about computation? What should college students know about computation? I think these are all questions we're going to ask.

van Dam is a CS professor at Brown University and the current chair of the Computing Research Association's education committee. I look forward to seeing what the CRA can help the discipline accomplish in this space.

Do keep in mind when I say things like "computer science for all students", I mean this for some yet-undetermined value of "computer science". I certainly don't think that current CS curricula or even intro courses are suited for helping all university or high school students learn the power of computing. (Heck, I'm not even sure that most of our intro courses are the best way to teach our majors.)

That's one of the concerns that I have with the proposed Computing and the Arts major at Yale that I mentioned last time. It's not at all clear to me that a combination of courses from the existing CS and art majors is what is really needed to educate a new audience of intellectuals or professionals empowered to use computation in a new way. Then again, I do not know what such a new major or its courses might look like, so this experiment may be a good way to get started. But the faculty there -- and here, and on the CRA education committee, and everywhere else -- should be on the look-out for how we can best prepare an educated populace, one that is computation-savvy for a world in which computation is everywhere.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 26, 2008 6:00 PM

A Few Thoughts on Artists as Programmers

I've written so much about scientists as programmers, I'm a little disappointed that I haven't made time to write more about artists as programmers in response to Ira Greenberg's visit last month. The reason is probably two-pronged. First, I usually have less frequent interaction with artists, especially artists with this sort of mentality. This month, I did give a talk with an artistic connection to an audience of art students, but even that wasn't enough to prime the pump. That can be attributed to the second prong, which is teaching five-week courses in languages I have never taught before -- one of which, PHP, I've never even done much programming in. I've been busy preparing course materials and learning.

Before I lose all track of the artists-as-programmers thread for now, let me say a few things that I still have in waiting.

Processing is really just a Java IDE. I don't mean that in a dismissive sense; it's very useful and provides some neat tools to hide the details of Java -- including classes and the dread "public static void main" -- from programmers who don't care. But there is not all that much to it in a technical sense, which means that CS folks don't need to obsess about whether they are using it or not.

Keller McBride's color spray artwork

For example, you can do many of the same things in JES, the Jython environment created for Guzdial's media computation stuff. When I taught media computation using Erickson and Guzdial's Java materials, I had my students implement the an interpreter for the simplest of graphics languages and then asked them to show off their program with a piece of art produced with the language. One result was the image to the right, produced by freshman Keller McBride's using a program processed by his own interpreter.

During his talk, Greenberg mentioned that he had a different take on the idea of web "usability". Later I commented that I was glad he had said that, because I found that his website was a little bit funky. His response was interesting in a way that meshes with some of the things I occasionally say about computing as a new paradigm for expressing ideas. (This is not an original idea, of course; Alan Kay has been trying to help us understand this for forty years.)

Greenberg doesn't see computation only as an extension of the engineering metaphor that has defined computing in the age of electronics; he sees it as the "dawn of a new age". When we think of computation in the engineering context, issues such as usability and ergonomics become a natural focus. But in this new age, computing can mean and be something different:

Where I want my toaster to "disappear" and simply render perfectly cooked bread, I don't want that same experience when I compute--especially since I don't often have an initial goal/purpose.

He mentioned, too, that his ideas are not completely settled in this area, but I don't think that anyone has a complete handle on what the new age of computing really means. It sounds as if his ideas are as well formed as most anyone's, and I'm excited when I hear what non-CS people think in this space.

Finally, when it comes to teaching art and computer science together, some schools are already working in that direction. For example, faculty at Yale recently announced that they are putting together a major in computing and the arts. I am not sure what to think about their proposal which aims to be "rigorous" by requiring students to take existing courses in the arts and computer science. There are courses created especially for the major. That is probably a good idea for some audiences, but what about artists who don't want a full computer science-specific CS experience? Do they need the same technical depth as your average CS student? Somehow, I don't think so. A new kind of discipline may well require a new kind of major. But it's neat that someone is taking steps in this direction. We will probably learn something useful from their experience.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 24, 2008 12:48 PM

Getting Lost

While catching up on some work at the office yesterday -- a rare Saturday indeed -- I listened to Peter Turchi's OOPSLA 2007 keynote address, available from the conference podcast page. Turchi is a writer with whom conference chair Richard Gabriel studied while pursuing his MFA at Warren Wilson College. I would not put this talk in the same class as Robert Hass's OOPSLA 2005 keynote, but perhaps that has more to do with my listening to an audio recording of it and not being there in the moment. Still, I found it to be worth listening as Turchi encouraged us to "get lost" when we want to create. We usually think of getting lost as something that happens to us when we are trying to get somewhere else. That makes getting lost something we wish wouldn't happen at all. But when we get lost in a new land inside our minds, we discover something new that we could not have seen before, at least not in the same way.

As I listened, I heard three ideas that captured much of the essence of Turchi's keynote. First was that we should strive to avoid preconception. This can be tough to do, because ultimately it means that we must work without knowing what is good or bad! The notions of good and bad are themselves preconceptions. They are valuable to scientists and engineers as they polish up a solution, but they often are impediments to discovering or creating a solution in the first place.

Second was the warning that a failure to get lost is a failure of imagination. Often, when we work deeply in an area for a while, we sometimes feel as if we can't see anything new and creative because we know and understand the landscape so well. We have become "experts", which isn't always as dandy a status as it may seem. It limits what we see. In such times, we need to step off the easy path and exercise our imaginations in a new way. What must I do in order to see something new?

This leads to the third theme I pulled from Turchi's talk: getting lost takes work and preparation. When we get stuck, we have to work to imagine our way out of the rut. For the creative person, though, it's about more about getting out of a rut. The creative person needs to get lost in a new place all the time, in order to see something new. For many of us, getting lost may seem like as something that just happens, but the person who wants to be lost has to prepare to start.

Turchi mentioned Robert Louis Stevenson as someone with a particular appreciation for "the happy accident that planning can produce". But artists are not the only folks who benefit from these happy accidents or who should work to produce the conditions in which they can occur. Scientific research operates on a similar plane. I am reminded again of Robert Root-Bernstein's ideas for actively engaging the unexpected. Writers can't leave getting lost to chance, and neither can scientists.

Turchi comes from the world of writing, not the world of science. Do his ideas apply to the computer scientist's form of writing, programming? I think so. A couple of years ago, I described a structured form of getting lost called air-drop programming, which adventurous programmers use to learn a legacy code base. One can use the same idea to learn a new framework or API, or even to learn a new programming language. Cut all ties to the familiar, jump right in, and see what you learn!

What about teaching? Yes. A colleague stopped by my office late last week to describe a great day of class in which he had covered almost none of what he had planned. A student had asked a question whose answer led to another, and then another, and pretty soon the class was deep in a discussion that was as valuable, or more, than the planned activities. My colleague couldn't have planned this unexpectedly good discussion, but his and the class's work put them in a position where it could happen. Of course, unexpected exploration takes time... When will they cover all the material of the course? I suspect the students will be just fine as they make adjustments downstream this semester.

What about running? Well, of course. The topic of air-drop programming came up during a conversation about a general tourist pattern for learning a new town. Running in a new town is a great way to learn the lay of the land. Sometimes I have to work not to remember landmarks along the way, so that I can see new things on my way back to the hotel. As I wrote after a glorious morning run at ChiliPLoP three years ago, sometimes you run to get from Point A to Point B; sometimes, you should just run. That applies to your hometown, too. I once read about an elite women's runner who recommended being dropped off far from your usual running routes and working your way back home through unfamiliar streets and terrain. I've done something like this myself, though not often enough, and it is a great way to revitalize my running whenever the trails start look like the same old same old.

It seems that getting lost is a universal pattern, which made it a perfect topic for an OOPSLA keynote talk.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns, Running, Software Development, Teaching and Learning

February 23, 2008 3:41 PM

Door No. 2

My dean recently distributed copies of "Behind Door No. 2: A New Paradigm for Undergraduate Science Education", from the 2006 annual report of Research Corporation. It says many of the things we have all heard about revitalizing science education, summarizing some of the challenges and ideas that people have tried. The report speaks in terms of the traditional sciences, but most of what it says applies well to computer science.

I don't think I learned all that much new from this report, but it was nice to see s relatively concise summary of these issues. What enjoyed most were some of the examples and quotes from respected science researchers, such as physics Nobel laureate Carl Wiemann. One of the challenges that universities face in re-forming how they teach CS, math, and science is that research faculty are often resistant to changing how they teach or think about their classrooms. (Remember, we material to cover.) These faculty are often tenured full professors who wield significant power in the department over curriculum and program content.

At a comprehensive university such as mine, the problem can be accentuated by the fact that even the research faculty teach a full load of undergraduate courses! At the bigger research schools, there are often faculty and instructors who focus almost entirely on undergraduate instruction and especially the courses in the undergraduate core and for non-majors. The research faculty, who may not place too much confidence in "all that educational mumbo-jumbo", don't have as much contact with undergrads and non-majors.

I also enjoyed some of the passages that close the article. First, Bruce Alberts suggests that we in the universities worry about the mote in our own eye:

I used to blame all the K-12 people for everything, but I think we [in higher education] need to take a lot of responsibility. ... K-12 teachers who teach science learned it first from science courses in college. You really want to be able to start with school teachers who already understand good science teaching, ...

Leon Lederman points to the central role that science plays in the modern world:

Once upon a time the knowledge of Latin and Greek was essential to being educated, but that's no longer true. Everywhere you look in modern society in the 21st century, science plays a role that's crucial. It's hard to think of any policy decision on the national level that doesn't have some important scientific criteria that should weigh in on the decisions you make.

He probably wasn't thinking of computer science, but when I think such thoughts I surely am.

Finally, Dudley Herschbach reminds us that the need for better science education is grounded in more than just the need for economic development. We owe our students and citizens more:

So often the science education issue is put in terms of workforce needs and competitiveness. Of course, that's a factor. But for me it's even more fundamental. How can you have a democracy if you don't have literacy? Without scientific literacy, citizens don't know what to believe.... It is so sad that in the world's richest country, a country that prides itself on being a leader in science and technology, we have a large fraction of the population that might as well live in the 19th, 18th or 17th century. They aren't getting to live in the 21st century except in the superficial way of benefiting from all the gadgets. But they don't have any sense of the human adventure...

That is an interesting stance: much of our population doesn't live in the 21st century, because they don't understand the science that defines their world.

Yesterday, I represented our department at a recruitment open house on campus. One mom pulled her high-school senior over to the table where computer science and chemistry stood and asked him, "Have you considered one of these majors?" He said, "I don't like science." Too many students graduate high school feeling that way, and it is a tragedy. It's bad for the future of technology; it's bad for the future of our economy. And they are missing out on the world they live in. I tried to share the thrill, but I don't think I'll see him in class next fall.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 20, 2008 2:55 PM

You Know You're Doing Important Work...

... when Charlie Eppes invokes your research area on Numb3rs. In the episode I saw last Friday, the team used a recommender system, among other snazzy techie glitz, to track down a Robin Hood who was robbing from the dishonestly rich and giving to the poor through a collection of charities. A colleague of mine does work in recommender systems and collaborative filtering, so I thought of him immediately. His kind of work has entered the vernacular now.

I don't recall the Numb3rs crew ever referring to knowledge-based systems or task-specific architectures, which was my area in the old days. Nor do I remember any references to design patterns or to programming language topics, which is where I have spent my time in the last decade or so. Should I feel left out?

But Charlie and Amita did use the idea of steganography in an episode two years ago, to find a pornographic image hidden inside an ordinary image. I have given talks on steganography on campus occasionally in the last couple of years. The first time was at a conference on camouflage, and most recently I spoke to a graphic design class, earlier this month. (My next engagement is at UNI's Saturday Science Showcase, a public outreach lecture series my college runs in the spring.) So I feel like at least some of my intellectual work has been validated.

Coincidentally, I usually bill my talks on this topic as "Numb3rs Meets The Da Vinci Code: Information Masquerading as Art", and one of the demonstrations I do is to hide an image of Numb3rs guys in a digitized version of the Mona Lisa. The talk is a lot of fun for me, but I wonder if college kids these days pay much attention to network television, let alone da Vinci's art.

Lest you think that only we nth-tier researchers care to have our areas trumpeted in the pop world, even the great ones can draw such pleasure. Last spring, Grady Booch gave a keynote address at SIGCSE. As a part of his opening, he played for us a clip from a TV show that had brightened his day, because it mentioned, among other snazzy techie glitz, the Unified Modeling Language he had helped to create. Oh, and that video clip came from... Numb3rs!


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

February 19, 2008 5:11 PM

Do We Need Folks With CS Degrees?

Are all the open jobs in computing that we keep hearing about going unfilled?

Actually -- they're not. Companies do fill those jobs. They fill them with less expensive workers, without computing degrees, and then train them to program.

Mark Guzdial is concerned that some American CEOs and legislators are unconcerned -- "So? Where's the problem?" -- and wonders how we make the case that degrees in CS matter.

I wonder if the US would be better off if we addressed a shortage of medical doctors by starting with less expensive workers, without medical degrees, and then trained them to practice medicine? We currently do face a shortage of medical professionals willing to practice in rural and underprivileged areas.

The analogy is not a perfect one, of course. A fair amount of the software we produce in the world is life-critical, but a lot is not. But I'm not sure whether we want to live in a world where our financial, commercial, communication, educational, and entertainment systems depend on software to run, and that software is written by folks with a shallow understanding of software and computing more generally.

Maybe an analogy to the law or education is more on-point. For example, would the US would be better off if we addressed a shortage of lawyers or teachers by starting with less expensive workers, without degrees in those areas, and then trained them? A shortage of lawyers -- ha! But there is indeed a critical shortage of teachers in many disciplines looming in the near future, especially in math and science. This might lead to an interesting conversation, because many folks advocate loosening the restrictions on professional training for folks who teach in our K-12 classrooms.

I do not mean to say that folks who are trained "on the job" to write software necessarily have a shallow understanding of software or programming. Much valuable learning occurs on the job, and there are many folks who believe strongly in a craftsmanship approach to developing developers. My friend and colleague Ken Auer built his company on the model of software apprenticeship. I think that our university system should adopt more of a project-based and apprenticeship-based approach to educating software developers. But I wonder about the practicality of a system that develops all of its programmers on the job. Maybe my view is colored by self-preservation, but I think there is an important role for university computing education.

Speaking of practicality, perhaps the best way to speak to the CEOs and legislators who doubt the value of academic CS degrees is in their language of supply and productivity. First, decentralized apprenticeship programs are probably how people really became programmers, but they operate on a small scale. A university program is able to operate on a slightly larger scale, producing more folks who are ready for apprenticeship in industry sooner than industry can grow them from scratch. Second, the best-prepared folks coming out of university programs are much more productive than the folks being retrained, at least while the brightest trainees catch. That lack of productivity is at best an opportunity cost, and at worst an invitation for another company to eat your lunch.

Of course, I also think that in the future more and more programmers will be scientists and engineers who have learned how to program. I'm inclined to think that these folks and the software world will be better off being educated by folks with a deep understanding of computing. Artists, too. And not only for immediately obvious economic reasons.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 15, 2008 4:48 PM

Catching a Meme at the End of a Long Week

I don't usually play meme games in my web, but as I am winding down for the week I ran across this one on Brian Marick's blog: grab the nearest book, open it to page 123, go to the 5th sentence, and type up the three sentences beginning there.

With my mind worn out from a week in which I caught something worse than a meme, I fell prey and swung my arm around. The nearest book was Beautiful Code. Technically, I suppose that a stack of PHP textbooks is a couple of inches closer to me, but does anyone really want to know what is on Page 123 of any of them?

Here is the output:

The resultant index (which was called iSrc in FilterMethodCS) might be outside the bounds of the array. The following code loads an integer 0 on the stack and branches if iSrc is less than 0, effectively popping both operands from the stack. This is a partial equivalent of the if statement conditional in line 19 of Example 8-2:

Okay, that may not be much more interesting than what a PHP book might have to say, at least out of context. I'm a compiler junkie, though, and I was happy to find a compiler-style chapter in the book. So I turned to the beginning of the chapter, which turns out to be "On-the-Fly Code Generation for Image Processing" by Charles Petzold. I must admit that this sounds pretty interesting to me. The chapter opens with something that may be of interest to others, too:

Among the pearls of wisdom and wackiness chronicled in Steve Levy's classic "Hackers: Heroes of the Computer Revolution" (Doubleday), my favorite is this one by Bill Gosper, who once said, "Data is just a dumb kind of programming."

Petzold then goes on to discuss the interplay between code and data, which is something I've written about as one of the big ideas computer science has taught the world.

What a nice way for me to end the week. Now I have a new something to read over the weekend. Of course, I should probably spend most of my time with those PHP textbooks; that language is iteration 2 in my course this semester. But I've avoided "real work" for a lot less in the past.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

February 08, 2008 6:08 PM

An Honest Question Deserves An Honest Answer

Today I spent the morning meeting with prospective CS majors and their parents. These prospective majors were high school seniors visiting campus as part of their deciding whether to come to my university. Such mornings are exhausting for me, because I'm not a natural glad-hander. Still, it is a lot of fun talking to folks about computer science and our programs. We end up spending a lot of time with relatively few folks, but I think of it like the retail politics of Iowa's caucus campaign: the word-of-mouth marketing is more valuable than any one or two majors we might attract, and these days, every new major counts.

Twice today I was surprised but a question that more high school students and parents could ask, but don't:

Me: So, you're interested in computer science?

Student: I think so, but I don't really know what computer science is.

Parent: Can you tell us what computer science is and what computer scientists do?

I'm glad that kids are now interested enough in CS to ask. In recent years, most have simply bypassed us for majors they understood already.

My answer was different each time but consistent in theme to what I talk about here. It occurs to me that "What Is Computer Science?" could make a good blog entry, and that writing it in concise form would probably be good practice for encounters such as the ones I had today.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Managing and Leading

February 04, 2008 6:51 AM

The Program's the Thing

Readers of this blog know that programming is one of the topics I most like to write about. In recent months I've had something of a "programming for everyone" theme, with programming as a medium of expression, as a way to create new forms, ideas, and solutions. But programming is also for computer scientists, being the primary mode for communicating their ideas.

To the non-CS folks reading this, that may seem odd. Isn't CS about programming? Most non-CS folks seem to take as a given that computer science is, but these days it is de rigeur for us in the discipline to talk about "computing" and how much bigger it is than "just programming".

Too some extent, I am guilty of this myself. I often use the term "computing" generically in this blog to refer to the fuzzy union of computer science, software engineering, applications, and related disciplines. This term allows me to talk about issues in their broadest context without limiting my focus to any one sub-discipline. But it also lets me be fuzzy in my writing, by not requiring that I commit.

Sometimes, that breadth can give the impression that I think programming is but a small part of the discipline. But most of my writing comes back to programming, and when I teach a CS course, programming is always central. When I teach introductory computer science, programming is a way for us to explore ideas. When I teach compilers, it all comes down to the project. My students learn Programming Languages and Paradigms by writing code in a new style and then using what they learn to explore basic ideas about language in code. When I taught AI for eight or ten straight years, as many of the big ideas as possible found their way into lab exercises. Even when I taught one course that had no programming component -- an amalgam of design, HCI, and professional responsibility called Software Systems -- I had students read code: simple implementations of model-view-controller, written in Java, C++, Common Lisp, or Ada!

I love to program, and I hope students leave my courses seeing that programming is medium for expressing and creating ideas -- as well as creating "business solutions", which will be the professional focus for most of them. Then again, the best business solutions are themselves ideas that need to be discovered, explored, and evolved. Programming is the perfect medium to do that.

So when I ran across The Nerd Factor is Huge, via Chuck Hoffman at Nothing Happens, I found myself to be part of the choir, shouting out "Amen, brother!" every so often. The article is a paean to programming, in a blog "dedicated to the glory of software programming". It claims that programming needs an academic home, a discipline focused on understanding it better and teaching people how to do it. (And that discipline should have its own conference!)

In Yegge-like fashion, the author uses expressive prose to makes big claims, controversial claims. I agree with many of them, and feel a little in harmony even with the ones I wouldn't likely stake my professional reputation on.

  • The shortage of women in computer science hurts our discipline, and it limits the opportunities available to the women, both intellectual and economic. I would broaden this statement to include other underrepresented groups.

  • Trying to interest girls and other underrepresented groups by "expanding the non-programming ghettos of computer science" is misguided and insults these people. We can certainly do more to communicate how becoming a computer scientist or programmer empowers a person to effect change and produce ideas that help people, but we should not remove from the equation the skill that makes all that possible.

  • Human-computer interaction doesn't belong in computer science. Doing so "is as insulting to the study of interaction design ... as it is to computer science". This is perhaps the most controversial claim in this paper, but the authors makes clear that he is not showing to disrespect to HCI. Instead, he wants due respect shown to it -- and programming, which tends to get lost in a computer science that seeks to be too many things.

I agree with the central thesis of this article. However, separating programming as a discipline from HCI and some of the other "non-programming ghettos" of CS creates a challenge for university educators. Most students come to us not only for enlightenment and great ideas but also for professional preparation. With several distinct disciplines involved, we need to help students put them all together to prepare them to be well-rounded pros.

How should we encourage more kids -- girls and boys alike -- to study computer science? "Nerd Factor" is right: don't shy away from programming; teach it, and sooner rather than later. Show them "how easy it is to create something. Because that is what programming is all about: making things." And making things is the key. Being able to program gives anyone the power to turn ideas into reality.

One of my early memories of OOPSLA comes from a panel discussion. I don't recall the reason for the panel, but several big names were talking about what we should call people who create software. There were probably some folks who supported the term "software engineer", because there always are. Kent Beck spoke heresy: "I'm just a programmer." I hope I muttered a little "Amen, brother" under my breath that day, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

January 30, 2008 8:39 AM

What is a Tree?

Ira Greenberg's Cobalt Rider

I can talk about something other than science. As I write this, I am at a talk called "What is a Tree?", by computational artist Ira Greenberg. In it, Greenberg is telling his story of going from art to math -- and computation -- and back.

Greenberg started as a traditional artist, based in drawing and focused in painting. He earned his degrees in Visual Art, from Cornell and Penn. His training was traditional, too -- no computation, no math. He was going to paint.

In his earliest work, Greenberg was caught up in perception. He found that he could experiment only with the motif in front of him. Over time he evolved from more realistic natural images to images that were more "synthetic", more plastic. His work came to be about shape and color. And pattern.

Just one word -- plastics.  From The Graduate.

Alas, he wasn't selling anything. Like all of us, he needed to make some money. His uncle told him to "look into computers -- they are the future". (This is 1993 or so...) Greenberg could not have been less interested. Working with computers seemed like a waste of time. But he got a computer, some software, and some books, and he played. In spite of himself, he loved it. He was fascinated.

Soon he got paying gigs at places like Conde Nast. He was making good money doing computer graphics for marketing and publishing folks. At the time, he said, people doing computer graphics were like mad scientists, conjuring works with mystical incantations. He and his buddies found work as a hired guns for older graphic artists who had no computer skills. They would stand over his should, point at the screen, and say in rapid-fire style, "Do this, do this, do this." "We did, and then they paid us."

All the while, Greenberg was still doing his "serious work" -- painting -- on side.

Ira Greenberg's Blue Flame

But he got good at this computer stuff. He liked it. And yet he felt guilty. His artist friends were "pure", and he felt like a sell-out. Even still, he felt an urge to "put it all together", to understand what this computer stuff really meant to his art. He decided to sell out all the way: to go to NYC and sell these marketable skills for big money. The time was right, and the money was good.

It didn't work. Going to an office to produce commercial art for hire changed him, and his wife notice. Greenberg sees nothing wrong with this kind of work; it just wasn't for him. Still, he liked at least one thing about doing art in the corporate style: collaboration. He was able to work with designers, writers, marketing folks. Serious painters don't collaborate, because they are doing their own art.

The more he work with computers in the creative process, the more he began to feel as if using tools like Photoshop and LightWave was cheating. They provide an experience that is too "mediated". With any activity, as you get better you "let the chaos guide you", but these tools -- their smoothness, their engineered perfection, their Undo buttons -- were too neat. Artists need fuzziness. He wanted to get his hands dirty. Like painting.

So Greenberg decided to get under the hood of Photoshop. He started going deeper. His artist friends thought he was doing the devil's work. But he was doing cool stuff. Oftentimes, he felt that the odd things generated by his computer programs were more interesting than his painting!

Ira Greenberg's Hoffman Plasticity Visualizer

He went deeper with the mathematics, playing with formulas, simulating physics. He began to substitute formulas inside formulas inside formulas. He -- his programs -- produced "sketches".

At some point, he came across Processing, "an open source programming language and environment for people who want to program images, animation, and interactions". This is a domain-specific language for artists, implemented as an IDE for Java. It grew out of work done by John Maeda's group at the MIT Media Lab. These days he programs in ActionScript, Java, Flash, and Processing, and promotes Processing as perhaps the best way for computer-wary artists to get started computationally.

What is a Tree? poster

With his biographical sketch done, he moved on to the art that inspired his talk's title. He showed a series of programs that demonstrated his algorithmic approach to creativity. His example was a tree, which was a double entendré for his past as a painter of natural scenes and also for his embrace of computer science.

He started with the concept of a tree in a simple line drawing. Then he added variation: different angles, different branching factors. These created asymmetry in the image. Then he added more variation: different scales, different densities. Then he added more variation: different line thickness, "foliage" at the end of the smallest branches. With randomness elements in the program, he gets different outputs each time he runs the code. He added still more variation: color, space, dimension, .... He can keep going along as many different conceptual dimensions as he likes to create art. He can strive for verisimilitude, representation, abstraction, ... any artistic goal he might seek with a brush and oils.

Greenberg's artistic medium is code. He writes some code. He runs it. He change some things, and runs it again. This process is interactive with the medium. He evolves not a specific work of art, but an algorithm that can generate an infinite number of works.

I would claim that in a very important sense his work is the algorithm. For most artists, the art is in the physical work they produce. For Greenberg, there is a level of indirection -- which is, interestingly, one of the most fundamental concepts of computer science. For me, perhaps the algorithm is the artistic work! Greenberg's program is functional, not representational, and what people want to see is the art his programs produce. But code can be beautiful, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

January 24, 2008 6:39 AM

More on Computational Simulation, Programming, and the Scientific Method

As I was running through some very cold, snow-covered streets, it occurred to me that my recent post on James Quirk's AMRITA system neglected to highlight one of the more interesting elements of Quirk's discussion: Computational scientists have little or no incentive to become better programmers, because research papers are the currency of their disciplines. Publications earn tenure and promotion, not to mention street cred in the profession. Code is viewed by most as merely a means to an end, an ephemeral product on the way to a citation.

What I take from Quirk's paper is that code isn't -- or shouldn't be -- ephemeral, or only a means to an end. It is the experiment and the source of data on which scientific claims rest. As I thought more about the paper I began to wonder, can computational scientists do better science if they become better programmers? Even more to the point, will it become essential for a computational scientist to be a good programmer just to do the science of the future? That's certainly what I heard some of the scientists at the SECANT workshop saying.

While googling to find a link to Quirk's article for my entry (Google is the new grep. TM), I found the paper Computational Simulations and the Scientific Method (pdf), by Bil Kleb and Bill Wood. They take the programming-and-science angle in a neat software direction, suggesting that

  • the creators of a new simulation technique should publish unit tests that specify the technique's intended behavior, and
  • the developers of scientific simulation code for a given technique use its unit tests to demonstrate that their component correctly implements the technique.

Publishing a test fixture offers several potential benefits, including:

  • a way to communicate a technique or algorithm better
  • a way to share the required functionality and performance features of an implementation
  • a way to improve repeatability of computational experiments, by ensuring that scientists using the same technique are actually getting the same output from their component modules
  • a way to improve comparison of different experiments
  • a way to improve verification and validation of experiments

These are not about programming or software development; they are about a way to do science.

This is a really neat connection between (agile) software development and doing science. The idea is not necessarily new to folks in the agile software community. Some of these folks speak of test-driven development in terms of being a "more scientific" way to write code, and agile developers of all flavors believe deeply in the observation/feedback cycle. But I didn't know that computational scientists were talking this way, too.

After reading the Kleb and Wood paper, I was not surprised to learn that Bil has been involved in the Agile 200? conferences over the years. I somehow missed the 2003 IEEE Software article that he and Wood co-wrote on P and scientific research and so now have something new to read.

I really like the way that Quirk and Kleb & Wood talk about communication and its role in the practice of science. It's refreshing and heartening.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

January 22, 2008 4:37 PM

Busy Days, Computational Science

Some days, I want to write but don't have anything to say. The start of the semester often finds me too busy doing things to do anything else. Plowing through the arcane details of Unix basic regular expressions that I tend not to use very often is perhaps the most interesting thing I've been doing.

Over the weekend, I did have a chance to read this paper by James Quirk, a computational scientist who has built a sophisticated simulation-and-documentation system called AMRITA. (You can download the paper as a PDF from his web site, by following About AMRITA and clinking on the link "Computational Science: Same Old Silence, Same Old Mistakes".) Quirk's system supports a writing style that interleaves code and commentary in the spirit of literate programming using the concept of a program fold, which at one level is a way to specify a different processor for each item in a tree of structured text items. The AMRITA project is an ambitious mixture of (1) demonstrating a new way to communicate computational science results and (2) arguing for software standards that make it possible for computational scientists to examine one another's work effectively. The latter point is essential if computational science is to behave like science, yet the complexity of most simulation programs almost precludes examination, replication, and experimentation with the ideas they implement.

Much of what Quirk says about scientists as programmers meshes with what I wrote in my reports on November's SECANT workshop. The paragraph that made me sit up, though, was this lead-in to his argument:

The AMR simulation shown in Figure 1 was computed July 1990.... It took just over 12 hours to run on a Sun SPARCstation 1. In 2003 it can be run on an IBM T30 laptop in a shade over two minutes.

It is sobering occasionally to see the advances in processors and other hardware presented in such concrete terms. I remember programming on a Sun SPARCstation 1 back in the early '90s, and how fast it seemed! By 2003 a laptop could perform more than 300 times faster on a data-intensive numeric simulation. How much faster still by 2008?

Quirk is interested in what this means for the conduct of computational science. "What should the computational science community be doing over and above scaling up the sizes of the problems it computes?" That is the heart of his paper, and much of the motivation behind AMRITA.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

January 09, 2008 4:31 PM

Follow Up to Recent Entry on Intro Courses

After writing my recent entry on some ideas about CS intro courses gleaned from a few of the columns and editorials in the latest edition of inroads, I had a few stray thoughts that seem worth expressing.

On Debate

I always feel a little uneasy when I write a piece like that. It comments specifically on an article written by a CS colleague, and in particular criticizes some of the ideas expressed in the article. Even when I include disclaimers to the effect that I really like an article, or respect the author's work, there is a chance that the author will take offense. That happens in the case even of more substantial written works, and I think that the looser, more "thinking out loud" nature of a blog only compounds the chance of a misunderstanding.

I certainly mean no offense when I write such pieces. Almost always, I am grateful for the author having triggered me to think and write something down. And always I intend for my articles to play a part in a healthy academic debate on an issue that both the author and I think is important.

Without healthy discussion of what we do -- and sometimes even contentious debate -- how can we hope to know whether we are on a good path? Or get onto a good path in the first place? Debate, or the friendlier "discussion" is one of the best ways we have for getting feedback on our thinking and for improving our ideas.

I like healthy discussion, even contentious debate, because I don't mean or take personal offense. I've always liked a good hearty discussion, but still I am indebted to my graduate advisor for modeling how academics approach ideas. In meetings of our research group and in seminars, he was often like a bulldog, challenging claims and offering counterarguments. At times, he could be contentious, and often the room became a lot hotter as he, I, and a couple of likewise combative fellow grad students went at it.

But those discussions were always about ideas. When the meeting was up, the discussion was over, and we all went back to being friends and colleagues. My advisor never held a grudge, even if I spent an hour telling him he was wrong, wrong, wrong. He enjoyed the debate, and knew that we all were getting better from the discussion.

Sometimes we have to work not to take personal offense, and to express our thoughts in a way that won't cause offense to a reasonable person. Sometimes, we even have to adjust our approach to account for the people in the room. But the work is worth it, and debate is necessary.

On Exciting Bright Minds

When deciding how to teach our intro courses, we need to ask several questions. What works best for our student population overall? What helps weakest students learn as much of value as they can? What helps our strongest students see the exciting ideas of computing deeply and come to love the power they afford us?

My dream is that we can offer all of our students, but especially our brightest ones, the sort of excitement that Jon Bentley felt, as he writes in his Communications of the ACM 50th anniversary piece In the realm of insight and creativity. The pages of each issue ignited a young mind that helped to define our discipline. Maybe our courses can do the same.

On Simplicity

In my article, I withheld comment on one of the "language war" elements of Manaris's argument, namely the argument for using a simpler language. In the editorial spirit of his article, I will share my own opinions, based almost wholly in personal preference and anecdote.

I remain a strong believer that OOP can be the foundation of a solid CS curriculum, but Java and certainly C++ are not the best vehicles for learning to program. I agree with Manaris that a conceptually simpler language is preferred. He cites Alan Kay, whom regular readers here know I cite frequently. I agree with Bill and Alan! Then again, Smalltalk or a variant thereof might be an even better choice for CS1 than Python. It has a simpler grammar, no goofy indentation rules no strange __this__ operator, ... but it is quite different from most mainstream languages in feel, style, and environment. Scheme? Can't get much simpler that that. We have learned, though, that simplicity itself can create a different sort of overload difficulty for students while learning. Many folks are successful teaching Scheme early, and I wish we had more evidence on Smalltalk's use in CS 1.

Conceptual simplicity is a good thing, but it is but one of several forces at play in the decision.

For what it's worth, as I mentioned in the same entry, we will be offering a Python-based media computation CS 1 course this semester. I am eager to see how it goes, both on its own and in comparison to the Java-based media comp CS 1 we taught once before.

On Patterns

Finally, Manaris quotes Alan Kay on the non-obvious ideas that can trip up a novice programmer, some which are

... like the concept of the arch in building design: very hard to discover, if you don't already know them.

I do not advocate hanging novices up on non-obvious ideas, but it occurs to me that instruction driven by elementary patterns would address this difficulty head-on. Sometimes, a concept we think obvious is not so to a student, and other times we want students to encounter a non-obvious idea as a part of their growth. Patterns are all about communicating such ideas, in context, with explanation of some of the thinking that underlies their utility.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 07, 2008 7:07 PM

Teaching Compilers by Example

The most recent issue of inroads contained more than just some articles on how to introduce computer science to majors and non-majors alike. It also contained an experience report at the intersection of topics I've discussed recently: a compiler course using a sequence of examples.

José de Oliveira Guimarães describes a course he has taught annually since 2002. In this course, he presents a sequence of ten compilers, each of which introduces a new idea or ideas beyond the previous. The "base case" isn't a full compiler but rather a parser for a Scheme-like expression language. Students see these examples -- full code for compilers that can be executed -- before they learn any theory behind the techniques they embody. All ten are implemented from scratch, using recursive descent for parsing.

If I understand the paper correctly, de Oliveira Guimarães teaches these examples, followed by five examples implemented using a parser generator, in the first eight weeks of a sixteen-week course. The second half of the course teaches the theory that underlies compiler construction. This seems to be a critical feature of a by-example course: The students study examples first and only then learn theory -- but now in the context of the code that they have already seen.

One thing that makes this course different than mine is that students do not implement a compiler during this semester. Instead, they take a second course in which they do a traditional compiler project. I don't have this luxury -- I'm lucky to be able to teach one compiler course at all, let alone a two-semester sequence. I also feel strongly about the power of students working on a major project while learning about compilers that I'm not sure how I feel about the examples-and-theory first course. But what a neat idea! I look forward to studying de Oliveira Guimarães's code in order to understand his course better. (He has all of code on-line, along with the beginnings of a descriptive "textbook" for the course, translated into English.)

In some ways, I suppose that what I do here bears some resemblance to this approach.

In the first semester, students study programming languages using an interpreter-based approach, which gives them a low level of understanding of techniques for processing programs. They see several small program processors and implement a few others, in order to understand these techniques. They then take the compiler course, where we "go deep" both in implementation and theory. But there is no question that their compiler project requires some of the theory we study that semester.

At the beginning of the second semester, before students begin work on their project and before I introduce any theory of lexical or syntax analysis, I spend two sessions presenting a whole compiler. This compiler is my implementation of a variant of a compiler described by Fischer, LeBlanc, and Cytron. I present this example first in order for students to see all of the phases of a compiler working in concert to translate a program. Because it comes so early in the semester, it is necessarily a simple example, but it does give me and the students a reference point for most of what we see later in the term.

This is only one example, but I work incremental refinements to it as we go along. As we learn new techniques for each phase of the compiler, I try to plug a new component into the reference compiler that demonstrates the theory we have just learned. For example, my initial example uses an ad hoc scanner and a recursive descent parser; later, I plug in a regular expression-based scanner and a table-driven parser. Again, language compiled by the reference program is pretty simple, so these new components are still quite simple, but I hope they give the students some grounding before they implement more complex versions in their project.

All that said, de Oliveira Guimarães's approach takes the idea of examples to another level. I look forward to digging into it further and seeing what I can learn from it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 03, 2008 3:25 PM

New Year, Old Topics

I have had a more relaxing break than last year. With no traveling and only an occasional hour or two in my office knocking around, not working, my mind has had a chance to clear out a bit. This is a good way to start the year.

I've even managed not to do much professional reading these past two weeks. About all I've read is a stack of old MacWorld magazines. Every time I or my department buys a Mac, I receive a complimentary 6-month subscription. I like to read about and try out all sorts of software, and this magazine is full of links. But I don't take time to read all of the issues as they roll in, which explains why I still had a couple of issues from late 2005 in my stack! (My favorite new app from this expedition is CocoThumbX.)

New Year's Eve did bring in my mail the latest issue of inroads, the periodic bulletin of SIGCSE. Rather than add it to my stack of things to read, I decided to browse through it while watching college football on TV. I found a few items of value to my work. In the first twenty pages or so, I encountered several short articles that set the stage for ongoing discussion in the new year of issues that have been capturing mind share among CS educators for the last year or so.

(Unfortunately, the latest issue of inroads is not yet available in the ACM Digital Library, so I'll have to add later the links to the articles discussed below.)

First was Bill Manaris's Dropping CS Enrollments: Or The Emperor's New Clothes, which claims that the drop in CS enrollments may well be a result of the switch to C++, Java, and object-oriented programming in introductory courses. He includes a form of disclaimer late in his article by saying that his argument "is about the possibility that there is no absolute best language/paradigm for CS1". This is now an almost standard disclaimer in papers about the choice of language and paradigm for introductory CS courses, perhaps as an attempt to avoid the contention of the debates that swirl around the topic. But the heart of Manaris's article is a claim that what most of us do in CS1 these days is not good, disclaimer notwithstanding. That's okay; ideas are meant to be expressed, examined, and evaluated.

Manaris backs his position from an interesting angle: the use of usability to judge programming languages. He quotes one of Jakob Nielsen's papers:

On the Web, usability is a necessary condition for survival. If a website is difficult to use, people leave. If the homepage fails to clearly state what a company offers and what users can do on the site, people leave. If users get lost on a website, they leave. If a website's information is hard to read or doesn't answer users' key questions, they leave. Note a pattern here? There's no such thing as a user reading a website manual or otherwise spending much time trying to figure out an interface. There are plenty of other websites available; leaving is the first line of defense when users encounter a difficulty.

Perhaps usability is a necessary condition for CS1 to retain students? Maybe students approach their courses and majors in the same way? They certainly have many choices, and if our first course raises unexpected difficulties, maybe students will exercise their choices.

Manaris moves quickly from this notion to the idea that we should consider the usability of the programming language we adopt for our intro courses. The idea of studying how novices learn a language and using that knowledge to inform language design and choice isn't new. The psychology of programming folks have been asking these questions for many years, and Brad Myers's research group at CMU has published widely in this area. But Manaris is right that not enough folks take this kind of research into account when we think about CS1-2. And when we do discuss the ideas of usability and learnability, we usually rely on our own biases and interests as "evidence".

Relying more on usability studies of programming languages would be a good thing. But they need to be real studies, not just more of the same old "here's why I think my favorite language (or paradigm) is best..." Unfortunately, both of the examples Manaris gives in his article are the sort we see too often in SIGCSE circles: anecdotal reports of personal experiences, which are unavoidably biased toward one person's knowledge and preferences. In one, he tells of his experience coding some task in languages A and B and reports that he had "9 compiler errors, one semantic error, and one 'headache' error" in A. (Hmm, could A be Java?) But I wonder if even some of my CS1 students would encounter these same difficulties; the good ones can be pretty good.

In the other, a single evaluator collects data from his own experiences solving a simple task in several different programming environments. I believe in reporting quantitative results, but quantitative data from one individual's experience is of limited value. How many introductory CS1 students would feel the same? Or intro instructors? We are the ones who usually make claims about languages and paradigms based almost solely on their experiences and the experiences of like-minded colleagues. (Guess what? The folks I meet with at the OOPSLA educators' symposium find OOP to be a great way to teach CS1. Shocking!)

And we always need to keep in mind the difference between essential and gratuitous complexity. I am often reminded of Alan Kay's story about learning to play violin. Of course it's hard to learn. But the payoff is huge.

To be fair, Manaris is writing an editorial, not a research paper, so his examples can be taken as hints, not exact recommendations. When conducting usability studies of languages, we need to be sure to seek answers to several different questions. What works best for the general population of students? What helps the weakest students learn as much of value as they can? What helps the strongest students come to see and appreciate -- even love -- the deep ideas and power of our discipline?

Manaris closes his paper with an interesting claim:

In the minds of our beginning students, the programming language/paradigm we expose them to in CS1 is computer science.

A good CS education should help students overcome this limited mindset as soon as possible, but for students who are living through an introduction to computing, this limitation is reality. Most importantly, for students who never go beyond CS1 -- and this includes students who might have gone on but who have an unsatisfying experience and leave -- it is a reality that defines our discipline for much of the rest of the world.

This idea leads nicely into Henry Walker's column, What Image Do CS1/CS2 Present to Our Students? a few pages later in the issue. Walker contrasts the excitement many practitioners and instructors express about computing with the reality of most introductory courses for our students, which are often inward-looking and duller than they should be. We define our discipline for students in the paradigm and language we teach them, but also in many other ways: in the approach to programming we model, in the kinds of assignments we give, and in the kinds of exam questions we ask. We also define it in the ideas we expose them to. Is computing "just programming"? If that's all we show students, then for them it may well be. What ideas do we talk about in class? What activities do we ask students to do? How much creativity do we allow them?

CS1 can't be and do everything, but it should be something more than just a programming language and an arcane set of rules for formatting and commenting code.

Finally, this idea leads naturally into Owen Astrachan's Head in the Clouds only two pages later. In the last few years, Owen has become an evangelist for the position that computing education has to be grounded in real problems -- not the Towers of Hanoi, but problems that real people (not computer scientists) solve in the world as a part of doing their jobs and living their lives. This is a way to get out of the inward-looking mindset that dominates many of our intro courses -- by designing them to look outward at the world of problems just waiting to be attacked in a computational way.

Owen also has been railing against the level of discourse in which CS educators partake, the sort of discourse that ask "What language should I use in CS1?" rather than "How can I help my students using computing as a tool to solve problems?" In that sense, he may not have much interest in Manaris's article, though he may appreciate the fact the article seeks to put our focus on how the students who take our courses learn a language, rather than on the language itself.

I think that we should still start from courses designed in a meaningful context. There is a lot of power in context, both for course design and for motivation. Besides, working from a problem-driven focus gives us an interesting opportunity for evaluating the effects of features such as a language's usability. Consider Mark Guzdial's media computation approach. His team has developed course materials for complete CS1 courses in two languages, Python and Java (which are, not coincidentally, two of the languages that Manaris discusses in his article). These materials have been developed by the creators of the approach, with equal care and concern for the success of students using the approach. Python is in many ways the simpler language, so it will be interesting to see whether students find one version of the course more attractive than the other. I have taught media comp in Java, as has a colleague, and this semester he will teach it using Python. While preparing for the semester, he has already commented on some of the ways in which Python "gets out of the way" and lets the class get down to the business of computing with images and sounds sooner. But that is early anecdote; what will students think and learn?

I hope that 2008 finds the CS education community asking the right questions and moving toward answers.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 21, 2007 4:05 PM

A Panoply of Languages

As we wind down into Christmas break, I have been thinking about a lot of different languages.

First, this has been a week of anniversaries. We celebrated the 20th anniversary of Perl (well, as much as that is a cause for celebration). And BBC World ran a program celebrating the 50th birthday of Fortran, in the same year that we lost John Backus, its creator.

Second, I am looking forward to my spring teaching assignment. Rather than teaching one 3-credit course, I will be teaching three 1-credit courses. Our 810:151 course introduces our upper-division students to languages that they may not encounter in their other courses. In some departments, students see a smorgasbord of languages in the Programming Languages course, but we use an interpreter-building approach to teach language principles in that course. (We do expose them in that course to functional programming in Scheme, but more deeply.)

I like what that way of teaching Programming Languages, but I also miss the experience of exposing students to the beauty of several different languages. In the spring, I'll be teaching five-week modules on Unix shell programming, PHP, and Ruby. Shell scripting and especially Ruby are favorites of mine, but I've never had a chance to teach them. PHP was thrown in because we thought students interested in web development might be interested. These courses will be built around small and medium-sized projects that explore the power and shortcomings of each language. This will be fun for me!

As a result, I've been looking through a lot of books, both to recommend to students and to find good examples. I even did something I don't do often enough... I bought a book, Brian Marick's Everyday Scripting with Ruby. Publishers send exam copies to instructors who use a text in a course, and I'm sent many, many others to examine for course adoption. In this case, though, I really wanted the book for myself, irrespective of using it in a course, so I decided to support the author and publisher with my own dollars.

Steve Yegge got me to thinking about languages, too, in one of his recent articles. The article is about the pitfalls of large code bases but, while I may have something to say about that topic in the future, what jumped out to me while reading this week were two passages on programming languages. One mentioned Ruby:

Java programmers often wonder why Martin Fowler "left" Java to go to Ruby. Although I don't know Martin, I think it's safe to speculate that "something bad" happened to him while using Java. Amusingly (for everyone except perhaps Martin himself), I think that his "something bad" may well have been the act of creating the book Refactoring, which showed Java programmers how to make their closets bigger and more organized, while showing Martin that he really wanted more stuff in a nice, comfortable, closet-sized closet.

For all I know, Yegge's speculation is spot on, but I think it's safe to speculate that he is one of the better fantasy writers in the technical world these days. His fantasies usually have a point worth thinking about, though, even when they are wrong.

This is actually the best piece of language advice in the article, taken at its most general level and not a slam at Java in particular:

But you should take anything a "Java programmer" tells you with a hefty grain of salt, because an "X programmer", for any value of X, is a weak player. You have to cross-train to be a decent athlete these days. Programmers need to be fluent in multiple languages with fundamentally different "character" before they can make truly informed design decisions.

We tell our students that all the time, and it's one of the reasons I'm looking forward to three 5-week courses in the spring. I get to help a small group of our undergrads crosstrain, to stretch their language and project muscles in new directions. That one of the courses helps them to master a basic tool and another exposes then to one of the more expressive and powerful languages in current use is almost a bonus for me.

Finally, I'm feeling the itch -- sometimes felt as a need, other times purely as desire -- to upgrade the tool I use to do my blogging. Should I really upgrade, to a newer version of my current software? (v3.3 >> v2.8...) Should I hack my own upgrade? (It is 100% shell script...) Should I roll my own, just for the fun of it? (Ruby would be perfect...) Language choices abound.

Merry Christmas!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 13, 2007 2:47 PM

Computing in Yet Another Discipline

Last month there was lots of talk here about how computing changes science. In that discussion I mentioned economics and finance as other disciplines that will be fundamentally different in the future as computation and massive data stores affect methodology. Here is another example, in a social science -- or perhaps the humanities, depending on your viewpoint.

Sergei Golitsinski is one of my MS students in Computer Science. Before catching the CS bug, via web design and web site construction, he had nearly completed an MA in Communication Studies (another CS!). For the last year or more, he has been at the thesis-writing stage in both departments. He finally defended his MA thesis yesterday afternoon.

The title of his thesis is "Significance of the General Public for Public Relations: A Study of the Blogosphere's Impact on the October 2006 Edelman/Wal-Mart Crisis". You'll have to read the thesis for the whole story, but here's a quick summary.

In October 2006, Edelman started a blog "as a publicity stunt on behalf of Wal-Mart" yet claimed it to be "an independent blog maintained by a couple traveling in their RV and writing stories about happy Wal-Mart employees". Eventually, bloggers got hold of the story and ran with it, creating a fuss that resulted in "significant negative consequences" for Edelman. Sergei collected data from these blogs and their comments, studied the graph of relationships among them, and argued that actions of the "general public" accounted for the effects felt by Edelman. This is significant in the PR world, it seems, because the PR world largely believes that the "general public" either does not exist or is insignificant. Only specific publics defined as coherent segments are able to effect change.

Sergei used publicly-available data from the blogosphere to drive an empirical study to support his claim that "new communication technologies have given the general public the power to cause direct negative consequences for organizations". Collecting data for an empirical study is not unusual in the Communications Studies world, but collecting it on this scale is unusual and using computer programs to study the data as a highly-interconnected graph is even less so.

I was not on this thesis committee, but I did ask a question at his defense: Does this type of research make it possible to ask questions in communications studies that were heretofore not askable? I suspected that the answer was yes but hoped to here some specific examples. I also hoped that these examples would help the Comm Studies folks in the room to see what computation will do to at least part of their discipline. His answer was okay, but not a grand slam; to be fair, I'm sure I caught him a bit off-guard.

From the questions that the Comm Studies committee members asked and the issues they discussed (mostly on the nature of "publics"), it wasn't clear whether they quite understand the full implication of the kind of work Sergei did. It will change how they do research is done, from statistical analyses of relatively small data sets to graph-theoretic analyses of large data sets. Computational research will make it possible to ask entirely new questions -- both ones that were askable before but not feasible to answer and ones that would not have been conceived before. This isn't news -- Duncan Watt's Small World Project is seminal work in this area -- but the time is right for this kind of work to explode into the mainstream.

What's up next Sergei? He has tried a Ph.D. program in Computer Science and found it not to his interests; it seemed too inward-looking, focused on narrow mathematics and not big problems. He may well stay in Communications and pursue a Ph.D. there. As I've told him, he could be part of vanguard in that discipline, helping to revolutionize methodology and ask some new and deeply interesting questions there.


Posted by Eugene Wallingford | Permalink | Categories: Computing

December 12, 2007 1:02 PM

Not Your Father's Data Set

When I became department head, I started to receive a number of periodicals unrequested. One is Information Week. I never read it closely, but I usually browse at least the table of contents and perhaps a few of the news items and articles. (I'm an Apple/Mac/Steve Jobs junkie, if nothing else.)

The cover story of the December 10 issue is on the magazine's 2007 CIO of the Year, Tim Stanley of Harrah's Entertainment. This was an interesting look into a business for which IT is an integral component. One particular line grabbed my attention, in a sidebar labeled "Data Churn":

Our data warehouse is somewhat on the order of 20 Tbytes in terms of structure and architecture, but the level of turn of that information is many, many, many times that each day.

The database contains information on 42 million customers, and it turns data over multiple tens of terabytes a day.

Somehow, teaching our students to work with data sets of 10 or 50 or 100 integers or strings seems positively 1960s. It also doesn't seem to be all that motivating an approach for students carrying iPods with megapixel images and gigabytes of audio and video.

An application with 20 terabytes of data churning many times over each day could serve as a touchstone for almost an entire CS curriculum, from programming and data structures to architecture and programming languages, databases and theory. As students learn how to handle larger problems, they see how much more they need to learn in order to solve common IT problems of the day.

sample graph of open data set, of Chinese food consumption

I'm not saying that we must use something on the order of Harrah's IT problem to do a good job or to motivate students, but we do need to meet our students -- and potential students -- closer to where they live technologically. And today we have so many opportunities -- Google, Google Maps, Amazon, Flickr, ... These APIs are plenty accessible to more advanced students. They might need a layer of encapsulation to be suitable for beginning students; that's something a couple of us have worked on occasionally at ChiliPLoP. But we all have even more options available these days, as data-sharing sites a lá Flickr become more common. (See, for example, Swivel; that's where I found the graph shown above, derived from data available at the USDA's Economic Research Service website.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 11, 2007 1:46 PM

Thoughts While Killing Time with the Same Old Ills

I have to admit to not being very productive this morning, just doing some reading and relaxing. There are a few essential tasks to do yet this finals week, some administrative (e.g., making some adjustments to our teaching schedule based on course enrollments) and some teaching (e.g., grading compiler projects and writing a few benchmark programs). But finals week is a time when I can usually afford half a day to let my mind float aimlessly in the doldrums.

Of course, by "making some adjustments to our teaching schedule based on course enrollments", I mean canceling a couple of classes due to low enrollments and being sure that our faculty have other teaching or scholarly assignments for the spring. The theme of low enrollments is ongoing even as we saw a nontrivial bump in number of majors last fall. Even if a trend develops in that direction, we have to deal with smaller upper-division enrollments from small incoming classes of the recent past.

Last month, David Chisnall posted an article called Is Computer Science Dying?, which speculates on the decline in majors, the differences between CS and software development, and the cultural change that has turned prospective students' interests to other pursuits. There isn't anything all that new in the article, but it's a thoughtful piece of the sort that is, sadly, all too common these days. At least he is hopeful about the long-term viability of CS as an academic discipline, though he doesn't have much to say about how CS and its applied professional component might develop together -- or apart.

In that regard, I like several of Shriram Krishnamurthi's theses on undergraduate CS. When he speaks of the future -- from a vantage point of two years ago -- he recommends that CS programs admit enough flexibility that they can evolve in ways that we see make sense.

(I also like his suggestion that we underspecify the problems we set before students:

Whether students proceed to industrial positions or to graduate programs, they will have to deal with a world of ambiguous, inconsistent and flawed demands. Often the difficulty is in formulating the problem, not in solving it. Make your assignments less clear than they could be. Do you phrase your problems as tasks or as questions?

This is one of the ways student learn best from big projects!)

Shriram also mentions forging connections between CS and the social sciences and the arts. One group of folks who is doing that is Mark Guzdial's group at Georgia Tech, with their media computation approach to teaching computer science. This approach has been used enough at enough different schools that Mark now has some data on how well it might help to reverse the decline in CS enrollments, especially among women and other underrepresented groups. As great as the approach is, the initial evidence is not encouraging: "We haven't really changed students' attitudes about computer science as a field." Even students who find that they love to design solutions and implement their designs in programs retain the belief that CS is boring. Students who start a CS course with a favorable attitude toward computing leave the university with a favorable attitude; those who start with an unfavorable attitude leave with the same.

Sigh. Granted, media computation aims at the toughest audience to "sell", folks most likely who consider themselves non-technical. But it's still sad to think we haven't made headway at least in helping them to see the beauty and intellectual import of computing. Mark's not giving up -- on computing for all, or on programming as a fundamental activity -- and neither am I. With this project, the many CPATH projects, and Owen Astrachan's Problem Based Learning in Computer Science project, and so many others, I think we will make headway. And I think that many of the ideas we are now pursuing, such as domain-specific applications, problems, and projects, is a right path.

Some of us here think that the media computation approach is one path worth pursue, so we are offering a CS1/CS2 track in media computation beginning next semester. This will be our third time, with the first two being in the Java version. (My course materials, from our first media comp effort, are still available on-line. This time, we are offering the course in Python -- a decision we made before I ended up hearing so much about the language at the SECANT workshop I reported on last month. I'm an old Smalltalk guy, and a fan of Scheme and Lisp, who likes the feel of a small, uniform language. We have been discussing the idea of using a so-called scripting language in CS1 for a long time, at least as one path into our major, and the opportunity is right. We'll see how it goes...

The secret to happiness is low expectations.
-- Barry Schwartz

In addition to reading this morning, I also watched a short video of Barry Schwartz from TED a couple of years ago. I don't know how I stumbled upon a link to such an old talk, but I'm glad I did. The talk was short, colorful, and a succinct summary of the ideas from Schwartz's oft-cited The Paradox of Choice. Somehow, his line about low expectations seemed a nice punctuation mark to much of what I was thinking about CS this morning. I don't feel negative when I think this, just sobered by the challenge we face.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

December 07, 2007 4:41 PM

At the End of Week n

Classes are over. Next week, we do the semiannual ritual of finals week, which keeps many students on edge while at the same time releasing most of the tension in faculty. The tension for my compiler students will soon end, as the submission deadline is 39 minutes away as I type this sentence.

The compiler course has been a success several ways, especially in the most important: students succeeded in writing a compiler. Two teams submitted their completed programs earlier this week -- early! -- and a couple of others have completed the project since. These compilers work from beginning to end, generating assembly language code that runs on a simple simulated machine. Some of the language design decisions contributed to this level of success, so I feel good. (And I already know several ways to do better next time!)

I've actually wasted far too much time this week writing programs in our toy functional language, just because I enjoy watching them run under the power of my students' compilers.

More unthinkable: There is a greater-than-0% chance that at least one team will implement tail call optimization before our final exam period next. They don't have an exam to study for in my course -- the project is the purpose we are together -- so maybe...

In lieu of an exam, we will debrief the project -- nothing as formal as a retrospective, but an opportunity to demo programs, discuss their design, and talk a bit about the experience of writing such a large, non-trivial program. I have never found or made the time to do this sort of studio work during the semester in the compilers course, as I have in my other senior project courses. This is perhaps another way for me to improve this course next time around.

The end of week n is a good place to be. This weekend holds a few non-academic challenges for me: a snowy 5K with little hope for the planned PR and my first performances in the theater. Tonight is opening night... which feels as much like a scary final exam as anything I've done in a long time. My students may have a small smile in their hearts just now.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

November 27, 2007 5:45 PM

Comments on "A Program is an Idea"

Several folks have sent interesting comments on recent posts, especially A Program is an Idea. A couple of comments on that post, in particular, are worth following up on here.

Bill Tozier wrote that learning to program -- just program -- is insufficient.

I'd argue that it's not enough to "learn" how to program, until you've learned a rigorous approach like test-driven development or behavior-driven development. Many academic and scientific colleagues seem to conflate software development with "programming", and as a result they live their lives in slapdash Dilbertesque pointy-haired boss territory.

This is a good point, and goes beyond software development. Many, many folks, and especially university students, conflate programming with computer science, which disturbs academic CS folks to no end, and for good reason. Whatever we teach scientists about programming, we must teach it in a broader context, as a new way of thinking about science. This should help ameliorate some of the issues with conflating programming and computer science. The need for this perspective is also one of the reasons that the typical CS1 course isn't suitable for teaching scientists, since its goals relative to programming and computer science are so different.

I hadn't thought as much about Bill's point of conflating programming with software development. This creates yet another reason not to use CS1 as the primary vehicle for teaching scientists to ("program" | "develop software"), because even more so in this regard are the goals of CS1 for computer science majors very different from the goals of a course for scientists.

Indeed, there is already a strong tension between the goals of academic computer science and the goals of professional software development that makes the introductory CS curriculum hard to design and implement. We don't want to drag that mess into the design of a programming course for scientists, or economists, or other non-majors. We need to think about how best to help the folks who will not be computer scientists learn to use computation as a tool for thinking about, and doing work in, their own discipline.

And keep in mind that most of these folks will also not be professional software developers. I suspect that folks who are not professional software developers -- folks using programs to build models in their own discipline -- probably need to learn a different set of skills to get started using programming as a tool than professional software developers do. Should they ever want to graduate beyond a few dozen lines of code, they will need more and should then study more about developing larger and longer-lived programs.

All that said, I think that that scientists as a group might well be more amenable than many others to learning test-driven development. It is quite scientific in spirit! Like the scientific method, it creates a framework for thinking and doing that can be molded to the circumstances at hand.

The other comment that I must respond to came from Mike McMillan. He recalled having run across a video of Gerald Sussman giving a talk on the occasion of Daniel Friedman's 60th birthday, and hearing Sussman comment on expressing scientific ideas in in Scheme. This brought to his mind Minsky's classic paper "Why Programming Is a Good Medium for Expressing Poorly-Understood and Sloppily-Formulated Ideas", which is now available on-line in a slightly revised form.

This paper is the source of the oft-quoted passage that appears in the preface of Structure and Interpretation of Computer Programs:

A computer is like a violin. You can imagine a novice trying first a phonograph and then a violin. The latter, he says, sounds terrible. That is the argument we have heard from our humanists and most of our computer scientists. Computer programs are good, they say, for particular purposes, but they aren't flexible. Neither is a violin, or a typewriter, until you learn how to use it.

(Perhaps Minsky's paper is where Alan Kay picked up his violin metaphor and some of his ideas on computation-as-medium.)

Minsky's paper is a great one, worthy of its own essay sometime. I should certainly have mentioned it in my essay on scientists learning to program. Thanks to Mike for the reminder! I am glad to have had reason to track down links to the video and the PDF version of the paper, which until now I've only had in hardcopy.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 24, 2007 5:43 AM

A Program is an Idea

Recently I claimed that scientists should learn to program. Why? I think that there are reasons beyond the more general reasons that others should learn.

As more and more science becomes computational, or relies on computation in some fundamental way, scientists will build models of their theories using programs. There are, of course, many modeling tools to support this activity. But a scientist should understand something about how these tools work in order to understand better the models built with them. Scientists may not need to understand their software tools as well as the computer scientists who build them, but they should probably understand them better than my grandma knows how her e-mail tool works.

Actually, scientists may want to know a lot about their tools. My physicist and chemist friends talk and write often about how they build their own instruments and mock up equipment for special-purpose experiments. There is great power in building the instruments one uses -- something I've written about as a computer scientist writing his own tools. Then again, my scientist friends don't blow their own glass or make their own electrical wire, so there is some point at which the process of tool-building bottoms out. I'm not sure we have a very good sense where the process of a scientist writing her own software will bottom out just yet.

Being able to write code, even if only short scripts, puts a lot of power into the scientist's hands. It is the power to express an idea. And that's what programs are: ideas. I read that line in an article by Paul Graham about how programmers work best. But the crux of his whole argument comes down to that one line: a program is an idea.

As a scientist creates an idea, explores it, shapes it, and communicates it, a program can be the evolving idea embodied -- concrete, unambiguous, portable. That is part of what I inferred from the presentations of earth scientist Noah Diffenbaugh and physics educators Ruth Chabay and Bruce Sherwood at the SECANT workshop, Programs are part of how the scientist works and part of how the scientist communicates the result.

Reader Darius Bacon sent me some some neat comments in response to my Programming Scientists entry on just this point. Darius expressed frustration with the way that many scientific papers formally explain their experiments and results. "Why don't you just *write a program*?" he asks. He agrees that especially scientists should learn how to program, because it may help them learn to communicate their ideas better. I look forward to him saying more about this in his blog.

A scientist can communicate an idea to others with a program, but they can also think better with a program. Graham captured this from the perspective of the software developer in the article I quoted earlier:

Your code is your understanding of the problem you're exploring.

Every programmer knows what this means. I can think all the Big Thoughts I like, but until I have working a program, I'm never quite certain that these thoughts make sense. Writing a program forces me to specify and clarify; it rejects fuzziness and incoherence. As my program grows and evolves, it reflects the growth and evolution of my idea. Graham says it more strongly: the program is the growth and evolution of my idea. Whether this is truth or merely literary device, thinking of the program as the idea is a useful mechanism for holding myself accountable to making my idea clear enough to execute.

This is one of the senses in which Darius Bacon's notion of program-as-explanation works. A program might be clearer than the prose, "and even if it isn't at first, at least I could reverse-engineer it, because it has to be precise enough to execute" (my emphasis). And what helps us communicate better can also help us to think better, because the program in which we embody our thoughts has to be precise enough to execute.

I know this sounds like a lot of high-falutin' philosophizing. But I think it is one of the Big Ideas that computing has given the world. It's one of the reasons that teaching everyone a little programming would be good.

There are practical reasons for learning to program, too, and I think we can be more practical and concrete in saying why programming is especially important to scientists. In science, the 20th century was the century of the mathematical formula. The ideas we explored, and the level at which we understand them, were in most ways expressible with small systems of equations. But as we tackle more complex problems and try to understand more complex systems, individual formulae become increasingly unsatisfactory as models. The 21st century will be the century of the algorithm! And the best way to express and test an algorithm is... write a program.


Posted by Eugene Wallingford | Permalink | Categories: Computing

November 21, 2007 3:28 PM

Small World

the cover of David Lodge's 'Small World'

Recently I mentioned the big pharmaceutical company Eli Lilly in an entry on the next generation of scientists, because one of its scientists spoke at the SECANT workshop I was attending. I have some roundabout personal connections to Lilly. It is based in my hometown.

When I was in high school and had moved to a small town in the next county, I used to go with some adult friends to play chess at the Eli Lilly Chess Club, which was the only old-style corporate chess club of its kind that I knew of. (Clubs like it used to exist in many big cities in the 19th and early 20th centuries. I don't know how common they are these days. The Internet has nearly killed face-to-face chess.) I recall quite a few Monday nights losing quarters for hours while playing local masters at speed chess, at 1:30-vs-5:00 odds!

Coincidentally, my high school hometown was also home to a Lilly Research Laboratories facility, which does work on vaccines, toxins, and agricultural concerns. Parents of several friends worked there, in a variety of capacities. When I was in college, I went out on a couple of dates with a girl from back home. Her father was a research scientist at Lilly in Greenfield. (A quick google search on his name even uncovers a link to one of his papers.) He is the sort of scientist that Kumar, our SECANT presenter, works with at Lilly. Interesting connection.

But I can go one step further and bring this even closer to my professional life these days. My friend's last name was Gries. It turns out that her father, Christian Gries, is brother to none other than distinguished computer scientist David Gries. I've mentioned Gries a few times in this blog and even wrote an extended review of one of his classic papers.

I don't think I was alert enough at the time to be sufficiently impressed that Karen's uncle was such a famous computer scientist. In any case, hero worship is hardly the basis for a long-term romantic relationship. Maybe she was wise enough to know that dating a future academic was a bad idea...


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

November 20, 2007 4:30 PM

Workshop 5: Wrap-Up

[A transcript of the SECANT 2007 workshop: Table of Contents]

The last bit of the SECANT workshop focused on how to build a community at this intersection of CS and science. The group had a wide-ranging discussion which I won't try to report here. Most of it was pretty routine and would not be of interest to someone who didn't attend. But there were a couple of points that I'll comment on.

On how to cause change.     At one point the discussion turned philosophical, as folks considered more generally how one can create change in a larger community. Should the group try to convince other faculty of the value of these ideas first, and then involve them in the change? Should the group create great materials and courses first and then use them to convince other faculty? In my experience, these don't work all that well. You can attract a few people who are already predisposed to the idea, or who are open to change because they do not have their own ideas to drive into the future. But folks who are predisposed against the idea will remain so, and resist, and folks who are indifferent will be hard to move simply because of inertia. If it ain't broke, don't fix it.

Others expressed these misgivings. Ruth Chabay suggested that perhaps the best way to move the science community toward computational science is by producing students who can use computation effectively. Those students will use computation to solve problems. They will learn deeper. This will catch the eye of other instructors. As a result, these folks will see an opportunity to change how they teach, say, physics. We wouldn't have to push them to change; they would pull change in. Her analogy was to the use of desktop calculators in math, chemistry, and physics classes in the 1970s and 1980s. Such a guerilla approach to change might work, if one could create a computational science course good enough to change students and attractive enough to draw students to take it. This is no small order, but it is probably easier than trying to move a stodgy academic establishment with brute force.

On technology for dissemination.     Man, does the world change fast. Folks talked about Facebook and Twitter as the primary avenues for reaching students. Blogs and wikis were almost an afterthought. Among our students, e-mail is nearly dead, only 20 years or so after it began to enter the undergraduate mainstream. I get older faster than the calendar says because the world is changing faster than the days are passing.

Miscellaneous.     Purdue has a beautiful new computer science building, the sort of building that only a large, research school can have. What we might do with a building at an appropriate scale for our department! An extra treat for me was a chance to visit a student lounge in the building that is named for the parents of a net acquaintance of mine, after he and his extended family made a donation to the building fund. Very cool.

I might trade my department's physical space for Purdue CS's building, but I would not trade my campus for theirs. It's mostly buildings and pavement, with huge amounts of auto traffic in addition to the foot traffic. Our campus is smaller, greener, and prettier. Being large has its ups and its downs.

Thanks to a recommendation of the workshop's local organizer, I was able to enjoy some time running on campus. Within a few minutes I found my way to some trails that head out into more serene places. A nice way to close the two days.

All in all, the workshop was well worth my time. I'll share some of the ideas among my science colleagues at UNI and see what else we can do in our own department.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

November 20, 2007 1:28 PM

Workshop 4: Programming Scientists

[A transcript of the SECANT 2007 workshop: Table of Contents]

Should scientists learn to program? This question arose several times throughout the SECANT workshop, and it was an undercurrent to most everything we talked about.

Most of the discipline-area folks at the workshop currently use programming in their courses. Someone pointed out that this can be an attractive feature in an elective science course that covers computational material -- even students in the science disciplines want to learn a tool or skill they know to be marketable. (I am guessing that at least some of the times it may be the only thing that convinces them to take the course!)

Few require a standard programming course from the CS catalog, or a CS-like course heavy in abstraction. That is usually not the most useful skill for the science students to learn. In practice, the scientists need to learn to write only small procedural programs. They don't really need OOP (which was the topic of a previous talk at the workshop), though they almost certainly will be clients of rich and powerful objects.

Python was a popular choice among attendees and is apparently quite popular as a scripting language in the sciences. The Matter and Interactions curriculum in physics, developed by Chabay and Sherwood, depends intimately on simulations programmed -- by physics students -- in VPython, which provides an IDE and modules for fast array operations and some quite beautiful 3-D visualization. I'm not running VPython yet, because it currently requires X11 on my Mac and I've tried to stay native Mac whenever possible. This package looks like it might be worth the effort.

A scripting language augmented with the right libraries seems like a slam-dunk for programmers in this context. Physics, astronomy, and other science students don't want to learn the overhead of Java or the gotchas of C pointers; they want to solve problems. Maybe CS students would benefit from learning to program this way? We are trying a Python-based version of a media computation CS1 in the spring and should know more about how our majors respond to this approach soon. The Java-based CS1 using media computation that we ran last year went well. In the course that followed, we did observe a few gaps that CS students don't usually have after CS1, so we will need to address those in the data structures course that will follow the Python-based offering. But that was to be expected -- programming for CS students is different than programming for end users. Non-computer scientists almost certainly benefit from a scripting language introduction. If they ever need more, they know where to go...

The next question is, should CS faculty teach the programming course for non-CS students? CS faculty almost always say, "Yes!" Someone at the workshop said that otherwise programming will be "taught BioPerl badly in a biology course by someone who only knows how to hack Perl. Owen Astrachan dared to ask "'taught badly' -- but is it?" The science students learn what they need to solve problems in their lab. CS profs responded, well, but their knowledge will be brittle, and it won't scale, and .... But do the scientists know that -- or even care!? If they ever need to write bigger, less brittle, more scalable programs, they know where to go to learn more.

I happen to believe that science students will benefit by learning to program from computer science professors -- but only if we are willing to develop courses and materials for them. I see no reason to believe that the run-of-the-mill CS1 course at most schools is the right way to introduce non-CS students to programming, and lots of reasons to believe otherwise.

And, yes, I do think that science students should learn how to program, for two reasons. One is that science in the 21st century is science of computation. That was one of the themes of this workshop. The other is that -- deep in my heart -- I think that all students should learn to program. I've written about this before, in terms of Alan Kay's contributions, and I'll write about it again soon. In short, I have at least two reasons for believing this:

  • Computation is a new medium of communication, and one with which we should empower everyone, not just a select few.
  • Computer programming is a singular intellectual achievement, and all educated people should know that, and why.

Big claims to close an entry!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 19, 2007 4:41 PM

Workshop 3: The Next Generation

[A transcript of the SECANT 2007 workshop: Table of Contents]

The highlight for me of the final morning of the SECANT workshop was a session on the "next generation of scientists in the workforce". It consisted of presentations on what scientists are doing out in the world and how computer scientists are helping them.

Chris Hoffman gave a talk on applications of geometric computing in industry. He gave examples from two domains, the parametric design of mechanical systems and the study of the structure and behaviors of proteins. I didn't follow the virus talk very well, but the issue seems to lie in understanding the icosahedral symmetry that characterizes many viruses. The common theme in the two applications is constraint resolution, a standard computational technique. In the design case, the problem is represented as a graph, and graph decomposition is used to create local plans. Arriving at a satisfactory design requires solving a system of constraint equations, locally and globally. A system of constraints is also used to model the features of a virus capsid and then solved to learn how the virus behaves.

Essential to both domains is the notion of a constraint solver. This won't be a surprise to folks who work on design systems, even CAD systems, but the idea that a biologist needs to work with a constraint solver might surprise many. Hoffman's take-home point was that we cannot do our work locked within our disciplinary boundaries, because we don't usually know where the most important connections lie.

The next two talks were by computer scientists working in industry. Both gave wonderful glimpses of how scientists work today and how computer scientists help them -- and are helping them redefine their disciplines.

First was Amar Kumar of Eli Lilly, who drew on his work in bioinformatics with biologists and chemists in drug discovery. He views his primary responsibility as helping scientists interpret their data.

The business model traditionally used by Lilly and other big pharma companies is unsustainable. If Lilly creates 10,000 new compounds, roughly 1,000 will show promise, 100 will be worth testing, and 1 will make it to market. The failure rate, the time required by the process, the cost of development -- all result in an unsustainable model.

Kumar means that Lilly and its scientists must undergo a transformation in thought process about how to find candidates and how to discover effects. He gave two examples. Biologists must move from "How does this particular gene respond to this particular drug?" to "How do all human genes respond to this panel of 35,000 drugs?" There are roughly 30,000 human genes, which means that the new question produces 1 billion data points. Similarly, drug researchers must move from "What does Alzheimer's do to the levels of amyloid protein in the brain? to "When I compare a healthy patient with an Alzheimer's patient, what is difference in the level of every brain-specific protein over time?" Again, the new question produces a massive number of data points.

Drug companies must ask new kinds of questions -- and design new ways to find answers. The new paradigm shifts the power from pure lab biologists to bioinformatics and statistics. This is a major shift in culture at a place like Lilly research labs. It terms of a table of gene/drug interactions, the first adjustment is from cell thinking (1 gene/1 drug) to column thinking (1 drug/all genes). Ultimately, Kumar believes, the next step -- to grid thinking (m drugs/n genes) and finding patterns throughout -- is necessary, too.

What the bioinformatician can do is to help convert information into knowledge. Kumar said that a friend used to ask him what he would do with infinite computational power. He thinks the challenge these days is not to create more computational power. We already have more data in our possession than we know what to do with. More than more raw power, we need new ways to understand the data that we gather. For example, we need to use clustering techniques more effectively to find patterns in the data, to help scientists see the ideas. Scientists do this "in the small", by hand, but programs can do so much better. Kumar showed an example, a huge spreadsheet with a table of genes crossed with metabolites. Rather than look at the data in the small he converted the numbers to a heat map so that the scientist could focus on critical areas of relationship. That is a more fruitful way to identify possible experiments than to work through the rows of the table by hand.

Kumar suggests that future scientists require some essential computational skills:

  • data integration (across data sets)
  • data visualization
  • clustering
  • translation of problems from one space to another
  • databases
  • software development lifecycle

Do CS students learn about clustering as undergrads? Biologists need to. On the last two items, other scientists usually know remarkably little. Knowing a bit about the software lifecycle will help them work better with computer scientists. Knowing a bit about databases will help them understand the technology decisions the CS folks make. If all you know is a flat text file or maybe a spreadsheet, then you may not understand why it is better to put the data in a database -- and how much better that will support your work.

The second speaker was Bob Zigon from Beckman Coulter, a company that works in the area of flow cytometry. Most of us in the room didn't know that flow cytometry studies the properties of cells as they flow through a liquid. Zigon is a software tech lead for the development of flow cytometry tools. He emphasized that to do his job, he has to act like the lab scientists. He has to learn their vocabulary, how to run their equipment, how to build their instruments, and how to perform experiments. The software folks at Beckman Coulter spend a lot of time observing scientists.

... and students chuckle at me when I tell them psychology, anthropology, and sociology make great minors or double majors for CS students! My experience came in the world of knowledge-based systems, which require a deep understanding of the practice and implicit knowledge of domain experts. Back in the early 1990s, I remember AI researcher John McDermott, of R1/XCON fame, describing how his expert systems team had evolved toward cultural anthropology as the natural next step in their work. I think that all software folks must be able to develop a deep cultural understanding of the domains they work in, if they want to do their jobs well. As software development becomes more and more interdisciplinary, it becomes more important. Whether they learn these skills in the trenches or with some formal training is up to them.

Enjoying this sort of work helps a software developer, too. Zigon clearly does. He and his team implement computations and build interfaces to support the scientists who use flow cytometry to study blood cancer and other health conditions. He gave a great two-minute description of one of the basic processes that I can't do justice here. First they put blood into a tube that narrows down to the thickness of a hair. The cells line up, one by one. Then the scientists run the blood across a laser beam, which causes the cells to effloresce. Hardware measures the fluorescent energy, and software digitizes it for analysis. The equipment processes 10k cells/second, resulting in 18 data points for each of anywhere between 1 and 20 million cells.

What do scientists working in this area need? Data management across the full continuum: acquisition, organization, querying, and visualization. Eight years of research data amount to about 15 gigabytes. Eight years of pharmaceutical data reaches 185 GB. And eight years of clinical data is 3 terabytes. Data is king.

Zigon's team moves all the data into relational databases, converting the data into fifth normal form to eliminate as much redundancy as possible. Their software makes the data available to the scientists for online transactional processing and online analytical processing. Even with large data sets and expensive computations, the scientists need query times in the range of 7-10 seconds.

With so much data, the need for ways to visualize data sets and patterns is paramount. In real time, they process 750 MB data sets at 20 frames per second. The biologists would still use histograms and scatter plots as their only graphical representations if the software guys couldn't do better. Zigon and his team build tools for n-dimensional manipulation and review of the data. They also work on data reduction, so that the scientists can focus on subsets when appropriate.

Finally, to help find patterns, they create and implement clustering algorithms. Many of the scientists tend to fall back on k-means clustering, but in highly multidimensional spaces that technique imposes a false structure on the data. They need something better, but the alternatives are O(n²) -- which is, of course, intractable on such large sets. So Zigon needs better algorithms and, contrary to Kumar's case, more computational power! At the top of his wish list are algorithms whose complexity scales to studying 15 million cells at a time and ways to parallelize these algorithms in cost- and resource-effective ways. Cluster computing is attractive -- but expensive, loud, hot, .... They need something better.

What else do scientists need? The requirements are steep. The ability to integrate cellular, proteomic, and genomic data. Usable HCI. On a more pedestrian tack, they need to replace paper lab notebooks with electronic notebooks. That sounds easy but laws on data privacy and process accountability make that a challenging problem, too. Zigon's team draws on work in the areas of electronic signatures, data security on a network, and the like.

From these two talks, it seems clear that domain scientists and computer scientists of the future will need to know more about the other discipline than may have been needed in the past. Computing is redefining the questions that domain scientists must ask and redefining the tasks performed by the CS folks. The domain scientists need to know enough about computer science, especially databases and visualization, to know what is possible. Computer scientists need to study algorithms, parallelism, and HCI. They also need to take more seriously the soft skills of communication and teamwork that we have encouraging for many years now.

The Q-n-A session that followed pointed out an interesting twist on the need for communication. It seems that clustering algorithms are being reinvented across many disciplines. As each discipline encounters the need, the scientists and mathematicians -- and even computer scientists -- working in that area sometimes solve their problems from scratch without reference to the well-developed results on clustering and pattern recognition from CS and math. This seems like a potentially valuable place to initiate dialogue across the sciences in places looking to increase their interdisciplinary focus.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 16, 2007 10:52 AM

Workshop 2: Exception Gnomes, Garbage Collection Fairies, and Problems

[A transcript of the SECANT 2007 workshop: Table of Contents]

Thursday afternoon at the NSF workshop saw a hodgepodge of sessions around the intersection of science ed and computing.

The first session was on computing courses for science majors. Owen Astrachan described some of the courses being taught at Duke, including his course using genomics as the source of problems to learn programming. He entertained us all with pictures of scientists and programmers, in part to demonstrate how many of the people who matter in the domains where real problems live are not computer geeks. The problems that matter in the world are not the ones that tend to excite CS profs...

Unbelievable but true! Not everyone knows about the Towers of Hanoi.

... or cares.

John Zelle described curriculum initiatives at Wartburg College to bring computing to science students. Wartburg has taken several small steps along the way:

  • a more friendly CS1 course
  • an introductory course in computational science
  • integration of computing into the physics curriculum
  • a CS senior project course that collaborates with the sciences
  • (coming) a revision of the calculus sequence

At first, Zelle said a "CS1 course friendlier to scientists", but then he backed up to the more general statement. The idea of needing a friendlier intro course even for our majors is something many of us in CS have been talking about for a while, and something I wrote about a while back. I was also interested in hearing about Wartburg's senior projects. More recently, I wrote about project-based computer science education. Senior project courses are a great idea, and one that CS faculty can buy into at many schools. That makes it a good first step to perhaps changing the whole CS program, if a faculty were so inclined. The success of such project-centered courses is just about the only way to convince some faculty that a project focus is a good idea in most, if not all, CS courses.

Wartburg's computational science course covers many of the traditional topics,including modeling, differential equations, numerical methods, and data visualization. It also covers the basics of parallel programming, which is less common in such a course. Zelle argued that every computational scientist should know a bit about parallel programming, given the pragmatics of computing over massive data sets.

The second session of the afternoon dealt with issues of programming "in the small" versus "in the large". It seemed like a bit of a hodgepodge itself. The most entertaining of these talks was by Dennis Brylow of Marquette, called "Object Dis-Oriented". He said that his charge was to "take a principled stand that will generate controversy". Some in the room found this to be a novelty, but for Owen and me, and anyone in the SIGCSE crowd, it was another in a long line of anti-"objects first" screeds: Students can't learn how to decompose into methods until they know what goes into a method; students can't learn to program top-down, because then "it's all magic boxes".

I reported on a similarly entertaining panel at SIGCSE a couple of years ago. Brylow did give us something new, a phrase for the annals of anti-OO snake oil: to students who learn OOP first see their programs as

... full of exception gnomes and garbage collection fairies.

Owen asked the natural question, reductio ad absurdum: Why not teach gates then? The answer from the choir around the room was, good point, we have to choose a level, but that level is below objects -- focused on "the fundamentals". Sigh. Abacus-early, anyone?

Brylow also offered a list of what he thinks we should teach first, which contains some important ideas: a small language, a heavy focus on data representation, functional decomposition, and the fundamental capabilities of machine computation

This list tied well into the round-table discussion that followed, on what computational concepts science students should learn. I didn't get a coherent picture from this discussion, but one part stood out to me. Bruce Sherwood said that many scientists view analytical solution as privileged over simulation, because it is exact. He then pointed out that in some domains the situation is being turned on its head: a faithful discrete simulation is a more real depiction of the world than the closed-form analytical solution -- which is, in fact, only an approximation created at a time when our tools were more limited. The best quote of this session came from John Zelle: Continuity is a hack!

The day closed with another hodgepodge session on the role of data visualization. Ruth Chabay spoke about visualizing models, which in physics are as important as -- more important than!? -- data. Michael Coen gave a "great quote"-laden presentation that on the question of whether computer science is the servant of science or the queen of the sciences, a lá Gauss on math before him. Chris Hoffman gave a concise but convincing motivational talk:

  • There is great power in visualizing data.
  • With power comes risk, the risk of misleading.
  • Visualization can be tremendously effective.
  • Techniques for visual data analysis must account for the coming data deluge. (He gave some great examples...)
  • The challenges of massive data are coming in all of the sciences.

When Coen's and Hoffman's slides become available on-line, I will point to them. They would be worth a glance.

Hodgepodge sessions and hodgepodge days are okay. Sometimes we don't know where the best connections will come from...


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 15, 2007 8:34 PM

Workshop 1: Creating a Dialogue Between Science and CS

[A transcript of the SECANT 2007 workshop: Table of Contents]

The main session this morning was on creating a dialogue between science and CS. There seems to be a consensus among scientists and computer scientists alike that the typical introductory computer science course is not what other science students need, but what do they need? (Then again, many of us in CS think that the typical introductory computer science course is not what our computer science students need!)

Bruce Sherwood, a physics professor at North Carolina State, addressed the question,"Why computation in physics?" He said that this was one prong in an effort to admit to physics students "that the 20th century happened". Apparently, this is not common enough in physics. (Nor in computer science!) To be authentic to modern practice, even intro physics must show theory, experiment, and computation. Physicists have to build computational models, because many of their theories are too complex or have no analytical solution, at least not a complete one.

What excited me most is that Sherwood sees computation as a means for communicating the fundamental principles of physics, even its reductionist nature. He gave as an example the time evolution of Newtonian synthesis. The closed form solution shows students the idea only at a global level. With a computer simulation, students can see change happen over time. Even more, it can be used to demonstrate that the theory supports open-ended prediction of future behavior. Students never really see this when playing with analytical equations. In Sherwood's words, without computation, you lose the core of Newtonian mechanics!

He even argued that physics students should learn to program. Why?

  • So there are "no black boxes". He wants his students to program all the physics in their simulations.
  • So that they see common principles, in the form of recurring computations.
  • So that they can learn the relationship between the different representations they use: equations, code, animation, ...

More on science students and programming in a separate entry.

Useful links from his talk include comPADRE, a part of the National Science Digital Library for educational resources in physics and astronomy, and VPython, dubbed by supporters as "3D Programming for Ordinary Mortals". I must admit that the few demos and programs I saw today were way impressive.

The second speaker was Noah Diffenbaugh, a professor in earth and atmospheric sciences at Purdue. He views himself as a modeler dependent on computing. In the last year or so, he has collected 55 terabytes of data as a part of his work. All of his experiments are numerical simulations. He cannot control the conditions of the system he studies, so he models the system and runs experiments on the model. He has no alternative.

Diffenbaugh claims that anyone who wants a career in his discipline must be able to do computing -- as a consumer of tools, builder of models. He goes farther, calling himself a black sheep in his discipline for thinking that learning computing is critical to the intellectual development of scientists and non-scientists alike.

When most scientists talk of computation, they talk about a tool -- their tool -- and why it should be learned. They do not talk about principles of computing or the intellectual process one practices when writing a program. This concerns Diffenbaugh, who thinks that scientists must understand the principles of computing on which they depend, and that non-scientists must understand them, too, in order to under the work of scientists. Of course, scientists are the only ones who fixate on their computational tools to the detriment of discussing ideas. CS faculty do it, too, when they discuss CS1 in terms of the languages we teach. What's worse, though, is that some of us in CS do talk about principles of computing and intellectual process -- but only as the sheep's clothing that sneaks our favorite languages and tools and programming ideas into the course.

The session did include some computer scientists. Kevin Wayne of Princeton described an interdisciplinary "first course in computer science for the next generation of scientists and engineers". On his view, both computer science students and students of science and engineering are shortchanged when they do not study the other discipline. One of his colleagues (Sedgewick?) argues that there should be a common core in math, science, and computation for all science and engineering students, including CS.

What do scientists want in such a course? Wayne and his colleagues asked and found that they wanted the course to cover simulation, data analysis, scientific method, and transferrable programming skills (C, Perl, Matlab). That list isn't too surprising, even the fourth item. That is a demand that CS folks hear from other CS faculty and from industry all the time.

The course they have built covers the scientific method and a modern programming model built on top of Java. It is infused with scientific examples throughout. This include not examples from the hard sciences, such as sequence alignment, but also cool examples from CS, such as Google's page rank scheme. In the course, they use real data and so so experience the sensitivity to initial conditions in the models they build. He showed examples from financial engineering and political science, including the construction of a red/blue map of the US by percentage of the vote won by each candidate in each state. Data of this sort is available at National Atlas of the US, a data source I've already added to my list.

The fourth talk of the session was on the developing emphasis on modeling at Oberlin College, across all the sciences. I did not take as many notes on this talk, but I did grab one last link from the morning, to the Oberlin Center for Computation and Modeling. Occam -- a great acronym.

My main takeaway points from this session came from the talks by the scientists, perhaps because I know relatively less about what scientists think about and want from computer science. I found the examples they offered fascinating and their perspectives on computing to be surprisingly supportive. If these folks are at all representative, the dialogue between science and CS is ripe for development.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 15, 2007 6:59 PM

Workshop Intro: Teaching Science and Computing

[A transcript of the SECANT 2007 workshop: Table of Contents]

I am spending today and tomorrow at an NSF Workshop on Science Education in Computational Thinking put on by SECANT, a group at Purdue University funded by a grant from NSF's Pathways to Revitalized Undergraduate Computing Education (CPATH) program. SECANT's goals are to build a community that is asking and answering questions such as these:

  • What should science majors know about computing?
  • How can computer science be used to teach science?
  • Can we integrate computer science effectively into other majors?
  • What will the implications of answers to these questions be for how we teach computer science and engineering themselves?

The goal of this workshop is to begin building a community, to share ideas and to make connections. I'll share in my next few entries some of the ideas I encounter here, as well as some of the thoughts I have along the way. This entry is mostly about the background of the workshop and a few miscellaneous impressions.

First, I am impressed with the wide range of attendees. Folks come from big state schools such as Ohio State, Purdue, and Iowa, from private research schools such as Princeton and Notre Dame, and from small liberal arts schools such as Wartburg and Kalamazoo.

We started with introduction from the workshop organizers at Purdue and the NSF itself. Joseph Urban from NSF spoke a bit about the challenges addressed by the CPATH program. I think its most interesting goal is to move "beyond curriculum revision" to "institution transformation models" -- avoiding the curse of incremental change. This reminded me of something that Guy Kawasaki said in his talk The Art of Innovation: Revolution, then evolution. To completely change how we teach sciences and intro computer science -- revolution first, or evolution? Given the deep strain of academic conservatism that dominates most colleges and universities, this raises an interesting question about which approach will work best. From what I've seen here today, different schools are trying each, with various levels of success.

The introductory remarks by Jeff Vitter, dean of the College of Sciences -- and a computer scientist by training -- included a comment that is a theme underlying this workshop and driving the scientists who are here to explore computer science more deeply: Computing is now a fundamental component in the cycle of science: theory followed by experimentation. For many scientists, building models is the next step after experiment, or even a hand-in-hand partner to experiment. For many scientists, visualizing the results of experiments is essential -- we cannot understand them otherwise.

The workshop made a few personal connections for me. Also in attendance are neighbors of mine, Alberto Segre from Iowa and John Zelle from Wartburg College. But there are connections to my past, too. Another attendee is an old grad school colleague of mine, Pat Flynn, who is now at Notre Dame. FInally, from Urban's NSF presentation I learned that one of the big CPATH awards was made to a team at Michigan State -- including my old advisor. I'm not too surprised that his professional interests have evolved in this direction, though he might be.

Some here expressed surprise that so many folks are already doing interesting work in this arena. I wasn't, because there's been a lot of buzz in the last couple of years, but I was interested to see the diversity of new courses and new programs already in place. That is, of course, one of the great benefits of attending workshops such as this one.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

November 10, 2007 4:21 PM

Programming Challenges

Gerald Weinberg's recent blog Setting (and Character): A Goldilocks Exercise describes a writing exercise that I've mentioned here as a programming exercise, a pedagogical pattern many call Three Bears. This is yet another occurrence of a remarkably diverse pattern.

Weinberg often describes exercises that writers can use to free their minds and words. It doesn't surprise me that "freeing" exercises are built on constraints. In one post, Weinberg describes The Missing Letter, in which the writer writes (or rewrites) a passage without using a randomly chosen letter. The most famous example of this form, known as a lipogram, is La disparition, a novel written by Georges Perec without an using the letter 'e' -- except to spell the author's name on the cover.

When I read that post months ago, I immediately thought of creating programming exercises of a similar sort. As I quoted someone in a post on a book about Open Court Publishing, "Teaching any skill requires repetition, and few great works of literature concentrate on long 'e'." We can design a set of exercises in which the programmer surrenders one of her standard tools. For instance, we could ask her to write a program to solve a given problem, but...

  • with no if statements. This is exactly the idea embodied in the Polymorphism Challenge that my friend Joe Bergin and I used to teach a workshop at SIGCSE a couple of years ago and which I often find useful in helping programmers new to OOP see what is possible.

  • with no for statements. I took a big step in understanding how objects worked when I realized how the internal iterators in Smalltalk's collection classes let me solve repetition tasks with a single message -- and a description of the action I wanted to take. It was only many years later that I learned the term "internal iterator" or even just "iterator", but by then the idea was deeply ingrained in how I programmed.

    Recursion is the usual path for students to learn how to repeat actions without a for statement, but I don't think most students get recursion the way most folks teach it. Learning it with a rich data type makes a lot more sense.

  • with no assignment statements. This exercise is a double-whammy. Without assignment statements, there is no call explicit sequences of statements, either. This is, of course, what pure functional programming asks of us. Writing a big app in Java or C using a pure functional style is wonderful practice.

  • with no base types. I nearly wrote about this sort of exercise a couple of years ago when discussing the OOP anti-pattern Primitive Obsession. If you can't use base types, all you have left are instances of classes. What objects can do the job for you? In most practical applications, this exercise ultimately bottoms out in a domain-specific class that wraps the base types required to make most programming languages run. But it is a worthwhile practice exercise to see how long one can put off referring to a base type and still make sense. The overkill can be enlightening.

    Of course, one can start with an language that provides only the most meager set of base types, thus forcing one to build up nearly all the abstractions demanded by a problem. Scheme feels like that to most students, but only a few of mine seem to grok how much they can learn about programming by working in such a language. (And it's of no comfort to them that Church built everything out of functions in his lambda calculus!)

This list operates at the level of programming construct. It is just the beginning of the possibilities. Another approach would be to forbid the use of a data structure, a particularly useful class, or an individual base type. One could even turn this idea into a naming challenge by hewing close to Weinberg's exercise and forbidding the use of a selected letter in identifiers. As an instructor, I can design an exercise targeted at the needs of a particular student or class. As a programmer, I can design an exercise targeted at my own blocks and weaknesses. Sometimes, it's worth dusting off an old challenge and doing it for it's own sake, just to stay sharp or shake off some rust.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

November 07, 2007 8:02 PM

Magic Books and Connections to Software

Lest you all think I have strayed so far from software and computer science with my last note that I've fallen off the appropriate path for this blog, let me reassure you. I have not. But there is even a connection between my last post and the world of software, though it is sideways.

Richard Bach, the writer whom I quoted last time, is best known for his bestselling Jonathan Livingston Seagull. I read a lot of his stuff back in high school and college. It is breezy pop philosophy wrapped around thin plots, which offers some deep truths that one finds in Hinduism and other Eastern philosophies. I enjoyed his books, including his more straightforward books on flying small planes.

But one of Richard Bach's sons is James, a software tester with whose work I came into contact via Brian Marick's. James is a good writer, and I enjoy both his blog and his other writings about software, development methods, and testing. Another of Richard Bach's son, Jon, is also a software guy, though I don't know about his work. I think that James and Jon have published together.

Illusions offers a book nested inside another book -- a magic book, no less. All we see of it are the snippets that our protagonist needs to read each moment he opens it. One of the snippets from this book-within-a-book might be saying something important about an ongoing theme here:

There is no such thing as a problem without a gift for you in its hands. You seek problems because you need their gifts.

Here is the magic page that grabbed me most as I thumbed through the book again this morning:

Live never to be ashamed if anything you do or say is published around the world -- even if what is published is not true.

Now that is detachment.

How about one last connection? This one is not to software. It is an unexpected connection I discovered between Bach's work and my life after I moved to Iowa. The title character in Bach's most famous book was named for a real person, John Livingston, a world-famous pilot circa 1930. He was born in Cedar Falls, Iowa, my adopted hometown, and once even taught flying at my university, back when it was a teachers' college. The terminal of the local airport, which he once managed, is named for Livingston. I have spent many minutes waiting to catch a plane there, browsing pictures and memorabilia from his career.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 07, 2007 7:45 AM

Magic Books

Last Saturday morning, I opened a book at random, just to fill some time, and ended up writing a blog entry on electronic communities. It was as if the book were magic... I opened to a page, read a couple of sentences, and was launched on what seemed like the perfect path for that morning. That experience echoed one of the things Vonnegut himself has often said: there is something special about books.

This is one reason that I don't worry about getting dumber by reading books, because for me books have always served up magic.

I remember reading just that back in high school, in Richard Bach's Illusions:

I noticed something strange about the book. "The pages don't have numbers on them, Don."

"No," he said. "You just open it and whatever you need most is there."

"A magic book!"

These days, I often have just this experience on the web, as I read blogs and follow links off to unexpected places. An academic book or conference proceedings can do the same. Bach would have said, "But of course."

"No you can do it with any book. You can do it with an old newspaper, if you read carefully enough. Haven't you done that, hold some problem in your mind, then open any book handy and see what it tells you?"

I do that sometimes, but I'm just as likely to catch a little magic when my mind is fallow, and I grab a paper of one of my many stacks for a lunch jaunt. Holding a particular problem in my mind sometimes puts too much pressure on whatever might happen.

Indeed, this comes back to the theme of the article I wrote on Saturday morning. On one hand there are traditional media and traditional communities, and on the other are newfangled electronic media and electronic communities. The traditional experiences often seem to hold some special magic for us. But the magic is not in any particular technology; it is in the intersection between ideas out there and our inner lives.

When I feel something special in the asynchronicity of a book's magic, and think that the predetermination of an RSS feed makes it less spontaneous, that just reflects my experience, maybe my lack of imagination. If I look back honestly, I know that I have stumbled across old papers and old blog posts and old web pages that served up magic to me in much the same way that books have done. And, like electronic communities, the digital world of information creates new possibilities for us. A book can be magic for me only if I have a copy handy. On the web, every article is just a click a way. That's a powerful new sort of magic.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 03, 2007 4:47 PM

Electronic Communities and Dancing Animals

I volunteered to help with a local 5K/10K race this morning. When I arrived at my spot along the course, I had half an hour to fill before the race began, and 45 minutes or so before the first runners would reach me. At first I considered taking a short nap but feared I'd sleep too long. Not much help to the runners in that! So I picked up Kurt Vonnegut's A Man Without a Country, which was in my front seat on its way back to the library. (I wrote a recent article motivated by something else I read in this last book of Vonnegut's.)

I opened the book to Page 61, and my eyes fell immediately to:

Electronic communities build nothing. You end up with nothing. We are dancing animals.

This passage follows a wonderful story about how Kurt mails his manuscripts, daily coming into contact with international voices and a flamboyant postal employee on whom he has a crush. I've heard this sentiment before, in many different contexts and from many different people, but fundamentally I disagree with the claim. Let me tell you about two stories of this sort that stick in my mind, and my reactions at the time.

A decade or so ago, the famed philosopher and AI critic Hubert Dreyfus came to our campus to deliver a lecture as part of an endowed lecture series in the humanities. Had I been blogging at that time, I surely would have written a long review of this talk! Instead, all I have a notebook on my bookshelf full of pages and pages of notes. (Perhaps one of these days...) Dreyfus claimed that the Internet was leading to a disintegration of society by creating barriers to people connecting in the real world. Electronic communication was supplanting face-to-face communication but giving us only an illusion of a real connection; in fact, we were isolating ourselves from one another.

In the question-and-answer session that followed, I offered a counterargument. Back in the mid-1980s I became quite involved in several Usenet newsgroups, both for research and entertainment. In the basketball and football newsgroups, I found intelligent, informed, well-rounded people with who to discuss sports at a deeper level than I could with anyone in my local physical world. These groups became an important part of my day. But as the number of people with Internet access exploded, especially on college campuses, the signal-to-noise ratio in the newsgroups fell precipitously. Eventually, a core group of the old posters moved much of discussion off-group to a private mailing list, and ultimately I was invited to join them.

This mailing list continues to this day, taking on and losing members as lives change and opportunities arise. We still discuss sports and politics, pop culture and world affairs. It is a community as real to me as most others, and I consider some of the folks there to be good friends whom I'm lucky to have come to know. Members of the basketball group get together in person annually for the first two rounds of the NCAA tournament, and wherever we travel for business or pleasure we are likely to be in the neighborhood of a friend we can join for a meal and a little face-to-face communication. Like any real community, there are folks in the group whom I like a lot and others with whom I've made little or no personal connection. On-line we have good moments and disagreements and occasional hurt feelings, like any other community of people.

The second story I remember most is from Vonnegut himself, when he, too, visited my campus back when. At one of the sessions I attended, someone asked him about the fate of books in the Digital Age. Vonnegut was certain that books would continue on in much their current form, because there was something special about the feel of a book in one's hands, the touch of paper on the skin, the smell of the print and binding. Even then I recall disagreeing with this -- not because I don't also feel that something special in the feel of a book in my hands or the touch of the paper on my skin. A book is an artifact of history, an invention of technology. Technology changes, and no matter how personally we experience a particular technology's outward appearance, it is more likely to be different in a few years than to be the same.

My Usenet newsgroup story seems to contradict Dreyfus's thesis, but he held that, because we took it upon ourselves to meet in person, my story actually supported it. To me that seemed a too convenient way for him to dismiss the key point: our sports list is essentially an electronic community, one whose primary existence is virtual. Were the Internet to disappear tomorrow, some of the personal connections we've made would live on, but the community would die.

And keep in mind that I am old guy... Today's youth grow up in a very different world of technology than we did. One of the specific sessions I regret missing by missing OOPSLA was the keynote by Jim Purbrick and Mark Lentczner on Second Life, a new sort of virtual world that may well revolutionize the idea of electronic community not only for personal interaction but for professional, corporate, and economic interaction as well. As an example, OOPSLA itself had an island in Second Life as a way to promote interaction among attendees before and during the conference.

The trend in the world these days is toward more electronic interaction, not less, and new kinds that support wider channels of communication and richer texture in the interchange. There are risks in this trend, to be sure. Who among us hasn't heard the already classic joke about the guy who needs a first life before he can have a Second Life? But I think that this trend is just another step in the evolution of human community. We'll find ways to minimize the risks while maximizing the benefits. The next generation will be better prepared for this task than old fogies like me.

All that said, I am sympathetic to the sentiment that Vonnegut expressed in the passage quoted above, because I think underlying the sentiment is the core of a truth about being human. He expresses his take on that truth in the book, too, for as I turned the page of the book I read:

We are dancing animals. How beautiful it is to get up and go out and do something. We are here on Earth to fart around. Don't let anybody tell you any different.

I know this beauty, and I'm sure you do. We are physical beings. The ability and desire to make and share ideas distinguish us from the rest of the world, but still we are dancing animals. There seems in us an innate need to do, not just think, to move and see and touch and smell and hear. Perhaps this innate trait is why I love to run.

But I am also aware that some folks can't run, or for whatever reason cannot sense our physical world in the same way. Yet many who can't still try to go out and do. At my marathon last weekend, I saw men who had lost use of their legs -- or lost their legs altogether -- making their way over 26.2 tough miles in wheelchairs. The long uphill stretches at the beginning of the course made their success seem impossible, because every time they released their wheels to grab for the next pull forward they lost a little ground. Yet they persevered. These runners' desire to achieve in the face of challenge made my own difficulties seem small.

I suspect that these runners' desire to complete the marathon had as much to do with a sense of loss as with their innate nature as physical beings. And I think that this accounts for Vonnegut's and others' sentiment about the insufficiency of electronic communities: a sense of loss as they watch the world around evolve quickly into something very different from the world in which they grew.

Living in the physical world is clearly an important part of being human. But it seems to be neither necessary nor sufficient as a condition.

Like Vonnegut, I grew up in a world of books. To me, there is still something special about the feel of a book in my hands, the touch of paper on my skin, the smell of the print and binding of a new book the first time I open it. But these are not necessary parts of the world; they are artifacts of history. The sensual feel of a book will change, and humanity will survive, perhaps none they worse for it.

I can't say that face-to-face communities are merely an artifact of history, soon to pass, but I see no reason to believe that the electronic communities we build now -- we do build them, and they so seem to last, at least on the short time scale we have for judging them -- cannot augment our face-to-face communities in valuable ways. I think that they will allow us to create forms of community that were not available to us before, and thus enrich human experience, not diminish it. While we are indeed dancing animals, as Vonnegut describes us, we are also playing animals and creative animals and thinking animals. And, at our core, we are connection-making animals, between ideas and between people. Anything that helps us to make more, different, and better connections has a good chance of surviving in some form as we move into the future. Whether dinosaurs like Vonnegut or I can survive there, I don't know!


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

October 24, 2007 4:12 PM

Missing OOPSLA

OOPSLA 2007

I am sad this week to be missing OOPSLA 2007. You might guess from my recent positive words that missing the conference would make me really sad, and you'd be right. But demands of work and family made traveling to Montreal a work infeasible. After attending eleven straight OOPSLAs, my fall schedule has a whole in it. My blog might, too; OOPSLA has been the source of many, many writing inspirations in the three years since I began blogging.

One piece of good news for me -- and for you, too -- is that we are podcasting all of OOPSLA's keynote talks this year. That would be a bonus for any conference, but with OOPSLA it is an almost sinfully delicious treat. I was reading someone else's blog recently, and the writer -- a functional programming guy, as I recall -- went down the roster of OOPSLA keynote speakers:

  • Peter Turchi
  • Kathy Sierra
  • Jim Purbrick & Mark Lentczner
  • Guy Steele & Richard Gabriel
  • Fred Brooks
  • John McCarthy
  • David Parnas
  • Gregor Kiczales
  • Pattie Maes

... and wondered if this was perhaps the strongest lineup of speakers ever assembled for a CS conference. It is an impressive list!

If you are interested in listening in on what these deep thinkers and contributors to computing are telling the OOPSLA crowd this week, checkout the conference podcast page. We have all of Tuesday's keynotes (through Steele & Gabriel in the above list) available now, and we hope to post today's and tomorrow's audio in the next day or so. Enjoy!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 18, 2007 8:40 AM

Project-Based Computer Science Education

[Update: I found the link to Michael Mitzenmacher's blog post on programming in theory courses and added it below.]

A couple of days ago, a student in my compilers course was in my office discussing his team's parser project. He was clearly engaged in the challenges that they had faced, and he explained a few of the design decisions they had made, some of which he was proud of and some of which he was less thrilled with. In the course of conversation, he said that he prefers project courses because he learns best when he gets into the trenches and builds a system. He contrasted this to his algorithms course, which he enjoyed but which left him with a nagging sense of incompleteness -- because they never wrote programs.

(One can, of course, ask students to program in an algorithms course. In my undergraduate algorithms, usually half of the homework involves programming. My recent favorite has been a Bloom filters project. Michael Mitzenmacher has written about programming in algorithms courses in his blog My Biased Coin.)

I have long been a fan of "big project" courses, and have taught several different ones, among them intelligent systems, agile software development, and recently compilers. So it was with great interest I read (finally) Philip Greenspun's notes on improving undergraduate computer science education. Greenspun has always espoused a pragmatic and irreverent view of university education, and this piece is no different. With an eye to the fact that a great many (most?) of our students get jobs as professional software developers, he sums up one of traditional CS education's biggest weaknesses in a single thought: We tend to give our students

a tiny piece of a big problem, not a small problem to solve by themselves.

This is one of the things that makes most courses on compilers -- one of the classic courses in the CS curriculum -- so wonderful. We usually give students a whole problem, albeit a small problem, not part of a big one. Whatever order we present the phases of the compiler, we present them all, and students build them all. And they build them as part of a single system, capable of compiling a complete language. We simplify the problem by choosing (or designing) a smallish source language, and sometimes by selecting a smallish target machine. But if we choose the right source and target languages, students must still grapple with ambiguity in the grammar. They must still grapple with design choices for which there is no clear answer. And they have to produce a system that satisfies a need.

Greenspun makes several claims with which I largely agree. One is this:

Engineers learn by doing progressively larger projects, not by doing what they're told in one-week homework assignments or doing small pieces of a big project.

Assigning lots of well-defined homework problems is a good way to graduate students who are really good at solving well-defined homework problems. The ones who can't learn this skill change majors -- even if they would make good software developers.

Here is another Greenspun claim. I think that it is likely even more controversial among CS faculty.

Everything that is part of a bachelor's in CS can be taught as part of a project that has all phases of the engineering cycle, e.g., teach physics and calculus by assigning students to build a flight simulator.

Many will disagree. I agree with Greenspun, but to act on this idea would, as Greenspun knows, require a massive change in how most university CS departments -- and faculty -- operate.

This idea of building all courses around projects is similar to an idea I have written about many times here, the value of teaching CS in the context of problems that matter, both to students and to the world. One could teach in the context of a problem domain that requires computing without designing either the course or the entire curriculum around a sequence of increasingly larger projects. But I think the two ideas have more in common than they differ, and problem-based instruction will probably benefit from considering projects as the centerpiece of its courses. I look forward to following the progress of Owen Astrachan's Problem Based Learning in Computer Science initiative to see what role projects will play in its results. Owen is a pragmatic guy, so I expect that some valuable pragmatic ideas will come out of it.

Finally, I think students and employers alike will agree with Greenspun's final conclusion:

A student who graduates with a portfolio of 10 systems, each of which he or she understands completely and can show off the documentation as well as the function (on the Web or on the student's laptop), is a student who will get a software development job.

Again, a curriculum that requires students to build such a portfolio will look quite different from most CS curricula these days. It will also require big changes in how CS faculty teach almost every topic.

Do read Greenspun's article. He is thought-provoking, as always. And read the comments; they contain some interesting claims, too, including the suggestion that we redesign CS education as professional graduate degree program along the lines of medicine.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 16, 2007 6:45 AM

Some Thoughts on How to Increase CS Enrollments

The latest issue of Communications of the ACM (Volume 50, Number 10, pages 67-71) contains an article by Asli Yagmur Akbulut and Clayton Arlen Looney called Inspiring Students to Pursue Computing Degrees. With that title, how could I not jump to it and read it immediately?

Most of the paper describes a survey of the sort that business professors love to do but which I find quite dull. Still, both the ideas that motivate the survey and the recommendations the authors make at the end are worth thinking about.

First, Akbulut and Looney base their research on a model derived from social cognitive theory called the Major Choice Goals Model. In this model, a person's choice goal (such as the choice to major in computing) is influenced by her interest in the area, the rewards she expects to receive as a result of the choice, and her belief that she can succeed in the area, which is termed self-efficacy. Interest itself is influenced by expected rewards and self-efficacy, and expected rewards are influenced by self-efficacy.

Major Choice Goals Model

Their survey of business majors in an introductory business computing course found that choice goals were determined primarily by interest, and that the the other links in the model also correlated significantly. If their findings generalize, then...

  • The key to increasing the number of computing majors lies in increasing their interest in the discipline.
  • Talking to students about the financial and other rewards of majoring in computing influence their choice to major in the discipline only indirectly, through increased interest.
  • Self-efficacy -- a student's judgment of her capability to perform effectively in the majors -- strongly affects both interest and outcome expectations.

I don't suppose that these results are all that earth-shaking, but they do give us clues on where we might best focus our efforts to recruit more majors.

First, we need to foster "a robust sense of self-efficacy" in potential students. This is most effective when we work with people who have little or no direct experience. We should strive to help these folks have successful, positive first experiences. When we encounter people who have had bad past experiences with computing, we need to work extra hard to overcome these with positive exposure.

Second, we need to enhance students' outcome expectations in a broader set of outcomes than just the lure of high salaries and plentiful jobs. Most of us have been looking for opportunities to share salary and employment data with students. But outcome expectations seem to affect a student's choice of majoring in computing mostly through increased interest in the discipline, and financial reward is only one, rather narrow, avenue to interest. We should communicate as many different kinds of rewards as possible, via as many different routes as possible, including different kinds of people who have reaped these benefits such as peer groups, alumni, and various IT professionals.

Third, we can seek to increase interest more directly. Again, this is something that most people in CS and IT have already been doing. I think the value Akbulut and Looney add here is in looking to the learning literature for influences on interest. These include the effective use of "novelty, complexity, conflict, and uncertainty". They remind us that "As technologies continue to rapidly evolve, it is important to deliver course content that is fresh, current, and aligned with students' interests". Our students are looking for ideas that they can apply to their own experiences and to open problems in the world.

The authors also make a suggestion that is controversial with many CS faculty but basic knowledge to others: In order to build self-efficacy and interest in students, we need to be sure that

... the most appropriate faculty are assigned to introductory computing courses. Instructors who are personable, fair, innovative, engaging, and can serve as role models would be more likely to attract larger pools of students.

This isn't pandering; it is simply how the world works. As someone who now has a role in assigning faculty to teach courses, I know that this can be a difficult task, both in making the choices and in working with faculty who would prefer different assignments.

When I first dug into the paper, I had some reservations. I'm not a big fan of this kind of research, because it seems too contingent on too many external factors to be convincing on its own. This particular study looked at business students and a very soft sort of computing course (Introduction to Information Systems) that all business students have to take at many universities. Do the findings apply to CS students more generally, or students who might be interested in a more technical sense of computing? In the end, though, this paper gave me a different spin on a couple of issues with which we have been grappling, in particular on students' sense that they can succeed in computing and on the indirect relationship between expected rewards and choice of major. This perspective gives me something useful to work with.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

October 13, 2007 6:01 PM

More on Forth and a New Compilers Course

Remember this as a reader--whatever you are reading
is only a snapshot of an ongoing process of learning
on the part of the author.
-- Kent Beck

Sometimes, learning opportunities on a particular topic seem to come in bunches. I wrote recently about revisiting Forth and then this week ran across an article on Lambda the Ultimate called Minimal FORTH compiler and tutorial. The referenced compiler and tutorial are an unusually nice resource: a "literate code" file that teaches you as it builds. But then you also get the discussion that follows, which points out what makes Forth special, some implementation tricks, and links to several other implementations and articles that will likely keep me busy for a while.

Perhaps because I am teaching a compilers course right now, the idea that most grabbed my attention came in a discussion near the end of the thread (as of yesterday) on teaching language. Dave Herman wrote:

This also makes me think of how compilers are traditionally taught: lexing → parsing → type checking → intermediate language → register allocation → codegen. I've known some teachers to have success in going the other way around: start at the back end and move your way forward. This avoids giving the students the impression that software engineering involves building a product, waterfall-style, that you hope will actually do something at the very, *very* end -- and in my experience, most courses don't even get there by the end of the semester.

I have similar concerns. My students will be submitting their parsers on Monday, and we are just past the midpoint of our semester. Fortunately, type checking won't take long, and we'll be on to the run-time system and target code soon. I think students do feel satisfaction at having accomplished something along the way at the end of the early phases. The parser even give two points of satisfaction: when it can recognize a legal program (and reject an illegal on), and then when it produces an abstract syntax tree for a legal program. But those aren't the feeling of having compiled a program from end to end.

The last time I debriefed teaching this course, I toyed with the idea making several end-to-end passes through compilation process, inspired by a paper on 15 compilers in 15 weeks. I've been a little reluctant to mess with the traditional structure of this course, which has so much great history. While I don't want to be one of those professors who teaches a course "the way it's always been done" just for that sake, I also would like to have a strong sense that my different approach will give students a full experience. Teaching compilers only every third semester makes each course offering a scarce and thus valuable opportunity.

I suppose that there are lots of options... With a solid framework and reference implementation, we could cover the phases of the compiler in any order we like, working on each module independently and plugging them into the system as we go. But I think that there needs to be some unifying theme to the course's approach to the overall system, and I also think that students learn something valuable about the next phase in the pipeline when we build them in sequence. For instance, seeing the limitations of the scanner helps to motivate a different approach to the parser, and learning how to construct the abstract syntax tree sets students up well for type checking and conversion to an intermediate rep. I imagine that similar benefits might accrue when going backwards.

I think I'll ask Dave for some pointers and see what a backwards compiler course might look like. And I'd still like to play more with the agile idea of growing a working compiler via several short iterations. (That sounds like an excellent research project for a student.)

Oh, and the quote at the top of this entry is from Kent's addendum to his forthcoming book, Implementation Patterns. I expect that this book will be part of my ongoing process of learning, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 08, 2007 8:05 PM

Go Forth and M*

It is easy to forget how diverse the ecosphere of programming languages is. Even most of the new languages we see these days look and feel like the same old thing. But not all languages look and feel the same. If you haven't read about the Forth programming language, you should. It will remind you just how different a language can be. Forth is a stack-based language that uses postfix notation and the most unusual operator set this side of APL. I've been fond of stack-based languages since spending a few months playing with the functional language Joy and writing an interpreter for it while on sabbatical many years ago. Forth is a more of a systems language than Joy, but the programming style is the same.

I recently ran across a link to creator Chuck Moore's Forth -- The Early Years, and it offered a great opportunity to reacquaint myself with the language. This paper is an early version of the paper that became the HOPL-2 paper on Forth, but it reads more like the notes of a talk -- an informal voice, plenty of sentence fragments, and short paragraphs that give the impression of stream of consciousness.

This "autobiography of Forth" is a great example of how a program evolves into a language. Forth started as a program to compute a least-squares fitting of satellite tracking data to determine orbits, and it grew into an interpreter as Moore bumped up against the limitations of the programming environment on the IBM mainframes of the day. Over time, it grew into a full-fledged language as Moore took it with him from job to job, porting it to new environments and extending it to meet the demands of new projects. He did not settle for the limitations of the tools available to him; instead, he thought "There must be a better way" -- and made it so.

As someone teaching a compilers course right now, I smiled at the many ways that Forth exemplified the kind of thinking we try to encourage in students learning to write a compiler. Moore ported Forth to Fortran and back. He implemented cross-assemblers and code generators. When speed mattered, he wrote native implementations. All the while, he kept the core of the language small, adding new features primarily as definitions of new "words" to be processed within the core language architecture.

My favorite quotes from the paper appear at the beginning and the end. To open, Moore reports that he experienced what is in many ways the Holy Grail for a programmer. As an undergraduate, he took a part-time job with the Smithsonian Astrophysical Observatory at MIT, and...

My first program, Ephemeris 4, eliminated my job.

To close, Moore summarizes the birth and growth of Forth as having "the making of a morality play":

Persistent young programmer struggles against indifference to discover Truth and save his suffering comrades.

This is a fine goal for any computer programmer, who should be open to the opportunity to become a language designer when the moment comes. Not everyone will create a language that lasts for 50 years, like Forth, but that's not the point.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 05, 2007 4:45 PM

Fear and Loathing in the Computer Lab

I occasionally write about how students these days don't want to program. Not only don't they want to do it for a living, they don't even want to learn how. I have seen this manifested in a virtual disappearance of non-majors from our intro courses, and I have heard it expressed by many prospective CS majors, especially students interested in our networking and system administration majors.

First of all, let me clarify something. When I say talk about students not wanting to program, one of my colleagues chafes, because he thinks I mean that this is an unchangeable condition of the universe. I don't. I think that the world could change in a way that kids grow up wanting to program again, the way some kids in my generation did. Furthermore, I think that we in computer science can and should help try to create this change. But the simple fact is that nearly all the students who come to the university these days do not want to write programs, or learn how to do so.

If you are interested in this issue, you should definitely read Mark Guzdial's blog. Actually, you should read it in any case -- it's quite good. But he has written passionately about this particular phenomenon on several occasions. I first read his ideas on this topic in last year's entry Students find programming distasteful, which described experiences with non-majors working in disciplines where computational modeling are essential to future advances.

This isn't about not liking programming as a job choice -- this isn't about avoiding facing a cubicle engaged in long, asocial hours hacking. This is about using programming as a useful tool in a non-CS course. It's unlikely that most of the students in the Physics class have even had any programming, and yet they're willing to drop a required course to avoid it.

In two recent posts [ 1 | 2 ], Mark speculates that the part of the problem involving CS majors may derive from our emphasis on software engineering principles, even early in the curriculum. One result is an impression that computer science is "serious":

We lead students to being able to create well-engineered code, not necessarily particularly interesting code.

One result of that result is that students speak of becoming a programmer as if this noble profession has its own chamber in one of the middle circles in Dante's hell.

I understand the need for treating software development seriously. We want the people who write the software we use and depend upon every day to work. We want much of it to work all the time. That sounds serious. Companies will hire our graduates, and they want the software that our graduates write to work -- all the time, or at least better than the software of their competitors. That sounds serious, too.

Mark points out that, while this requirement on our majors calls for students to master engineering practice, it does "not necessarily mesh with good science practice".

In general, code that is about great ideas is not typically neat and clean. Instead, the code for the great programs and for solving scientific problems is brilliant.

And -- here is the key -- our students want to be creative, not mundane.

Don't get me wrong here. I recently wrote on the software engineering metaphor as mythology, and now I am taking a position that could be viewed as blaming software engineering for the decline of computer science. I'm not. I do understand the realities of the world our grads will live in, and I do understand the need for serious software developers. I have supported our software engineering faculty and their curriculum proposals, including a new program in software testing. I even went to the wall for an unsuccessful curriculum proposal that created some bad political feelings with a sister institution.

I just don't want us to close the door to our students' desire to be brilliant. I don't want to close the door on what excites me about programming. And I don't want to present a face of computing that turns off students -- whether they might want to be computer scientists, or whether they will be the future scientists, economists, and humanists who use our medium to change the world in the ways of those disciplines.

Thinking cool ideas -- ideas that are cool to the thinker -- and making them happen is intellectually rewarding. Computer programming is a new medium that empowers people to realize their ideas in a way that has never been available to humankind before.

As Mark notes in his most recent article on this topic, realizing one's own designs also motivates students to want to learn, and to work to do it. We can use the power of our own discipline to motivate people to sample it, either taking what they need with them to other pastures or staying and helping us advance the discipline. But in so many ways we shoot ourselves in the foot:

Spending more time on comments, assertions, preconditions, and postconditions than on the code itself is an embarrassment to our field.

Amen, Brother Mark.

I need to do more to advance this vision. I'm moving slowly, but I'm on the track. And I'm following good leaders.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 04, 2007 6:40 PM

OOPSLA Evolving

I have like to write programs since I first learned Basic in high school. When I discovered OOPSLA back in 1996, I felt as if I had found a home. I had been programming in Smalltalk for nearly a decade. At the time, OOP was just reaching down into the university curriculum, and the OOPSLA Educators' Symposium introduced me to a lot of smart, interesting people who were wrestling with some of the questions we were wrestling with her.

But the conference was about more than objects. It had patterns, and agile software development, and aspect-oriented programming, and language processing, and software design more generally. It was about programs. The people at OOPSLA liked to write programs. They liked to look at programs, discuss, and explore new ways of writing them. I was hooked.

When objects were in ascendancy in industry, OOPSLA had the perfect connection to academia and industry. That was useful. But now that OOP has become so mainstream as to lose its sense of urgency, the value of having "OO" in the conference name has declined. Now, the "OO" part of the name is more misleading than helpful. In some ways, it was an accident of history that this community grew up around object-oriented programming. Its real raison d'etre is programming.

The conference cognoscenti have been bandying about the idea of changing the name of the conference for a few years now, to communicate better why someone should come to the conference. This is a risky proposition, as the OOPSLA name is a brand that has value in its own right.

You can see one small step toward the possibility of a new name in how we have been "branding" the conference this year. On the 2007 web site, instead of saying "OOPSLA" we have been saying ooPSLA. There are a couple of graphical meanings one can impose on this spelling, but it is a change that signals the possibility of more.

It has been fun hearing the discussions of a possible name change. You can see glimpses of the "OOPSLA as programming" theme, and some of the interesting ideas driving thoughts of change, in this year's conference program. General chair Richard Gabriel writes:

I used to go to OOPSLA for the objects -- back in the 1980s when there was lots to find/figure out about objects and how that approach -- oop -- related to programming in general. Nowadays objects are mainstream and I go for the programming. I love programs and programming. I laugh when people try to compare programming to something else, such as: "programming is like building a bridge" or "programming is like following a recipe to bake a soufflé." I laugh because programming is the more fundamental activity -- people should be comparing other things to it: "writing a poem is like programming an algorithm" or "painting a mural is like patching an OS while it's running." I write programs for fun the way some people play sudoku or masyu, and so I love to hear and learn about programs and programming.

Programming is the more fundamental activity... Very few people in the world realize this -- including a great many computer scientists. We need to communicate this better to everyone, lest we fail to excite the great minds of the future to help us build this body of knowledge.

OOPSLA has an Essays track that distinguishes it from other academic conferences. An OOPSLA essay enables an author to reflect ...

... upon technology, its relation to human endeavors, or its philosophical, sociological, psychological, historical, or anthropological underpinnings. An essay can be an exploration of technology, its impacts, or the circumstances of its creation; it can present a personal view of what is, explore a terrain, or lead the reader in an act of discovery; it can be a philosophical digression or a deep analysis. At its best, an essay is a clear and compelling piece of writing that enacts or reveals the process of understanding or exploring a topic important to the OOPSLA community. It shows a keen mind coming to grips with a tough or intriguing problem and leaves the reader with a feeling that the journey was worthwhile.

As 2007 Essays chair Guy Steele writes in his welcome,

Perhaps we may fairly say that while Research Papers focus on 'what' and 'how' (aided and abetted by 'who' and 'when' and 'where'), Essays take the time to contemplate 'why' (and Onward! papers perhaps brashly cry 'why not?').

This ain't your typical research paper, folks. Writers are encouraged to think big thoughts about programs and programming, and then share those thoughts with an audience that cares.

Steele refers to Onward!, and if you've never been to OOPSLA you may not be able to what he means. In many ways, Onward! is the archetypal example of how OOPSLA is about programs and all the issues related to them. A few years ago, many conference folks were frustrated that the technical track at OOPSLA made no allowance for papers that really push the bounds of our understanding, because they didn't fit neatly into the mold of conventional programming languages research. Rather than just bemoan the fact, these folks -- led by Gabriel -- created the conference-within-a-conference that is Onward!. Crista Lopes's Onward! welcome leaves no doubt that the program is the primary focus of the Onward! and, more generally, the conference:

Objects have grown up, software is everywhere, and we are now facing a consequence of this success: the perception that we know what programming is all about and that building software systems is, therefore, just a simple matter of programming ... with better or worse languages, tools, and processes. But we know better. Programming technology may have matured, programming languages, tools, and processes may have proliferated, but fundamental issues pertaining to computer Programming, Systems, Languages, and Applications are still as untamed, as new, and as exciting as they ever were.

Lopes also wrote a marvelous message on the conference mailing list last October that elaborates on these ideas. She argued that we should rename OOPSLA simply the ACM Conference on Programming. I'll quote only this portion:

Over the past couple of decades, the words "programming" and "programmer" fell out of favor, and were replaced by several other expressions such as "software engineer(ing)", "software design(er)", "software architect(ure)", "software practice", etc. A "programmer" is seen in some circles as an inferior worker to a "software engineer" or, pardon the comparison!, a "software architect". There are now endless, more fashionable terms that try to hide, or subsume, the fact that, when the rubber hits the road, this is all about developing systems whose basic elements are computer programs, and the processes and tools that surround their creation and composition.

...

While I have nothing against the new, more fashionable terms, and even understand their need and specificity, I think it's a big mistake that the CS research community follows the trend of forgetting what this is all about. The word "programming" is absolutely right on the mark!, and CS needs a research community focusing on it.

On this view, we need to rename OOPSLA not for OOPSLA's sake, but for the discipline's. Lopes's "Conference on Programming" sounds to bland to those with a marketing bent and too pedestrian for those who with academic pretension. But I'm not sure that it isn't the most accurate name.

What are the options? For many, then default is to drop the "oo" altogether, but that leaves PSLA -- which breaks whatever rule there is against creating acronyms that sound unappealing when said out loud. So I guess the ooPSLA crowd should just keep looking.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 30, 2007 11:16 AM

Unexpected Fun Cleaning out My Closet

The last week or so I've been trying to steal a few minutes each day to clean up the closet in my home work area. One of the big jobs has been to get rid of several years of journals and proceedings that built up from 1998 to 2002, when it seems I had time only to skim my incoming periodicals.

I seem genetically unable to simply through these into a recycling bin; instead, I sit on the floor and thumb through each, looking at least at the table of contents to see if there is anything I still want to read. Most of the day-to-day concerns in 2000 are of no particular interest now. But I do like to look at the letters to the editor in Communications of the ACM, IEEE Computer, and IEEE Spectrum, and some of the standing columns in SIGPLAN Notices, especially on Forth and on parsing. Out of every ten periodicals or so, I would guess I have saved a single paper or article for later reading.

One of the unexpected joys has been stumbling upon all of the IEEE Spectrum issues. It's one of the few general engineering generals I've ever received, and besides it has the bimonthly Reflections column by Robert Lucky, which I rediscovered accidentally earlier this month. I had forgotten in the off-months of Reflections, Spectrum runs a column called Technically Speaking, which I also enjoy quite a bit. According to its by-line, this column is "a commentary on technical culture and the use and misuse of technical language". I love words and learning about their origin and evolution, and this column used to feed my habit.

Most months, Technically Speaking includes a sidebar called "Worth repeating", which presents a quote of interest. Here are a couple that struck me as I've gone through my old stash.

From April 2000:

Engineering, like poetry, is an attempt to approach perfection. And engineers, like poets, are seldom completely satisfied with their creations.... However, while poets can go back to a particular poem hundreds of times between its first publication and its final version in their collected works, engineers can seldom make major revision in a completed structure. But an engineer can certainly learn from his mistakes.

This is from Henry Petroski, in To Engineer is Human. The process of discovery in which an engineer creates a new something is similar to the poet's process of discovery. Both lead to a first version by way of tinkering and revision. As Petroski notes, though, when engineers who build bridges and other singular structures publish their first version, it is their last version. But I think that smaller products which are mass produced often can be improved over time, in new versions. And software is different... Not only can we grow a product through a conscious process of refactoring, revision, and rewriting from scratch, but after we publish Version 1.0 we can continue to evolve the product behind its interface -- even while it is alive, servicing users. Software is a new sort of medium, whose malleability makes cleaving too closely to the engineering mindset misleading. (Of course, software developers should still learn from their mistakes!)

From June 2000:

You cannot have good science without having good science fans. Today science fans are people who are only interested in the results of science. They are not interested in a good play in science as a football fan is interested in a good play in football. We are not going to be able to have an excellent scientific effort unless the man in the street appreciates science.

This is reminiscent of an ongoing theme in this blog and in the larger computer science community. It continues to be a theme in all of science as well. How do we reform -- re-form -- our education system so that most kids at least appreciate what science is and means? Setting our goal as high as creating fans as into science as into football or NASCAR would be ambitious indeed!

Oh, and don't think that this ongoing theme in the computer science and general scientific world is a new one. The quote above is from Edward Teller, taken off the dust jacket of a book named Rays: Visible and Invisible, published in 1958. The more things change, the more they stay the same. Perhaps it should comfort us that the problem we face is at least half a century old. We shouldn't feel guilty that we cannot solve it over night.

And finally, from August 2000:

To the outsider, science often seems to be a frightful jumble of facts with very little that looks human and inspiring about it. To the working scientist, it is so full of interest and so fascinating that he can only pity the layman.

I think the key here is make moire people insiders. This is what Alan Kay urges us to do -- he's been saying this for thirty years. The best way to share the thrill is to help people to do what we do, not (just) tell them stories.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

September 26, 2007 6:42 PM

Updates, Courtesy of My Readers

I love to hear from readers who have enjoyed an article. Often, folks have links or passages to share from their own study of the same issue. Sometimes, I feel a need to share those links with everyone. Here are three, in blog-like reverse chronological order:

On Devil's Advocate for Types

Geoff Wozniak pointed me in the direction of Gilad Bracha's work on pluggable type systems. I had heard of this idea but not read much about it. Bracha argues that a type system should be a wrapper around the dynamically typed core of a program. This makes it possible to expose different views of a program to different readers, based on their needs and preferences. More thinking to do...

On Robert Lucky and Math Blues

Chris Johnson, a former student of mine, is also a fan of Bob Lucky's. As a graduate student in CS at Tennessee, though, he qualifies for a relatively inexpensive IEEE student membership and so can get his fix of Lucky each month in Spectrum. Chris took pity on his old prof and sent me a link to Lucky's Reflections on-line. Thank you, thank you! More reading to do...

On Would I Lie to You?

Seth Godin's thesis is that all good marketers "lie" because they tell a story tailored to their audience -- not "the truth, the whole truth, and "nothing but the truth". I applied his thesis to CS professors and found it fitting.

As old OOSPLA friend and colleague Michael Berman reminds us, this is not a new idea:

Another noteworthy characteristic of this manual is that it doesn't always tell the truth.... The author feels that this technique of deliberate lying will actually make it easier for you to learn the ideas.

That passage was written by Donald Knuth in the preface to The TEXbook, pages vi-vii. Pretty good company to be in, I'd say, even if he is an admitted liar.

Keep those suggestions coming, folks!


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

September 24, 2007 4:47 PM

Devil's Advocate for Types

A couple of my recent entries (here and here) have questioned the value of data types, at least in relation to a corresponding set of unit tests. While running this weekend, I played devil's advocate with myself a bit, thinking, "But what would a person who prefers programming with manifest types say?"

One thing that types provide that tests do not is a sense of universality. I can write a test for one case, or two, or ten, but at the end of the day I will have only a finite number of test cases. A type checker can make a more general statement, of a more limited scope. The type checker can say, "I don't know the values of all possible inputs, but I do know that all of the values are integers." That information can be quite useful. A compiler can use it to generate more efficient target code. A programmer can use it to generalize more confidently from a small set of tests to the much larger set of all possible tests.

In the terms of unit testing, types give us a remarkable level of test coverage for a particular kind of test case.

This is a really useful feature of types. I'd like to take advantage of it, even if I don't want my language to get in my way very much while I'm writing code. One way I can have both is to use type inference -- to let my compiler, or more generally my development environment, glean type information from my code and use that in ways that help me.

There is another sense in which we object-oriented programmers use types without thinking about them: we create objects! When I write a class, I define a set of objects as an abstraction. Such an object is specified in terms of its behavioral interface, which is public, and its internal state, which is private. This creates a kind of encapsulation that is just like what a data type provides. In fact, we often do think of classes as abstract data types, but with the twist that we focus on an object's behavioral responsibility, rather than manipulating its state.

That said, newcomers to my code benefit from manifest types because the types point them to the public interface expected of the objects that appear in the various contexts of my program.

I think this gets to the heart of the issue. Type information is incredibly useful, and helps the reader of a program in ways that a set of tests does not. When I write a programs with a data-centric view of my abstractions, specifying types up front seems not only reasonable but almost a no-brainer. But when I move away from a data-centric view to a behavioral perspective, tests seem to offer a more flexible, comfortable way to indicate the requirements I place on my objects.

This is largely perception, of course, as a Java-style interface allows me to walk a middle road. Why not just define an interface for everything and have the benefits of both worlds? When I am programming bottom-up, as I often do, I think the answer comes down to the fact that I don't know what the interfaces should look like until I am done, and fiddling with manifest types along the way slows me down at best and distracts me from what is important at worst. By the time I know what my types should look like, they are are of little use to me as a programmer; I'm on to the next discovery task.

I didn't realize that my mind would turn to type inference when I started this line of questioning. (Thinking and writing can be like that!) But now I am wondering how we can use type inference to figure out and display type information for readers of code when it will be useful to them.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 20, 2007 6:55 AM

Hype, or Disseminating Results?

The software world always seems to have a bandwagon du jour, which people are either riding or rebelling against. When extreme programming became the rage a while back, all I seemed to hear from some folks was that "agile" was a buzzword, a fad, all hype and no horse. Object-oriented programming went through its bandwagon phase, and Java had its turn. Lately it seems Ruby is the target of knowing whispers, that its popularity is only the result of good marketing, and it's not really all that different.

But what's the alternative? Let's see what Turing Award winner Niklaus Wirth has to say:

Why, then, did Pascal capture all the attention, and Modula and Oberon got so little? Again I quote Franz: "This was, of course, partially of Wirth's own making". He continues: "He refrained from ... names such as Pascal-2, Pascal+, Pascal 2000, but instead opted for Modula and Oberon". Again Franz is right. To my defense I can plead that Pascal-2 and Pascal+ had already been taken by others for their own extensions of Pascal, and that I felt that these names would have been misleading for languages that were, although similar, syntactically distinct from Pascal. I emphasized progress rather than continuity, evidently a poor marketing strategy.

But of course the naming is by far not the whole story. For one thing, we were not sufficiently active -- today we would say aggressive -- in making our developments widely known.

Good names and aggressive dissemination of ideas. (Today, many would call that "marketing".)

Wirth views Pascal, Modula, and Oberon as an ongoing development of 25 years that resulted in a mature, elegant, and powerful language, a language who couldn't even imagine back in 1969. Yet for many software folks, Modula was a blip on the scene, or maybe just a footnote, and Oberon was, well, most people just say "Huh?" And that's a shame, because even if we choose not to program in Oberon, we lose something by not understanding what it accomplished as a language capable of supporting teams and structured design across the full array of system programming.

I never faulted Kent Beck for aggressively spreading XP and the ideas it embodied. Whatever hype machine grew up around XP was mostly a natural result of people becoming excited by something that could so improve their professional practice. Yes, I know that some people unscrupulously played off the hype, but the alternative to risking hype is anonymity. That's no way to change the world.

I also applaud Kent for growing as he watched the results of XP out in the wild and for folding that growth back into his vision of XP. I wonder, though, if the original version of XP will be Pascal to XP2e's Modula.

By the way, the Wirth quote above comes from his 2002 paper Pascal and Its Successors. I enjoy hearing scientists and engineers tell the stories of their developments, and Wirth does a nice job conveying the context in which he developed Pascal, which had a great many effects in industry but more so in the academic world, and its progeny. As I read, I reacted to several of his remarks:

  • On Structured Programming:

    Its foundations reached far deeper than simply "programming without go to statements" as some people believed. It is more closely related to the top-down approach to problem solving.

    Yes, and in this sense we can more clearly see the different mindset between the Structured Programming crowd and the bottom-up Lisp and Smalltalk crowd.

  • On static type checking:

    Data typing introduces redundancy, and this redundancy can be used to detect inconsistencies, that is, errors. If the type of all objects can be determined by merely reading the program text, that is, without executing the program, then the type is called static, and checking can be performed by the compiler. Surely errors detected by the compiler are harmless and cheap compared to those detected during program execution in the field, by the customer.

    Well, yeah, but what if I write tests that let me detect the errors in house -- and tell more about my program and intentions than manifest types can?

  • On loopholes in a language:

    The goal of making the language powerful enough to describe entire systems was achieved by introducing certain low-level features.... Such facilities ... are inherently contrary to the notion of abstraction by high-level language, and should be avoided. They were called loopholes, because they allow to break the rules imposed by the abstraction. But sometimes these rules appear as too rigid, and use of a loophole becomes unavoidable. The dilemma was resolved through the module facility which would allow to confine the use of such "naughty" tricks to specific, low-level server modules. It turned out that this was a naive view of the nature of programmers. The lesson: If you introduce a feature that can be abused, then it will be abused, and frequently so!

    This is, I think, a fundamental paradox. Some rules, especially in small, elegant languages, don't just appear too rigid; they are. So we add accommodations to give the programmer a way to breach the limitation. But then programmers use these features in ways that circumvent the elegant theory. So we take them out. But then...

    The absence of loopholes is the acid test for the quality of a language. After all, a language constitutes an abstraction, a formal system, determined by a set of consistent rules and axioms. Loopholes serve to break these rules and can be understood only in terms of another, underlying system, an implementation.

    Programming languages are not (just) formal systems. They are tools used by people. An occasional leak in the abstraction is a small price to pay for making programmers' lives better. As Spolsky says, "So the abstractions save us time working, but they don't save us time learning."

    A strong culture is a better way to ensure that programmers don't abuse a feature to their detriment than a crippled language.

All that said, we owe a lot to Wirth's work on Pascal, Modula, and Oberon. It's worth learning.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 05, 2007 7:48 PM

Math Blues

I used to be a member of the IEEE Computer Society. A few years, a combination of factors (including rising dues, a desire to cut back on paper journal subscriptions, a lack of time to read all the journals I was receiving) led me to drop my membership. In some ways, I miss receiving the institute's flagship publication, Spectrum. It was aimed more at "real" engineers than software types, but it was a great source of general information across a variety of engineering disciplines. My favorite column in Spectrum was Robert Lucky's "Reflections". It is written in a blog-like fashion, covering whatever neat ideas he has been thinking about lately in a conversational tone.

For some reason, this week I received a complimentary issue of Spectrum, and I immediately turned to "Reflections", which graced the last page. In this installment, Lucky writes about how math is disappearing from the practice of engineering, and this makes him sad. In the old days engineers did more of their own math, while these days they tend to focus on creating and using software to do those computations for them. But he misses the math, both doing it and thinking about it. Once he came to appreciate the beauty in the mathematics that underlies his corner of engineering, and now it is "as if my profession had slipped away while I wasn't looking". Thus his title, "Math Blues".

I appreciate how he must feel, because a lot of what used to be "fundamental" in computer science now seems almost quaint these days. I especially feel for folks who seem more attached to the old fundamentals, because today's world must seem utterly foreign to them. Of course, we can always try to keep our curriculum focused on those fundamentals, though students sometimes realize that we are living in the past.

I felt some math blues as I read Lucky's column, too, but of a different sort. Here is the passage that made me saddest:

I remember well the day in high school algebra class when I was first introduced to imaginary numbers. The teacher said that because the square root of a negative number didn't actually exist, it was called imaginary. That bothered me a lot. I asked, If it didn't exist, why give it a name and study it? Unfortunately, the teacher had no answers for these questions.

What great questions young Bob asked, and what an opportunity for his teacher to open the student to a world of possibilities. But it was a missed opportunity. Maybe the teacher just did not know how to explain such an advanced idea in a way that his young student could grasp. But I think it is likely that the teacher didn't understand, appreciate, or perhaps even know about that world.

Maybe you don't have to be a mathematician, engineer, or scientist to be able to teach math well. But one thing is for certain: not knowing mathematics at the level those professionals need creates a potential for shallowness that is hard to overcome. How much more attractive would a college major in science, math, or engineering look to high school students if they encountered deep and beautiful ideas in their courses -- even ideas that matter when we try to solve real problems?

Outreach from the university into our school systems can help. Many teachers want to do more and just need more time and other resources to make it happen. I think, though, that a systemic change in how we run our schools and in what we expect of our teaching candidates would go even farther.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 31, 2007 5:39 PM

Good Writing, Good Programming

This week I ran across a link to an old essay called Good Writing, by Marc Raibert. By now I shouldn't be so happy to be reminded how much good programming practice is similar to good writing in general. But I usually am. The details of Raibert's essay are less important to me than some of his big themes.

Raibert starts with something that people often forget. To write well, one must ordinarily want to write well and believe that one can. Anyone who teaches introductory computer science knows how critical motivation and confidence are. Many innovations in CS1 instruction over the years have been aimed at helping students to develop confidence in the face of what appear to be daunting obstacles, such as syntax, rigor, and formalism. Much wailing and gnashing of teeth has followed the slowly dawning realization that students these days are much less motivated to write computer programs than they have been over most of the last thirty years. Again, many innovations and proposals in recent years have aimed at motivating students -- more engaging problems, media computation, context in the sciences and social sciences, and so on. These efforts to increase motivation and confidence are corporate efforts, but Raibert reminds us that, ultimately, the individual who would be a writer must hold these beliefs.

After stating these preconditions, Raibert offers several pieces of advice that apply directly to computing. Not surprisingly, my favorite was his first: Good writing is bad writing that was rewritten. This fits nicely in the agile developer's playbook. I think that few developers or CS teachers are willing to say that it's okay to write bad code and then rewrite. Usually, when folks speak in terms of do the simplest thing that will work and refactor mercilessly, they do not usually mean to imply that the initial code was bad, only that it doesn't worry inordinately about the future. But one of the primary triggers for refactoring is the sort of duplication that occurs when we do the simplest thing that will work without regard for the big picture of the program. Most will agree that most such duplication is a bad thing. In these cases, refactoring takes a worse program and creates a better one.

Allowing ourselves to write bad code empowers us, just as it empowers writers of text. We need not worry about writing the perfect program, which frees us to write code that just works. Then, after it does, we can worry about making the program better, both structurally and stylistically. But we can do so with the confidence that comes from knowing that the substance of our program is on track.

Of course, starting out with the freedom to write bad code obligates us to re-write, to refactor, just as it obligates writers of text to re-write. Take the time! That's how we produce good code reliably: write and re-write.

I wish more of my freshmen would heed this advice:

The first implication is that when you start a new [program], there is nothing wrong with using bad writing. Your goal when you start is to get your ideas down on paper in any form you can.

For the novice programmer, I do not recommend writing ungrammatical or "stream of consciousness" code, but I do encourage them to take the ideas they have after having thought about the problem and expressing them in code. The best way to find out if an idea is a good one is to see it run in code.

Raibert's other advice also applies. When I read Spill the beans fast, I think of making my code transparent. Don't hide its essence in subterfuge that makes me seem clever; push its essence out where all can see it and understand the code. Many of the programmers whose code I respect most, such as Ward Cunningham, write code that is clear, concise, and not at all clever. That's part of what makes it beautiful.

Don't get attached to your prose is essential when writing prose, and I think it applies to code as well. Just because you wrote a great little method or class yesterday doesn't mean that it should survive in your program of tomorrow. While programming, you discover more about your problem and solution than you knew yesterday. I love Raibert's idea of a PRIZE_WINNING_STUFF.TXT file. I have a directory labeled playground/ where I place all the little snippets I've built as simple learning experiments, and now I think I need to create a winners/ directory right next to it!

Raibert closes with the advice to get feedback and then to trust your readers. A couple of months back I had a couple of entries on learning from critics, with different perspectives from Jerry Weinberg and Alistair Cockburn. That discussion was about text, not code (at least on the surface). But one thing that we in computer science need to do is to develop a stronger culture of peer review of our code. The lack of one is one of the things that most differentiates us from other engineering disciplines, to which many in computing look for inspiration. I myself look more to the creative disciplines for inspiration than to engineering, but on this the creative and engineering disciplines agree: getting feedback, using it to improve the work, and using it to improve the person who made the work are essential. I think that finding ways to work the evaluation of code into computer science courses, from intro courses to senior project courses, is a path to improving CS education.

PLoP 2007

This last bit of advice from Raibert is also timely, if bittersweet for me... In just a few days, PLoP 2007 begins. I am quite bummed that I won't be able to attend this year, due to work and teaching obligations. PLoP is still one of my favorite conferences. Patterns and writers' workshops are both close to my heart and my professional mind. If you have never written a software pattern or been to PLoP, you really should try both. You'll grow, if only when you learn how a culture of review and trust can change how you think about writing.

OOPSLA 2007

The good news for me is that OOPSLA 2007, which I will be attending, features a Mini-PLoP. This track consists of a pattern writing bootcamp, a writers' workshop for papers in software development, and a follow-up poster. I hope to attend the writers' workshop session on Monday, even if only as a non-author who pledges to read papers and provide feedback to authors. It's no substitute for PLoP, but it's one way to get the feeling.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

August 28, 2007 9:44 PM

Refactoring, Beyond Software

Today I "refactored" the web page for a class session, along with files that support it. I used scare quotes there, but my process really was affected by the refactoring that we do to our code. I can probably describe it in terms of code refactorings.

Here is what I started with: a web page that loads thirteen images from a subdirectory named session02/, and a web page that loads two images from a subdirectory named session03/.

Due to changes in the timing of presentation, I needed to move a big chunk of HTML text from Session 3 to Session 2, including the text that loads the images from session03/.

In the old days, I would have cut the text from Session 3, pasted it into Session 2, renamed the images in session03/ so that they did not clash with files in session02/, moved them to session03/, and deleted session03/.

Along the way, there are a number of mistakes I could make, from inadvertently overwriting a file to losing text in transit by bungling a few keystrokes in emacs. I have done that before, and ended up spending precious time trying to recover the text and files I had lost.

I did something different this time. I didn't move and delete files or text. First, I copied text from one page to the other, allowing Session 2 to load images from the existing session03/ directory. I tested this change by loading the page to see that all the images still loaded in the right places. Only then did I delete the text from Session 2. Next, I copied the images from session03/ to session02/, using new names, and modified the web page to load the new images. I tested this change by re-loading the web page to verify the lecture. Only then did I delete session03/ and the images it contained.

Everything went smoothly. I felt so good that I even made a subdirectory named sample-compiler/ in the session02/ directory and moved the images in session02/ associated with the sample compiler -- one of the original session02/ images plus the two originally in session03/ -- down into the new subdirectory. I made this change in a similarly deliberate and safe way, making copies and running tests before removing any existing functionality.

When I got done, I felt as if I had applied to common code refactorings: Move Method and Extract Subclass. The steps were remarkably similar.

My description may sound as if this set of changes took me a long time to effect, but it didn't. Perhaps it took a few seconds longer than if I had executed a more direct path without error, but... I moved much more confidently in this approach, and I did not make any errors. The trade-off of deliberate action as insurance against the cost of recovering from errors was a net gain.

I refactored my document -- really, a complex of HTML files, subdirectories, and images -- using the steps like those we learn in Fowler and Kerievsky: small, seemingly too small in places, but guaranteed to work while "passing tests" along the way. My test, reloading the web page and examining the result after each small change, would be better if automated, but frankly the task here is simple enough the simple "inspect the output" method works just fine.

The ideas we discover in developing software often apply outside the world of software. I'm not sure this is an example of what people computational thinking, except in the broadest sense, but it is an example of how an idea we use in designing and implementing programs applies to the design and implementation of other artifacts. The ideas we discover in developing software often apply outside the world of software. We really should think about how to communicate them to the other folks who can use them.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

August 24, 2007 12:20 PM

You Want Security?

Here is security, which comes along with my new favorite error message:

Your password must be at least 18770 characters and cannot repeat any of your previous 30689 passwords. Please type a different password. Type a password that meets these requirements in both text boxes.

Oh, yeah -- be sure to type it twice.

I leave it to the TheoryCS guys at Ernie's 3D Pancakes, Computational Complexity, and The Geomblog to give us the mathematical picture on how secure an 18,770-character password is, and what the implications are for installing SP1, before which you could get by with a passcode of a mere 17,145 characters.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

August 22, 2007 8:08 PM

Seeing and Getting the Power

After six months of increased blogging output, if not blog quality, August has hit me head on. Preparing for vacation. Vacation. Digging out of backlog from vacation. Getting ready for fall semester. The start of the semester.

At least now I have some time and reason to spend energy on my compilers course!

The algorithm geeks among you might enjoy this frequently-referenced and very cool demo of a graphics technique called Content-Aware Image Sizing, presented by Ariel Shamir at SIGGRAPH earlier this month (in San Diego just before I arrived for the above-mentioned vacation). I think that this will make a neat example for my algorithms the next time I teach it. (It's on tap for next spring.)

Algorithmically, the idea is cool but seems straightforward enough. To me, what is most impressive is having the idea in the first place. One of the things I love about following research in other areas of computer science -- and even in areas far beyond CS -- is seeing how thinkers who understand a domain deeply create and apply new ideas. This particular idea will be so useful on the web and in media of the future.

The guy who sent the link said it well:

SIGGRAPH has a way of reminding me that in many ways I'm the dumb kid at the back of the classroom.

I feel the same way sometimes, but I usually don't mind. There is much to learn.

Still, I hope that by the end of this semester my compilers students don't feel this way about compilers. I'd like them to appreciate that a compiler is "just another program" -- that they can learn techniques which make building a compiler straightforward and even fun. What looks like magic becomes understanding, without losing its magical aura.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

August 08, 2007 3:40 PM

Let's Kill and Dick and Jane

No, I've not become homicidal. That is the title of a recent book about the Open Court Publishing Company, which according to its subtitle "fought the culture of American education" by trying to change how our schools teach reading and mathematics. Blouke Carus, the creator of Open Court's reading program, sought to achieve an enviable goal -- engagement with and success in the world of great ideas for all students -- in a way that was beholden to neither the traditionalist "back to basics" agenda nor the progressivist "child-centered" agenda. Since then, the reading series has been sold to McGraw-Hill.

Thanks to the creator of the TeachScheme! project, Matthias Felleisen, I can add this book to my list of millions. He calls Let's Kill and Dick and Jane "TeachScheme! writ large". Certainly there are differences between the K-8 education culture and the university computer science culture, but they share enough commonalities to make reform efforts similarly difficult to execute. As I have noted before, universities and their faculty are a remarkably conservative lot. TeachScheme! is a principled, comprehensive redefinition of introductory programming education. In arguing for his formulation, Felleisen goes so far as to explain why Structure and Interpretation of Computer Programs -- touted by many, including me, as the best CS book ever written -- is not suitable for CS 1. (As much as I like SICP, Matthias is right.)

But TeachScheme! has not succeeded in the grand way its implementors might have hoped, for many of the reasons that Open Court's efforts have come up short of its founders' goals. Some of the reasons are cultural, some are historical, and some are probably strategic.

The story of Open Court is of immediate interest to me for our state's interest in changing K-12 math and science education in a fundamental way, a reform effort that my university has a leading role in, and which my department and I have a direct interest in. We believe in the need for more and better computer scientists and software developers, but university CS enrollments remain sluggish. Students who are turned off to science, math, and intellectual ideas in grade school aren't likely to select CS as a major in college... Besides, like Carus, I have a great interest in raising the level of science and math understanding across the whole population.

This book taught me a lot about what I had understood only incompletely as an observer of our education system. And I appreciated that it avoided the typical -- and wrong -- conservative/liberal dichotomy between the traditional and progressive approaches. America's education culture is a beast all its own, essentially anti-intellectual and exhibiting an inertia borne out of expectations, habit, and a lack of will and time to change. Changing the system will take a much more sophisticated and patient approach than most people usually contemplate.

Though I have never developed a complete curriculum for CS 1 as Felleisen has, I have long aspired to teaching intro CS in more holistic way, integrating the learning of essential tools with higher-level design skills, built on the concept of a pattern language. So Open Court's goals, methods, and results all intrigue me. Here are some of the ideas that caught my attention as I read the story of Open Court:

  • On variety in approach:

    A teacher must dare to be different! She must pull away from monotonous repetition of, 'Today we are going to write a story.' Most children view that announcement with exactly what it deserves, and there are few teachers who are not aware of what the reactions are.

    s/story/program/* and s/children/students/* to get a truth for CS instructors such as me.

  • A student is motivated to learn when her activities scratch her own itch.

  • Teaching techniques that may well transfer to my programming classroom: sentence lifting, writing for a real audience, proofreading and revising programs, and reading good literature from the beginning.

  • On a curriculum as a system of instruction:

    The quality of the Open Court program was a substantive strength and a marketing weakness. It required teachers to be conversant with a variety of methods. And the program worked best when used as a system... Teachers accustomed to trying a little of this and a little of that were likely to be put off by an approach that did not lend itself to tinkering.

    I guess I'm not the only person who has trouble sticking to the textbook. To be fair to us tinkerers, systematic integrated instructional design is so rare as to make tinkering a reasonable default stance.

  • On the real problem with new ideas for instruction:

    Thus Open Court's usual problem was not that it contradicted teachers' ideology, but that it violated their routine.

    Old habits dies hard, if at all.

  • In the study of how children learn to comprehend text, schema theory embodied the "idea that people understand what they read by fitting it into patterns they already know". I believe that this is largely true for learning to understand programs, too.

  • Exercising existing skills does not constitute teaching.

  • Quoting a paper by reading scholars Bereiter and Scardamalia, the best alternative model for a school is

    ... the learned professions ...[in which] adaptation to the expectations of one's peers requires continual growth in knowledge and competence.

    In the professions, we focus not only on level of knowledge but also on the process of continuously getting better.

  • When we despair of revolutionary change:

    There are circumstances in which it is best to package our revolutionary aspirations in harmless-looking exteriors.... We should swallow our pride and let [teachers] think we are bland and acceptable. We should fool them and play to their complacent pieties, But we should never for a moment fool ourselves.

    Be careful what you pretend to be.

  • Too often, radical new techniques are converted into "subject matter" to be added to the curriculum. But in that form they usually become overgeneralized rules that fail too often to be compelling. More insidious is that this "makes it possible to incorporate new elements into a curriculum without changing the basic goals and strategies of the curriculum". Voilá -- change with no change.

    Then we obsess about covering all the material, made harder by the inclusion of all this new material.

    I've seen this happen to software patterns in CS classrooms. It was sad. I usually think that we elementary patterns folks haven't done a good enough job yet, but Open Court's and TeachScheme!'s experiences do not encourage me to think that it will be easy to do well enough.

  • On the value of a broad education: This book tells of how the people at Open Court who were focused exclusively on curriculum found themselves falling prey to invalid plans from marketers.

    In an academic metaphor, [Open Court's] people had been the liberal-arts students looking down on the business school. But now that they needed business-school expertise, they were unable to judge it critically.

    Maybe a so-called "liberal education" isn't broad enough if it leaves the recipient unable to think deeply in an essential new context. In today's world, both business and technology are essential components of a broadly applicable education.

  • On the need for designing custom literature suitable for instruction of complete novices:

    Teaching any skill requires repetition, and few great works of literature concentrate on long "e".

    My greatest inspiration in this vein is the Suzuki literature developed for teaching violin and later piano and other instruments. I've experienced the piano literature first hand and watched my daughters work deep into the violin literature. At the outset, both use existing works (folk tunes, primarily) whenever appropriate, but they also include carefully designed pieces that echo the great literature we aspire to for our students. As the student develops technical skill, the teaching literature moves wholly into the realm of works with connection to the broader culture. My daughters now play real violin compositions from real composers.

  • But designing a literature and curriculum for students is not enough. Ordinary teachers can implement more aggressive and challenging curricula only with help: help learning the new ideas, help implementing the ideas in classroom settings, and help breaking the habits and structures of standing practice. Let's Kill Dick and Jane is full of calls from teachers for more help in understanding Open Court's new ideas and in implementing them well in the face of teachers' lack of experience and understanding. Despite its best attempts, Open Court never quite did this well enough.

    TeachScheme! is a worthy model in this regard. It has worked hard to train teachers in its approach, to provide tangible support, and to build a community of teachers.

    In my own work on elementary patterns, my colleague and friend Robert Duvall continually reminds us all of the need for providing practical support to instructors who might adopt our ideas -- if only they had the instructional tools to make a big change in how they teach introductory programming.

  • ... which leads me to a closing observation. One of the big lessons I take from the book is that to effect change, one must understand and confront issues that exist in the world, not the the issues that exist only in our idealized images of the world or in what are basically the political tracts of activists.

    In an odd way, this reminds me of Drawing on the Right Side of the Brain, with its precept that we should draw what we see, not what we think we see.

Let's Kill Dick and Jane is a slim volume, a mere 155 pages, and easy to read. It's not perfect, neither in its writing nor in its analysis, but it tells an important story well. I recommend it to anyone with aspirations of changing how we teach computer science to students in the university or high school. I also recommend it to anyone who is at all interested in the quality of our educational system. In a democracy such as the U.S., that should be everyone.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

July 23, 2007 1:59 PM

Intelligent Game Playing in the News

Two current events have me thinking about AI, one good and one sad.

First, after reporting last week that checkers has been solved by Jonathan Schaeffer's team at the University of Alberta, this week I can look forward to the Man vs. Machine Poker Challenge at AAAI'07 The computer protagonist in this event, Polaris, also hails from Alberta and Schaeffer's poker group. In this event, which gets under way shortly in Vancouver, Polaris will play a duplicate match against two elite human pros, Phil Laak and Ali Eslami. Laak and Eslami will play opposite sides of the same deal against Polaris, in an attempt to eliminate the luck of the draw from the result.

I don't know much about computer card-playing. Back when I was teaching AI in the mid-1990s, I used Matthew Ginsberg's text, and from his research learned a bit about programs that play bridge. Of course, bridge players tend to view their game as a more intellectual task than poker (and as more complex than, say, chess), whereas poker introduces the human element of bluffing. It will be fun seeing how a "purely rational" being like Polaris bluffs and responds to bluffs in this match. If poker is anything at all like chess, I figure that the program's dispassionate stance will help it respond to bluffs in a powerful way. Making bluffs seems a different animal altogether.

I wish I could be in Vancouver to see the matches. Back in 1996 I was fortunate to be at AAAI'96 in Philadelphia for the first Kasparov-Deep Blue match. The human champ won a close match that year before losing to Deep Blue the next. We could tell from Kasparov's demeanor and behavior during this match, as well as from his public statements, that he was concerned that humans retain their superiority over machines. Emotion and mental intimidation were always a part of his chess.

On the contrary, former World Series of Poker champion Laak seems unconcerned at the prospect that Polaris might beat him in this match, or soon; indeed, he seems to enjoy the challenge and understand the computational disadvantage that we humans face in these endeavors. That's a healthier attitude, both long term and for playing his match this week. But I appreciated Kasparov's energy during that 1996 match, as it gave us demonstrative cues about his state of mind. I'll never forget the time he made a winning move and set back smugly to put his wristwatch back on. Whenever Garry put his watch back on, we knew that he thought he was done with the hard working of winning the game

The second story is sadder. Donald Michie, a pioneer in machine learning, has died. Unlike many of the other founders of my first love in computing, I never had any particular connection to Michie or his work, though I knew his name well from the series of volumes on machine learning that he compiled and edited, as they are staples of most university libraries. But then I read in his linked Times On-Line article:

In 1960 he built Menace, the Matchbox Educable Noughts and Crosses Engine, a game-playing machine consisting of 300 matchboxes and a collection of glass beads of different colours.

We Americans know Noughts and Crosses as tic-tac-toe. It turns out that Michie's game-playing machine -- one that needed a human CPU and peripherals in order to run -- was the inspiration for an article by Martin Gardner, which I read as a sophomore or junior in high school. This article was one of my first introductions to machine learning and fueled the initial flame of my love for AI. I even built Gardner's variant on Michie's machine, a set of matchboxes to play Hexapawn and watched it learn to play a perfect game. It was no Chinook or Deep Blue, but it made this teenager's mind marvel at the possibilities of machine intelligence.

So, I did have a more direct connection to Michie, and had simply forgotten! RIP, Dr. Michie.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

July 19, 2007 3:03 PM

Checkers -- Solved!

I told the story of Jonathan Schaeffer's SIGCSE talk on the history of Chinook back in March. In that talk, he said that his team was a few months away from solving checkers. They have done it, as this Scientific American article reports:

Jonathan Schaeffer's quest for the perfect game of checkers has ended. ... after putting dozens of computers to work night and day for 18 years -- jump, jump, jump -- he says he has solved the game -- king me!. "The starting position, assuming no side makes a mistake, is a draw," he says.

The proof is on-line, but the best proof is Chinook, Schaeffer's checker-playing program that is now the one player in the wold that will never lose a game of checkers. You can still play Chinook on-line, if you got game.

This is easily the most complex game to be solved by computation, and the result depends on several different areas of computer science: AI, distributed computing, parallel programming, and databases most prominent among them. Chinook's endgame database now contains approximately 39 trillion positions and is the practical keystone of its play. Chinook searches deep, like many masters, but now it can relatively quickly terminate its analysis not with a heuristic static evaluation function but a database look-up that guarantees correctness. So even analytical mistakes early in the game can be corrected for as soon as the program reaches a solved position.

I am a game player, primarily chess, and I know some folks who will call this database an unfair advantage. But the best game players I know have always capitalized on their better memories and better computational skills; I don't know why Chinook or any other program should be held to a different standard. But what a memory that is!

I am finally ready to believe that, if Chinook were to play Marion Tinsley -- may he rest in peace [*] -- in another match, it would not lose, and most likely would win. Even the great Tinsley made an occasional error.

And if you have not yet read Schaeffer's book One Jump Ahead on my earlier recommendation, well, shame on you. Do so now.

But is checkers really dead, as the popular press is now saying? Not at all. It is still a fun game for people to play, and a great mental battlefield. It's just that now we have an objective standard against which to measure ourselves.

----

[*] Or should that be "rest in piece"?


Posted by Eugene Wallingford | Permalink | Categories: Computing

July 18, 2007 4:20 PM

Mathematics as "Social Construct"

Many folks like to make analogies between mathematics and art, or computer science and creative disciplines. But there are important ways in which these analogies come up short. Last time, I wrote about Reuben Hersh's view of how to teach math right. The larger message of the article, though, was Hersh's view of mathematics as a social construct, a creation of our culture much as law, religion, and money are. One of the neat things about Edge is that it not only gives interviews with thinkers but also asks thinkers from other disciplines to comment on the interviews. In the same issue as Hersh's interview is a response by Stanislas Dehaene, a mathematician-turned-neuroscientist who has studied the cognition of reading and number. He agrees that a Platonic view of number as Platonic ideal is untenable but then draws on his knowledge of cognitive science to remind us that math is not like art and religion as social constructs in two crucial ways: universality and effectiveness. First, there are some mathematical universals to which all cultures have converged, and for which we can construct arguments sufficient to convincing any person:

If the Pope is invited to give a lecture in Tokyo and attempts to convert the locals to the Christian concept of God as Trinity, I doubt that he'll convince the audience -- Trinity just can't be "proven" from first principles. But as a mathematician you can go to any place in the world and, given enough time, you can convince anyone that 3 is a prime number, or that the 3rd decimal of Pi is a 1, or that Fermat's last theorem is true.

I suspect that some cynics might argue that this is true precisely because we define mathematics as an internally consistent set of definitions and rules -- as a constructed system. Yet I myself am sympathetic to claims of the universality of mathematics beyond social construction.

Second, mathematics seems particular effective as the language of science. Dehaene quotes Einstein, "How is it possible that mathematics, a product of human thought that is independent of experience, fits so excellently the objects of physical reality?" Again, a cynic might claim that much of mathematics has been defined for the express purpose of describing our empirical observations. But that really begs the question. What are the patterns common to math and science that make this convergence convenient, even possible?

Dehaene's explanation for universality and effectiveness rests in evolutionary biology -- and patterns:

... mathematical objects are universal and effective, first, because our biological brains have evolved to progressive internalize universal regularities of the external world ..., and second, because our cultural mathematical constructions have also evolved to fit the physical world. If mathematicians throughout the world converge on the same set of mathematical truths, it is because they all have a similar cerebral organization that (1) lets them categorize the world into similar objects ..., and (2) forces to find over and over again the same solutions to the same problems ....

The world and our brains together drive us to recognize the patterns that exist in the world. I am reminded of a principle that I think I first learned from Patrick Henry Winston in his text Artificial Intelligence, called The Principle of Convergent Intelligence:

The world manifests constraints and regularities. If an agent is to exhibit intelligence, then it must exploit these constraints and regularities, no matter the nature of its physical make-up.

The close compatibility of math and science marveled at by Einstein and Dehaene reminds me of another of Winston's principles, Winston's Principle of Parallel Evolution:

The longer two situations have been evolving in the same way, the more likely they are to continue to evolve in the same way.

(If you never had the pleasure of studying AI from Winston's text, now in its third edition, then you missed the joy of his many idiosyncratic principles. They are idiosyncratic in that you;ll read them no where else, certainly not under the names he gives them. But they express truths he wants you to learn. They must be somewhat effective, if I remember some from my 1986 grad course and from teaching out of his text in the early- to mid-1990s. I am sure that most experts consider the text outdated -- the third edition came out in 1992 -- but it still has a lot to offer the AI dreamer.)

So, math is more than "just" a mental construct because it expresses regularities and constraints that exist in the real world. I suppose that this leaves us with another question: do (or can) law and religion do the same, or do they necessarily lie outside the physical world? I know that some software patterns folks will point us to Christopher Alexander's arguments on the objectivity of art; perhaps our art expresses regularities and constraints that exist in the real world, too, only farther from immediate human experience.

These are fun questions to ponder, but they may not tell us much about how to do better mathematics or how to make software better. For those of us who make analogies between math (or computer science) and the arts, we are probably wise to remember that math and science reflect patterns in our world, at least more directly with our immediate experience than some of our other pursuits.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

July 17, 2007 8:17 PM

Mathematics, Problems, and Teaching

I'm surprised by how often I encounter the same topic in two different locations on the same day. The sources may be from disparate times, but they show up on my radar nearly coincident. Coincidence, perhaps.

Yesterday I ran across a link to an old article at Edge called What Kind of a Thing is a Number?. This is an interview with Reuben Hersh, a mathematician with a "cultural" view of mathematics. More on that notion later, but what caught my eye was Hersh's idea of how to teach math right:

A good math teacher starts with examples. He first asks the question and then gives the answer, instead of giving the answer without mentioning what the question was. He is alert to the body language and eye movements of the class. If they start rolling their eyes or leaning back, he will stop his proof or his calculation and force them somehow to respond, even to say "I don't get it." No math class is totally bad if the students are speaking up. And no math lecture is really good, no matter how beautiful, if it lets the audience become simply passive. Some of this applies to any kind of teaching, but math unfortunately is conducive to bad teaching.

Computer science isn't mathematics, but it, too, seems conducive to a style of teaching in which students are disengaged from the material. Telling students how to write a program is usually not all that enlightening for students; they need to do it to understand. Showing students how to write a program may be a step forward, because at least then they see a process in action and may have the gumption to ask "what?" and "why?" at the moments they don't understand. But I find that students often tune out when I demonstrate for too long how to do something. It's too easy for me to run ahead of what they know and can do, and besides, how I do something may not click in the right way with how they think and do.

The key for Hersh is "interaction, communication". But this creates a new sort of demand on instructors: they have to be willing to shut up, maybe for a while. This is uncomfortable for most faculty, who learned in classrooms where professors spoke and students took notes. Hersh tells a story in which he had to wait and wait, and then sit down and wait some more.

It turned out to be a very good class. The key was that I was willing to shut up. The easy thing, which I had done hundreds of times, would have been to say, "Okay, I'll show it to you." That's perhaps the biggest difficulty for most, nearly all, teachers--not to talk so much. Be quiet. Don't think the world's coming to an end if there's silence for two or three minutes.

This strategy presumes that students have something to say, and just need encouragement and time to say it. In a course like Hersh's, on problem solving for teachers, every student has a strategy for solving problems, and if the instructors goal is to bring out into the open different strategies in order to talk about them, that works great. But what about, say, my compilers course? This isn't touchy-feely; it has well-defined techniques to learn and implement. Students have to understand regular expressions and finite-state machines and context-free grammars and automata and register allocation algorithms... Do I have time to explore the students' different approaches, or even care what they are?

I agree with Hersh: If I want my students actually to learn how to write a compiler, then yes, I probably want to know how they are thinking, so that I can help them learn what they need to know. How I engage them may be different than sending them to the board to offer their home-brew approach to a problem, but engagement in the problems they face and with the techniques I'd like them to learn is essential.

This sort of teaching also places a new demand on students. They have to engage the material before they come to class. They have to read the assigned material and do their homework. Then, they have to come to class prepared to be involved, not just lean against a wall with a Big Gulp in their hands and their eyes on the clock. Fortunately, I have found that most of our students are willing to get involved in their courses and do their part. It may be a cultural shift for them, but they can make it with a little help. And that's part of the instructor's job, yes -- to help students move in the right direction?

That was one article. Later the same afternoon, I received ACM's regular mailing on news items and found a link to this article, on an NSF award received by my friend Owen Astrachan to design a new curriculum for CS based on... problems. Owen's proposal echoes Hersh's emphasis on problem-before-solution:

Instead of teaching students a lot of facts and then giving them a problem to solve, this method starts out by giving them a problem.... Then they have to go figure out what facts they need to learn to solve it.

This approach allows students to engage a real problem and learn what they need to know in a context that matters to them, to solve something. In the article, Owen even echoes the new demand made of instructors, being quiet:

With problem-based learning, the faculty person often stands back while you try to figure it out, though the instructor may give you a nudge if you're going the wrong way.

... and the new demand made of students, to actively engage the material:

And [the student] might spend a couple of weeks on a problem outside of class.... So you have to do more work as a student. It's kind of a different way of learning.

The burden on Astrachan and his colleagues on this grant is to find, refine, and present problems that engage students. There are lots of cool problems that might excite us as instructors -- from the sciences, from business, from the social sciences, from gaming and social networking and media, and so on -- but finding something works for a large body of students over a long term is not easy. I think Owen understands this; this is something he has been thinking about for a long time. He and I have discussed it a few times over the years, and his interest in redesigning how we teach undergraduate CS is one of the reasons I asked him to lead a panel at the OOPSLA 2004 Educators' Symposium.

Frank Oppenheimer's Exploratorium

This is also a topic I've been writing about for at least that long, including entries here on how Problems Are The Thing and before that on Alan Kay's talks ... at OOPSLA 2004! I think that ultimately Kay has the right idea in invoking Frank Oppenheimer's Exploratorium as inspiration: a wide-ranging set of problems that appeal to the wide-ranging interests of our students while at the same time bringing them "face to face with the first simple idea of science: The world is not always as it seems. This is a tall challenge, one better suited to a community working together (if one by one) than to a single researcher or small group alone. The ChiliPLoP project that my colleagues and I have been chipping away slowly on the fringes. I am looking forward to the pedagogical infrastructure and ideas that come from Owen's NSF project. If anyone can lay a proper foundation for problems as the centerpiece of undergrad CS, he and his crew can.

Good coincidences. Challenging coincidences.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 10, 2007 5:53 PM

Thinking Ahead to OOPSLA

OOPSLA 2007 logo

I haven't written much in anticipation of OOPSLA 2007, but not because I haven't been thinking about it. In years when I have had a role in content, such as the 2004 and 2005 Educators' Symposia or even the 2006 tutorials track, I have been excited to be deep in the ideas of a particular part of OOPSLA. This year I have blogged just once, about the December planning meeting. (I did write once from the spring planning meeting, but about a movies.) My work this year for the conference has been in an administrative role, as communications chair, which has focused on sessions and schedules and some web content. Too be honest, I haven't done a very good job so far, but that is a subject for another post. For now, let's just say that I have not been a very good Mediator nor a good Facade.

I am excited about some of the new things we are doing this year to get the word out about the conference. At the top of this list is a podcast. Now, podcasts have been around for a while now, but they are just now becoming a part of the promotional engine for many organizations. We figured that hearing about some of the cool stuff that will happen at OOPSLA this year would complement what you can read on the web. So we arranged to have two outfits, Software Engineering Radio and DimSumThinking, co-produce a series of episodes on some of the hot topics covered at this year's conference.

Our first episode, on a workshop titled No Silver Bullet: A Retrospective on the Essence and Accidents of Software Engineering, organized by Dennis Mancl, Steven Fraser, and Bill Opdyke, is on-line at the OOPSLA 2007 Podcast page. Stop by, give it a listen, and subscribe to the podcast's feed so that you don't miss any of the upcoming episodes. (We are available in iTunes, too.) We plan to role new interviews out every 7-10 for the next few months. Next up is a discussion of a Scala tutorial with Martin Odersky, due out on July 16.

If you would like to read a bit more about the conference, check out conference chair Richard Gabriel's The (Unofficial) How To Get Around OOPSLA Guide, and especially his ooPSLA Impressions. As I've written a few times, there really isn't another conference like OOPSLA. Richard's impressions page does a good job communicating just how, mostly in the words of people who've been to OOPSLA and seen it.

While putting together some of his podcast episodes, Daniel Steinberg of DimSumThinking ran into something different than usual: fun.

I've done three interviews for the oopsla podcast -- every interviewee has used the same word to describe OOPSLA: fun. I just thought that was notable -- I do a lot of this sort of thing and that's not generally a word that comes up to describe conferences.

And that fun comes on top of the ideas and the people you will encounter, that will stretch you. We can't offer a Turing Award winner every year, but you may not notice with all the intellectual foment. (And this year, we can offer John McCarthy as an invited speaker...)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

July 09, 2007 7:28 PM

Preparing for Fall Compilers Course (Almost)

Summer is more than half over. I had planned by now to be deep in planning for my fall compilers course, but the other work has kept me busy. I have to admit also to suffering from a bout of intellectual hooky. Summer is a good time for a little of that.

Compilers is a great course, in so many ways. It is one of the few courses of an undergraduate's curriculum in which students live long enough with code that is big enough to come face-to-face with technical debt. Design matters, implementation matters, efficiency matters. Refactoring matters. The course brings together all of the strands of the curriculum into a real project that requires knowledge from the metal up to the abstraction of language.

In the last few weeks I've run across several comments from professional developers extolling the virtues of taking a compilers course, and often lamenting that too many schools no longer require compilers for graduation. We are one such school; compilers is a project option competing with several others. Most of the others are perceived to be easier, and they probably are. But few of the others offer anything close to the sort of capstone experience that compilers does.

In a comment on this post titled Three Things I Learned About Software in College, Robert Blum writes:

Building OSes and building compilers are the two ends of the spectrum of applied CS. Learn about both, and you'll be able to solve most problems coming your way.

I agree, but a compilers course can also illuminate theoretical CS in ways that other courses don't. Many of the neat ideas that undergrads learn in an intro theory course show up in the first half of compilers, where we examine grammars and build scanners and parsers.

My favorite recent piece on compilers is ultra-cool Steve Yegge's Rich Programmer Food. You have to read this one -- promise me! -- but I will tease you with Yegge's own precis:

Gentle, yet insistent executive summary: If you don't know how compilers work, then you don't know how computers work. If you're not 100% sure whether you know how compilers work, then you don't know how they work.

Yegge's article is long but well worth the read.

As for my particular course, I face many of the same issues I faced the last time I taught it: choosing a good textbook, choosing a good source language, and deciding whether to use a parser generator for the main project are three big ones. If you have any suggestions, I'd love to hear from you. I'd like to build a larger, more complete compiler for my students to have as a reading example, and writing one would be the most fun I could have getting ready for the course.

I do think that I'll pay more explicit attention in class to refactoring and other practical ideas for writing a big program this semester. The extreme-agile idea of 15 compilers in 15 days, or something similar, still holds me in fascination, but at this point I'm in love more with the idea than with the execution, because I'm not sure I'm ready to do it well. And if I can't do it well, I don't want to do it at all. This course is too valuable -- and too much fun -- to risk on an experiment in whose outcome I don't have enough confidence.

I'm also as excited about teaching the course as the last time I taught it. On a real project of this depth and breadth, students have a chance to take what they have learned to a new level:

How lasts about five years, but why is forever.

(I first saw that line in Jeff Atwood's article Why Is Forever. I'm not sure I believe that understanding why is a right-brain attribute, but I do believe in the spirit of this assertion.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 07, 2007 7:10 AM

Quick Hits, Saturday Edition

Don't believe me about computational processes occurring in nature? Check out Clocking In And Out Of Gene Expression, via The Geomblog. Some genes turn other genes on and off. To mark time, they maintain a clock by adding ubiquitin molecules to a chain; when the chain reaches a length of five, the protein is destroyed. That sounds a lot like a Turing machine using a separate tape as a counter...

Becky Hirta learned something that should make all of us feel either better or worse: basic math skills are weak everywhere. We can feel better because it's not just our students, or we can feel worse because almost no one can do basic math. One need not be able to solve solve linear equations to learn how to write most software, but an inability to learn how to solve solve linear equations doesn't bode well.

Hey, I made the latest Carnival of the Agilists. The Carnival dubs itself "the bi-weekly blogroll that takes a sideways slice through the agile blogosphere". It's a nice way for me to find links to articles on agile software development that I might otherwise have missed.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

July 06, 2007 12:00 PM

Independence Day Reading

While watching a little Wimbledon on television the other day, I read a couple of items on my daunting pile of papers to read. Among the first was the on-line excerpt of Scott Rosenberg's recent book Dreaming in Code, about the struggles of the team developing the open-source "interpersonal information manager" Chandler. In the introduction, Rosenberg says about software:

Never in history have we depended so completely on a product that so few know how to make well.

In context, the emphasis is on "know how to make well". He was speaking of the software engineering problem, the knowledge of how to make software. But my thoughts turned immediately to what is perhaps a more important problem: "so few". The world depends so much on computer software, yet we can't seem to attract students to study computer science or (for those who think that is unnecessary) to want to learn how to make software. Many young people think that the idea of making software is beyond them -- too hard. But most don't think much about programming at all. Software is mundane, too ordinary. Where is the excitement?

Later I was reading Space for improvement, on "re-engaging the public with the greatest adventure of our time": space travel. Now space travel travel still seems pretty cool to me, one of the great scientific exercises of my lifetime, but polls show that most Americans, while liking the idea in the abstract, don't care all that much about space travel when it comes down to small matters of paying the bills.

The focus of the article is on the shortcomings of how NASA and others communicate the value and excitement of space travel to the public. It identifies three problems. The first is an "unrelenting positiveness" in PR, which may keep risk-averse legislators happy but gives the public the impression that space travel is routine. The second is a lack of essential information from Mission Control during launches and flights, information that would allow the PR folks tell a more grounded story. But author Bob Mahoney thinks that the third and most important obstacle in the past past has been a presumption that has run through NASA PR for many year's:

The presumption? That the public can't understand or won't appreciate the deeper technical issues of spaceflight. By assuming a disinterested and unintelligent public, PAO [NASA's s Public Affairs Office] and the mainstream media have missed out completely on letting the public share in the true drama inherent in space exploration.

If you presume a disinterested and unintelligent public, then you won't -- can't -- tell an authentic story. And in the case of space travel, the authentic story, replete with scientific details and human drama, might well snag the attention of the voting public.

I can't claim that that software development is "the greatest adventure of our time", but I think we in computing can learn a couple of things from reading this article. First, tell people the straight story. Trust them to understand their world and to care about things that matter. If the public needs to know more math and science to understand, teach them more. Second, I think that we should tell this story not just to adults, but to our children. The only way we can expect students to want to learn how to make software or to learn computer science is if they understand why these things matter and if they believe that they can contribute. Children are in some ways a tougher audience. They still have big imaginations and so are looking for dreams that can match their imagination, and they are pretty savvy when it comes to recognizing counterfeit stories.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

July 04, 2007 9:19 PM

Recursion, Natural Language, and Culture

M.C. Escher, 'Hands'

It's not often that one can be reading a popular magazine, even one aimed at an educated audience, and run across a serious discussion of recursion. Thanks to my friend Joe Bergin for pointing me to The Interpreter, a recent article in the The New Yorker by Reporter at Large John Colapinto. The article tells the story of the Pirahã, a native tribe in Brazil with a most peculiar culture and a correspondingly unusual language. You see, while we often observe recursion in nature, one of the places we expect to see it is in natural language -- in the embedding of sentence-like structures within other sentences. But the Pirahã don't use recursion in their language, because their world view makes abstract structure meaningless.

Though recursion plays a critical role in Colapinto's article, it is really about recursion; it is about a possible crack in Chomsky's universal grammar hypothesis about language, and some of the personalities and technical issues involved. Dan Everett is a linguist who has been working with the Pirahã since the 1970s. He wrote his doctoral dissertation on how the Pirahã language fit into the Chomsky, but upon further study and a new insight now "believes that Pirahã undermines Noam Chomsky's idea of a universal grammar." As you might imagine, Chomsky and his disciples disagree.

What little I learned about the Pirahã language makes me wonder at what it must be like to learn it -- or try to. One the one hand, it's a small language, with only eight consonants and three vowels. But that's just the beginning of its simplicity:

The Pirahã, Everett wrote, have no numbers, no fixed color terms, no perfect tense, no deep memory, no tradition of art or drawing, and no words for 'all', 'each', 'every', 'most', or 'few' -- terms of quantification believed by some linguists to be among the common building blocks of human cognition. Everett's most explosive claim, however, was that Pirahã displays no evidence of recursion, a linguistic operation that consists of inserting one phrase inside another of the same type..."

This language makes Scheme look like Ada! Of course, Scheme is built on recursion, and Everett's claim that the Pirahã don't use it -- can't, culturally -- is what rankles many linguists the most. Chomsky has built the most widely accepted model of language understanding on the premise that "To come to know a human language would be an extraordinary intellectual achievement for a creature not specifically designed to accomplish this task." And at the center of this model is "the capacity to generate unlimited meaning by placing one thought inside another", what Chomsky calls "the infinite use of finite means", after the nineteenth-century German linguist Wilhelm von Humboldt.

According to Everett, however, the Pirahã do not use recursion to insert phrases one inside another. Instead, they state thoughts in discrete units. When I asked Everett if the Pirahã could say, in their language, "I saw the dog that was down by the river get bitten by a snake", he said, "No. They would have to say, 'I saw the dog. The dog was at the beach. A snake bit the dog.'" Everett explained that because the Pirahã accept as real only that which they observe, their speech consists only of direct assertions ("The dog was at the beach."), and he maintains that embedded clauses ("that was down by the river") are not assertions but supporting, quantifying, or qualifying information -- in other words, abstractions.

The notion of recursion as abstraction is natural to us programmers, because inductive definitions are by their nature abstractions over the sets they describe. But I had never before thought of recursion as a form of qualification. When presented in the form of an English sentence such as "I saw the dog that was down by the river get bitten by a snake", it makes perfect sense. I'll need to think about whether it makes sense in a useful for my programs.

Here is one more extended passage from the article, which discusses an idea from Herb Simon, which appears in the latest edition of the Simon book I mentioned in my last entry:

In his article, Everett argued that recursion is primarily a cognitive, not a linguistic, trait. He cited an influential 1962 article, "The Architecture of Complexity," by Herbert Simon, a Nobel Prize-winning economist, cognitive psychologist, and computer scientist, who asserted that embedding entities within like entities (in a recursive tree structure of the type central to Chomskyan linguistics) is simply how people naturally organize information. ... Simon argues that this is essential to the way humans organize information and is found in all human intelligence systems. If Simon is correct, there doesn't need to be any specific linguistic principle for this because it's just general cognition." Or, as Everett sometimes likes to put it: "The ability to put thoughts inside other thoughts is just the way humans are, because we're smarter than other species." Everett says that the Pirahã have this cognitive trait but that it is absent from their syntax because of cultural constraints.

This seems to be a crux in Everett's disagreement with the Chomsky school: Is it sufficient -- even possible -- for the Pirahã to have recursion as a cognitive trait but not as a linguistic trait? For many armchair linguists, the idea that language and thought go hand in hand is almost an axiom. I can certainly think recursively even when my programming language doesn't let me speak recursively. Maybe the Pirahã have an ability to organize their understanding of the world using nested structures (as Simon says they must) without having the syntactic tools for conceiving such structures linguistically (as Everett says they cannot).

I found this to be a neat article for more reasons than just its references to recursion. Here are few other ideas that occurred as I read.

Science and Faith Experience

At UNICAMP (State Univ. of Campinas in Brazil), in the fall of 1978, Everett discovered Chomsky's theories. "For me, it was another conversion experience," he said.

Everett's first conversion experience happened when he became a Christian in the later 1960s, after meeting his wife-to-be. It was this first conversion that led him to learn linguistics in the first place and work with the Pirahã under the auspices of the Summer Institute of Linguistics, an evangelical organization. He eventually fell away from his faith but remained a linguist.

Some scientists might balk at Everett likening his discovery of Chomsky to a religious conversion, but I think he is right on the mark. I know what it's like as a scholar to come upon a new model for viewing the world and feeling as if I am seeing a new world entirely. In grad school, for me it was the generic task theory of Chandrasekaran, which changed how I viewed knowledge systems and foreshadowed my later move into the area of software patterns.

It was interesting to read, even briefly, the perspective of someone who had undergone both a religious conversion and a scientific conversion -- and fallen out of both, as his personal experiences created doubts for which his faiths had no answers for him.

Science as Objective

Obvious, right? No. Everett has reinterpreted data from his doctoral dissertation now that he has shaken the hold of his Chomskyan conversion. Defenders of Chomsky's theory say that Everett's current conclusions are in error, but he now says that

Chomsky's theory necessarily colored his [original] data-gathering and analysis. "'Descriptive work' apart from theory does not exist. We ask the questions that our theories tell us to ask.

Yes. When you want to build generic task models of intelligent behavior, you see the outlines of generic tasks wherever you look. You can tell yourself to remain skeptical, and to use an objective eye, but the mind has its own eye.

Science is a descriptive exercise, and how we think shapes what we see and how we describe. Do you see objects or higher-order procedures when you look at a problem to describe or when you conceive a solution? Our brains are remarkable pattern machines and can fall into the spell of a pattern easily. This is true even in a benign or helpful sense, such as what I experienced after reading an article by Bruce Schneier and seeing his ideas in so many places for a week or so. My first post in that thread is here, and the theme spread throughout this blog for at least two weeks thereafter.

Intellectually Intimidating Characters

Everett occupied an office next to Chomsky's; he found the famed professor brilliant but withering. "Whenever you try out a theory on someone, there's always some question that you hope they won't ask," Everett said. "That was always the first thing Chomsky would ask.

That is not a fun feeling, and not the best way for a great mind to help other minds grow -- unless used sparingly and skillfully. I've been lucky that most of the intensely bright people I've met have had more respect and politeness --and skill -- to help me come along on the journey, rather than to torch me with their brilliance at every opportunity.

Culture Driving Language

One of the key lessons we see from the Pirahã is that culture is a powerful force, especially a culture so long isolated from the world and now so closely held. But you can see this phenomenon even in relatively short-term educational and professional habits such as programming styles. I see it when I teach OO to imperative programmers, and when I teach functional programming to imperative OO programmers. (In a functional programming course, the procedural and OO programmers realize just how similar their imperative roots are!) Their culture has trained them not to use the muscles in their minds that rely on the new concepts. But those muscles are there; we just need to exercise them, and build them up so they are as strong as the well-practiced muscles.

What Is Really Universal?

Hollywood blockbusters, apparently:

That evening, Everett invited the Pirahã to come to his home to watch a movie: Peter Jackson's remake of "King Kong". (Everett had discovered that the tribe loves movies that feature animals.) After nightfall, to the grinding sound of the generator, a crowd of thirty or so Pirahã assembled on benches and on the wooden floor of Everett's [house]. Everett had made popcorn, which he distributed in a large bowl. Then he started the movie, clicking ahead to the scene in which Naomi Watts, reprising Fay Wray's role, is offered as a sacrifice by the tribal people of an unspecified South Seas island. The Pirahã shouted with delight, fear, laughter, and surprise -- and when Kong himself arrived, smashing through the palm trees, pandemonium ensued. Small children, who had been sitting close to the screen, jumped up and scurried into their mothers' laps; the adults laughed and yelled at the screen.

The Pirahã enjoy movies even when the technological setting is outside their direct experience -- and for them, what is outside their direct experience seems outside their imagination. The story reaches home. From their comments, the Pirahã seemed to understand King Kong in much the way we did, and they picked up on cultural clues that did fit into their experience. A good story can do that.

Eugene sez, The Interpreter, is worth a read.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

July 03, 2007 8:02 AM

Computational Processes in Nature

Back in February I wrote a speculative piece on computer science as science, in which I considered ways in which CS is a scientific discipline. As a graduate student, I grew up on Herb Simon's book The Sciences of the Artificial, so the notion of studying phenomena that are contingent on the intentions of their designer has long seemed a worthwhile goal. But "real" scientists, those who study the natural world, have never been persuaded by Simon's arguments. For them, science deals with natural phenomena; programs and more comprehensive computer systems are man-made; and so computing as a science of the artificial is not a real science.

As the natural sciences develop, though, we have begun to learn something that computer scientists have sensed for a long time: computational processes occur in the natural world. Peter Denning has taken dead aim on this observation in his new essay Computing is a Natural Science, published in this month's Communications of the ACM. He opens with his take-home claim:

Computing is now a natural science. Computation and information processes have been discovered in the deep structures of many fields. Computation was present long before computers were invented, but the remarkable shift to this realization occurred only in the last decade. We have lived for so long in the belief that computing is a science of the artificial, it may be difficult to accept that many scientists now see information processes abundantly in nature.

Denning supports his claim with examples from biology and physics, in which natural computations now form the basis of much of the science, in the form of DNA and quantum electrodynamics, respectively. In many ways, the realization that computation lies in the deep structures of many natural systems is a vindication of Norbert Wiener, who in the 1940s and 1950s wrote of information as a fundamental element of systems that communicate and interact, whether man-made or living.

The article continues with a discussion of some of the principles discovered and explored by computer scientists, all of which seem to have correlates in natural phenomena. The table in his paper, available on his web site as a PDF file, lists a few key ones, such as intractability, compression, locality, bottlenecks, and hierarchical aggregation. That these principles help us to understand man-made systems better and to design better systems should not distract from their role in helping us to understand computations in physical, chemical, and biological systems.

There is some talk on my campus of forming a "school of technology" into which the Department of Computer Science might move. From my department's perspective, this idea offers some potential benefits and some potential costs. One of the potential costs that concerns me is that being in a school of technology might stigmatize the discipline as merely a department of applications. This might well limit people's perception of the department and its mission, and that could limit the opportuniies available to us. At a time when we are working so hard to help folks understand the scientific core of computing, I'm not keen on making a move that seems to undermine our case.

Explicating the science of computing has been Denning's professional interest for many years now. You can read more about his work on his Great Principles of Computing web site. There is also an interview with Denning that discusses some of his reasons for pursuing this line of inquiry in the latest issue of ACM's Ubiquity magazine. As Denning points out there, having computing as a common language for talking about the phenomena we observe in natural systems is an important step in helping the sciences that study those systems advance. That we can use the same language to describe designed systems -- as well as large interactive systems that haven't been designed in much detail, such as economies -- only makes computer science all the more worthy of our efforts.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 27, 2007 2:50 PM

Hobnobbing with Legislators

As a department head, I am occasionally invited to attend an event as a "university leader". This morning I had the chance to attend a breakfast reception thrown by the university for our six local state legislators. They had all been part of a strong funding year for state universities, and this meeting was a chance for us to say "thank you" and to tell them all some of the things we are doing. This may not sound like all that much fun to some of you; it's certainly unlike a morning spent cutting code. But I find this sort of meeting to be a good way to put a face on our programs to the people who hold our purse strings, and I admit to enjoying the experience of being an "insider".

I found our delegation to consist of good people who had done their homework and who have good intentions regarding higher education. Two or three of them seem to be well-connected in the legislature and so able to exercise some leadership. One in particular has the look, bearing, speaking ability, and mind that bode well should he decide to seek higher elected office.

I can always tell when I am in the presence of folks who have to market the university or themselves, as nearly every person the room must. I hear sound bites about "windows of opportunity" and "dynamic personalities in the leadership". My favorite sound bite of the morning bears directly on a computer science department: "The jobs of the future haven't been invented yet."

This post involves computing in an even more immediate way. Upon seeing my name tag, two legislators volunteered that the toughest course they took in college was their computer programming class, and the course in which they received their lowest grades (a B in Cobol and a C in Pascal, for what it's worth). These admissions came in separate conversations, completely independent from one another. The way they spoke of their experiences let me know that the feeling is still visceral for them. I'm not sure that this is the sort of impression we want to make on the folks who pay our bills! Fortunately, they both spoke in good nature and let us know that they understand how important strong CS programs are for the economic development of our region and state. So I left the meeting with a good feeling.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

June 20, 2007 1:20 PM

More Dissatisfaction with Math and Science Education

Another coincidence in time... The day after I post a note on Alan Kay's thoughts on teaching math and science to kids, I run across (via physics blogger and fellow basketball crazy Chad Orzel) Sean Carroll's lament about a particularly striking example of what Kay wants to avoid.

Carroll's article points one step further to his source, Eli Lansey's The sad state of science education, which describes a physics club's visit to a local elementary school to do cool demos. The fifth graders loved the demos and were curious and engaged; the sixth graders were disinterested and going through the motions of school. From his one data point, Carroll and Lansey hypothesize that there might be a connection between this bit flip and what passed for science instruction at the school. Be sure to visit Lansey's article if only to see the pictures of the posters these kids made showing their "scientific procedure" on a particular project. It's really sad, and it goes on in schools everywhere. I've seen similar examples in our local schools, and I've also noticed this odd change in stance toward science -- and loss in curiosity -- that seems to happen to students around fifth or sixth grade. Especially among the girls in my daughters' classes. (My older daughter seemed to go through a similar transition about that time but also seems to have rediscovered her interest in the last year as an eighth grader. My hope abounds...)

Let's hope that the students' loss of interest isn't the result of some unavoidable developmental process and does follow primarily from non-science or anti-science educational practices. If it's the latter, then the sort of things that Alan Kay's group are doing can help.

I haven't written about it here yet, but Iowa's public universities have been charged by the state Board of Regents with making a fundamental change in how we teach science and math in the K-12 school system. My university, which is the home of the state's primary education college, is leading the charge, in collaboration with our bigger R-1 sisters. I'll write more later as the project develops, but for now I can point you to web page that outlines the initiative. Education reform is often sought, often started, and rarely consummated to anyone's satisfaction. We hope that this can be different. I'd feel a lot more confident if these folks would take work like Kay's as its starting point. I fear that too much business-as-usual will doom this exercise.

As I type this, I realize that I will have to get more involved if I want what computer scientists are doing to have any chance of being in the conversation. More to do, but a good use of time and energy.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

June 19, 2007 3:46 PM

Teaching Science and Math to Young Children

After commenting on Alan Kay's thesis, I decided to read a more recent paper by Alan that was already in my stack, Thoughts About Teaching Science and Mathematics To Young Children. This paper is pretty informal, written in a conversational voice and marked by occasional typos. It some ways, it felt like a long blog entry, in which Kay could speak to a larger audience about some of the ideas that motivate his current work at the Viewpoints Research Institute. It's short -- barely more than four pages -- so you should read it yourself, but I'll share a few thoughts that came to mind as I read this morning in between bouts of advising incoming CS freshmen.

Kay describes one of the key challenges to teaching children to become scientists: we must help students to distinguish between empiricism and modeling on one hand and belief- based acceptance of dogma on the other. This is difficult for at least three reasons:

  • Our schools don't usually do this well right now, even when they use "inquiry"-based methods.

  • Our culture is "an admixture of many different approaches to the world and how it works and can be manipulated", with scientific thinking in the background -- even for most of our teachers.

  • Children think differently from adults, being capable of understanding rather complex concepts when they are encountered in a way that matches their world model.

The last of these is a problem because most of us don't understand very well how children think, and most of us are prone to organize instruction in a way that conforms with how we think. As a parent who has watched one daughter pass through middle school and who has another just entering, I have seen children grok some ideas much better than older students when the children have an opportunity engage the concepts in a fortuitous way. I wish that I had gleaned from my experience some ideas that would enable me to create just the right opportunities for children to learn, but I'm still in the hit-or-miss phase.

This brings out a second-order effect of understanding how children think, which Kay points out: "the younger the children, the more adept need to be their mentors (and the opposite is more often the case)". To help someone learn to think and act like a scientist, it is at least valuable and more likely essential for the teacher (to be able) to think and act like a scientist. Sadly, this is all to rare among elementary-school and even middle-school teachers.

I also see this issue operating at the level of university CS education. Being a good CS1 teacher requires both knowing a lot about how students' minds work and being an active computer scientist (or software developer). Whatever drawbacks you may find in a university system that emphasizes research even for teaching faculty, I think that this phenomenon speaks to the value of the teacher-scholar. And by "scholar", I mean someone who is actively engaged doing the discipline, but the fluffy smokescreen that the term sometimes signifies for faculty who have decided to "focus on their teaching".

For Kay, it is essential that children encounter "real science". He uses the phrase "above the threshold" to emphasize that what students do must be authentic, and not circumscribed in a way that cripples asking questions and building testable models. At the end of this paper, he singles out for criticism Interactive Physics and SimCity:

Both of these packages have won many "educational awards" from the pop culture, but in many ways they are anti-real-education because they miss what modern knowledge and thinking and epistemology are all about. This is why being "above threshold" and really understanding what this means is the deep key to making modern curricula and computer environments that will really help children lift themselves.

I found particularly useful Kay's summary of Papert's seminal contribution to this enterprise and of his own contribution. Papert combined an understanding of science and math "with important insights of Piaget to realize that children could learn certain kinds of powerful math quite readily, whereas other forms of mathematics would be quite difficult." In particular, Papert showed that children could understand in a powerful way the differential geometry of vectors and that the computer could play an essential role in abetting this understanding by doing the integral calculus that is beyond them -- and which performance is not necessary for the first-order understanding of the science. Kay claims himself to have made only two small contributions:

  • that multiple, independent, programmable objects can serve as a suitable foundation for children to build scientific models, and
  • that the modeling environment and programming language children use are a user interface that must be designed carefully in order to implement Papert's insight.

What must the design of these tools be like? It must hide gratuitous complexity while exposing essential complexity, doing "the best job possible to make all difficulties be important ones whose overcoming is the whole point of the educational process". Learning involves overcoming difficulties, but we want learners to overcome difficulties that matter, not defects in the tools or pedagogy that we design for them. This is a common theme in the never-ending discussion of which language to use to teach CS majors to write programs -- if, say, C introduces too many unnecessary or inconsistent difficulties, should we use it to teach people to program? Certainly not children, would say Kay, and he says the same thing about most of the languages we use in our universities. Unfortunately, the set of languages that are usually part of the CS1 discussion don't really differ in ways that matter... we are discussing something that matters a lot but not in a way that matters at all.

Getting the environment and language right do matter, because students who encounter unnecessary difficulties will usually blame themselves for their failure, and even when they don't they are turned off to the discipline. Kay says it this way:

In programming language design in a UI, especially for beginners, this is especially crucial.... Many users will interpret [failure] as "I am stupid and can't do this" rather than the more correct "The UI and language designers are stupid and they can't do this".

This echoes a piece of advice by Paul Graham from an entirely different context, described here recently: "So when you get rejected by investors, don't think 'we suck,' but instead ask 'do we suck?' Rejection is a question, not an answer." Questions, not answers.

Kay spends some time talking about how language design can provide the right sort of scaffolding for learning. As students learn, we need to be able to open up the black boxes that are primitive processes and primitive language constructs in their learning to expose a new level of learning that is continuous with the previous. As Kay once wrote elsewhere, one of the beautiful things about how children learn natural language is that the language learned by two-year-olds and elementary school students is fundamentally the same language used by our great authors. The language children use to teach science and math, and the language they use to build their models, should have the same feature.

But designing these languages is a challenge, because we have to strike a balance between matching how learners think and providing avenues to greater expressiveness:

Finding the balance between these is critical, because it governs how much brain is left to the learner to think about content rather than form. And for most learners, it is the initial experiences that make the difference for whether they want to dive in or try to avoid future encounters.

Kay is writing about children, but he could just as well be describing the problem we face at the university level.

Of course, we may well have been handicapped by an education system that has already lost most students to the sciences by teaching math and science as rules and routine and dogma not to be questioned. That is ultimately what drives Kay and his team to discover something better.

If you enjoy this paper -- and there is more there than I've discussed here, including a neat paragraph on how children understand variables and parameters -- check out some more of his team's recent work on VPRI's Writings page.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 16, 2007 3:34 PM

Two Things, Computing and Otherwise

My recent article on Alan Kay's thesis incidentally intersected with one of those blog themes (er, memes) that make the rounds. Kay brought out two essential concepts of computing: syntax and abstraction. Abstraction and the distinction between syntax and semantics are certainly two of the most important concepts in computing.

Charles Miller takes a shot at the identifying The Two Things about Computer Programming:

  1. Every problem can be solved by breaking it up into a series of smaller problems.
  2. The computer will always do exactly what you tell it to.

That first one, decomposition, is closely related to abstraction.

When I followed the link to the source of the The Two Things phenomenon, I found that my favorites were not about computers or science but from the humanities, history to be precise. These are attributed to Jonathan Dresner. Here are Dresner's The Two Things about History:

  1. Everything has antecedents.
  2. Sources lie, but they're all we have.

Excellent! Of course, these apply to the empirical side of science, too, and even to the empirical side of understanding large software systems. Consider #1. That Big Ball of Mud we are stuck with has antecedents, and understanding the forces that lead to such systems is important both if we want to understand the architectures of real systems and if we seek a better way to design. All patterns we notice have their antecedents, and we need to understand them. As for #2, if we changed 'sources' to 'source', most programmers would nod knowingly. Source code often lies -- hides its true intentions, masks the program's larger structure, misleads us with unnecessary complexity of embellishment. Even when we do our best to make it speak truth, code can sometimes lie.

As a CS instructor, I also liked His The Two Things about Teaching History:

  1. A good story is all they'll remember, not the half hour of analysis on either side of it.
  2. They think it's about answers, but it's really about questions.

This pair really nails what it's like to teach in any academic discipline. I've already written about the first in All About Stories. As to the second, helping students make the transition from answers to questions -- not turning away from seeking answers, but turning one's focus to asking questions -- is one of the goals of education. By the time students reach the university these days, the challenge seems to have grown, because they have grown up in a system that focuses on answers, implicitly even when not explicitly.

I'm not sure any of the entries on computing at the The Two Things site nail our discipline as well the two things about history above. It seems like a fun little exercise to keep thinking on what I'd say if asked the question...


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

June 13, 2007 4:05 PM

BASIC and COBOL Cross My Path

I don't run into Basic and Cobol all that often these days, but lately they seem to pop up all over. Once recently I even ran into them together in an article by Tim Bray on trends in programming language publishing:

Are there any here that might go away? The only one that feels threatened at all is VB, wounded perhaps fatally in the ungraceful transition to .NET. I suppose it's unlikely that many people would pick VB for significant new applications. Perhaps it's the closest to being this millennium's COBOL; still being used a whole lot, but not creatively.

Those are harsh words, but I suppose it's true that Cobol is no longer used "creatively". But we still receive huge call for Cobol instruction from industry, both companies that typically recruit our students and companies in the larger region -- Minneapolis, Kansas City, etc. -- who have learned that we have a Cobol course on the books. Even with industry involvement, there is effectively no student demand for the course. Whether VB is traveling the same path, I don't know. Right now, there is still decent demand for VB from students and industry.

Yesterday, I ran into both languages again, in a cool way... A reader and former student pointed out that I had "hit the big leagues" when my recent post on Alan Kay started scoring points at programming.reddit.com. When I went there for a vanity stroke, I ran into something even better, a Sudoku solver written in Cobol! Programmers are a rare and wonderful breed. Thanks to Bill Price for sharing it with us. [1]

While looking for a Cobol compiler for my Intel Mac [2], I ran instead into Chipmunk Basic, "an old-fashioned Basic interpreter" for Mac OS. This brings back great memories, especially in light of my upcoming 25th high school reunion. (I learned Basic as a junior, in the fall of 1980.) Chipmunk Basic doesn't seem to handle my old graphics-enabled programs, but it runs most of the programs my students wrote back in the early 1990s. Nice.

I've been considering a Basic-like language as a possible source language for my compiler students this fall. I first began having such thoughts when I read a special section on lightweight languages in a 2005 issue of Dr. Dobbs' Journal and found Tom Pitman's article The Return of Tiny Basic. Basic has certain limitations for teaching compilers, but it would be simple enough to tackle in full within a semester. It might also be nice for historical reasons, to expose today's students to something that opened the door to so many CS students for so many years.

----

[1] I spent a few minutes poking around Mr. Price's website. In some sort of cosmic coincidence, it seems that Mr. Price is took his undergraduate degree at the university where I teach (he's an Iowa native), and is an avid chessplayer -- not to mention a computer programmer! That's a lot of intersection with my life.

[2] I couldn't find a binary for a Mac OS X Cobol, only sources for OpenCOBOL. Building this requires building some extension packages that don't compile without a bunch of tinkering, and I ran out of time. If anyone knows of a decent binary package somewhere, please drop me a line.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development

June 13, 2007 7:52 AM

Computing's Great Ideas Are Everywhere

Philip Windley recently wrote about how he observed a queue in action at a local sandwich shop. Then he steps back to note:

The world is full of these kinds of patterns. There's a great write-up of why Starbucks doesn't use a two-phase commit. The fact that these kinds of process issues occur in everyday life would lead the cynic to say that there is nothing new in Computer Science -- people have always known these things.

But there's a big difference between someone figuring out to put a queue in between their order taking station and their sandwich making station and understanding why, when, and how it works in enough detail that the technique can be analyzed and applied generally.

These observations tell us that Windley is real computer scientist. They also lead me to think that he is probably an effective teacher of computer science, observing algorithms and representation in the world and relating them to concepts in the discipline.

To say that because "these kinds of process occur in everyday life" there is nothing new in Computer Science would be like saying that because mass and light and energy are everywhere there is nothing new in Physics. It is the purpose of our discipline to recognize these patterns and tell their story -- and to put them into our service in systems we build.

Windley's comment on big ideas from computing showing up in the world came to mind when I was thinking about Alan Kay's thesis, in particular his relating syntax and abstraction back to pre-literate man's recognition of fruitful patterns in their use of grunts to make a point. These big ideas -- the distinction between form and meaning; abstraction; the interplay between data and process, ... -- these are not "big ideas in computing". They are big ideas. These ideas are central to how we understand the universe, how it is and how it works. This is why we need computing for more than the construction of the next enterprise architecture-cum-web framework. It's why computer science is an essential discipline.


Posted by Eugene Wallingford | Permalink | Categories: Computing

June 11, 2007 4:31 PM

Alan Kay's "The Reactive Engine"

Last week I somehow came across a pointer to Matthias Müller-Prove's excerpt of "The Reactive Engine", Alan Kay's 1969 Ph.D. thesis at the University of Utah. What a neat find! The page lists the full table of contents and gives then gives the abstract and a few short passages from the first section on FLEX, his hardware-interpreted interactive language that foreshadows Smalltalk.

Those of you who have read here for a while know that I am a big fan of Kay's work and often cite his ideas about programming as a medium for expressing and creating thought. One of my most popular entries is a summary of his Turing Award talks at the 2004 OOSPLA Educators' Symposium. It is neat to see the roots of his recent work in a thesis he wrote nearly forty years ago, a work whose ambition, breadth, and depth seem shocking in a day where advances in computing tend toward the narrow and the technical. Even then, though, he observed this phenomenon: "Machines which do one thing only are boring, yet exert a terrible fascination." His goal was to construct a system that would serve as a medium for expression, not just be a special-purpose calculator of sorts.

The excerpt that jarred me most when I read it was this statement of the basic principles of his thesis:

Probably the two greatest discoveries to be made [by pre-literate man] were the importance of position in a series of grunts, and that one grunt could abbreviate a series of grunts. Those two principles, called syntax (or form) and abstraction, are the essence of this work.

In this passage Kay ties the essential nature of computing back to its source in man's discovery of language.

In these short excerpts, one sees the Alan Kay whose manner of talking about computing is delightfully his own. For example, on the need for structure in language:

The initial delights of the strongly interactive languages JOSS and CAL have hidden edges to them. Any problem not involving reasonably simple arithmetic calculations quickly developed unlimited amounts of "hair".

I think we all know just what he means by this colorful phrase! Or consider his comments on LISP and TRAC. These languages were notable exceptions to the sort of interactive language in existence at the time, which left the user fundamentally outside of the models they expressed. LISP and TRAC were "'homoiconic', in that their internal and external representations are essentially the same". (*) Yet these languages were not sufficient for Kay's goal of making programming acceptable to any person interested in entering a dialog with his system:

Their only great drawback is that programs written in them look like King Burniburiach's letter to the Sumerians done in Babylonian cun[e]iform!

Near the end of his introduction to FLEX Kay describes the goals for the work documented in the thesis (bolded text is my emphasis):

The summer of 1969 sees another FLEX (and another machine). The goals have not changed. The desire is still to design an interactive tool which can aid in the visualization and realization of provocative notions. It must be simple enough so that one does not have to become a systems programmer (one who understands the arcane rites) to use it. ... It must do more than just be able to realize computable functions; it has to be able to form the abstractions in which the user deals.

The "visualization and realization of provocative notions"... not just by those of us who have been admitted to the guild of programmers, but everyone. That is the ultimate promise -- and responsibility -- of computing.

Kay reported then that "These goals have not been reached." Sadly, forty years later, we still haven't reached them, though he and his team continue to work in this vein. His lament back in 2004 was that too few of us had joined in the search, settling instead to focus on what will in a few decades -- or maybe even five years -- be forgotten as minutiae. Even folks who thought they were on the path had succumbed to locking in to vision of Smalltalk that is now over twenty-five years old, and which Kay himself knows to be just a stepping stone early on the journey.

In some ways, this web page is only a tease. I really should obtain a copy of his full thesis and read it. But Matthias has done a nice job pulling out some of the highlights of the thesis and giving us a glimpse of what Alan Kay was thinking back before our computers could implement even a small bit of his vision. Reading the excerpt was at once a history lesson and a motivating experience.

----

(*) Ah, that's where I ran across the link to this thesis, in a mailing-list message that used the term "homoiconic" and linked to the excerpt.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 05, 2007 8:11 AM

Miscellaneous Thoughts for Programmers

As summer begins for me, I get to think more about programming. For me, that will be Ruby and my compilers over the next few months.

Too Hard

From Ruby vs. Java Myth #3:

In what serious discipline is "It's too hard" a legitimate excuse? I have never seen a bank that eschews multiplication: "We use repeated addition here--multiplication was too hard for our junior staffers." And I would be uncomfortable if my surgeon said, "I refuse to perform procedures developed in the last 10 years--it is just too hard for me to learn new techniques."

Priceless. This retort applies to many of our great high-level languages, such as Scheme or Haskell, as anyone who has taught these languages will attest.

The problem we in software have is this conundrum: The level of hardness -- usually, abstraction -- we find in some programming languages narrows our target population much more than the level of hardness that we find in multiplication. At the same time, our demand for software developers far outstrips our demand for surgeons. Finding ways to counteract these competing forces is a major challenge for the software industry and for computing programs.

For what it's worth, I strongly second Stuart's comments in Ruby vs. Java Myth #1, on big and small projects. This is a case where conventional wisdom gets things backwards, at a great cost to many teams.

A Programmer's Programmer

I recently ran across a link to this interview with Don Knuth from last year. It's worth a read. You gotta love Knuth as much as you respect his work:

In retirement, he still writes several programs a week.

Programmers love to program and just have to do it. But even with 40+ years of experience, Knuth admits a weakness:

"If I had been good at making estimates of how long something was going to take, I never would have started."

If you've studied AI or search algorithms, you from A* that underestimates are better than overestimates, for almost exactly the reason that they helped Knuth. There are computational reasons this is true for A*, but with people it is mostly a matter of psychology -- humans are more likely to begin a big job if they start with a cocky underestimate. "Sure, no problem!"

If you are an agile developer, Knuth's admission should help you feel free not to be perfect with your estimates; even the best programmers are often wrong. But do stay agile and work on small, frequent releases... The agile approach requires short-term estimates, which can be only so far off and which allow you to learn about your current project more frequently. I do not recommend underestimates as drastic as the ones Knuth made on his typesetting project (which ballooned to ten years) or his Art of Computing Programming series (at nearly forty years and counting!) A great one like Knuth may be creating value all the long while, but I don't trust myself to be correspondingly productive for my clients.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 01, 2007 3:16 PM

More on Structured Text

My entry formatting text for readability elicited some interesting responses. A couple of folks pointed to Sun's language in development, Fortress, which is something of an example going in the other direction: it is a programming language that will be presentable in multiple forms, including a more human mathematics display. Indeed, Fortress code uses a notation that mathematicians will find familiar.

I especially enjoyed a message from Zach Beane, who recently read William Manchester's biography of Winston Churchill, Churchill wrote the notes for his speeches using a non-standard, structured form. While he may not have used syntactic structure as his primary mechanism, he did use syntactic structure as part of making his text easier to scan during delivery. Zach offered a few examples from the Library of Congress's on-line exhibit Churchill and the Great Republic, including Churchill's Speech to the Virginia General Assembly, March 8, 1946. My favorite example is this page of speaking notes for Churchill's radio broadcast to the United States, on October 16, 1938:

speaking notes for Churchill's broadcast too the United States, October 16, 1938

Thanks to Zach and all who responded with pointers!


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

May 24, 2007 7:48 AM

Formatting Text for Readability

Technology Changes, Humans Don't

Gaping Void ran the cartoon at the right last weekend, which is interesting, given that several of my recent entries have dealt with a similar theme. Technology may change, but humans -- at least our hard-wiring -- don't. We should take into account how humans operate when we work with them, whether in security, software development, or teaching.

In another coincidence, I recently came across a very cool paper, Visual-Syntactic Text Formatting: A New Method to Enhance Online Reading. We programmers spend an awful lot of time talking about indenting source code: how to do it, why to do it, tools for doing it, and so on. Languages such as Python require a particular sort of indentation. Languages such Scheme and Common Lisp depend greatly on indentation; the programming community has developed standards that nearly everyone follows and, by doing so, programmers can understand code whose preponderance of parentheses would otherwise blind them.

But the Walker paper is the first time I have ever read about applying this idea to text. Here is an example. This:

When in the Course of human events, it becomes necessary
for one people to dissolve the political bands which have
connected them with another, and to assume among the powers
of the earth, the separate and equal station to which the
Laws of Nature...

might become:

When in the Course
        of human events,
    it becomes necessary
        for one people
          to dissolve the political bands
            which have
              connected them with another,
          and to assume
              among the powers
                of the earth,
            the separate and equal station
              to which
                the Laws of Nature
...

Cognitively, this may make great sense, if our minds can process and understand text better presented when it is presented structurally. The way we present text today isn't much different in format than when we first started to write thousands of years ago, and perhaps it's time for a change. We shouldn't feel compelled to stay with a technology for purely historical reasons when the world and our understanding of it have advanced. (Like the world of accounting has with double-entry bookkeeping.)

For those of you who are still leery of such a change, whether for historical reasons, aesthetic reasons, or other personal reasons... First of all, you are in good company. I was once at a small session with Kurt Vonnegut, and he spoke eloquently of how the book as we know it now would never disappear, because there was nothing like the feel of paper on your fingertips, the smell of a book when you open its fresh pages for the first time. I did not believe him then, and I don't think even he believed that deep in his heart; it is nothing more than projecting our own experiences and preferences onto a future that will surely change. But I know just how he felt, and I see my daughters' generation already experiencing the world in a much richer, technology-mediated way than Vonnegut or I have.

Second, don't worry. Even if what Walker and his colleagues describe becomes a standard, I expect a structured presentation to simply be one view on the document out of many possible views. As an old fogey, I might prefer to read my text in the old paragraph-structured way, but I can imagine that having a syntactically-structured view would make it much easier to scan a document and find something more easily. Once I find the passage of interest, I could toggle back to a paragraph-structured view and read to my hearts content. And who knows; I might prefer reading text that is structured differently, if only I have the chance.

Such toggling between views is possible because of... computer science! The same theory and techniques that make it possible to do this at all makes it possible to do however you like. Indeed, I'll be teaching many of the necessary techniques this fall, as a part of building the "front end to a program compiler. The beauty of this science is that we are no longer limited by someone else's preferences, or by someone else's technology. As I often mention here, this is one of the great joys of being a computer scientist: you can create your own tools.

We can now see this technology making it out to general public. I can see the MySpace generation switching to new ways of reading text immediately. If it makes us better readers, and more prolific readers, then we will have a new lens on an old medium. Computer science is a medium-maker.

Of course, this particular project is just a proposal and in the early stages of research. Whether it is worth pursuing in its current form, or at all, depends on further study. But I'm glad someone is studying this. The idea questions assumptions and historical accident, and it uses what we have learned from cognitive science and medical science to suggest a new way to do something fundamental. As I said, very cool.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

May 17, 2007 11:08 AM

Quick Hits

Over the last couple of months, I've been collecting some good lines and links to the articles that contain them. Some of these may show up someday in something I write, but it seems a shame to have them lie fallow in a text file until then. Besides, my blog often serves as my commonplace book these days. All of these pieces are worth reading for more than the quote.

If the code cannot express itself, then a comment might be acceptable. If the code does not express itself, the code should be fixed.
-- Tim Ottinger, Comments Again

In a concurrent world, imperative is the wrong default!
-- Tim Sweeney of Epic Games, The Next Mainstream Programming Language: A Game Developer's Perspective, an invited talk at ACM POPL'06 (full slides in PDF)

When you are tempted to encode data structure in a variable name (e.g. Hungarian notation), you need to create an object that hides that structure and exposes behavior.
-- Uncle Bob Martin The Hungarian Abhorrence Principle

Lisp... if you don't like the syntax, write your own.
-- Gordon Weakliem, Hashed Thoughts, on simple syntax for complex data structures

Pairing is a practice that has (IIRC) at least five different benefits. If you can't pair, then you need to find somewhere else in the process to put those benefits.
-- John Roth, on the XP mailing list

Fumbling with the gear is the telltale sign that I'm out of practice with my craft. ... And day by day, the enjoyment of the craft is replaced by the tedium of work.
-- Mike Clark, Practice

So when you get rejected by investors, don't think "we suck," but instead ask "do we suck?" Rejection is a question, not an answer.
-- Paul Graham, The Hacker's Guide to Investors

Practice. Question rejection.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Software Development

May 11, 2007 9:29 AM

Fish is Fish

Yesterday's post on the time value of study reminded me a bit of Aesop's fable The Ant and the Grasshopper. So perhaps you'll not be surprised to read a post here about a children's book.

cover image of Fish is Fish

While at workshop a couple of weeks ago, I had the pleasure of visiting my Duke friends and colleagues Robert Duvall and Owen Astrachan. In Owen's office was a copy of the book Fish Is Fish, by well-known children's book author Leo Lionni. Owen and Robert recommended the simple message of this picture book, and you know me, so... When I got back to town, I checked it out.

The book's web site summarizes the book as:

A tadpole and a minnow are underwater friends, but the tadpole grows legs and explores the world beyond the pond and then returns to tell his fish friend about the new creatures he sees. The fish imagines these creatures as bird-fish and people-fish and cow-fish and is eager to join them.

The story probably reaches young children in many ways, but the first impression it left on me was, "You can't imagine what you can't experience." Then I realized that this was both an overstatement of the story and probably wrong, so I now phrase my impression as, "How we imagine the the rest of the world is strongly limited by who we are and the world in which we live." And this theme matters to grown-ups as much as children.

Consider the programmer who knows C or Java really well, but only those languages. He is then asked to learn functional programming in, say, Scheme. His instructor describes higher-order procedures and currying, accumulator passing and tail-recursion elimination, continuations and call/cc. The programmer sees all these ideas in his mind's eye as C or Java constructs, strange beasts with legs and fins.

Or consider the developer steeped in the virtues and practices of traditional software engineering. She is then asked to "go agile", to use test-first development and refactoring browsers, pair programming and continuous integration, the planning game and YAGNI. The developer is aghast, seeing these practices in her mind's eye from the perspective of traditional software development, horrific beasts with index cards and lack of discipline.

When we encounter ideas that are really new to us, they seem so... foreign. We imagine them in our own jargon, our own programming language, our own development style. They look funny, and feel unnecessary or clunky or uncomfortable or wrong.

But they're just different.

Unlike the little fish in Lionni's story, we can climb out of the water that is our world and on to the beach of a new world. We can step outside of our experiences with C or procedural programming or traditional software engineering and learn to live in a different world. Smalltalk. Scheme. Ruby. Or Erlang, which seems to have a lot of buzz these days. If we are willing to do the work necessary to learn something new, we don't flounder in a foreign land; we make our universe bigger.

Computing fish don't have to be (just) fish.

----

(Ten years ago, I would have used OOP and Java as the strange new world. OO is mainstream now, but -- so sadly -- I'm not sure that real OO isn't still a strange new world to most developers.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

April 28, 2007 12:55 PM

Open Mind, Closed Mind

I observed an interesting phenomenon working in a group this morning.

Another professor, an undergrad student, and I are at Duke University this weekend for a workshop on peer-led team-learning in CS courses. This is an idea borrowed from chemistry educators that aims to improve recruitment and retention, especially among underrepresented populations. One of our first activities was to break off into groups of 5-8 faculty and so a sample session PLTL class session, led by an experienced undergrad peer leader from one of the participating institutions. My group's leader was an impressive young women from Duke who is headed to Stanford this fall for graduate work in biomedical informatics.

One of the exercises our group did involved Soduku. First, we worked on a puzzle individually, and then we came back together to work as a group. I finished within a few minutes, before the leader called time, while no one else had filled in much of the grid yet.

Our leader asked us to describe bits about how we had solved the puzzle, with an eye toward group-writing an algorithm. Most folks described elements of the relatively naive combinatorial approach of immediately examining constraints on individual squares. When my turn came, I described my usual approach, which starts with a preprocessing of sorts that "cherry picks" obvious slots according to groups of rows and columns. Only later do I move on to constraints on individual squares and groups of squares.

I was surprised, because no one seemed to care. They seemed happy enough with the naive approach, despite the fact that it hadn't served them all that while solving the puzzle earlier. Maybe they dismissed my finishing quickly as an outlier, perhaps the product of a Soduku savant. But I'm no Soduku savant; I simply have had a lot of practice and have developed one reasonably efficient approach.

The group didn't seem interested in a more efficient approach, because they already knew how to solve the problem. My approach didn't match their own experiences, or their theoretical understanding of the problem. They were comfortable with their own understanding.

(To be honest, I think that most of them figured they just needed to "go faster" in order to get done faster. If you know your algorithms, you know that going faster doesn't help at all with many, many algorithms! We still wouldn't get done.)

Dr. Phil -- How's that workin' for ya?

After making this observation, I also had a realization. In other situations, I behave just like this. Sometimes, I have an idea in mind, one I like and am comfortable with, and when confronted with something that might be better, I am likely to dismiss it. Hey, I just need to tweak what I already know. Right. I imagine Dr. Phil asking in his Texas drawl, "How's that workin' for ya?" Not so well, but with a little more time...

When I want to learn, entering a situation with a closed mind is counterproductive. This is, of course, true when I walk into the room saying, "I don't want to learn anything new." But it is just as important, and far more dangerous, when I think I want to learn but am holding tightly to my preconceptions and idiosyncratic experiences. In that case, I expect that I will learn, but really all I can do is rearrange what we already know. And I may end up disappointed when I don't make a big leap in knowledge or performance.

One of the PLTL undergrad leaders working with us gets it. He says that one of the greatest benefits of being a peer leader is interacting with the students in his groups. He has learned different way to approach many specific problem and different high-level approaches to solving problems more generally. And he is a group leader.

Later we had fun with a problem on text compression, using Huffman coding as our ultimate solution. I came up with an encoding targeted to a particular string, which used 53 bits instead of the 128 bits of a standard ASCII encoding. No way a Huffman code can beat that. Part way through my work on my Huffman tree, I was even surer. The end result? 52 bits. It seems my problem-solving ego can be bigger than warranted, too. Open mind, open mind.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 26, 2007 7:05 PM

Don Norman on Cantankerous Cars

Yesterday afternoon I played professional hooky and rode with another professor and a few students to Ames, Iowa, to attend the fourth HCI Forum, Designing Interaction 2007, sponsored by Iowa State's Virtual Reality Applications Center. This year, the forum kicks off a three-day Emerging Technologies Conference that features several big-name speakers and a lot of the HCI research at ISU.

Donald Norman

I took an afternoon "off" to hear the keynote by Donald Norman, titled "Cautious Cars and Cantankerous Kitchens". It continues the story Norman began telling years ago, from his must-read The Design of Everyday Things to his in-progress The Design of Future Things.

"Let me start with a story." The story is about how a time when he is driving, feeling fine, and his wife feels unsafe. He tries to explain to her why everything is okay.

"New story." Same set-up, but now it's not his wife reacting to an unsafe feeling, but his car itself. He pays attention.

Why does he trust his car more than he trusts his wife? He thinks it's because, with his wife, conversation is possible. So he wants to talk. Besides, he feels in control. When conversation is not possible, and the power lies elsewhere, he acquiesces. But does he "trust"? In Norman's mind, I think the answer is 'no'.

Control is important, and not always in the way we think. Who has the most power in a negotiation? Often (Norman said always), it is the person with the least power. Don't send your CEO; send a line worker. Why? No matter how convincing the other sides' arguments are, the weakest participant may well have to say, "Sorry, I have my orders." Or at least "I'll have to check with my boss".

It's common these days to speak of smart artifacts -- smart cars, houses, and so on. But the intelligence does not reside in the artifact. It resides in the head of designer.

And when you use the artifact, the designer not there with you. The designer would be able to handle unexpected events, even by tweaking the artifact, but the artifact itself can't.

"There are two things about unexpected events... They are unexpected. And they always happen."

Throughout his talk, Norman compared driving a car to riding a horse, driving a horse and carriage, and then to riding a bike. The key to how well these analogies work or not lies in the three different levels of engagement that a human has: visceral, behavioral, and reflective. Visceral is biological, hard-coded in our brains, and so largely common to all people. It recognizes safe and dangerous situations. Behavioral refers to skills and "compiled" knowledge, knowledge that feels like instinct because it is so ingrained. Reflective is just that, our ability to step outside of a situation and consider it rationally. There are times for reflective engagement, but hurtling around a mountain curve at breakneck speed is not one of them.

Norman suggested that a good way to think of designing intelligent systems is to think of a new kind of entity: (human + machine). The (car + driver) system provides all three levels of engagement, with the car providing the visceral intelligence and the human providing the behavioral and reflective intelligences. Cars can usually measure most of what makes our situations safe or dangerous better than we can, because our visceral intelligence evolved under very different circumstances than the ones we now live in. But the car cannot provide the other levels of intelligence, which we have evolved as much more general mechanisms.

Norman described several advances in automobile technology that are in the labs or even available on the road: cars with adaptive cruise control; a Lexus that brakes when its on-board camera senses that the driver isn't paying attention; a car that follows lanes automatically; a car that parks automatically, both parallel and head-in. Some of these sound like good ideas, but...

In Norman's old model of users and tasks, he spoke of the gulfs of evaluation and execution. In his thinking these days, he speaks of the knowledge gap between human & machine, especially as we more and more think about machines as intelligence.

The problem, in Norman's view, is that machines automate the easy parts of a task, and they fail us when things get hard and we most need them. He illustrated his idea with a slide titled "Good Morning, Silicon Valley" that read, in part, "... at the very moment you enter a high-speed crisis, when a little help might come in handy, the system says, 'Here, you take it.'"

Those of us who used to work on expert systems and later knowledge-based systems recognize this as the brittleness problem. Expert systems were expert in their narrow niche only. When a system reached the boundary of its knowledge, its performance went from expert to horrible immediately. This differed from human experts and even humans who were not experts, whose performances tended to degrade more gracefully.

My mind wandered during the next bit of the talk... Discussion included ad hoc networks of cars on the road, flocking behavior, cooperative behavior, and swarms of cars cooperatively drafting. Then he discussed a few examples of automation failures. The first few were real, but the last two were fiction -- but things he thinks may be coming, in one form or another:

  • I swipe my credit card to make a purchase at the store. The machine responds, "Transaction Refused. You Have Enough Shoes."
  • A news headline: "Motorist Trapped in Roundabout for 14 Hours". If you drive a care that follows lanes and overrules your attempts to change... (April Fool's!)

Norman then came to another topic familiar to anyone who has done AI research or thought about AI for very long. The real problem here is shared assumptions, what we sometimes now call "common ground". Common ground in human-to-human communication is remarkably good, at least when the people come from cultures that share something in common. Common ground in machine-to-machine is also good, sometimes great, because it is designed. Much of what we design follows a well-defined protocol that makes explicit the channel of communication. Some protocols even admit a certain amount of fuzziness and negotiation, again with some prescribed bounds.

But there is little common ground in communication between human and machine. Human knowledge is so much richer, deeper, and interconnected than what we are yet able to provide our computer programs. So humans who wish to communicate with machines must follow rigid conventions, made explicit in language grammars, menu structures, and the like. And we aren't very good at following those kind of rules.

Norman believes that the problem lies in the "middle ground". We design systems in which machines do most or a significant part of a task and in which humans handle the tough cases. This creates expectation and capability gaps. His solution: let machine do all of a task -- or nothing. Anti-lock brakes were one of his examples. But what counts as a complete task? It seems to me that this solution is hard to implement in practice, because it's hard to draw a boundary around what is a "whole task".

Norman told a short story about visiting Delft, a city of many bicycles. As he and his guide were coming to the city square, which is filled with bicycles, many moving fast, his guide advised him, "Don't try to help them." By this, he meant not to slow down or speed up to avoid a bike, not to guess the cyclist's intention or ability. Just cross the street.

Isn't this dangerous? Not as dangerous as the alternative! The cyclist has already seen you and planned how to get through without injuring you or him. If you do something unexpected, you are likely to cause an accident! Act in the standard way so that the cyclist can solve the problem. He will.

This story led into Norman's finale, in which he argued that automation should be:

  • predictable
  • self-explaining
  • optional
  • assistive

The Delft story illustrated that the less flexible, less powerful party should be the more predictable party in an interaction. Machines are still less flexible than humans and so should be as predictable as possible. The computer should act in the standard way so that the human user can solve the problem. She will.

Norman illustrated self-explaining with a personal performance of the beeping back-up that most trucks have these days. Ever have anyone explain what the frequency of the beeps means? Ever read the manual? I don't think so.

The last item on the list -- assistive -- comes back to what Norman has been preaching forever and what many folks who see AI as impossible (or at least not far enough along) have also been saying for decades: Machines should be designed to assist humans in doing their jobs, not to do the job for them. If you believe that AI is possible, then someone has to do the research to bring it along. Norman probably disagrees that this will ever work, but he would at least say not to turn immature technology into commercial products and standards now. Wait until they are ready.

All's I know is... I could really have used a car that was smarter than its driver on Tuesday morning, when I forgot to raise my still-down garage door before putting the car into reverse! (Even 20+ years of habit sometimes fails, even if under predictable conditions.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

April 17, 2007 7:46 AM

Less May Be More

Over at Lambda the Ultimate, I ran into a minimal Lisp system named PicoLisp. Actually, I ran into a paper that describes PicoLisp as a "radical approach to application development", and from this paper I found my way to the software system itself.

PicoLisp is radical in eschewing conventional wisdom about programming languages and application environments. The Common Lisp community has accepted much of this conventional wisdom, it would seem, in reaction to criticism of some of Lisp's original tenets: the need for a compiler to achieve acceptable speed, static typing within an abundant set of specific types, and the rejection of pervasive, shallow dynamic binding. PicoLisp rejects these ideas, and takes Lisp's most hallowed feature, the program as s-expression, to its extreme: a tree of executable nodes, each of which...

... is typically written in optimized C or assembly, so the task of the interpreter is simply to pass control from one node to the other. Because many of those built-in Lisp functions are very powerful and do a lot of processing, most of the time is spent in the nodes. The tree itself functions as a kind of glue.

In this way an "interpreter" that walks the tree can produce rather efficient behavior, at least relative to what many people think an interpreter can do.

As a developer, the thing I notice most in writing PicoLisp code is its paucity of built-in data types. It supports but three, numbers, symbols, and lists. No floats; no strings or vectors. This simplifies the interpreter in several ways, as it now need to make run-time checks on fewer different types. The price is paid by the programmer in two ways. First, at programming time, the developer must create the higher-order data types as ADTs -- but just once. This is a price that any user of a small language must pay and was one of the main trade-offs that Guy Steele discussed in his well-known OOPSLA talk Growing a Language. Second, at run time, the program will use more space and time than if the those types were primitive in the compiler. But space is nearly free these days, and the run-time disadvantage turns out to be smaller than one might imagine. The authors of PicoLisp point out that the freedom their system gives them saves them a much more expensive sort of time -- developer time in an iterative process that they liken to XP.

Can this approach work at all in the modern world? PicoLisp's creators say yes. They have implemented in PicoLisp a full application development environment that provides a database engine, a GUI, and the generation of Java applets. Do they have the sort of competitive advantage that Paul Graham's writes about having had at the dawn of ViaWeb? Maybe so.

As a fan of languages and language processors, I always enjoy reading about how someone can be productive working in an environment that stands against conventional wisdom. Less may be more, but not just because it is less (say, fewer types and no compiler). It is usually more because it is also different (s-expressions as powerful nodes glued together with simple control).


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 14, 2007 3:38 PM

If Only We Had More Time...

Unlike some phrases I find myself saying in class, I don't mind saying this one.

Used for the wrong reasons, it would signal a problem. "If we had more time, I would teach you this important concept, but..." ... I've left it out because I didn't plan the course properly. ... I've left it out because preparing to teach it well would take too much time. ... I'm running behind; I wasted too much time speaking off-topic. There are lots of ways that not covering something important is wrong.

But there is a very good reason why it's not possible to cover every topic that comes up. There is so much more! There are more interesting ideas in this world -- in programming languages, in object-oriented programming, in algorithms -- than we can cover in a 3-credit, 15-week course. The ideas of computing are bigger than any one course, and some of the cool things we do in class are only the beginning. This is a good thing. Our discipline is deep, and it rewards the curious with unexpected treasures.

More practically, "If only we had more time..." is a cue to students who do have more time -- graduate students looking for research projects, undergrad honors students looking for thesis topics, an undergrads who might be thinking of grad school down the line.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

April 04, 2007 5:57 PM

Science Superstars for an Unscientific Audience

Somewhere, something incredible is waiting to be known.
-- Carl Sagan

Sometime ago I remember reading John Scalzi's On Carl Sagan, a nostalgic piece of hero worship for perhaps the most famous popularizer of science in the second half of the 20th century. Having written a piece or two of my own of hero worship, I empathize with Scalzi's nostalgia. He reminisces about what it was like to be an 11-year-old astronomer wanna-be, watching Sagan on TV, "talk[ing] with celebrity fluidity about what was going on in the universe. He was the people's scientist."

Scalzi isn't a scientist, but he has insight into the importance of someone like Sagan to science:

... Getting science in front of people in a way they can understand -- without speaking down to them -- is the way to get people to support science, and to understand that science is neither beyond their comprehension nor hostile to their beliefs. There need to be scientists and popularizers of good science who are of good will, who have patience and humor, and who are willing to sit with those who are skeptical or unknowing of science and show how science is already speaking their language. Sagan knew how to do this; he was uncommonly good at it.

We should be excited to talk about our work, and to seek ways to help others understand the beauty in what we do, and the value of what we do to humanity. But patience and humor are too often in short supply.

I thought of Scalzi's piece when I ran across a link to the recently retired Lance Fortnow's blog entry on Yisroel Brumer's Newsweek My Turn column Let's Not Crowd Me, I'm Only a Scientist. Seconding Brumer's comments, Fortnow laments that the theoretical computer scientist seems at a disadvantage in trying to be Sagan-like:

Much as I get excited about the P versus NP problem and its great importance to all science and society, trying to express these ideas to uninterested laypeople always seems to end up with "Should I buy an Apple or a Windows machine?"

(Ooh, ooh! Mr. Kotter, Mr. Kotter! I know the answer to that one!)

I wonder if Carl Sagan ever felt like that. Somehow I doubt so. Maybe it's an unfair envy, but astronomy and physics seem more visceral, more romantic to the general public. We in computing certainly have our romantic sub-disciplines. When I did AI research, I could always find an interested audience! People were fascinated by the prospect of AI, or disturbed by it, and both groups wanted to talk about. But as I began to do work in more inward-looking areas, such as object-oriented programming or agile software development, I felt more like Brumer felt as a scientist:

Just a few years ago, I was a graduate student in chemical physics, working on obscure problems involving terms like quantum mechanics, supercooled liquids and statistical thermodynamics. The work I was doing was fascinating, and I could have explained the basic concepts with ease. Sure, people would sometimes ask about my work in the same way they say "How are you?" when you pass them in the hall, but no one, other than the occasional fellow scientist, would actually want to know. No one wanted to hear about a boring old scientist doing boring old science.

So I know the feeling reported by Brumer and echoed by Fortnow. My casual conversation occurs not at cocktail parties (there aren't my style) but at 8th-grade girls' basketball games, and in the hall outside dance and choir practices. Many university colleagues don't ask about what I do at all, at least once they know I'm in CS. Most assume that computers are abstract and hard and beyond them. When conversation does turn to computers, it usually turns to e-mail clients or ISPs. If I can't diagnose some Windows machine's seemingly random travails, I am considered quizzically. I can't tell if they think I am a fraud or an idiot. Isn't that what computer scientists know, what they do?

I really can't blame them. We in computing don't tell our story all that well. (I'm have a distinct sense of deja vu right now, as I have blogged this topic several times before.) The non-CS public doesn't know what we in CS do because the public story of computing is mostly non-existent. Their impressions are formed by bad experiences using computers and learning how to program.

I take on some personal responsibility as well. When my students don't get something, I have to examine what I am doing to see whether the problem is with how I am teaching. In this case, maybe I just need to to be more interesting! At least I should be better prepared to talk about computing with a non-technical audience.

(By the way, I do know how to fix that Windows computer.)

But I think that Brumer and Fortnow are talking about something bigger. Most people aren't all that interested in science these days. They are interested in the end-user technology -- just ask them to show you the cool features on their new cell phones -- but not so much in the science that underlies the technology. Were folks in prior times more curious about science? Has our "audience" changed?

Again, we should think about where else responsibility for such change may lie. Certainly our science has changed over time. It is often more abstract than it was in the past, farther removed from the user's experience. When you drop too many layers of abstraction between the science and the human experience, the natural response of the non-scientist is to view the science as magic, impenetrable by the ordinary person. Or maybe it's just that the tools folks use are so commonplace that they pay the tools no mind. Do us old geezers think much about the technology that underlies pencils and the making of paper?

The other side of this issue is that Brumer found, after leaving his scientific post for a public policy position, that he is now something of a star among his friends and acquaintances. They want to know what he thinks about policy questions, about the future. Ironic, huh? Scientists and technologists create the future, but people want to talk to wonks about it. They must figure that a non-scientist has a better chance of communicating clearly with them. Either they don't fear that something will be lost in the translation via the wonk, or they decide that the risk is worth taking, whatever the cost of that.

This is the bigger issue: understanding and appreciation of science by the non-scientist, the curiosity that the ordinary person brings to the conversation. When I taught my university's capstone course, aimed at all students as their culminating liberal-arts core "experience", I was dismayed by the lack of interest among students about the technological issues that face them and their nation. But it seems sometimes that even CS students don't want to go deeper than the surface of their tools. This is consistent with a general lack of interest in how world works, and the role that science and engineering play in defining today's world. Many, many people are talking and writing about this, because a scientifically "illiterate" person cannot make informed decisions in the public arena. And we all live with the results.

I guess we need our Carl Sagan. I don't think it's in me, at least not by default. People like Bernard Chazelle and Jeannette Wing are making an effort to step out and engage the broader community on its own terms. I wish them luck in reaching Sagan's level and will continue to do my part on a local scale.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

March 30, 2007 6:51 PM

A Hint for Idealess Web Entrepreneurs

I'm still catching up on blog reading and just ran across this from Marc Hedlund:

One of my favorite business model suggestions for entrepreneurs is, find an old UNIX command that hasn't yet been implemented on the web, and fix that. talk and finger became ICQ, LISTSERV became Yahoo! Groups, ls became (the original) Yahoo!, find and grep became Google, rn became Bloglines, pine became Gmail, mount is becoming S3, and bash is becoming Yahoo! Pipes. I didn't get until tonight that Twitter is wall for the web.

Show of hands -- how many of you have used every one of those Unix commands? The rise of Linux means that my students don't necessarily think of me as a dinosaur for having used all of them!

I wonder when rsync will be the next big thing on the web. Or has already done that one, too?

Then Noel Welsh points out a common thread:

The real lesson, I think, is that the basics of human nature are pretty constant. A lot of the examples above are about giving people a way to talk. It's not a novel idea, it's just the manifestation that changes.

Alan Kay is right -- perhaps the greatest impact of computing will ultimately be as a new medium of communication, not as computation per se. Just this week an old friend from OOPSLA and SIGCSE dropped me a line after stumbling upon Knowing and Doing via a Google search for an Alan Kay quote. He wrote, "Your blog illustrates the unique and personal nature of the medium..." And I'm a pretty pedestrian blogger way out on the long tail of the blogosphere.

This isn't to say that computation qua computation isn't exceedingly important. I have a colleague who continually reminds us young whippersnappers about the value added by scientific applications of computing, and he's quite right. But it's interesting to watch the evolution of the web as a communication channel, and as our discipline lays the foundation for a new way to speak we make possible the sort of paradigm shift that Kay foretells. And this paradigm shift will put the lie to the software world's idea that moving from C to C++ is a "paradigm shift". To reach Kay's goal, though, we need to make the leap from social software to everyman-as-programmer, though that sort of programming may look nothing like what we call programming today.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

March 30, 2007 12:06 PM

One of Those Weeks

I honestly feel like my best work is still ahead of me.
I'm just not sure I can catch up to it.

I owe this gem, which pretty much sums up how I have felt all week, to comedian Drew Hastings, courtesy of the often bawdy but, to my tastes, always funny Bob and Tom Show. Hastings is nearly a decade older than I, but I think we all have this sense sooner or later. Let's hope it passes!

I owe you some computing content, so here is an interview with Fran Allen, who recently received the 2006 Turing Award. She challenges us to recruit women more effectively ("Could the problem be us?") and to help our programming languages and compilers catch up with advances in supercomputing ("Only the bold should apply!")


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal

March 26, 2007 8:14 PM

The End of a Good Blog

Between travels and work and home life, I've fallen way behind in reading my favorite blogs. I fired up NetNewsWire Lite this afternoon in a stray moment just before heading home and checked my Academic CS channel. When I saw the blog title "The End", I figured that complexity theorist Lance Fortnow had written about the passing of John Backus. Sadly, though, he has called an end to his blog, Computational Complexity. Lance is one of the best CS bloggers around, and he has taught me a lot about the theory of computation and the world of theoretical computer scientists. Theory was one of my favorite areas to study in graduate school, but I don't have time to keep up on its conferences, issues, and researchers full time. These days I rely on the several good blogs in this space to keep me abreast. With Computational Complexity's demise, I'll have one less source to turn to. Besides, like the best bloggers, Lance was a writer worth reading regardless of his topic.

I know how he must feel, though... His blog is 4.5 years and 958 entries old, while mine is not yet 3 years old and still shy of 500 posts. There are days and weeks where time is so scarce that not writing becomes easy. Not writing becomes a habit, and pretty soon I almost have to force myself to write. So far, whenever I get back to writing regularly, the urge to write re-exerts itself and all is well with Knowing and Doing is well again. Fortunately, I still the need to write as I learn. But I can imagine a day when the light will remain dim, and writing out of obligation will not seem right.

Fortunately, we all still have good academic CS blogs to read, among my favorites being the theory blogs Ernie's 3D Pancakes and The Geomblog. But I'll miss reading Lance's stuff.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

March 21, 2007 4:35 PM

END DO

John Backus

What prompted me to finally write about Frances Allen winning the Turing Award was a bit of sad news. One of the pioneers of computing, John Backus, has died. Like Allen, Backus also worked in the area of programming languages. He is most famous as the creator of Fortran, as reported in the Times piece:

Fortran changed the terms of communication between humans and computers, moving up a level to a language that was more comprehensible by humans. So Fortran, in computing vernacular, is considered the first successful higher-level language."

I most often think of Backus's contribution in terms of the compiler for Fortran. His motivation to write the compiler and design the language was that shared by many computer scientists through history: laziness. Here is my favorite quote from the CNN piece:

"Much of my work has come from being lazy," Backus told Think, the IBM employee magazine, in 1979. "I didn't like writing programs, and so, when I was working on the IBM 701 (an early computer), writing programs for computing missile trajectories, I started work on a programming system to make it easier to write programs."

Work on a programming system to make it easier to write programs... This is the beginning of computer science as we know it!

Backus's work laid the foundation for Fran Allen's work; in fact, her last big project was called PTRAN, an homage to Fortran that stands for Parallel TRANslation.

One of my favorite Backus papers is his Turing Award essay, Can Programming Be Liberated from the von Neumann Style? (subtitled: A Functional Style and its Algebra of Programs). After all his years working on language for programmers and translators for machines, he had reached a conclusion that the mainstream computing world is still catching up to, that a functional programming style may serve us best. Every computer scientist should read it.

This isn't the first time I've written of the passing of a Turing Award winner. A couple of years ago, I commented on Kenneth Iverson, also closely associated with his own programming language, APL. Ironically, APL offers a most extreme form of liberation from the von Neumann machine. Thinking of Iverson and Backus together at this time seems especially fitting.

The Fortran programmers among us know what the title means. RIP.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 21, 2007 2:03 PM

Turing Award History: Fran Allen

Frances Allen

For the last month or so, I have been meaning to write about a historic announcement in the computing world, that Frances Allen receiving the Turing Award. This is the closest thing to a Nobel Prize in computing as there is. The reason that this announcement is historic? Allen is the first women to win a Turing. Women have long made seminal contributions to computer science, something that most computer scientists know and appreciate. But it's a milestone when these contributions receive prestigious and public acknowledgment. You can also listen to a short interview with John Hennessy, the well-know computer scientist who is now president of Stanford, in which he puts Allen's award into context.

In one of her interviews since receiving the award, Allen reminds us that women these days are underrepresented in computing compared to when she began her career in the 1950s. Women were better represented in computing than in most other science and technology disciplines up through 1985 or 1990, at which time their numbers began to plummet -- precipitously. This is a trend that most of us in computing would like to reverse, and one that Allen has working to correct since retiring from IBM five years ago.

Let's not allow the social implication of Allen's award to steal attention from the quality and importance of the technical work she did. The chair of the Turing Award committee said, "Fran Allen's work has led to remarkable advances in compiler design and machine architecture that are at the foundation of modern high-performance computing." Her primary contributions came in the areas of compiler design and program optimization, techniques for making compilers that produce better executable code. Her early work is the theoretical foundation for modern optimizers that work independent of particular source languages and target machines. Her later work focused on optimization of programs for parallel computers, which contributed to high-performance computing for weather simulation and bioinformatics. I found one of her seminal papers in the ACM digital library: Control Flow Analysis; check it out.

Thinking about this award helps us to remember the "applied" value that derives from basic scientific research in an area as theoretical as compiler optimization can be. By making it possible to write really good compilers, we make it possible to create higher-level programming languages. This makes programmers more productive and also widens the potential population of programmers. By advancing parallel high-speed computing, we make it possible to study much larger problems and thus address important social and scientific questions. This latter point is an important one in the context of trying to make computing more attractive to women, who seem to be more interested in careers that "advance the public good" in obvious ways. Allen herself has stated her hope that high-performance computing's role in medical and scientific research will attract women back into our profession.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 13, 2007 1:55 PM

Yannis's Law on Programmer Productivity

I'm communications chair for OOPSLA 2007 and was this morning updating the CFP for the research papers track, adding URLs to each member of the program committee. Chair David Bacon has assembled quite a diverse committee, in terms of affiliation, continent, gender, and research background. While verifying the URLs by hand, I visited Yannis Smaragdakis's home page and ran across his self-proclaimed Yannis's Law. This law complements Moore's Law in the world of software:

Programmer productivity doubles every 6 years.

I have long been skeptical of claims that there is a "software crisis", that as hardware advances give us incredibly more computing power our ability to create software grows slowly, or even stagnates. When I look at the tools that programmers have today, and at what students graduate college knowing, I can't take seriously the notion that programmers today are less productive than those who worked twenty or more years ago.

We have made steady advances in the tools available to mainstream programmers over the last thirty years, from frameworks and libraries, to patterns, to integrated environments such as .NET and Eclipse, down to the languages we use, like Perl, Ruby, and Python. All help us to produce more and better code in less time than we could manage even back when I graduated college in the mid-1980s.

Certainly, we produced far fewer CS graduates and and employed far fewer programmers thirty years ago, and so we should not be surprised that that cohort was -- on average -- perhaps stronger than the group we produce today. But we have widened the channel of students who study CS in recent decades, and these kids do all right in industry. When you take this democratization of the pool of programmers into account, I think we have done all right in terms of increasing productivity of programmers.

I agree with Smaragdakis's claim that a decent programmer working with standard tools of the day should be able to produce Parnas's KWIC index system in a couple of hours. I suspect that a decent undergrad could do so as well.

Building large software systems is clearly a difficult task, one usually dominated by human issues and communication issues. But we do our industry a disservice when we fail to see how far we have come.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 12, 2007 12:34 PM

SIGCSE Day 3: Jonathan Schaeffer and the Chinook Story

The last session at SIGCSE was the luncheon keynote address by Jonathan Schaeffer, who is a computer scientist at the University of Alberta. Schaeffer is best known as the creator of Chinook, a computer program that in the mid 1990s became the second best checker player in the history of the universe and that, within three to five months, will solve the game completely. If you visit Chinook's page, you can even play a game!

I'm not going to write the sort of entry that I wrote about Grady Booch's talk or Jeannette Wing's talk, because frankly I couldn't do it justice. Schaeffer told us the story of creating Chinook, from the time he decided to make an assault on checkers, through Chinook's rise and monumental battles with world champion Marion Tinsley, up to today. What you need to do is to go read One Jump Ahead, Schaeffer's book that tells the story of Chinook up to 1997 in much more detail. Don't worry if you aren't a computer scientist; the book is aimed at a non-technical audience, and in this I think Schaeffer succeeds. In his talk, he said that writing the book was the hardest thing he has ever done -- harder than creating Chinook itself! -- because of the burden of trying to make his story interesting and understandable to the general reader.

If you are a CS professor of student, you'll still learn a lot from the book. Even though it is non-technical, Schaeffer does a pretty good job introducing the technical challenges that faced his team, from writing software to play the game in parallel across as many processors as he could muster, to building databases of the endgame positions so that the program could play endings perfectly. (A couple of these databases are large by today's standards. Just try to recall how large a billion billion entry table must have seemed in 1994!) He also helps to us feel what he must have felt when non-intellectual problems arose, such as a power failure in the lab that had been computing a database for weeks, or mix-up at the hotel where Chinook was playing its world championship match that resulted in oven-like temperatures in the playing room. This snafu may account for one of Chinook's losses in that match.

As a computer scientist, what I found most compelling about the talk was reading about the dreams, goals, and daily routine of a regular computer scientist. Schaeffer is clearly a bright and talented guy, but he tells his story as one of an Everyman -- a guy with a big ego who obsessively pursued a research goal, whose goal came to have as much of a human element as a technical one. He has added to our body of knowledge, as well as our lore. I think that non-technical readers can appreciate the human intrigue in the Chinook-versus-Tinsley story as well. It's a thriller of a sort, with no violence in its path.

I knew a bit about checkers before I read the book. Back in college, I was trying to get my roommate to join me in an campus-wide chess tournament that would play out over several weeks. I was a chessplayer, but he was only casual, so he decided one way to add a bit of spice was for both of us to enter the checkers part of the same tournament. Neither of us know much about checkers other than how to move the pieces. The dutiful students that we were, we went to Bracken Library and checked out several books on checkers strategy and studied them before the tournament. That's where I learned that checkers has a much narrower search space than chess, and that many of its critical variations are incredibly narrow and also incredibly deep. This helped me to appreciate how Tinsley, the human champion, once computed a variation over 40 moves long at the table while playing Chinook. (Schaeffer did a wonderful job explaining the fear this struck in him and his team: How can we beat this guy? He's more of a machine than our program!)

That said, knowing how to play checkers will help as you read the book, but it's not essential. If you do know, dig out a checkers board and play along with some of the game scores as you read. To me, that added to the fun.

Reading the book is worth the effort only to learn about Chinook's nemesis, Marion Tinsley ( Chinook page | wikipedia page), the 20th-century checkers player (and math Ph.D. from Ohio State) who until the time of his death was the best checkers player in the world, almost certainly the best checkers player in history, and in many ways unparalleled by any competitor in any other game or sport I know of. Until his first match against Chinook, Tinsley lost only 3 games in 42 years. He retired through the 1960s because he was so much better than his competition that competition was no fun. The appearance of Chinook on the scene, rather than bothering or worrying him (as it did most in the checkers establishment, and as the appearance of master-level chess programs did at first in the chess world), reinvigorated Tinsley, as it now gave him opponent that played at his level and, even better, had no fear of him. By Tinsley's standard, guys like Michael Jordan, Tiger Woods, and even Lance Armstrong are just part of the pack in their respective sports. Armstrong's prolonged dominance of the Tour de France is close, but Tinsley won every match he played and nearly every game, not just in the single premiere event each year.

The book is good, but the keynote talk was compelling in its own way. Schaeffer isn't the sort of electric speaker that holds his audience by force of personality. He really seemed like a regular guy, but one telling the story of his own passions, in a way that gripped even someone who knew the ending all the way to the end. (His t-shirt with pivotal game positions on both front and back was a nice bit of showmanship!) And one story that I don't remember from the book was even better in person: He talked about how he used lie in bed next to his wife and fantasize... about Marion Tinsley, and beating him, and how hard that would be. One night his wife looked over and asked, "Are you thinking about him again?"

Seeing this talk reminded me of why I love AI and loved doing AI, and why I love being a computer scientist. There is great passion in being a scientist and programmer, tackling a seemingly insurmountable problem and doggedly fighting it to the end, through little triumphs and little setbacks along the way. Two thumbs up to the SIGCSE committee for its choice. This was a great way to end SIGCSE 2007, which I think was one of the better SIGCSEs in recent years.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 10, 2007 12:39 PM

SIGCSE This and That

Here are some miscellaneous ideas from throughout the conference...

Breadth in Year 1

On Friday, Jeff Forbes and Dan Garcia presented the results of their survey of introductory computer science curricula across the country. The survey was methodologically flawed in many ways, which makes it not so useful for drawing any formal conclusions. But I did notice a couple of interesting ways that schools arrange their first-year courses. For instance, Cal Tech teaches seven different languages in their intro sequence. Students must take three -- two required by the department, and a third chosen from a menu of five options. Talk about having students with different experiences in the upper-division courses! I wonder how well their students learn all of these languages (some are small, like Scheme and Haskell), and I wonder how well this would work at other schools.

Many Open Doors

In the same session, I learned that Georgia Tech offers three different versions of CS1: the standard course for CS majors, a robotics-themed course for engineers, and the media computation course that I adopted last fall for my intro course. Even better, they let CS majors take any of the CS1s to satisfy their degree requirement.

This is the sort of first-year curriculum that we are moving to at UNI. For a variety of reasons, we have had a hard time arriving at a common CS1/CS2 sequence that satisfies all of our faculty. We've had parallel tracks in Java and C/C++ for the last few years, and we've decided to make this idea of alternate routes into the major a feature of our program, rather than a curiosity (or a bug!). Next year, we will offer three different CS1/CS2 sequences. Our idea is that with "many open doors", more different kinds of students may find what appeals to them about CS and pursue a major or minor. Recruitment, retention, faculty engagement -- I have high hopes that this broadening of our CS1 options will help our department get better.

No Buzz

Last year, the buzz at SIGCSE was "Python rising". That seemed a natural follow-up to SIGCSE 2005, where the buzz seemed to be "Java falling". But this year, neither of these trends seems to have gained steam. Python is out there seeing some adoptions, but Java remains strong, and it doesn't seem to be going anywhere fast.

I don't feel a buzz at SIGCSE this year. The conference has been useful to me in many ways, and I've enjoyed many sessions. But there doesn't seem to be energy building behind any particular something that points to a different sort of future.

That said, I do notice the return of games to the world of curriculum. Paper sessions on games. Special sessions on games. Game development books at every publisher's booth. (Where did they come from? Books take a long time to write!) Even still, I don't feel a buzz.

The idea that causes the most buzz for me personally is the talk of computational thinking, and what that means for computer science as a major and as a discipline for all.

RetroChic CS

I am sitting in a session on nifty assignments just now. The assignments have ranged from nifty to less so, but the last takes us all back to the '70s and '80s. It is Dave Reed's explorations with ASCII art, modernized as ASCIImations. He started his talk with a quote that seems a fitting close to this entry:

Boring is the new nifty.
-- Stuart Reges, 2006

Up next: the closing luncheon and keynote talk by Jonathan Schaeffer, of Chinook fame.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 09, 2007 9:11 PM

SIGCSE Day 2: Read'n', Writ'n', 'Rithmetic ... and Cod'n'

Grady Booch, free radical

Back in 2005, Grady Booch gave a masterful invited talk to close OOPSLA 2005, on his project to preserve software architectures. Since then, he has undergone open-heart surgery to repair a congenital defect, and it was good to see him back in hale condition. He's still working on his software architecture project, but he came to SIGCSE to speak to us as CS educators. If you read Grady's blog, you'll know that he blogged about speaking to us back on March 5. (Does his blog have permalinks to sessions that I am missing?) There, he said:

My next event is SIGCSE in Kentucky, where I'll be giving a keynote titled Read'n, Writ'n, 'Rithmetic...and Code'n. The gap between the technological haves and have-nots is growing and the gap between academia and the industries that create these software-intensive systems continues to be much lamented. The ACM has pioneered recommendations for curricula, and while there is much to praise about these recommendations, that academia/industry gap remains. I'll be offering some observations from industry why that is so (and what might be done about it).

And that's what he did.

A whole generation of kids has grown up not knowing a time without the Internet. Between the net, the web, iPods, cell phones, video games with AI characters... they think they know computing. But there is so much more!

Grady recently spent some time working with grade-school students on computing. He did a lot of the usual things, such as robots, but he also took a step that "walked the walk" from his OOPSLA talk -- he showed his students the code for Quake. Their eyes got big. There is real depth to this video game thing!

Grady is a voracious reader, "especially in spaces outside our discipline". He is a deep believer in broadening the mind by reading. This advice doesn't end with classic literature and seminal scientific works; it extends to the importance of reading code. Software is everywhere. Why shouldn't read it to learn to understand our discipline? Our CS students will spend far more of their lives reading code than writing it, so why don't we ask them to read it? It is a great way to learn from the masters.

According to a back of the envelope calculation by Grady and Richard Gabriel 38 billion lines of new and modified code are created each year. Code is everywhere.

While there may be part of computer science that is science, that is not what Booch sees on a daily basis. In industry, computing is an engineering problem, the resolution of forces in constructing an artifact. Some of the forces are static, but most are dynamic.

What sort of curriculum might we develop to assist with this industrial engineering problem? Booch referred to the IEEE Computer Society's Software Engineering Body of Knowledge (SWEBOK) project in light of his own software architecture effort. He termed SWEBOK "noble but failed", because the software engineering community was unable to reach a consensus on the essential topics -- even on the glossary of terms! If we cannot identify the essential knowledge, we cannot create a curriculum to help our students learn it.

He then moved on to curriculum design. As a classical guy, he turned to the historical record, the ACM 1968 curriculum recommendations. Where were we then?

Very different from now. A primary emphasis in the '68 curriculum was on mathematical and physical scientific computing -- applications. We hadn't laid much of the foundation of computer science at that time, and the limitations of both the theoretical foundations and physical hardware shaped the needs of the discipline and thus the curriculum. Today, Grady asserts that the real problems of our discipline are more about people than physical limits. Hardware is cheap. Every programmer can buy all the computing power she needs. The programmer's time, on the other hand, is still quite expensive.

What about ACM 2005? As an outsider, Grady says, good work! He likes the way the problem has been decomposed into categories, and the systematic way it covers the space. But he also asks about the reality of university curricula; are we really teaching this material in this way?

But he sees room for improvement and so offered some constructive suggestions for different ways to look at the problem. For example, the topical categories seem limited. The real world of computing is much more diverse than our curriculum. Consider...

Grady has worked with SkyTV. Most of their software, built in web-centric world, is less than 90 days old. Their software is disposable! Most of their people are young, perhaps averaging 28 years old or so.

He has also worked with people at the London Underground. Their software is old, and their programmers are old (er, older). They face a legacy problem like no other, both in their software and in their physical systems. I'm am reminded of my interactions with colleagues from Lucent, who work with massive, old physical switching systems driven by massive, old programs that no one person can understand.

What common theme do SkyTV and London Underground folks share? Building software is a team sport.

Finally, Grady looked at the ACM K-12 curriculum guidelines. He was so glad to see it, so glad to that see we are teaching the ubiquitous presence of computing in contemporary life to our young! But we are showing them only the fringes of the discipline -- the applications and the details of the OS du jour. Where do we teach them our deep ideas, the beauty and nobility of our discipline?

As he shifted into the home stretch of the talk, Grady pointed us all to a blog posting he'd come across called The Missing Curriculum for Programmers and High Tech Workers, written by a relatively recent Canadian CS grad working in software. He liked this developers list and highlighted for us many of the points that caught his fancy as potential modifications to our curricula, such as:

  • Sometimes, worker harder or longer won't get the job done.
  • Learn a scripting language!
  • Documentation is essential, but it must be tied to code.
  • Learn the patterns of organization behavior.
  • Learn about many other distinctly human elements of the profession, like meetings (how to stay awake, how to avoid them), hygiene (friend or foe?), and planning for the future.

One last suggestion for our consideration involved his Handbook of Software Architecture. There, he has created categories of architectures that he considers the genres of our discipline. Are these genres that our students should know about? Here is a challenging thought experiment: what if these genres were the categories of our curriculum guidelines? I think this is a fascinating idea, even if it ultimately failed. How would a CS curriculum change if it were organized exclusively around the types of systems we build, rather than mostly on the abstractions of our discipline? Perhaps that would misrepresent CS as science, but what would it mean for those programs that are really about software development, the sort of engineering that dominates industry?

Grady said that he learned a few lessons from his excursion into the land of computing curriculum about what (else) we need to teach. Nearly all of his lessons are the sort of things non-educators seem always to say to us: Teach "essential skills" like abstraction and teamwork, and teach "metaskills" like the ability to learn. I don't diminish these remarks as not valuable, but I don't think these folks realize that we do actually try to teach these, but they are hard to learn, especially in the typical school setting, and so hard to teach. We can address the need to teach a scripting language by, well, adding a scripting language to the curriculum in place of something less relevant these days. But newbies in industry don't abstract well because they haven't gotten it yet, not because we aren't trying.

The one metaskill on his list that we really shouldn't forget, but sometimes do, is "passion, beauty, joy, and awe". This is what I love about Grady -- he considers these metaskills, not squishy non-technical effluvium. I do, too.

During his talk, Grady frequently used the phrase "right and noble" to describe the efforts he sees folks in our industry making, including in CS education. You might think that this would grow tiresome, but it didn't. It was uplifting.

It is both a privilege and a responsibility, says Grady, to be a software developer. It is a privilege because we are able to change the world in so many, so pervasive, so fundamental ways. It is a responsibility for exactly the same reason. We should keep this mind, and be sure that our students know this, too.

At the end of his talk, he made one final plug that I must relay. He says that patterns are the coolest, most important thing that have happened in the software world over the last decade. You should be teaching them. (I do!)

And I can't help passing on one last comment of my own. Just as he did at OOPSLA 2005, before giving his talk he passed by my table and said hello. When he saw my iBook, he said almost exactly the same thing he said back then: "Hey, another Apple guy! Cool."


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 09, 2007 4:02 PM

SIGCSE Day 1: Computational Thinking

I have written before about Chazelle's efforts to pitch computer science as an algorithmic science that matters to everyone, but I have been remiss in not yet writing about Jeannette Wing's Communications of the ACM paper on Computational Thinking. (Her slides are also available on line). Fortunately, there was a panel session on computational thinking today at which Wing and several like-minded folks shared the ideas and courses.

Jeannette Wing

Wing's dream is that, like the three Rs, every child learns to think like a computer scientist. As Alan Kay keeps telling us, our discipline fundamentally changes how people think and write -- or should. Wing is trying to find ways to communicate computational thinking and its importance to everyone. In this post, I summarize some of her presentation.

Two "A"s of computational thinking distinguish it from other sciences:

  • abstraction: Computational thinkers work at multiple levels of abstraction. Other scientists do, too, but their bread-and-butter is at a single layer. Computational thinkers work at multiple levels at the same time.

  • automation: Computational thinkers mechanize their abstractions. This gives us two more "A"s -- the ability and audacity to scale up rapidly, to build large systems.

Wing gave a long, long list of examples of computational thinking. A few will sound familiar to readers of this and similar blogs: We ask and answer these questions: How difficult is it to solve this problem? How can we best solve it? We reformulate problems in terms of other problems that we already understand. We choose ways to represent data and process. We think of data as code, code as data. We look for opportunities to solve problems by decomposing them and by applying abstractions.

If only we look, we can evidence of the effects of computational thinking in other disciplines. There are plenty of surface examples, but look deeper. Machine learning has revolutionized statistics; math and statistics departments are hiring specialists in neural nets and data mining into their statistics positions. Biology is beginning to move beyond the easy computational problems, such as data mining, to the modeling of biological processes with algorithms. Game theory is a central mechanism in economics.

Computational thinking is bigger than what most people think of computer science. It is about conceptualizing, not programming; about ideas, not artifacts. In this regard, Wing has two messages for the general public:

  • Intellectually challenging and engaging scientific problems in computer science remain to be solved. Our only limits are curiosity and creativity.

  • One can major in computer science and do anything -- just like English, political science, and mathematics.

You can find more of Wing's ideas at her Computational Thinking web site.

Wing was only participant on this panel. The other folks offered some interesting ideas as well, but her energy carried the session. The one other presentation that made my list of ideas to try was Tom Cortina's description of his cool Principles of Computation course at Carnegie Mellon. What stuck with me were some of his non-programming, non-computing examples of important computing concepts, such as:

  • booking flights to give four talks over break, and then thirty-nine (the Traveling Salesman Problem and computational complexity)

  • washing and drying five loads of laundry (pipelining), which even the least mathematical students can understand: you can dry one load while washing another!

  • four-way stops in the road (races and deadlocks), including a great story of the difference between New York drivers (race) and Pittsburgh drivers (deadlock)

My department is beginning to implement courses aimed at bringing computational thinking to the broader university community, including an experimental Computational Modeling and Simulation course next fall. Perhaps we can incorporate some of these ideas into our course, and see how they play in Cedar Falls.


Posted by Eugene Wallingford | Permalink | Categories: Computing

March 08, 2007 8:03 PM

SIGCSE Day 1: Media Computation BoF

A BoF is a "birds of a feather" session. At many conferences, BoFs are a way for people with similar interests to get together during down time in the conference to share ideas, promote an idea, or gather information as part of a bigger project.

Tonight I attended a BoF on the media computation approach I used in CS I last semester. The developers of this approach, Mark Guzdial and Barbara Ericson, organized the session, called "A Media Computation Art Gallery and Discussion", to showcase work done by students in the many college and high school courses that use their approach. You can access the movies, sounds, and video shown at this on-line gallery.

Keller McBride's color spray artwork

The picture to the right was an entry I submitted, an image generated by a program written by my CS I student Keller McBride. This picture demonstrates the sort of creativity that our students have, just waiting to get out when given a platform. I don't know how novel the assignment itself was, but here's the idea. Throughout the course, students do almost all of their work using a set of classes designed for them, which hide many of the details of Java image and sound. In one lab exercise, students played with standard Java graphics programming using java.awt.Graphics objects. That API gives programmers some power, but it has always seemed more complicated than is truly necessary. My 8-year-old daughter ought to be able to draw pictures, too! So, while we were studying files and text processing, I decided to try an assignment that blended files, text, and images. I asked my students to write an interpreter for a simple straight-line language with commands like this:

     line 10 20 300 400
     circle 100 200 50

The former draws a line from (10, 20) to (300, 400), and the latter a circle whose center point is (100, 200) and whose radius is 50.

This is the sort of assignment that lies right in my sweet spot for encouraging students to think about programming languages and the programs that process them. Even a CS I student can begin to appreciate this central theme of computing!

Students were obligated to implement the line and circle commands, and then to create and implement a command of their own choosing. I expected squares and ovals, which I received, and text, which I did not. Keller implemented something I never expected: a colorSpray command that takes a density argument and then produces four sprays, one from each corner. I describe the effect as shaking a paint brush full of color and watching the color disperse in an arc from the brush, becoming less dense as the paint moves away from the shaker.

This was a CS1 course, so I wasn't expecting anything too fancy. Keller even discounted the complexity of his code in a program comment:

* My Color Spray method can only be modified by how many arcs it creates, not really fancy, but I did write it from scratch, and I think it's cool.

I do, too. The code uses nested loops, one determinate and one indeterminate, and does some neat arithmetic to generate the spray effect. This is real programming, in the sense that it requires discovering equations to build a model of something the programmer understands at a particular level. It required experimentation and thought. If all my students could get so excited by an assignment or two each semester, our CS program would be a much better place.

At the BoF, one attendee asked how he should respond to colleagues at his university who ask "Why teach this? Why not just teach Photoshop?" Mark's answer was right on mark for one audience. Great artists understand their tools and their media at a deep level. This approach helps future Photoshop users really understand the tool they will be using. And, as another attendee pointed out, many artists bump up against the edges of Photoshop and want to learn a little programming precisely so they can create filters and effects that Adobe didn't ship in their software. The answer to this question for a CS or technical audience ought to be obvious -- students can encounter so many of the great ideas of computing in this approach; it motivates many students so much more than "convert Celsius to Fahrenheit"; and it is relevant to students' everyday experiences with media, computation, and data.

The CS education community owes Mark and Barb a hand for their work developing this wonderful idea through to complete, flexible, and usable instructional materials -- in two different languages, no less! I'm looking forward to their next step, a data structures course that builds on top of the CS 1 approach. We may have a chance to try it out in Fall 2008, if our faculty approve a third offering of media computation CS I next spring.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

February 23, 2007 7:58 PM

**p++^=q++=*r---s

I thought about calling this piece "The Case Against C" but figured that this legal expression in the language makes a reasonably good case on its own. Besides, that is the name of a old paper by P.J. Moylan from which I grabbed the expression. ( PDF | HTML, minus footers with quotes) When I first ran across a reference to this paper, I thought it might make a good post for my attempt at the Week of Science, but by the author's own admission this is more diatribe than science.

C is a medium-level language combining
the power of assembly language with
the readability of assembly language.

I finally finished reading the paper on my way home from ChiliPLoP, and it presented a well-reasoned argument that computer scientists and programmers consider using languages that incorporate advances in programming languages since the creation of C. He doesn't diss C, and even professes an admiration for it; rather he speaks to specific features about which we know much more now than we did in the early 1970s. I ended up collecting a few good quotes, like the one above, and a few good facts, trivia, and guidelines. I'll share some of the more notable with you.

Facts, Trivia, and Guidelines

  • One of the features of C that is behind the times is its weak support for modular code. C supports separate compilation of modules, but Moylan reminds us that modularity is really about information hiding and abstraction. In this regard, C's system is lacking. Moylan gives a very nice description of ten practices that one can adopt in order to build effective modular programs in C, ranging from technical advice such as "Exactly one header file per module.", "Every module must import its own header file, as a consistency check.", and "The compiler warning 'function call without prototype" should be enabled, and any warning should be treated as an error." to team practices such as "Ideally, programmers working in a team should not have access to one another's source files. They should share only object modules and header files." He is not optimistic about the consistent use of these rules, though:

    Now, the obvious difficulty with these rules is that few people will stick to them, because the compiler does not enforce them. ... And, what is worse, it takes only one programmer in a team to break the modularity of a project, and to force the rest of the team to waste time with grep and with mysterious errors caused by unexpected side-effects.

  • Moylan gives a simple example that as concisely as possible how C's #include directive can lead to a program that is in an inconsistent state because some modules which should have been re-compiled were not. The remedy of always recompiling everything is obviously unattractive to anyone working on a large system.

  • Conventional wisdom says that C compilers produce faster code than compilers for other things. Moylan objects on several grounds, including the lack of any substantial recent evidence for the conventional wisdom. He closes with my favorite piece of trivia from the paper:

    It is true that C compilers produced better code, in many cases, than the Fortran compilers of the early 1970s. This was because of the very close relationship between the C language and PDP-11 assembly language. (Constructs like *p++ in C have the same justification as the three-way IF of Fortran II: they exploit a special feature of the instruction set architecture of one particular processor.) If your processor is not a PDP-11, this advantage is lost.

    I learned my assembly language and JCL on an IBM mainframe and so never had the pleasure of writing assembly for a PDP-11. (I did learn Lisp on a PDP-8, though...) Now I want to go learn about the PDP-11's assembly language so that I can use this example at greater depth in my compilers course next semester.

    Favorite Quotes

    You've already seen one above. My other favorite is:

    Much of the power of C comes from having a powerful preprocessor. The preprocessor is called a programmer.

    There were other good ones, but they lack the technical cleanness of the best because they could well be said of other languages. Examples from this category include "By analysis of usenet source, the hardest part of C to use is the comment." and "Real programmers can write C in any language." (My younger readers may not know what Moylan means by "usenet", which makes me feel old. But they can learn more about it here.)

    ----

    As readers here probably know from earlier posts such as this one, I'm as tempted by a guilty language pleasure as anyone, so I enjoyed Moylan's article. But even if we discount the paper's value for its unabashed advocacy on language matters, we can also learn from his motivation:

    I am not so naive as to expect that diatribes such as this will cause the language to die out. Loyalty to a language is very largely an emotional issue which is not subject to rational debate. I would hope, however, that I can convince at least some people to re-think their positions.

    I recognise, too, that factors other than the inherent quality of a language can be important. Compiler availability is one such factor. Re-use of existing software is another; it can dictate the continued use of a language even when it is clearly not the best choice on other grounds. (Indeed, I continue to use the language myself for some projects, mainly for this reason.) What we need to guard against, however, is making inappropriate choices through simple inertia.

    Just to keep in mind that we have a choice of language each time we start a new project is a worthwhile lesson to learn.


  • Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    February 19, 2007 11:39 PM

    How to Really Succeed in Research...

    ... when you work at a primarily undergraduate school.

    My Dean sent me a link to Guerrilla Puzzling: a Model for Research (subscription required), a Chronicle Observer piece by Marc Zimmer. The article describes nature as "full of intriguing puzzles for researchers to solve". Unlike the puzzles we buy at the store, the picture isn't known ahead of time, and the shape and number of pieces aren't even known. Scientists have "to find the pieces before trying to place them in the puzzle."

    From this analogy, Zimmer argues that researchers at schools devoted to undergraduates can make essential and valuable contributions to science despite lacking the time, manpower, and financial resources available to scientists at our great research institutions. While some at the elite liberal arts colleges do so by mimicking the big-school model on a smaller scale, most do some by complementing -- not competing with -- the efforts of major research programs.

    The biggest disadvantage of doing research at an undergraduate institution is the different time scale. An undergraduate is in school for only four years or so, and the typical undergrad is capable of contributing to a project for much less time, perhaps a year or two. In computer science, where even empirical science often requires writing code, the window of opportunity can be especially small. My hardest adjustment in going from graduate school researcher to faculty researcher was the speed with which students, even master's students, moved from entering the lab to collecting their sheepskin. Just as we felt comfortable and productive together, the student was gone.

    Zimmer points out one positive in this pace: the implicit license to take bigger chances. When one's grad students depend on successful projects to land and hold their future jobs, an advisor feels some moral duty to select projects with reasonable chances for success. An undergrad, on the other hand, benefits just from participating in research. Thus the advisor can take a chance on a project with higher reward/risk profile.

    How is the researcher at the primarily undergraduate institution to succeed? Via what he calls "guerilla" puzzling:

    • Start working in a new area, while the big guys are still writing the grants they need to get started, and pick off the easiest problems.

      This strategy requires a pretty keen sense of one's field. But it can also be helped along by listening to deep thinkers and making connections.

      One of our faculty members is collaborating on a grid computing project with a local data center, and they are attacking a particular set of questions that bigger schools and more prominent industrial concerns haven't yet been able to pin down yet. Sometimes, it's easier to be agile when you're small.

    • Start working in an area whose solutions seem to generate new problems.

      It seems to me that certain parts of the theoretical CS world work this way. My former student's work applying the "theory of networks" to Linux package relationships fits into an unfolding body of work where every new application of network ideas creates a set of questions that sustain the next round of study. Another faculty member here has built a productive career by following a trail of generative questions in medical informatics and database organization and search.

    • Start working on distinctive, attractive problems.

      I'm not sure I get this one. Why aren't the big guys working on these? Presumably, they can see them as well as the researcher at the undergrad school, and they have the resources they need to move to dominate the problem.

    Sometimes, small-school researchers can create a niche in their problem space by attacking a well-defined, focused problem, solving it, and then moving on to another. In some ways, computer science is more amenable to this straightforward approach. Unless you work in a "big iron" area of computing like supercomputing, the lab equipment that one needs is pretty simple and not beyond the financial means of anyone. And these days one can even do very interesting projects in parallel and distributed computing using clusters of commodity processors. So a CS researcher at an undergrad institution can compete on a focused problem nearly as well as someone at a larger school. The primary advantage of the large- school program is in numbers -- an army of grad students can deforest a problem area pretty quickly, and it can be agile, too.

    One thing is for certain: A scientist at a primarily undergrad school has to think consciously about how to build a research program, and mimicking one's Research I advisor isn't likely the most reliable path to success.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    February 19, 2007 12:03 AM

    Being Remembered

    Charisma in a teacher is not a mystery or a nimbus of personality, but radiant exemplification to which the student contributes a corresponding radiant hunger to becoming.

    -- William Arrowsmith

    On a day when I met with potential future students and their parents, several former students of mine sent me links to this xkcd comic:

    xkcd -- my god, it's full of cars

    I laughed out loud twice while reading it, first at the line "My god, it's full of cars." and then at the final panel nod at a postmodern god. The 'cars' line was a common subject line in the messages I received.

    As these messages rolled in, I also felt a great sense of honor. Students whom I last saw two or seven or ten years ago thought enough of their time studying here to remember me upon reading this comic and then send me an e-mail message. All studied Programming Languages and Paradigms with me, and all were affected in some lasting way by their contact with Scheme. One often hears about how a teacher's effect on the world is hard to measure, like a ripple on the water sent out into the future. I am humbled that some really neat people out in this world recall a course they took with me.

    Of course, I know that a big part of this effect comes from the beauty of the ideas that I am fortunate enough to teach. Scheme and the ideas it embodies have the power to change students who approach it with a "radiant hunger to becoming". I am merely its its channel. That is the great privilege of scholars and teachers, to formulate and share great ideas, and to encourage others to engage the ideas and make them grow.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    February 12, 2007 8:15 PM

    3 Out Of 5 Ain't Bad

    ... with apologies to Meat Loaf.

    Last week, I accepted the Week of Science Challenge. How did I do? On sheer numbers, not so well. I did manage to blog on a CS topics three of the five business days. That's not a full week of posts, but it is more than I have written recently, especially straight CS content.

    On content, I'm also not sure I did so well. The first post, on programming patterns across paradigms, is a good introduction to an issue of longstanding interest to me, and it caused me to think about Felleisen's important paper. The second post, on the beautiful unity of program and data, was not what I had hoped for it. I think I overreached, trying to bring too many different ideas into a single short piece. It neither needs to be a lot longer, or it needs a sharper focus. At least I can say that I learned something while writing the essay, so it was valuable to me. But it needs another draft, or three, before it's in a final state. Finally, my last piece of the week, turned out okay for a blog entry. Computer Science -- at least a big part of the varied discipline that falls under this umbrella -- is science. We in the field need to do a better job helping people to know this.

    Accepting the challenge served me well by forcing me to write. That sort of constraint can be good. Because I had to write something, even if not borne away by a flash of inspiration, it forced me to write about something that required extra effort at that moment, and to write quickly enough to get 'er done that day. These constraints, too, can boost creativity, and help build the habit of writing where it has fallen soft in the face of too many other claims on time. In some ways, writing those essays felt like writing essay exams in college!

    I think that I would probably have wanted to write about all of these ideas at some point later, but without the outside drive to write now I would probably have deferred them until a better time, until I was "ready". But would I ever? It's easy for me to wait so long that the idea is no longer fresh enough for me to write. An interesting Writing Down The Bones-like exercise might be for me to grab an old ideas file and use a random-number generator to pick out one topic every day for a week or so -- and then just write it.

    As for the pieces produced this week, I can imagine writing more complete versions of the last two some day, when time, an inspiration, or a need hits me.

    As I forced myself to look for ideas every day, I noticed my senses were heightened. For example, one evening last week I listened to an Opening Move podcast with Scott Rosenberg, author of Dreaming in Code. This book is about the many-years project of Mitch Kapor to build the ultimate PIM Chandler. During the interview, Rosenberg comments that software is "thought stuff", not subject to the laws of physics. As we implement new ideas, users -- including ourselves -- keep asking for more. My first thought was, I should update my piece on CS as science. CS helps to ask and to answer fundamental questions about what we could reasonably ask for, how much is too much, and what we will have to give up to get certain features or functionality. What are the limits to what we can ask for? Algorithms, theory, and experiment all play a role.

    Maybe in my next draft.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    February 08, 2007 6:23 PM

    Computer Science as Science

    If you're in computer science, you've heard the derision from others on campus. "Any discipline with 'science' in the name isn't." Or maybe you've heard, "What we really need on campus is a department of toaster oven science." These comments reflect a deep misunderstanding of what computing is, or what computer scientists do. We haven't always done a very good job telling our story. And a big part of what happens in the name of CS really isn't science -- it's engineering, or (sadly) technical training. But does that mean that no part of computer science is 'science'?

    Not at all. Computer science is grounded in a deep sense of empiricism, where the scientific method plays an essential role in the process of doing computer science. It's just that the entities or systems that we study don't always spring from the "natural world" without man's intervention. But they are complex systems that we don't understand thoroughly yet -- systems created by technological processes and by social processes.

    I mentioned an example of this sort of research from my own backyard back in December 2004. A master's student of mine, Nate Labelle, studied relationships among open-source software packages. He was interested in seeing to what extent the network of 'imports' relationships bore any resemblance to the social and physical networks that had been studies deeply by folks such as Barabasi and Watts. Nate conducted empirical research: he mapped the network of relationships in several different flavors of Linux as well as a few smaller software packages, and then determined the mathematical properties of the networks. He presented this work in a few places, including a paper at the 6th International Conference on Complex Systems. He concluded that:

    ... despite diversity in development groups and computer system architecture, resource coupling at the inter-package level creates small-world and scale-free networks with a giant component containing recurring motifs; which makes package networks similar to other natural and engineered systems.

    This scientific result has implications for how we structure package repositories, how we increase software robustness and security, and perhaps how we guide the software engineering process.

    Studying such large systems is of great value. Humans have begun to build remarkably complex systems that we currently understand only surface deep, if at all. That's is often a surprise to non-computer scientists and non-technology people: The ability to build a (useful) system does not imply that we understand the system. Indeed, it is relatively easy for me to build systems whose gross-level behavior I don't yet understand well (using a neural network or genetic algorithm) or to build systems that perform a task so much better than I that it seems to operate on a completely different level than I do. A lot of chess and other game-playing programs can easily beat the people who wrote them!

    We can also apply this scientific method to study processes that occur naturally in the world using a computer. Computational science is, at its base, a blend of computer science and another scientific domain. The best computational scientists are able to "think computationally" in a way that qualitatively changes their domain science.

    As Bertrand Russell wrote a century ago, science is about description, not logic or causation or any other naive notion we have about necessity. Scientists describe things. This being the case, computer science is in many ways the ultimate scientific endeavor -- or at least a foundational one -- because computer science is the science of description. In computing we learn how to describe thing and process better, more accurately and more usefully. Some of our findings have been surprising, like the unity of data and program. We've learned that process descriptions whose behavior we can observe will teach us more than static descriptions of the same processes left to the limited interpretative powers of the human mind. The study of how to write descriptions -- programs -- has taught us more about language and expressiveness and complexity than our previous mathematics could ever have taught us. And we've only begun to scratch the surface.

    For those of you who are still skeptical, I can recommend a couple of books. Actually, I recommend them to any computer scientist who would like to reach a deeper understanding of this idea. The first is Herb Simon's seminal book The Sciences of the Artificial, which explains why the term "science of the artificial" isn't an oxymoron -- and why thinking it is is a misconception about how science works. The second is Paul Cohen's methodological text Empirical Methods for Computer Science, which both teaches computer scientists -- especially AI students -- how to do empirical research in computer science. Along the way, he demonstrates the use of these techniques on real CS problems. I seem to recall examples from machine learning, which is perticularly empirical in its approach.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    February 06, 2007 10:31 PM

    Basic Concepts: The Unity of Data and Program

    I remember vividly a particular moment of understanding that I experienced in graduate school. As I mentioned last time, I was studying knowledge-based systems, and one of the classic papers we read was William Clancey's Heuristic Classification. This paper described an abstract decomposition for classification programs, the basis of diagnostic systems, that was what we today would call a pattern. It gave us the prototype against which we could pattern our own analysis of problem-solving types.

    In this paper, Clancey discussed how configuration and planning are two points of view on the same problem, design. A configuration system produces as output an artifact capable of producing state changes in some context; a planning system takes such an artifact as input. A configuration takes as input a sequence of desired state changes, to be produced by the configured system; a planning system produces a sequence of operations that produces desired state changes in the given artifact. Thus, the same kind of system could produce a thing, an artifact, or a process that creates an artifact. In a certain sense, things and processes were the same kind of entity.

    Wow. Design and planning systems could be characterized by similar software patterns. I felt like I'd been shown a new world.

    Later I learned that this interplay between thing and process ran much deeper. Consider this Lisp (or Scheme) "thing", a data value known as a list:

    (a 1 2)

    If I replace the symbol "a" with the symbol "+", I also have a Lisp list of size 3:

    (+ 1 2)

    But this Lisp list is also the Lisp program for computing the sum of 1 and 2! If I give this program to a Lisp interpreter, I will see the result:

    > (+ 1 2)
    3

    In Lisp, there is no distinction between data and program. Indeed, this is true for C, Java, or any other programming language. But the syntax of Lisp (and especially Scheme) is so simple and uniform that the unity of data and program stands out starkly. It also makes Scheme a natural language to use in a course on the principles of programming languages. The syntax and semantics of Lisp programs are so uniform that one can write a Lisp interpreter in about a page of Lisp code. (If you'd like, take a look at my implementation of John McCarthy's Lisp-in-Lisp, in Scheme, based on Paul Graham's essay The Roots of Lisp. If you haven't read that paper, please do soon.)

    There is no distinction between data and program. This is one of the truly beautiful ideas in computer science. It runs through everything that we do, from von Neumann's stored program computer, itself to the implementation of a virtual machine for Java to run inside a web browser.

    A related idea is the notion that programs can exist at arbitrary levels of abstraction. For each level at which a program is data to another program, there is yet another program whose behavior is to produce that data. An assembler produces machine language from assembly language.

    • A compiler produces assembly language from a high-level program.

    • The program-processing programs themselves can be produced. From a language grammar, lex and yacc produce components of the compiler.

    • One computer can pretend to be another. Windows can emulate DOS programs. Mac OS X can run old-style OS 9 programs in its Classic mode. Intel-based Macs can run programs compiled for a PowerPC-based Mac in its Rosetta emulation mode.

    One of the lessons of computer science is that "machine" is an abstract idea. Everything can be interpreted by someone -- or something -- else.

    I don't know enough of the history of mathematics or natural philosophy to say to what extent these ideas are computer science's contributions to our body of knowledge. On the one hand, I'm sure that deep thinkers throughout history at least had reason and resource to make some of the connections between thing and process, between design and planning. On the other, I imagine that before we had digital computers at our disposal, we probably didn't have sufficient vocabulary or the circumstances needed to explore issues of natural language to the level of program versus data, or of languages being processed from abstract to concrete, down to the details of a particular piece of hardware. Church. Turing. Chomsky. McCarthy. These are the men who discovered the fundamental truths of language, data, and program, and who laid the foundations of our discipline.

    At first, I wondered why hadn't I learned this small set of ideas as an undergraduate student. In retrospect, I'm not surprised. My alma mater's CS program was aimed at applications programming, taught a standard survey-style programming languages course, and didn't offer a compilers course. Whatever our students here learn about the practical skills of building software, I hope that they also have the chance to learn about some of the beautiful ideas that make computer science an essential discipline in the science of the 21st century.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

    February 05, 2007 8:46 PM

    Programming Patterns and "The Conciseness Conjecture"

    For most of my research career, I have been studying patterns in software. In the beginning didn't think of it in these terms. I was a graduate student in AI doing work in knowledge-based systems, and our lab worked on so-called generic tasks, little molecules of task-specific design that composed into systems with expert behavior. My first uses of the term "pattern" were home-grown, motivated by an interest to help novice programmers recognize and implement the basic structures that made up their Pascal, Basic, and C programs. In the mid-1990s I came into contact with the work of Christopher Alexander and the software patterns community, and I began to write up patterns of both sorts, including Sponsor-Selector, loops, Structured Matcher, and even elementary patterns of Scheme programming for recursion. My interest turned to the patterns at the interface between different programming styles, such as object-oriented and functional programming. It seemed to me that many object-oriented design patterns implemented constructs that were available more immediately in functional languages, and I wondered whether it were true that patterns in any one style would reflect the linguistic shortcomings of the style, implementing ideas available directly in another style.

    My work in this area has always been more phenomenological than mathematical, despite occasional short side excursions into promising mathematics like group and category theory. I recall a 6- to 9-month period five or six years ago when a graduate student and I looked group theory and symmetries as a possible theoretical foundation for characterizing relationships among patterns. I think that this work still holds promise, but I have not had a chance to take it any further.

    Only recently did I finally read Matthias Felleisen's 1991 paper On the Expressive Power of Programming Languages. I should have read it sooner! This paper develops a framework for comparing languages on the basis of their expressiveness powers and then applies it to many of the issues relevant to language research of the day. One section in particular speaks to my interest in programming patterns and shows how Felleisen's framework can help us to discern the relative merits of more expressive languages and the patterns that they embody. This section is called "The Conciseness Conjecture". Here is the conjecture itself:

    Programs in more expressive programming languages that use the additional features in a sensible manner contain fewer programming patterns than equivalent programs in less expressive languages.

    Felleisen gives a couple of examples to illustrate his conjecture, including one in which Scheme with assignment statements realizes the implementation of an stateful object more concisely, more clearly, and with less convention than a purely functional subset of Scheme. This is just the sort of example that led me to wonder whether functional programming's patterns, like OOP's patterns, embodied ideas that were directly expressible in another style's languages -- a Scheme extended with a simple object-oriented facility would make implementation of Felleisen's transaction manager even clearer than the stateful lambda expression that switches on transaction types.

    Stated as starkly as it is, I am not certain I believe the conjecture. Well, that's not quite true, because in one sense it is obviously true. A more expressive language allows us to write more concise code, and less code almost always means fewer patterns. This is true, of course, because the patterns reside in the code. I say "almost always" because there is an alternative to fewer patterns in smaller code: the same number of patterns, or more, in denser code!

    If we qualify "fewer programming patterns" as "fewer lower-level programming patterns", then I most certainly believe Felleisen's conjecture. I think that this paper makes important contribution to the study of software patterns by giving us a vocabulary and mechanism for talking about languages in terms of the trade-off between expressiveness and patterns. I doubt that Felleisen intended this, because his section on the Conciseness Conjecture confirms his uneasiness with pattern-driven programming. "The most disturbing consequence," he writes, of programming patterns is that they are an obstacle to understanding of programs for both human readers and program-processing programs." For him, an important result of his paper is to formalize "how the use of expressive languages seems to be the ability to abstract from programming patterns with simple statements and to state the purpose of a program in the concisest possible manner."

    This brings me back to the notions of "concise" and "dense". I appreciate the goal of using the most abstract language possible to write programs, in order to state as unambiguously and with as little text as possible the purpose and behavior of a program. I love to show my students how, after learning the basics of recursive programming. they can implement a higher-order operation such as fold to eliminate the explicit recursion from their programs entirely. What power! all because they are using a language expressive enough to allow higher-order procedure. Once you understand the abstraction of folding, you can write much more concise code.

    Where is the down side? Increasing concision ultimately leads to a trade-off on understandability. Felleisen points to the danger that dispersion poses for code readers: in the worst case, it requires a global understanding of the program to understand how the program embodies a pattern. But at the other end of the spectrum is the danger posed by concision: density. In the worst, the code is so dense as to overwhelm the reader's sense. If density were an unadulterated good, we would all be programming in a descendant of APL! The density of abstraction is often the reason "practical" programmers cite for not embracing functional programming is the density of the abstractions one finds in the code. It is an arrogance for us to imply that those who do not embrace Scheme and Haskell are simply not bright enough to be programmers. Our first responsibility is to develop means for teaching programmers these skills better, a challenge that Felleisen and his Teach Scheme! brigade have accepted with great energy. The second is to consider the trade-off between concision and dispersion in a program's understandability.

    Until we reach the purest sense of declarative programming, all programs will have patterns. These patterns are the recurring structures that programmers build within their chosen style and language to implement behaviors not directly supported by the language. The patterns literature describes what to build, in a given set of circumstances, along with some idea of how to build the what in a way that makes the most of the circumstances.

    I will be studying "On the Expressive Power of Programming Languages" again in the next few months. I think it has a lot more to teach me.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

    February 02, 2007 6:13 PM

    Recursing into the Weekend

    Between meetings today, I was able to sneak in some reading. A number of science bloggers have begun to write a series of "basic concepts" entries, and one of the math bloggers wrote a piece on the basics of recursion and induction. This is, of course, a topic well on my mind this semester as I teach functional programming in my Programming Languages course. Indeed, I just gave my first lecture on data-driven recursion in class last Thursday, after having given an introduction to induction on Tuesday. I don't spend a lot of time on the mathematical sort of recursion in this course because it's not all that relevant to to the processing of computer programs. (Besides, it's not nearly as fun!)

    This would would probably make a great "basic concepts in CS" post sometime, but I don't have time to write it today. But if you are interested, you can browse my lecture notes from the first day of recursive programming techniques in class.

    (And, yes, Schemers among you, I know that my placement of parentheses in some of my procedures is non-standard. I do that in this first session or so so that students can see the if-expression that mimics our data type stand out. I promise not to warp their Scheme style with this convention much longer.)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    February 02, 2007 5:57 PM

    Week of Science Challenge, Computer Science-Style

    I don't consider myself a "science blogger", because to me computer science is much more than just science. But I do think of Knowing and Doing as a place for me to write about the intellectual elements of computer science and software development, and there is a wonderful, intriguing, important scientific discipline at the core of all that we do. So when I ran across the Week of Science Challenge in a variety of places, I considered playing along. Doing so would mean postponing a couple of topics I've been thinking about writing about, and it would mean not writing about the teaching side of what I do for the next few days. Sometimes, when I come out of class, that is all I want to do! But taking the challenge might also serve as a good way to motivate myself to write on some "real" computer science issues for a while again. And that would force me to read and think about this part of my world a bit more. Given the hectic schedule I face on and off campus in the next four weeks, that would be a refreshing intellectual digression -- and a challenge to my time management skills.

    I have decided to pick up the gauntlet and take the challenge. I don't think I can promise a post every day during February 5-11, or that my posts will be considered "science blogs" by all of the natural scientists who consider computer science and software development as technology or engineering disciplines. But let's see where the challenge leads. Watch for some real CS here in the next week...


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    January 23, 2007 8:09 AM

    Class Personality and New Ideas

    People you've got the power over what we do
    You can sit there and wait
    Or you can pull us through
    Come along, sing the song
    You know you can't go wrong

    -- Jackson Browne, "The Load-Out"

    Every group of students is unique. This is evident every time I teach a course. I think I notice these differences most when I teach a course Programming Languages, the course I'm teaching this semester.

    In this course, our students learn to program in a functional style, using Scheme, and then use their new skills to build small language interpreters that help them to understand the principles of programming languages. Because we ask students to learn a new programming style and a language very different from any they know when the enter the course, this course depends more than most on the class's willingness to change their minds. This willingness is really attribute of each of the individuals in the class, but groups tend to develop a personality that grows out of the individual personalities which make it up.

    The last time I taught this course, I had a group that was eager to try new ideas. These folks were game from Day 1 to try something like functional programming. Many of them had studied a language called Mumps in another course, which had shown them the power that can be had in a small language that does one or two things well. Many of them were Linux hackers who appreciated the power of Unix and its many little languages. Scheme immediately appealed to these students, and they dove right in. Not all of them ended up mastering the language or style, but all made a good faith effort. The result was an uplifting experience both for them and for me. Each class session seemed to have a positive energy that drove us all forward.

    But I recall a semester that went much differently. That class of students was very pragmatic, focused on immediate skills and professional goals. While that's not always my orientation (I love many ideas for their own sake, and look for ways to improve myself by studying them), I no longer fault students who feel this way. They usually have good reasons for having developed such a mindset. But that mindset usually doesn't make for a very interesting semester studying techniques for recursive programming, higher-order procedures, syntactic abstractions, and the like. Sure, these ideas show up -- increasingly often -- in the languages that they will use professionally, and we can make all sorts of connections between the ideas they learn and the skills they will need writing code in the future. It's just that without a playful orientation toward new ideas, a course that reaches beyond the here-and-now feels irrelevant enough to many students to be seem an unpleasant burden.

    That semester, almost every day was a chore for me. I could feel the energy drain from my body as I entered the room each Tuesday and Thursday and encountered students who were ready to leave before we started. Eventually we got through the course, and the students may even have learned some things that they have since found useful. But at the time the course was like a twice-weekly visit to the dentist to have a tooth pulled.

    In neither of these classes was there only the one kind of student. The personality of the class was an amalgam, driven by the more talkative members or by the natural leaders among the students. In one class, I would walk into the room and find a few of them discussing some cool new thing they had tried since the last time we met; in the other, they would be discussing the pain of the current assignment, or a topic from some other course they were enjoying. These attitudes pervaded the rest of the students and, at least to some extent, me. As the instructor, I do have some influence over the class's emotional state of mind. If I walk into the room with excitement and energy, my students will feel that. But the students can have the same effect. The result is a symbiotic process that requires a boost of energy from both sides every class period.

    We are now two weeks into the new semester, and I am trying to get a feel for my current class. The vocal element of the class has been skeptical, asking lots of "why?" questions about functional programming and Scheme alike. So far, it hasn't been the negative sort of skepticism that leads to a negative class, and some of the discussion so far has had the potential to provoke their curiosity. As we get deeper into the meat of the course, and students have a chance to write code and see its magic, we could harness their skepticism into a healthy desire to learn more.

    Over the years, I've learned how better to respond to the sort of questions students ask at the outset of the semester in this course. My goal is to lead the discussion in a way that is at the same time intellectually challenging and pragmatic. I learned long ago that appealing only to the students' innate desire to learn abstract ideas such as continuations doesn't work for the students in my courses. In most practical ways, the course is about what they need to learn, not about what I want blather on about. And as much as we academics like papers such as Why Functional Programming Matters -- and I do like this paper a lot! -- it is only persuasive to programmers who are open to being persuaded.

    But I've also found that pandering to students by telling them that the skills they are learning can have an immediate effect on their professional goals does not work in this sort of course. Students are smart enough to see that even if Paul Graham got rich writing ViaWeb in Lisp, most of them aren't going to be starting their own companies, and they are not likely to get a job where Scheme or functional programming will matter in any direct way. I could walk into class each day with a different example of a company that has done something "in the real world" with Scheme or Haskell, and at the end of the term most students would have perceived only thirty isolated and largely irrelevant examples.

    This sort of course requires balancing these two sets of forces. Students want practical ideas, ideas that can change how they do their work. But we sell students short when we act as if they want to learn only practical job skills. By and large they do want ideas, ideas that can change how they think. I'm better at balancing these forces with examples, stories, and subtle direction of classroom discussion than I was ten or fifteen years ago, but I don't pretend to be able to predict where we'll all end up.

    Today we begin a week studying Scheme procedures and some of the features that make functional programming different from what they are used to, such as first-class procedures, variable arity, higher-order procedures, and currying. These are ideas with the potential to capture the students' imagination -- or to make them think to themselves, "Huh?" I'm hopeful that we'll start to build a positive energy that pulls us forward into a semester of discovery. I don't imagine that they'll be just like my last class; I do hope that we can be a class which wants to come together a couple of times every week until May, exploring something new and wonderful.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    January 19, 2007 3:45 PM

    Three Notes for a Friday

    Three thoughts from recent reading. From a professional standpoint, one is serious, one is light, and one is visionary. You'll have to decide.

    Understanding the New

    In their New Yorker article Manifold Destiny, on the social drama that surrounded Grigory Perelman's proof of the Poincaré conjecture, Sylvia Nasar and David Gruber write:

    On September 10, 2004, more than a year after Perelman returned to St. Petersburg, he received a long e-mail from Tian, who said that he had just attended a two-week workshop at Princeton devoted to Perelman's proof. "I think that we have understood the whole paper," Tian wrote. "It is all right."

    Mathematics, the theoretical parts of computer science, and theoretical physics are different from most other fields of human knowledge in a way that many folks don't realize. The seminal advances in these areas create a whole new world. Even the smartest, most accomplished people in the field may well not understand this new world until they have invested a lot of time and effort studying the work.

    This should encourage those of us who don't understand a new idea immediately the first time through.

    Dressing the Part

    In Why I Don't Wear a Suit and Can't Figure Out Why Anyone Does!, America's most X-generation sports owner, Mark Cuban, writes:

    Someone had once told me that you wear to work what your customers wear to work. That seemed to make sense to me, so I followed it, and expected those who worked for me to follow it as well.

    This is good news for college professors. If you believe the conventional wisdom these days, our customers are our students, and their dress is the ultimate in casual and comfortable. I can even wear shorts to class with little concern that my students will care.

    But what about all of our other customers -- parents, the companies that hire our students, the state government, the taxpayers? They generally expect something more, but even still I think that academics are unique among professionals these days in that almost everyone cuts us slack on how we dress. Or maybe no one thinks of us as professionals...

    Now that I am a department head, I have made adjustments in how I dress, because my audience really is more than just my students. I meet with other faculty, higher-level administrators, and members of the community on a regular basis, and where they have expectations I try to meet or exceed them. Cuban is basically right, but you have to think of "customer" in a broader sense. It is "whoever is buying my immediate product right now", and your product may change from audience to audience. The dean and other department heads are consuming a different product than the students!

    Controlling the Present and Future

    Courtesy of James Duncan Davidson, another quote from Alan Kay that is worth thinking about:

    People who are really serious about software should make their own hardware.

    Alan Kay has always encouraged computer scientists to take ownership of the tools they use. Why should we settle for the word processor that we are given by the market? Or the other basic tools of daily computer use, or the operating system, or the programming languages that we have been handed? We have the power to create the tools we use. In Kay's mind, we have the obligation to make and use something better -- and then to empower the rest of the users, by giving them better tools and by making it possible for them to create their own.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    December 18, 2006 4:00 PM

    Another Way Our Students Are Different From Us

    On my CS 1 final exam, I asked the students to suggest a way that we could hide a string in a sound, where a Sound object is encoded as an array of sound samples in the range [-32768..32767]. Here is the last part of the question:

    Under your scheme, how many characters could you hide in a song that is 3:05 long?

    My caricature of the typical answer is...

    It depends.

    Actually, most of the students were a bit more specific:

    It depends on the sampling rate.

    Of course it does. I was looking for an answer of this sort:

    (185 * sampling rate) / 100

    ... for a scheme that stored one character every 100 sound samples. It never occurred to me that most students would get as far as "It depends on the sampling rate." and just stop. When they realized that they couldn't write down as answer such as "42", they must have figured they were done thinking about the problem. I've been doing computer science so long, and enjoying math for much longer, that I naturally wanted to write a formula to express the result. I guess I assumed the students would want to do the same! This is yet another example of how our students are different from us.

    Fortunately, nearly all of them came up with a scheme for hiding a text that would work. Some of their schemes would degrade the sound we hear considerably, but that wasn't the point of the question. My goal was to see whether they could think about our representations at that level. In this, they succeeded well enough.

    Well, I guess that depends.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    November 22, 2006 6:30 PM

    An Old Book Day

    After over a year in the Big Office Downstairs and now the Office with a View, I am finally unpacking all the boxes from my moves and hanging pictures on the walls. Last year it didn't seem to make much sense to unpack, with the impending move to our new home, and since then I've always seemed to have more pressing things to do. But my wife and daughters finally tired of the bare walls and boxes on the floor, so I decided to use this day-off-while-still-on-duty to make a big pass at the task. It has gone well but would have gone faster if I only I could go through boxes of old books without looking at them.

    Old books ought to be able to toss without a thought, right? I mean, who needs a manual for a Forth-79 interpreter that ran on a 1980 Apple ][ ? A 1971 compiler textbook by David Gries (not even mine -- a colleague's hand-me-down)? Who wants to thumb through Sowa's Conceptual Structures, Raphael's The Thinking Computer, Winograd and Flores's Understanding Computers and Cognition? Guilty as charged.

    And while I may not be an active AI researcher anymore, I still love the field and its ideas and classic texts. I spent most of my time today browsing Raphael, Minsky, Schank's Dynamic Memory, Weld and de Kleer's Readings in Qualitative Reasoning about Physical Systems, the notes from a workshop on functional reasoning at AAAI 1994 (one of the last AI conferences I attended). These books brought back memories, of research group meetings on Wednesday afternoons where I cut my teeth as a participant in the academic discussion, of dissertation research, and of becoming a scholar in my own right. There were also programming books to be unpacked -- too many Lisp books to mention, including first and second editions of The Little LISPer, and a cool old book on computer chess from 1982 that is so out of date now as to be hardly worth a thumbing through. But I was rapt. These books brought back memories of implementing the software foundation for my advisor's newly established research lab -- and reimplementing it again and again as we moved onto ever better platforms for our needs. (Does anyone remember PCL?) Eventually, we moved away from Lisp altogether and into a strange language that no one seemed to know much about... Smalltalk. And so I came to learn OOP many years before it came into vogue via C++ and Java.

    Some of these books are classics, books I could never toss out. Haugeland's Mind Design, Schank, Minsky, Raphael, The Little LISPer. Others hold value only in memory of time and place, and how they were with me when I learned AI and computer science.

    I tossed a few books (among them the Gries compiler book) and kept a few more. I told my daughter I was being ruthless, but in reality I was far softer than I could have been. That's okay... I have shelf space to spare yet, at least in this office, and the prospect of my next move is far enough off that I am willing to hold onto that old computer chess book just in case I ever want to play through games by Belle and an early version of Cray Blitz, or steal a little code for a program of my own.

    I wonder what our grandchildren and great-grandchildren will think of our quaint fetish for books. For me, as I close up shop for a long weekend devoted to the American holiday of Thanksgiving, I know that I am thankful for all the many books I've had the pleasure to hold and read and fall asleep with, and thankful for all the wonderful people who took the time to write them.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    November 17, 2006 4:36 PM

    More Serendipity

    When my CS1 class and I explored compression recently, it occurred to one student that we might be able to compress our images, too. Pictures are composed of Pixels, which encode RGB values for colors. We had spent many weeks accessing the color components using accessors getRed(), getGreen(), and getBlue(), retrieving integer values. But the color component values lie in the range [0..255], which would fit in byte variables.

    So I spent a few minutes in class letting students implement compression and decompression of Pixels using one int to hold the three bytes of data. It gave us a good set of in-class exercises to practice on, and let students think about compression some more.

    The we took a peek at how Pixels are encoded -- and found, much to the surprise of some, that the code for our text already uses our compressed encoding! We had reinvented the existing implementation.

    I didn't mind this at all; it was a nice experience. First, it helped students to see very clearly that there does not have to be a one-to-one relationship between accessors and instance variables. getRed(), getGreen(), and getBlue() do not retrieve the values of separate variables, but rather the corresponding bytes in a single integer. This point, that IVs != accessors, is one I like to stress when we begin to talk about class design. Indeed, unlike many CS1 instructors, I make a habit of creating accessors only when they are necessary to meet the requirements of a task. Objects are about behavior, not state, and I fear that accessors-by-default gives a wrong impression to students. I wonder if this is an unnecessary abstraction that I introduce too early in my courses, but if so then it is just one of my faults. If not, then this was a great way to experience the idea that objects provide services and encapsulate data representation.

    Second, this gave my students a chance to do a little bit arithmetic, figuring out how to use multiplication to move values into higher-order bytes of an integer. Then we looked inside the Pixel class, we got to see the use of Java's shift operators to accomplish the same goal. This was a convenient way to see a little extra Java without having to make a big fuss about motivating it. Our context provided all the motivation we needed.

    I hope the students enjoyed this as much as I did. I'll have to ask as we wrap up the course. (I should practice what I preach about feedback!)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    November 14, 2006 4:32 PM

    When Opportunity Knocks

    When teaching a class, sometimes a moment comes along in that is pregnant with possibilities. I usually like to seize these moments, if only to add a little spice to my own enjoyment of the course. When the new avenue leads to a new understanding for the students, all the better!

    A couple of weeks ago, I was getting ready to finish off the unit on manipulating sound in CS 1. The last chapter of that unit in the textbook didn't excite me as much as I had hoped, and I had also just been sensitized by a student's comment. The day before, at a group advising session, one of my students had commented that the media computation was cool and all, but he "data to crunch", and he was hoping to do more of that in class. My first reaction was, isn't processing a 1-megapixel image crunching enough data? Sure, he might say, but the processing we were doing at each pixel, or at each sample in a sound, was relatively simple.

    With that in the back of my mind somewhere, I was reading the part of the chapter that discussed different encodings for sound, such as MP3 and MIDI. My stream of consciousness was working independent of my reading, or so it seemed. "MP3 is a way to compress sound ... compression algorithms range from simple to complex operations ... we can benefit greatly by compressing sound ... our Sound uses a representation that is rather inefficient ...". Suddenly I knew what I wanted to do in class that day: teach my students how to compression sound!

    Time was short, but I dove into my new idea. I hadn't looked very deeply at how the textbook was encoding sounds, and I'd never written a sound compression algorithm before. The result was a lot of fun for me. I had to come up with a reasonable encoding to compress our sounds, one that allowed me to talk about lossy and lossless compressions; I had to make the code work. Then I had to figure out how to tell the story so that students could reach my intended destination. This story has to include code that the students write, so that they can grow into the idea and feel some need for what I ask them to do.

    I ended up creating a DiffSound that encoded sounds as differences between sound samples, rather than as samples. The differences between samples tend to be smaller than the sample values themselves, which gives us some hope of creating a smaller file that loses little or no sound fidelity.

    This opportunity had another unexpected benefit. The next chapter of the text introduced students to classes and objects. While we had been using objects of the Picture, Pixel, Sound, and SoundSample classes in a "bottom-up" fashion, but we had never read a full class definition. And we certainly hadn't written one. The textbook used what was for me an uninspiring first example, a Student class that knows its grades. What was worse than not exciting me was that the class was not motivated by any need the students could feel from their own programming. But after writing simple code to convert a sound from an array of sound samples into an array of sample differences, we had a great reason to create a new class -- to encapsulate our new representation and to create a natural home for the methods that manipulate it. When I first embarked on the compression odyssey, I had no idea that I would be able to segue so nicely into the next chapter. Serendipity.

    After many years of teaching, bumping into such opportunities, and occasionally converting them into improvements to my course, I've learned a few lessons. The first is that not all opportunities are worth seizing. Sometimes, the opportunity is solely to my benefit, letting me play with some new idea. If it produces a zero sum for my students, then it may be worth trying. But too often an opportunity creates a distraction for students, or adds overhead to what they do, and as a result interferes with their learning. Some design patterns work this way for OOP instructors. When you first learn Chain of Responsibility, it may seem really cool, but that doesn't mean that it fits in your course or adds to what your students will learn. Such opportunities are mirages, and I have to be careful not to let them pull me off course.

    But many opportunities make my course better, by helping my students learn something new, or something old in a new way. These are the ideas worth pursuing. The second lesson I've learned is that such an idea usually creates more work for me. It's almost always easier to stay on script, to do what I've done before, what I know well. The extra work is fun, though, because I'm learning something new, too, and getting and chance to write the code and figure out how to teach the idea well. A few years ago, I had great fun creating a small unit on Bloom filters for my algorithms course, after reading a paper on the plane back from a conference. The result was a lot of work -- but also a lot of fun, and also an enrichment to what my students learned about the trade-offs between data and algorithm and between efficiency and correctness. That was an opportunity well-seized. But I needed time to turn the possibility into a reality.

    The third lesson I've learned is that using real data and real problems greatly increases the chances that I will see an unexpected opportunity. Images and sounds are rich objects, and manipulating them raises a lot of interesting questions. Were I teaching with toy problems -- converting Fahrenheit to Celsius, or averaging grades in an ad hoc student array -- then the number of questions that might raise my interest or my students' interest would be much smaller. Compression only matters if you are working with big data files.

    Finally, I've learned to be open to the possibility of something good. I have to take care not to fall into the rut of simply doing what's in my notes for the day. Eternal vigilance is the price of freedom, but it is also the price we must pay if we want to be ready to be excited and to excite our students with neat ideas lying underneath the surface of what we are learning.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    October 28, 2006 8:05 PM

    OOPSLA This and That

    In addition to the several OOPSLA sessions I've already blogged about, there were a number of other fun or educational moments at the conference. Here are a few...

  • Elisa Baniassad presented an intriguing Onward! talk called The Geography of Programming. She suggested that we might learn something about programming language design by considering the differences between Western and Eastern thought. Her motivation came from Richard Nisbett's The Geography of Thought: How Asians and Westerners Think Differently--And Why. One more book added to my must-read list...

  • Partly in honor of OOPSLA stalwart John Vlissides, who passed away since OOPSLA'05, and partly in honor of Vlissides et al.'s seminal book Design Patterns, there was a GoF retrospective panel. I learned two bits of trivia... John's favorite patterns were flyweight (which made it into the book) and solitaire (which didn't). The oldest instance of a GoF pattern they found in a real system? Observer -- in Ivan Sutherland's SketchPad! Is anyone surprised that this pattern has been around that long, or that Sutherland discovered its use over 40 years ago? I'm not.

  • On the last morning of the conference, there was scheduled a panel on the marriage of XP and Scrum in industry. Apparently, though, before I arrived on the scene it had morphed into something more generally agile. While discussing agile practices, "Object Dave" Thomas admitted he believes that, contrary to what many agilists seem to imply, comments in code are useful. After all, "not all code can be read, being encrypted in Java or C++ as it is". But he then absolved his sin a bit by noting that the comment should be "structurally attached" to the code with which it belongs; that is a tool issue.

  • Then, on the last afternoon of the conference, I listened in on the Young Guns panel, in which nearly a dozen computer scientists under the age of 0x0020 commented on the past, present, and future of objects and computing. One of these young researchers commented that scientists tend to make their great discoveries while still very young, because they don't yet know what's impossible. To popularize this wisdom, gadfly and moderator Brian Foote suggested a new motto for our community: "Embrace ignorance."

  • During this session, it occurred to me that I am no longer a "young gun" myself, spending the six last days of my 0x0029th year at OOPSLA. This is part of how I try to stay "busy being born", and I look forward to it every year. I certainly don't feel like an old fogie, at least not often.

  • Finally, as we were wrapping up the conference in the committee room after the annual ice cream social, I told Dick Gabriel that I would walk across the street to hear Guy Steele read a restaurant menu aloud. Maybe there is a little bit of hero worship going on here, but I always seem to learn something when Steele shares his thoughts on computing.

    ----

    Another fine OOPSLA is in the books. The 2007 conference committee is already at work putting together next year's event, to be held in Montreal during the same week. Wish us wisdom and good fortune!


  • Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    October 28, 2006 7:36 PM

    OOPSLA Day 3: Philip Wadler on Faith, Evolution, and Programming Languages

    Philip Wadler

    "Of course, the design should be object-oriented." Um. "Of course not. The design should be functional". Day 3's invited speaker, Philip Wadler, is a functional programming guy who might well hear the first statement here at OOPSLA, or almost anywhere out in a world where Java and OO have seeped through the software development culture, and almost as often he reacts with the second statement (if only in his mind). He has come to recognize that these aren't scientific statements or reactions; they are matters of faith. But faith and science as talking about different things. Shouldn't we make language and design choices on the basis of science? Perhaps, but we have a problem: we have not yet put our discipline on a solid enough scientific footing.

    The conflict between faith and science in modern culture, such as on the issue of evolution, reminds Wadler of what he sees in the computing world. Programming language design over the last fifty years has been on an evolutionary roller coaster, with occasional advances injected into path of languages growing out of the dominant languages of the previous generation. He came to OOPSLA in a spirit of multiculturalism, to be a member of a "broad church", hoping to help us see the source of his faith and to realize that we often have alternatives available when we face language and design decisions.

    The prototypical faith choice that faces every programmer is static typing versus dynamic typing. In the current ecosystem, typing seems to have won out versus no typing, static typing has usually had a strong upper hand over dynamic typing. Wadler reminded us that this choice goes back to the origins of our discipline, between the untyped lambda calculus of Alonzo Church and the typed calculus of Haskell Curry. (Church probably did not know he had a choice; I wonder how he would have decided if he had?)

    Wadler then walked us through his evolution as a programming languages researcher, and taught us a little history on the way.

    Church: The Origins of Faith

    Symbolic logic was largely the product of the 19th century mathematician Gottlob Frege. But Wadler traces the source of his programming faith to the German logician Gerhard Gentzen (1909-1945). Gentzen followed in footsteps of Frege, both as a philosopher of symbolic logic and as an anti-Semite. Wadler must look past Gentzen's personal shortcomings to appreciate his intellectual contribution. Gentzen developed the idea of natural deduction and proof rules.

    (Wadler showed us page of inference rules using the notation o mathematical logic, and then asked for a show of hands to see if we understood the ideas and the notation on his slides. On his second question, enough of the audience indicated uncertainty that he slowed down to explain more. He said that he didn't mind the diversion: "It's a lovely story.")

    Next he showed the basics of simplifying proofs -- "great stuff", he old us, "at least as important as the calculus", something man had searched for thousands of years. Wadler's love for his faith was evident in the words he chose and the conviction with which he said them.

    Next came Alonzo Church, who did his work after Gentzen but still in the first half of the 20th century. Church gave us the lambda calculus, from which the typed lambda calculus was "completely obvious" -- in hindsight. The typed lambda calculus was all we needed to make the connection between logic and programming: a program is a proof, and a type is a proof term. This equivalence is demonstrated in the Curry-Howard isomorphism, named for the logician and computer scientist, respectively, who made the the connection explicit. In Wadler's view, this isomorphism predicts that logicians and computer scientists will develop many of the same ideas independently, discovered from opposite sides of the divide.

    This idea, that logic and programming are equivalent, is universal. In the movie, Independence Day, the good guys defeat the alien invaders by injecting a virus written in C into its computer system. The aliens might not have known the C programming language, and thus been vulnerable on that front, but they would have to have known the lambda calculus!

    Haskell: Type Classes

    The Hindley-Milner algorithm is named for another logician/computer scientist pair that made the next advance in this domain. They showed that even without type annotations an automated system can deduce the most general data types that make the program execute. The algorithm is both correct and complete. Wadler wanted to show us that this idea is so important, so beautiful, that he took a step to the side of his podium and jumped up and down!

    Much of Wadler's renown in the programming language derives from his seminal contributions to Haskell, a pure functional language based on Curry-Howard isomorphism and the Hindley-Milner algorithm. Haskell implements these ideas in the form of type classes. Wadler introduced this idea into the development of Haskell, but he humbly credited others for doing the hard work to make things work.

    Java: Adding Generics

    Java 1.4 was in many ways too simple. We had to use a List of some sort for almost everything, in order to have polymorphic structures. Trying to add C-style templates threatened to make things only worse. What the Java team needed was... the lambda calculus!

    (At this moment, Wadler stopped his talk Superman-style and took off his business suit to reveal his Lambda-man outfit. The crowd responded with hearty applause!)

    Philip Wadler as 'Lambda Man'

    Java generics have the same syntax as C++, but different semantics. (He added parenthetically that Java generics "have semantics".) The templates are merely syntactic sugar, rewritten into older Java in a technique called "erasure". The rewrite produces identical byte codes that a programmer's own Java might. Much was written about this language addition both before and after he it was made to the Java specification, and I don't want to get into that discussion. But, as Wadler notes, this approaches supports the evolution of the language in a smooth way, consistent with existing Java practice. Java generics also bear more than a passing resemblance to type classes, which means that it could evolve into something more different -- and more.

    Links: Reconciliation

    Web applications typically consist in three tiers: the browser, the server, and the database. All are typically programmed in different languages, such as HTML, CSS, JavaScript, Perl, and SQL. Wadler's newest language, Links, is intended to be one language used for all three tiers. It compiles to SQL for running directly against a database. Links is similar to the similar-sounding LINQ, a dynamic query language developed at Microsoft. The similarity is no coincidence -- one of its architects, Erik Meijer, came from Haskell community. Again, type classes figure prominently in Links. Programmers in the OO community can think of them in an OO way with no loss of understanding. But they may want to broaden their faith to include something more.

    Wadler closed his talk by returning to the themes with which he began: faith, evolution, and multiculturalism. He viewed the OOPSLA conference committee's inviting him to speak as a strong ecumenical step. "Faith is well and good", but he would like for computer science to make inroads helping us to make better decisions about language design and use. Languages like Links, implemented with different features and used in experiments, might help.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    October 26, 2006 4:28 PM

    OOPSLA Day 2: Jim Waldo "On System Design"

    Last year, OOPSLA introduced another new track called Essays. This track shared a motivation from another recent introduction, Onward!, in providing an avenue for people to write and present important ideas that cannot find a home in the research-oriented technical program of the usual academic conference. But not all advances are in the form of novelty, of results from narrowly-defined scientific experiments. Some empirical results are the fruit of experience, reflection, and writing. The Essays track offers an avenue for sharing this sort of learning. The author writes an essay in the spirit of Montaigne and Bacon, using the writing to work out an understanding of his experience. He then presents the essay to the OOPSLA audience, followed by a thoughtful response from a knowledge person who has read the essay and thought about the ideas. (Essays crossed my path in a different arena last week, when I blogged a bit on the idea of blog as essay.)

    Jim Waldo, a distinguished engineer at Sun, presented the first essay of OOPSLA 2006, titled "On System Design". He reflected on his many years as a software developer and lead, trying to get a handle on what he now believes about the design of software.

    What is a "system"? To Waldo, it is not "just a program" in the sense he thinks meant by the postmodern programming crowd, but a collection of programs. It exists at many levels of scale that must be created and related; hence the need for us to define and manage abstractions.

    Software design is a craft, in the classical sense. Many of Waldo's engineer friends are appalled at what software engineers think of as engineering (essentially the application of patterns to roll out product in a reliable, replicable way) because what engineers really do involves a lot of intuition and craft.

    Software design is about technique, not facts. Learning technique takes time, because it requires trial and error and criticism. It requires patience -- and faith.

    Waldo paraphrase Grady Booch as having said that the best benefit of the Rational toolset was that it gives developers "a way to look like they were doing something while they had time to think".

    The traditional way to learn system design is via apprenticeship. Waldo usually asks designers he respects who they apprenticed with. At first he feared that at least a few would look at him oddly, not understanding the question. But he was surprised to find that every person answered without batting an eye. They all not only understood the question but had an immediate answer. He was also surprised to hear the same few names over and over. This may reflect Waldo moving in particular circles, or only a small set of master software developers out there!

    In recent years, Waldo has despaired of the lack of time, patience, and faith shown in industry for developing developers. Is all lost? No. In reflecting on this topic and discussing with readers of his early drafts, Waldo sees hope in two parts of the software world: open source and extreme programming.

    Consider open source. It has a built-in meritocracy, with masters at the top of the pyramid, controlling the growth and design of their systems. New developers learn from example -- the full source of the system being built. Developers face real criticism and have the time and opportunity to learn and improve.

    Consider extreme programming. Waldo is not a fan of the agile approaches and doesn't think that the features on which they are sold are where they offer most. It isn't the illusion of short-term cycles or of the incrementalism that grows a big ball of mud which give him hope. In reality, the agile approaches are based in a communal process that builds systems over time, giving people the time to think and share, mentor and learn. Criticism is built into the process. The system is an open, growing example.

    Waldo concludes that we can't teach system design in a class. As an adjunct professor, he believes that system design skills aren't a curriculum component but a curricular outcome. Brian Marick, the discussant on Waldo's took a cynical turn: No one should be allowed to teach students system design if they haven't been a committer to a large open-source project. (Presumably, having had experience building "real" big systems in another context would suffice.) More seriously, Marick suggested that it is only for recent historical reasons that we would turn to academia to solve the problem of producing software designers.

    I've long been a proponent of apprenticeship as a way of learning to program, but Waldo is right that doing this as a part of the typical university structure is hard, if not impossible. We heard about a short-lived attempt to do this at last year's OOPSLA, but a lot of work remains. Perhaps if more people like Waldo, not just the more provocative folks at OOPSLA, start talking openly about this we might be able to make some progress.

    Bonus reading reference: Waldo is trained more broadly as a philosopher, and made perhaps a surprising recommendation for a great document on the act of designing a new sort of large system: The Federalist Papers. This recommendation is a beautiful example of the value in a broad education. The Federalist Papers are often taught in American political science courses, but from a different perspective. A computer scientist or other thinker about the design of systems can open a whole new vista on this sort of document. Here's a neat idea: a system design course team taught by a computer scientist and a political scientist, with The Federalist Papers as a major reading!

    Now, how to make system design as a skill an inextricable element of all our courses, so that an outcome of our major is that students know the technique? ("Know how", not "know that".)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

    October 26, 2006 3:46 PM

    OOPSLA Day 2: Guy Steele on Fortress

    The first two sentences of Guy Steele's OOPSLA 2006 keynote this morning were eerily reminiscent of his describes the application of the principles from his justly famous OOPSLA 1998 invited talk Growing a Language. I'm pretty sure that, for that few seconds, he used only one-syllable words!

    Guy Steele

    This talk, A Growable Language, was related to that talk, but in a more practical sense. It applied the spirit and ideas expressed in that talk to a new language. His team at Sun is designing Fortress, a "growable language" with the motto "To Do for Fortran What Java Did for C".

    The aim of Fortress is to support high-performance scientific and engineering programming without carrying forward the historical accidents of Fortran. Among the additions to Fortran will be extensive libraries, including for networking, a security model, type safety, dynamic compilation (to enable the optimization of running program), multithreading, and platform independence. The project is being funded by DARPA with the goal of improving programmer productivity for writing scientific and engineering applications -- to reduce the time between when a programmer receives a problem and when the programmer delivers the answer, rather than focus solely on the speed of the compiler or executable. Those are important, too, but we are shortsighted in thinking that they are the only formsof speed that matter. (Other DARPA projects in this vein are the Extend language from IBM[?] and the Chapel language from Cray.)

    Given that even desktop computers are moving toward multicore chips, this sort of project offers potential value beyond the scientific programming community.

    The key ideas behind the Fortress project are threefold:

    • Don't build a language; grow it piecemeal.
    • Create a programming notation that is more like the mathematical notation that this programmer community uses.
    • Make parallelism the default way of thinking.

    Steele reminded his audience of the motivation for growing a language from his first talk: If you plan for a language, and then design it, then build it, you will probably miss the optimal window of opportunity for language. One of the guiding questions of the Fortress project is, Will designing for the growth of a language and its user community change the technical decisions the team makes or, more importantly, the way it makes them?

    One of the first technical questions the team faced was what set of primitive data types to build into the language. Integer and float -- but what sizes? Data aggregates? Quaternions, octonions? Physical units such as meter and kilogram?? He "might say 'yes' to all of them, but he must say no to some of them." Which ones -- and why?

    The strategy of the team is to, wherever possible, add a desired feature via a library -- and to give library designers substantial control over both the semantics and the syntax of the library. The result is a two-level language design: a set of features to support library designers and a set of features to support application programmers. The former have turned out to be quite obbject-oriented, while the latter is not obbject-oriented at all -- something of a surprise to the team.

    At this time, the language defines some very cool types in libraries: lists, vectors, sets, maps (with better, more math-like notion), matrices and multidimensional vectors, and units of measurement. The language also offers as a feature mathematical typography, using a wiki-style mark-up to denote Unicode characters beyond what's available on the ASCII keyboard.

    In the old model for designing a language, the designers

    • study applications
    • add language features to support application developers
    In the new model, though, designers
    • study applications
    • add language features to support library designers in creating the desired features
    • let library designers create a library that supports application developers

    At a strategic level, the Fortress team wants to avoid creating a monolithic "standard library", even when taking into account the user-defined libraries created by a single team or by many. Their idea is instead to treat libraries as replaceable components, perhaps with different versions. Steele says that Fortress effectively has make and svn built into its toolset!

    I can just hear some of my old-school colleagues decrying this "obvious bloat", which must surely degrade the language's performance. Steele and his colleagues have worked hard to make abstraction efficient in a way that surpasses many of today's languages, via aggressive static and dynamic optimization. We OO programmers have come to accept that our environments can offer decent efficiency while still having features that make us more productive. The challenge facing Fortress is to sell this mindset to C and Fortran programmers with habits of 10, 20, even 40 years thinking that you have to avoid procedure calls and abstract data types in order to ensure optimal performance.

    The first category of features in Fortress intend to support library developers. The list is an impressive integration of ideas from many corners of the programming language community, including first-class types, traits and trait descriptors (where, comprise, exclude), multiple inheritance of code but not fields, and type contracts. In this part of the language definition, knowledge that used to be buried in the compiler is brought out explicitly into code where it can be examined by programmers, reasoned over by the type inference system, and used by library designers. But these features are not intended for use by application programmers.

    In order to support application developers, Steele and his team watched (scientific) programmers scribble on their white boards and then tried to convert as much of what they say as possible into their language. For example, Fortress takes advantage of subtle whitespace cues, as in phrases such as

    { |x| | x ← S, 3 | x }

    Note the four different uses of the vertical bar, disambiguated in part by the whitespace in the expression.

    The wired-in syntax of Fortress consists of some standard notation from programming and math:

    • () for grouping
    • , for separating values in a tuple
    • ; for separating statements on a line
    • . for selecting fields and methods
    • conservative, traditional rules of precedence

    Any other operator can be defined as infix, prefix, or postfix. For example, ! is defined as a postfix for factorial. Similarly, juxtaposition is a binary operator, one which can be defined by the library designer for her own types. Even nicer for the cientific programmer, the compiler knows that the juxtaposition of functions is itself a function (composition!).

    The syntax of Fortress is rich and consistent with how scientific programmers think. But they don't think much about "data types", and Fortress supports that, too. The goal is for library designers to think about types a lot, but application programmers should be able to do their thing with type inference filling in most of the type information.

    Finally, scientists, engineers, and mathematicians use particular textual conventions -- fonts, characters, layout -- to communicate. Fortress allows programmers to post-process their code into a beautiful mathematical presentation. Of course, this idea and even its implementation are not new, but the question for the Fortress team was what it would be like if a language were designed with this downstream presentation as the primary mode o presentation?

    The last section of Steele's talk looked a bit more like a POPL or ICFP paper, as he explained the theoretical foundations underlying Fortress's biggest challenge: mediating the language's abstractions down to efficient executable code for parallel scientific computation. Steele asserted that parallel programming is not a feature but rather a pragmatic compromise. Programmers do not think naturally in parallel and sop need language support. Fortress is an experiment in making parallelism the default mode of computation.

    Steele's example focused on the loop, which in most languages conflates two ideas: "do this statement (or block) multiple times" and "do things in this order". In Fortress, the loop itself means only the former; by default its iterations can be parallelized. In order to force sequencing, the programmer modifies the interval of the loop (the range of values that "counts" the loop) with the seqoperator. So, rather than annotate code to get parallel compilation, we must annotate to get sequential compilation.

    Fortress uses the idea of generators and reducers -- functions that produce and manipulate, respectively, data structures like sequences and trees -- as the basis of the program transformations from Fortress source code down to executablecode. There are many implementations for these generators and reducers, some that are sequential and some that are not.

    From here Steele made a "deep dive" into how generators and reducers are used to implement parallelism efficiently. That discussion is way behind what I can write here. Besides, I will have to study the transformations more closely before I can explain them well.

    As Steele wrapped up, he reiterated The Big Idea that guides the Fortress team: to expose algorithm and design decisions in libraries rather than bury them in the compiler -- but to bury them in the libraries rather than expose them in application code. It's an experiment that many of us are eager to see run.

    One question from the crowd zeroed in on the danger of dialect. When library designers are able to create such powerful extensions, with different syntactic notations, isn't there a danger that different libraries will implement similar ideas (or different ideas) incompatibly? Yes, Steele acknolwedge, that is a real danger. He hopes that the Fortress community will grow with library designers thinking of themselves as language designers and so exercise restraint in the the extensions they make, and work together to create community standards.

    I also learned about a new idea that I need to read about... the BOOM hierarchy. My memory is vagues, but the discussion involved considering whether a particular operation -- which, I can't remember -- is associative, commutative, and idempotent. There are, of course, eight possible combinations of these features, four of which are meaningful (tree, list, bag/multiset, and set). One corresponds to an idea that Steele termed a "mobile", and the rest are, in his terms, "weird". I gotta read more!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

    October 25, 2006 2:13 PM

    OOPSLA Day 1: Gabriel and Goldman on Conscientious Software

    The first episode of Onward! was a tag team presentation by Ron Goldman and Dick Gabriel of their paper Conscientious Software. The paper is, of course, in the proceedings and so the ACM Digital Library, but the presentation itself was performance art. Ron gave his part of the talk in a traditional academic style, dressed as casually as most of the folks in the audience. Dick presented an outrageous bit of theater, backed by a suitable rock soundtrack, and so dressed outrageously (for him) in a suit.

    In the "serious" part of the talk, Ron spoke about complexity and scale, particularly in biology. His discussion of protein, DNA, and replication soon turned to how biological systems suffer damage daily to their DNA as a natural part of their lifecycle. They also manage to repair themselves, to return to equilibrium. These systems, Ron said, "pay their dues" for self-repair. Later he discussed chemotaxis in E.Coli, wherein "organisms direct their movements according to certain chemicals in their environment". This sort of coordinated but decentralized action occurs not among a bacterium's flagella but also among termites as they build a system, and elephants as they vocalize across miles of open space, and among chimpanzees as they pass emotional state via facial expression.

    We have a lot to learn from living things as we write our programs.

    What about the un-"serious" part of the talk? Dick acted out a set of vignettes under title slides such as "Thirteen on Visibility", "Continuous (Re)Design", "Requisite Variety", and "What We Can Build".

    His first act constructed a syllogism, drawing on the theme that the impossible is famously difficult. Perfect understanding requires abstraction, which is the ability to see the simple truth. Abstraction ignores the irrelevant. Hence abstraction requires ignorance. Therefore, perfect understanding requires ignorance.

    In later acts, Dick ended up sitting for a while, first listening to his own address was part of the recorded soundtrack and then carrying on a dialogue with his alter ego, which spoke in a somewhat ominous Darth Vader-ized voice in counterpoint. The alter ego espoused a standard technological view of design and modularity, of reusable components with function and interface. This left Gabriel himself to embody a view centered on organic growth and complexity -- of constant repair and construction.

    Dick's talk considered Levittown and the intention of "designed unpredictability", even if inconvenient, such as the placement of the bathroom far from the master bedroom. In Levittown, the 'formal cause' (see my notes on Brenda Laurel's keynote from the preceding session) lay far outside the "users", in Levitt's vision of what suburban life should be. But today Levittown is much different than designed; it is lived-in, a multi-class, multi-ethnic community that bears "the complexity of age".

    On requisite variety, Dick started with a list of ideas (including CLOS, Smalltalk, Self, Oaklisp, ...) disappearing one by one to the background music of Fort Minor's Where'd You Go.

    The centerpiece of Gabriel's part of the talk followed a slide that read

    Ghazal
    (google it)
    on
    "Unconventional Design"

    He described an experiment in the artificial evolution of electronic circuits. The results were inefficient to our eyes, but they were correct and carried certain advantages we might not expect from a human-designed solution. The result was labeled "... magical ... unexpected ...". This sort system building is reminiscent of what neural networks promise in the world of AI, an ability to create (intelligent) systems without having to understand the solution at all scales of abstraction. For his parallel, Dick didn't refer to neural nets but to cities -- they are sometimes designed; their growth is usually governed; and they are built (and grow) from modules: a network of streets, subways, sewers.

    Dick closed his talk with a vignette whose opening slide read "Can We Get There From Here" (with a graphical flourish I can't replicate here). The coming together of Ron's and Dick's threads suggest one way to try: find inspiration in biology.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    October 24, 2006 11:31 PM

    OOPSLA Day 1: Brenda Laurel on Designed Animism

    Brenda Laurel

    Brenda Laurel is well known in computing, especially the computer-human interaction computing, for her books The Art of Human-Computer Interface Design and the iconic Computers as Theatre. I have felt a personal connection to her for a few years, since an OOPSLA a few years ago when I bought Brenda's Utopian Entrepreneur, which describes her part in starting Purple Moon, a software company to produce empowering computer games for young girls. That sense of connection grew this morning as I prepared this article, when I learned that Brenda's mom is from Middletown, Indiana, less than an hour from my birthplace, and just off the road I drove so many times between my last Hoosier hometown and my university.

    Laurel opened this year's OOPSLA as the Onward! keynote speaker, with a talk titled, "designed animism: poetics for a new world". Like many OOPSLA keynotes, this one covered a lot of ground that was new to me, and I can remember only a a bit -- plus what I wrote down in real time.

    These days, Laurel's interests lie in pervasive, ambient computing. (She recently gave a talk much like this one at UbiComp 2006.) Unlike most folks in that community, her goal is not ubiquitous computing as primarily utilitarian, with its issues of centralized control, privacy, and trust. Her interest is in pleasure. She self-effacingly attributed this move to the design tactic of "finding the void", the less populated portion of the design space, but she need not apologize; creating artifacts and spaces for human enjoyment is a noble goal -- a necessary part of of our charter -- in its own right. In particular, Brenda is interested in the design of games in which real people are characters at play.

    Dutch Windmill Near Amsterdam, Owen Merton, 1919

    (Aside: One of Brenda's earliest slides showed this painting, "Dutch Windmill Near Amsterdam" by Owen Merton (1919). In finding the image I learned that Merton was the father of Thomas Merton, the Buddhist-inspired Catholic monk whom I have quoted here before. Small world!)

    Laurel has long considered how we might extend Aristotle's poetics to understand and create interactive form. In the Poetics, Aristotle "set down ... an understanding of narrative forms, based upon notions of the nature and intricate relations of various elements of structure and causation. Drama relied upon performance to represent action." Interactive systems complicate matters relative to Greek drama, and ubiquitous computing "for pleasure" is yet another matter altogether.

    To start, drawing on Aristotle, I think, Brenda listed the four kinds of cause of a created thing (at this point, we were thinking drama):

    • the end cause -- its intended purpose
    • the formal cause -- the platonic ideal in the mind of the creator that shaped the creation
    • the efficient cause -- the designer herself
    • the material cause -- the stuff out of which it is made, which constrains and defines the thing

    In an important sense, the material and formal causes work in opposite directions with respect to dramatic design. The effects of the material cause work bottom-up from material to pattern on up to the abstract sense of the thing, while the effects of the formal cause work top-down from the ideal to components on to the materials we use.

    Next, Brenda talked about plot structure and the "shape of experience". The typical shape is a triangle, a sequence of complications that build tension followed by a sequence of resolutions that return us to our balance point. But if we look at the plots of most interesting stories at a finer resolution, we see local structures and local subplots, other little triangles of complication and resolution.

    (This part of the talk reminded of a talk I saw Kurt Vonnegut give at UNI almost a decade or so ago,in which he talked about some work he had done as a master's student in sociology at the University of Chicago, playfully documenting the small number of patterns that account for almost all of stories we tell. I don't recall Vonnegut speaking of Aristotle, but I do recall the humor in is own story. Laurel's presentation blended bits of humor with two disparate elements: an academic's analysis and attention to detail, and a child's excitement at something that clearly still lights up her days.)

    One of the big lessons that Laurel ultimately reaches is this: There is pleasure in the pattern of action. Finding these parts is essential to telling stories that give pleasure. Another was that by using natural materials (the material causality in our creation), we get pleasing patterns for free, because these patterns grow organically in the world.

    I learned something from one of her examples, Johannes Kepler's Harmonices Mundi, an attempt to "explain the harmony of the world" by finding rules common to music and planetary motion within the solar system. As Kepler wrote, he hoped "to erect the magnificent edifice of the harmonic system of the musical scale ... as God, the Creator Himself, has expressed it in harmonizing the heavenly motions." In more recent times, composers such as Stravinsky, deBussy, and Ravel have tried to capture patterns from the visual world in their music, seeking more universal patterns of pleasure.

    This led to another of Laurel's key lessons, that throughout history artists have often captured patterns in the world on the basis of purely phenomenological evidence, which were later reified by science. Impressionism was one example; the discovery of fractal patterns in Jackson Pollock's drip projectories were another.

    The last part of Laurel's talk moved on to current research with sensors in the ubiquitous computing community, the idea of distributed sensor networks that help us to do a new sort of science. As this science exposes new kinds of patterns in the about the world, Laurel hopes for us to capitalize on the flip side of the art examples before: to be able to move from science, to math, and then on to intuition. She would like to use what we learn to inform the creation of new dramatic structures, of interactive drama and computer games that improve the human condition -- and give us pleasure.

    The question-and-answer session offered a couple of fun moments. Guy Steele asked Brenda to react to Marvin Minsky's claim that happiness is bad for you, because once you experience it you don't want to work any more. Brenda laughed and said, "Marvin is a performance artist." She said that he was posing with this claim, and told some stories of her own experiences with Marvin and Timothy Leary (!). And she is even declared a verdict in my old discipline of AI: Rod Brooks and his subsumption architecture are right, and Minsky and the rest of symbolic AI are wrong. Given her views and interests in computing, I was not surprised by her verdict.

    Another question asked whether she had seen the tape of Christopher Alexander's OOPSLA keynote in San Jose. She hadn't, but she expressed a kinship in his mission and message. She, too, is a utopian and admitted to trying to affect our values with her talk. She said that her research through the 1980s especially had taught her how she could sell cosmetics right into the insecurities of teenage girls -- but instead she chose to create an "emotional rehearsal space" for them to grow and overcome those insecurities. That is what Purple Moon was all about!

    As usual, the opening keynote was well worth our time and energy. As a big vision for the future, as a reminder of our moral center, it hit the spot. I'm still left to think how these ideas might affect my daily work as teacher and department leader.

    (I'm also left to track down Ted Nelson's Computer Lib/Dream Machines, a visionary, perhaps revolutionary book-pair that Laurel mentioned. I may need the librarian's help for this one.)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

    October 04, 2006 5:45 PM

    Hope with Thin Envelopes

    I am working on a longer entry about a fine book I listened to this weekend, but various work duties -- including preparing for the rapidly approaching OOPSLA conference -- have kept me busier than I planned. I did run across a bit of news today that will perhaps raise the spirits of high school and college students everywhere who did not get into their dream schools. This from a wide-ranging bits-of-news column by Leah Garchik at the San Francisco Chronicle:

    P.S.: A bit of information in Tuesday's story about Andrew Fire of Stanford University, winner of a Nobel Prize for medicine, seems deserving of underlining: Stanford turned down Fire when he applied for undergraduate study there. This revelation is a gift to every high school senior who ever received a thin envelope instead of a fat one.

    (Of course, Fire did have the good fortune to study at Berkeley and MIT...)

    Worth noting, too, is that Fire studied mathematics as an undergrad, and that his quantitative background probably played an important role in the thinking that led to his Nobel-winning work. Whenever I encounter high school or college students who are interested in other sciences these days, I tell them that studying computer science or math too will almost certainly make them better scientists than only studying a science content area.

    I also tell them that computer science is a pretty good content area in its own right!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    September 18, 2006 6:24 PM

    A New Entry for my Vocabulary

    In my last entry I quoted a passage from Paul Graham's essay Copy What You Like. Graham encourages students of all sorts to develop the taste to recognize good stuff in their domain, and then to learn by copying it.

    My friend Richard Gabriel is both computer scientist and poet. He has often talked of how writers copy from one another -- not just style, but sentences and phrases and even the words that catch our ears, that sound just right when used in the right place.

    In that spirit I will steal, er, copy a word from Graham.

    Most of the words that our discipline has added to the broader lexicon are hideous abominations, jargon used to replace already useful words. The next time I hear someone use "interface" as a verb in place of the older, humbler, and perfectly fine "interact", well, I don't know what I'll do. But it won't be pleasant. In this vein, I did recently hear an unusual word choice from a graduate student recently moved hear from Russia. Instead of "interaction", she used "intercourse". It sounded charming and had a subtly different connotation, but these days in the U.S. I suspect that most folks would look askance at you for this word choice.

    But in "Copy...", Graham put a CS jargon word to good use in ordinary conversation:

    It was so clearly a choice of doing good work xor being an insider that I was forced to see the distinction.

    Standard English doesn't have a good word with the meaning of "xor"; "or" admits the same confusion in regular conversation that it does in logic. But sometimes we really want to express an 'exclusive or', and "xor" is perfect.

    Now I'm on the look-out for an opportunity to drop this word into a conversation!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    September 18, 2006 6:03 PM

    Professors Who Code

    Steve Yegge's entertaining fantasy on developing programmers contains a respectful but pointed critique of the relevance of a Ph.D. in computer science to contemporary software development, including:

    You hire a Ph.D., it's hit-or-miss. Some of them are brilliant. But then some subset of virtually every educated group is brilliant. The problem is that the notion of a Ph.D. has gradually been watered down for the last century. It used to mean something to be a Doctor of Philosophy: it meant you had materially advanced your discipline for everyone. Von Neumann, Nash, Turing -- people like that, with world-changing dissertations, ....

    ... These kids have to work hard for their Ph.D., and a lot of them never quite finish. But too often they finish without having written more than a few hundred lines of code in the past five years. Or they've over-specialized to the point where they now think Big-O is a tire company; they have no idea how computers or computation actually work anymore. They can tell you just about everything there is to know about SVM kernels and neural-net back propagation, or about photorealistic radiosity algorithms that make your apartment look fake by comparison. But if you want a website thrown together, or a scalable service written, or for that matter a graphics or machine-learning system, you're usually better off hiring a high-school kid, because the kid might actually know how to program. Some Ph.D.s can, but how many of them is it, really? From an industry perspective, an alarming number of them are no-ops.

    Ouch. I'm sure Steve is exaggerating for dramatic effect, but there is a kernel of truth in there. People who study for a Ph.D. in computer science are often optimizing on skills that are not central to the industrial experience of building software. Even those who work on software tend to focus on a narrow slice of some problem, which means not studying broadly in all of the elements of modern software development.

    So, if you want to apprentice with one person in an effort to learn the software industry, you can often find a better "master" than by selecting randomly among the run-of-the-mill CS professors at your university. But then, where will you connect with this person, and how will you convince him or her to carry you while you slog through learning the basics? Maybe when Wizard Schools are up and running everywhere, and the Ward Cunninghams and Ralph Johnsons of the world are their faculty, you'll have a shot. Until then, a CS education is still the most widely available and trustworthy path to mastery of software development available.

    You will, of course, have to take your destiny into your own hands by seeking opportunities to learn and master as many different skills as you can along the way. Steve Yegge reminds his readers of this all the time.

    In this regard, I was fortunate in my graduate studies to work on AI, in particular intelligent systems. You might not think highly of the work done by the many, many AI students of the 1980s. Paul Graham had this to say in a recent essay:

    In grad school I was still wasting time imitating the wrong things. There was then a fashionable type of program called an expert system, at the core of which was something called an inference engine. I looked at what these things did and thought "I could write that in a thousand lines of code." And yet eminent professors were writing books about them, and startups were selling them for a year's salary a copy. What an opportunity, I thought; these impressive things seem easy to me; I must be pretty sharp. Wrong. It was simply a fad.

    But whatever else you say, you have to admit that most of us AI weenies produced a lot of code. The AI and KBS research groups at most schools I knew sported the longest average time-to-graduate of all the CS areas, in large part because we had to write huge systems, including a lot of infrastructure that was needed in order to do the highly-specialized whatever we were doing. And many of us wrote our code in one of those "super-succinct 'folding languages'" developed by academics, like Lisp. I had the great good fortune of schlocking a non-trivial amount of code in both Lisp and Smalltalk. I should send my advisor a thank-you note, but at the time we felt the burden of all the code we had to produce to get to the place where we could test our cool ideas.

    I do agree with Yegge that progressive CS departments need to work on how better to prepare CS graduates and other students to participate in the development of software. But we also have to wrestle with the differences between computer science and software development, because we need to educate students in both areas. It's good to know that at least a few of the CS professors know how to build software. Not very many of us know when to set the fourth preference on the third tab of the latest .NET development wizard, but we do have some idea about what it's like to build a large software system and how students might develop the right set of skills to do the same.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    September 14, 2006 3:03 PM

    Department as Student Recruiter

    Our second session today is on student recruitment. We've been thinking about this a lot, but my thoughts during this session are still at the level of a few key points:

    • Universities have staff that recruit students under a corporate banner, but the faculty have a passion for science, math, and technology. Given the state of national enrollments in these disciplines, we really have to make recruitment one of our central missions.

    • The same story from fundraising: increasingly, the responsibility for recruitment is devolving from the university to the departments and faculty.

    • Trying to grow an academic department when the university has a posture of decreasing enrollment, or at least a public image of doing so, is difficult. When the potential audience thinks you are trying to get smaller, they don't take your recruitment efforts seriously.

    • Faculty should be involved in developing relationships with high school students -- and parents. Personal touch, especially from the people who will be in the classroom with students, is invaluable. When the target parent is an alum, the effect is multiplied.

    • How can a medium-sized, public, undergrad-oriented university such as mine recruit effectively out of state? Why should an Illinois high-schooler come here? Almost every state has a school -- or ten -- like ours. What makes us attractive?

    • Offering an especially strong program in a particular area can draw some students, but only those who already know what they want to study.

    • An idea offered by someone else... Maybe we should brand ourselves as what we are: "proudly educating Iowans". Selling yourself as doing a well-defined task well is sometimes attractive to people outside the target, because they appreciate excellence.

    • Another idea from someone else... Focus on in-state students, which our bigger, R-I sister institutions don't do. This is different than the previous idea, which is about branding; here the university would willingly cede out-of-state students.

    • More and more students these days come to the university undecided about their majors. (Our advising staff has stopped calling these folks "undecided majors"; they are now called "deciding".) How do we take care of these students and maximize the chance that these folks give CS and the sciences a fair chance? The community colleges cater directly to this audience and are pulling enrollment away from our universities. How can we get them to come to the university? UNI has an advantage over some 4-year institutions. We have a common liberal arts core that all students take, which gives students a couple of years to decide -- while still making tangible progress toward a degree.

    I have more questions than answers at this point.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    August 31, 2006 5:12 PM

    Names and Jargon in CS1

    As I've mentioned a few times in recent months that I am teaching our intro course this semester for the first time in a decade or so. After only two weeks, I have realized this: Teaching CS1 every so often is a very good idea for a computer science professor!

    With students who have no programming background, I cannot take anything for granted: values, types, variables, ... expressions, statements, functions/methods ... reference, binding ... so many fundamental concepts to build! In one sense, we have to build them from scratch, because students have no programming knowledge to which we can connect. But they do have a lot of knowledge of the world, and they know something about computers and files as users, and we can sometimes get leverage from this understanding. So far, I remain of the view that we can build the concepts in many different orders. The context in which students learn guides the ordering of concepts, by making some concepts relatively more or less "fundamental". The media computation approach of Guzdial and Ericson has been a refreshing change for me -- the idea of a method "happened" naturally quite early, as a group of operations that we have applied repeatedly from Dr. Java's interactions pane. Growing ideas and programs this way lets students learn bottom-up but see useful ideas as soon as they become useful.

    I've spent a lot of time so far talking about the many different kind of names that we use and define when thinking computationally. So much of what we do in computing is combining parts, abstracting from them an idea, and then giving the idea a name. We name values (constant), arbitrary or changing values (variable), kinds of values (type, class), processes (function, method)... Then we have arguments and parameters, which are special kinds of arbitrary values, and even files -- whose names are, in an important way, outside of our programs. I hope that my students are appreciating this Big Idea already.

    And then there is all of the jargon that we computer folks use. I have to assume that my students don't know what any of that jargon means, which means that (1) I can't use much, for fear of making the class sound like a sea of Babel, and (2) I have to define what I use. Today, for example, I found myself wanting to say "hard-coded", as such as a constant hard-coded into a method. I caught myself and tried to relate it to what we were doing, so that students would know what I meant, both now and later.

    I often speak with friends and colleagues who teach a lot of CS as trainers in industry. I wonder if they ever get a chance to teach a CS1 course or something like it. The experience is quite different for me from teaching even a new programming style to sophomores and juniors. There, I can take so much for granted, and focus on differences. But for my intro student the difference isn't between two somethings, but something and nothing.

    However, I also think that we have overglamorized how difficult it is to learn to program. I am not saying that learning to program is easy; it is tough, with ideas and abstractions that go beyond what many students encounter. But I think that sometimes lure ourselves into something of a Zeno's paradox: "This concept is so difficult to learn; let's break it down into parts..." Well, then that part is so difficult to learn that we break it down into parts. Do this recursively, ad infinitum, and soon we have made things more difficult than they really are -- and worse, we've made them incredibly boring and devoid of context. If we just work from a simple context, such as media computation, we can use the environment to guide us a bit, and when we reach a short leap, we make it, and trust our students to follow. Answer questions and provide support, but don't shy away from the idea.

    That's what I'm thinking this afternoon at least. Then again, it's only the end of our second week of classes!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    August 24, 2006 5:34 PM

    Playing the Big Points Well

    This week I've had the opportunity to meet with several colleagues from around my college, as part of some committee work I had volunteered for. This sort of extra work is often just that -- extra work -- but sometimes it is an opportunity to be reminded that I have some impressive colleagues. Sometimes, I even learn something new or am prompted to think about an idea that hadn't occurred to me before.

    One of my colleagues spoke of how important it is to get the science faculty working more closely with one another at this time. He couched his ideas in historical terms. Science in the early 20th century was quite interdisciplinary, but as the disciplines matured within the dominant paradigms they became more and more specialized. The second half of the century was marked by great specialization, even with the various disciplines themselves. Computer science grew out of mathematics, largely, as a specialty, and over the course of the century it became quite specialized itself. Even within artificial intelligence, the area in which I did my research, became almost balkanized as a set of communities that didn't communicate much. But the sciences seem to have come back to an era of interdisciplinary work, and CS is participating in that, too. Bioinformatics, computational chemistry, physics students rediscovering computer programming for their own research -- all are indications that we have entered the new era, and CS is a fundamental player in helping scientists redefine what they do and how they do it.

    Another colleague spoke eloquently of why we need to work hard to convince young people to enter the sciences at the university level. He said something to the effect that "Society does not need a lot of scientists, but the ones it does need, it needs very much -- and it needs them to be very good!" That really stuck with me. In an era when university funding may become tied to business performance, we have to be ready to argue the importance of departments with small numbers of majors, even if they aren't compensating with massive gen-ed credit hours.

    Finally, a third colleague spoke of the "rhythm" of an administrator's professional life. Administrators often seek out their new positions because they have a set of skills well-suited to lead, or even a vision of where they want to help their colleagues go. But minutiae often dominate the daily life of the administrator. Opportunities to lead, to exercise vision, to think "big" come along as fleeting moments in the day. What a joy they are -- but you have to be alert, watching for them to arise, and then act with some intensity to make them fruitful.

    For some reason, this reminded me of how sports and other competitive activities work. In particular, I recall a comment Justin Henin-Hardenne made at Wimbledon this year, after her semifinal win, I think. She spoke of how tennis is long string of mostly ordinary points, with an occasional moment of opportunity to change the direction of the match. She had won that day, she thought, because she had recognized and played those big points better than her opponent. I remember that feeling from playing tennis as a youth, usually on the losing end!, and from playing chess, where my results were sometimes better. And now, after a year as an administrator, I know what my colleague meant. But I'm not sure I had quite thought of it in these terms before.

    Sometimes, you can learn something interesting when doing routine committee work. I guess I just have to be alert, watching for them to arise, and then act with some intensity to make them fruitful.

    (And of course I'm not only an administrator... I'm having fun with my first week of CS1 and will write more as the week winds down and I have a chance to digest what I'm thinking.)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Managing and Leading

    August 12, 2006 2:21 PM

    Grades and Verticality

    The next book on my nightstand is Talking About Leaving: Why Undergraduates Leave the Sciences, by Elaine Seymour and Nancy Hewitt. The dean and department heads in my college have had an ongoing discussion of national trends in university enrollment in mathematics and the sciences, because all of our departments with the exception of biology have seen their number of majors drop in recent years. If we can understand the issue better, perhaps we can address it. Are students losing interest in the sciences in college? In high school? In grade school? Why? One of the common themes on this blog for the last year or so has been apparent declining interest in CS among students both at the university and in the K-12 system. At this point, all I know is that this is a complex problem with a lot of different components. Figuring out which components play the largest role, and which ones we as university faculty and as citizens can affect, is the challenge.

    My net acquaintance Chad Orzel, a physicist, recently commented on why they're leaving, drawing his inspiration from an Inside Higher Ed piece of the same name. That article offers four explanations for why students leave the physical sciences and engineering disciplines: lower GPAs, weed-out courses, large and impersonal sections, and "vertical curricula". The last of these refers to the fact that students in our disciplines often have to "slog" through a bunch of introductory skills courses before they are ready to do the intersting work of science.

    While Chad teaches at a small private college, my experience here at a mid-size public university seems to match with his. We don't have many large sections in computer science at UNI. For a couple of years, my department experimented with 100-person CS1 sections as a way to leverage faculty expertise in a particular language against our large enrollments. (Irony!) But for the most part our sections have always been 35 or less. I once taught a 53-person section of our third course (at that time, object-oriented programming), but that was an aberration. Our students generally have small sections with plenty of chance to work closely with the tenure-track faculty who teach them.

    We've never had a weed-out course, at least not intentionally. Many of our students take Calculus and may view that as a weeder, but my impression from talking to our students is that this is nothing like the brutal weed-out courses used in many programs to get enrollments down to a manageable size of sufficient quality. These days, the closest thing we have to a weed-out course is our current third course, Data Structures. It certainly acts as a gatekeeper, but it's mostly a matter of programming practice; students who apply themselves to the expectations of the instructor ought to be able to succeed.

    The other two issues are problems for us. The average GPA of a computer science student is almost surely well below the university average. I haven't seen a list of average GPAs sorted by department in many years, but the last few times I did CS and economics seemed to be jostling for the bottom. These are not disciplines that attract lots and lots of weak students, so grading practices in the departments must play a big role. As the Inside Higher Ed article points out, This culture of grading is common in the natural sciences and the more quantitative social sciences at most universities. I don't doubt that many students are dissuaded from pursuing a CS major by even a B in an intro course. Heck, they get As in their other courses, so maybe they are better suited for those majors? And even the ones who realize that this is an illogical deduction may figure that their lives will simply be easier with a different major.

    I won't speak much of the other problem area for us, because I've written about it a lot recently. I've never used the word "vertical" to describe our problem of boring intro courses that hide or kill the joy of doing computing before students ever get to see it, but I've certainly written about the issue. Any student who graduates high school with the ability to read is ready for a major in history or communication; the same student probably needs to learn a programming language, learn how to write code, and figure out a lot of new terminology before being ready to "go deep" in CS. I think we can do better, but figuring out how is a challenge.

    I must point out, though, that the sciences are not alone in the problem of a vertical curriculum. As an undergraduate, I double-majored in CS and accounting. When I switched from architecture to CS, I knew that CS was what I wanted to do, but my parents encouraged me to take a "practical" second major as insurance. I actually liked accounting just fine, but only because I saw past all of the bookkeeping. It wasn't until I got to managerial accounting as a junior and auditing as a senior that I got to the intellectually interesting part of accounting, how one models an organization in terms of its financial system in order to understand how to make it stronger. Before that came two years of what was, to me, rather dull bookkeeping -- learning the "basics" so that we could get to the professional activities. I often tell students today that accounting is more interesting than it probably seems for the first one, two, or three years.

    Computer science may not have moved much faster back then. I took a one-quarter CS 1 to learn how to program (in Fortran), a one-quarter data structures course, and a couple of courses in assembly language, job control language, and systems programming, but within three or four quarters I was taking courses in upper-division content such as databases, operating systems, and programming languages -- all of which seemed like the Real Thing.

    One final note. I actually read the articles mentioned at the beginning of this essay after following a link from another piece by Chad, called Science Is Not a Path to Riches. In it, Chad says:

    A career in research science is not a path to riches, or even stable employment. Anyone who thinks so is sadly deluded, and if sure promotion and a fat paycheck are your primary goal (and you're good at math), you should become an actuary or an accountant or something in that vein. A career in research science can be very rewarding, but the rewards are not necessarily financial (though I hasten to add, I'm not making a bad living, either).

    This is one place where we differ from physicists and chemists. By and large, CS graduates do get good jobs. Even in times of economic downturn, most of our grads do pretty well finding and keeping jobs that pay above average for where they live. Our department is willing to advertise this when we can. We don't want interested kids to turn away because they think they can't get a job, because all the good jobs are going to India.

    Even still, I am reluctant to over-emphasize the prospect of financial reward. For one thing, as the mutual fund companies all have to tell us, "past performance is no guarantee of future results". But more importantly, intrinsic interest matters a lot, too, perhaps more so than extrinsic financial reward, when it comes to finding a major and career path that works. I'd also like to attract kids because CS is fun, exciting, and worth doing. That's where the real problem of 'verticality' comes in. We don't want kids who might be interested to turn away because the discipline looks like a boring grind.

    I hope to learn more about this systemic problem from the empirical data presented in Talking About Leaving, and use that to figure out how we can do better.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    July 14, 2006 5:29 PM

    Growing a Tech Industry Instead of Corn

    One of the enjoyable outreach activities I've been involved with as department head this year has been the state of Iowa's Information Technology Council. A few years back, the Iowa Department of Economic Development commissioned the Battelle corporation to study the prospects for growing the state's economy in the 21st century. They focused on three areas: bioscience, advanced manufacturing, and information technology. The first probably sounds reasonable to most people, given Iowa's reputation as an agriculture state, but what of the other two? It turns out that Iowa is much more of a manufacturing state than many people realize. Part of this relates back to agriculture. While John Deere is headquartered in Moline, Illinois, most of its factories are in Iowa. We also have manufacturers such as Rockwell Collins and Maytag (though that company has been purchased by Whirlpool and will close most or all of its Iowa locations soon).

    But information technology? Des Moines is home to several major financial services companies or their regional centers, such as Principal Financial Group and Wells Fargo. Cedar Rapids has a few such firms as well, as well as other companies with a computing focus such as NCR Pearson and ACT.

    IDED created the IT Council to guide the state in implementing the Information Technology Strategic Roadmap developed by Battelle as a result of its studies. (You can see the report at this IDED web page.) The council consists of representatives from most of Iowa's big IT firms and many of the medium-sized and small IT firms that have grown up throughout the state. Each of the three state universities has a representative on the council, as does the community college system and the consortium of Iowa's many, many private colleges and universities. I am UNI's representative.

    The council has been meeting for only one year, and we have spent most of our time really understanding the report and mapping out some ideas to act on in the coming year. One of the big issues is, of course, how Iowa can encourage IT professionals to make the state their home, to work at existing companies and to create innovative start-ups that will fuel economic growth in the sector. Another part of the challenge is to encourage Iowa students to study computer science, math, and other science and engineering disciplines -- and then to stay in Iowa, rather than taking attractive job offers from the Twin Cities, Chicago, Kansas City, and many other places with already-burgeoning IT sectors.

    To hear Paul Graham tell it, we are running a fool's errand. Iowa doesn't seem to be a place where nerds and the exceedingly rich want to live. Indeed, Iowa is one of those red states that he dismisses out of hand:

    Conversely, a town that gets praised for being "solid" or representing "traditional values" may be a fine place to live, but it's never going to succeed as a startup hub. The 2004 presidential election ... conveniently supplied us with a county-by-county map of such places. [6]

    Actually, as I look at this map, Iowa is much more people than red, so maybe we have a chance! I do think that a resourceful people that is willing to look forward can guide its destiny. And the homes of our three state universities -- Iowa City, Ames, and Cedar Falls -- bear the hallmark of most university towns: attracting and accepting more odd ideas than the surrounding environment tends to accept. But Iowans are definitely stolid Midwestern US stock, and it's not a state with grand variation in geography or history or culture. We have to bank on solidity as a strength and hope that some nerds might like to raise their families in a place with nice bike trails and parks, a place where you can let your kids play in the neighborhood with fearing the worst.

    We also don't have a truly great university, certainly not of the caliber Graham expects. Iowa and Iowa State are solid universities, with very strong programs in some areas. UNI is routinely praised for its efficiency and for its ability to deliver a solid education to its students. (Solid -- there's that word again!) But none of the schools has a top-ten CS program, and UNI has not historically been a center of research.

    I've sometimes wondered why Urbana-Champaign in Illinois hasn't developed a higher-profile tech center. UIUC has a top-notch CS program and produces a lot of Ph.D., M.S., and B.S. graduates every year. Eric Sink has blogged for a few years about the joys of starting an independent software company amid the farmland of eastern Illinois. But then there is that solid, traditional-values, boring reputation to overcome. Chicago is only a few hours' drive away, but Chicago just isn't a place nerds want to be near.

    So Iowa is fighting an uphill battle, at least by most people's reckoning. I think that's okay, because I think the battle is still winnable -- perhaps not on the level of the original Silicon Valley but at least on the scale needed to invigorate Iowa's economy. And while reputation can be an obstacle, it also means that competitors may not be paying enough attention. The first step is to produce more tech-savvy graduates, especially ones with an entrepreneurial bent, and then convince them to stay home. Those are steps we can take.

    One thing that has surprised me about my work with the IT Council is that Iowa is much better off on another of Graham's measures than I ever realized, or than most people in this state know. We have a fair amount of venture capital and angel funding waiting for the right projects to fund. This is a mixture of old money derived from stodgy old companies like Deere and new money from the 1990s. We need to find a way to connect this money to entrepreneurs who are ready to launch start-ups, and to educate folks with commercializable ideas on how to make their ideas attractive to the folks with the money.

    Here at UNI, we are blessed to have an existence proof that it is possible to grow a tech start-up right here in my own backyard: TEAM Technologies, which among its many endeavors operates the premier data center in the middle part of the USA. A boring, solid location with few people, little crime, and no coastal weather turns out to be a good thing when you want to store and serve data safely! TEAM is headed up by a UNI alumnus -- another great strength for our department as we look for ways to expand our role in the economic development of the state.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Managing and Leading, Software Development

    July 08, 2006 3:58 PM

    Driving Students Away

    Back in January, I wrote about making things worse in the introductory course, which had been triggered by an article at Uncertain Principles. The world is circling back on itself, because I find myself eager to write about how we drive students away from our intro courses, again triggered by another article at Uncertain Principles. This time, Chad's article was itself triggered by another physicist's article on driving students away. Perhaps the length of the chain grows by each time it comes around...

    According to these physicists, one of the problems with intro course in physics is that it is too much like high school physics, which bores the better students to the point that they lose interest. We don't face that issue in CS much these days, at least in my neck of the woods, because so few of our freshmen enter the program with any formal CS or programming education. I'm not a fan of the approaches suggested to keep the attention of these well-prepared students (a byzantine homework policy, lots of quizzes) because I think that repeating material isn't the real problem. And these approaches make the real problem worse.

    The real problem they describe is one with which we are familiar: students "lose sight of the fun and sense of wonder that are at the heart of the most successful scientific careers". The intro physics course...

    ... covers 100's years of physics in one year. We rarely spend more than a lecture on a single topic; there is little time for fun. And if we want to make room for something like that we usually have to squeeze out some other topic. Whoosh!

    Chad says that this problem also affects better students disproportionately, because they "have the preparation to be able to handle something more interesting, if we could hold their attention".

    I think most students can handle something more interesting. They all deserve something more interesting than we usually give them, too.

    And I don't think that the answer involves "more content". Whenever I talk to scientists about the challenges of teaching, the conversation always seems to turn to how much content we have to deliver. This attitude seems wrongheaded to me when taken very far. It's especially dangerous in an introductory course, where novices can easily drown in syntax and detail -- and lose sight of what it is like to be a scientist, or an engineer. Pouring on more content, even when the audience is honors students, almost always results in suboptimal learning, because the course tends to become focused on data rather than ideas.

    In closing, I did enjoy seeing that academic physicists are now experimenting with courses about something more than the formulas of physics. One of the commenters on the Uncertain Principles article notes that he is tweaking a new course design around the question, "How old is the universe?" He also mentions one of the obstacles to making this kind of change: students actually expect a memorization-driven course, because that's what they've learned from their past experiences. This is a problem that really does affect better students differently, because they have mastered the old way of doing things! As a result, some of them will resent a new kind of course. My experience, though, is that you just have to stick to your approach through some rough patches early; nearly all of these students will eventually come around and appreciate the idea- and practice-driven approach even more once they adapt to the change. Remember, adaptation to change takes time, even for those eager to to change...


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    July 03, 2006 4:55 PM

    Humility and Revolution

    Three quotes for what amounts in most American workplaces the middle of a long holiday weekend. The first two remind me to approach my teaching and administrative duties with humility. The third reminds me of we in America celebrate this holiday weekend at all.

    ... from Gerald Weinberg:

    When I write a book or essay, or teach a course, I have one fundamental measure of failure, which I call Weinberg's Target:

    After exposure to my work, does the audience care less about the subject than they did before?

    If the answer is Yes, I've failed. If the answer is No, I've succeeded, and I'm happy for it. Perhaps you consider my goal too modest. Perhaps you aspire to something greater, like making the student learn something, or even love the subject. Oh, I'm not dismayed by such fine outcomes, but I don't think it's a reasonable goal to expect them.

    We can do much worse than communicate some information without dampening our audience's natural enthusiasm.

    ... from Steve Yegge:

    If you don't know whether you're a bad manager, then you're a bad manager. It's the default state, the start-state, for managers everywhere. So just assume you're bad, and start working to get better at it. ... Look for things you're doing wrong. Look for ways to improve. If you're not looking, you're probably not going to find them.

    Steve's essay doesn't have much in the way of concrete suggestions for how to be a good manager, but this advice is enough to keep most of us busy for a while.

    ... finally, from the dome of the Jefferson Memorial, via Uncertain Principles:

    I have sworn upon the altar of God eternal hostility against every form of tyranny over the mind of man.

    A mixture of humility and boldness befitting revolution, of thought and government.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Managing and Leading

    June 29, 2006 11:47 AM

    Buried Treasures

    I finally got around to reading Glenn Vanderburg's Buried Treasure. The theme of his article is "our present is our past, and there's more past in our future". Among his conclusions, Glenn offers some wisdom about programming languages and software education that we all should keep in mind:

    What I've concluded is that you can't keep a weak team out of trouble by limiting the power of their tools. The way forward is not figuring out how to achieve acceptable results with weak teams; rather, it's understanding how to build strong teams and how to train programmers to be part of such teams.

    Let's stop telling ourselves that potential software developers can't learn to use powerful tools and instead work to figure out how to help them learn. Besides, there is a lot more fun in using powerful tools.

    Glenn closes his article with a nascent thought of his, an example of how knowing the breadth of our discipline and its history might help a programmer solve a thorny problem. His current thorny problem involves database migration in Rails, and how that interacts with version control. We usually think of version control as tracking static snapshots of a system, but a database migration subsystem is itself a tracking of snapshots of an evolving database schema -- so your version control system ends up tracking snapshots of what is in effect a little version control system! Glenn figures that maybe he can learn something about solving this problem from Smalltalkers, who deal with this sort of this thing all the time -- because their programs are themselves persistent objects in an evolving image. If he didn't know anything about Smalltalk or the history of programming languages, he might have missed a useful connection.

    Speaking of Smalltalk, veteran Smalltalker Blaine Buxton wrote recently on a theme you've seen here: better examples. All I can say is what Blaine himself might say, Rock on, Blaine! I think I've found a textbook for my CS 1 course this fall that will help my students see lots of more interesting examples than "Hello, world!" and Fibonacci numbers.

    That said, my Inner Geek thoroughly enjoyed a little Scheme programming episode motivated by one of the comments on this article, which taught me about a cool feature of Fibonacci numbers:

    Fib(2k) = Fib(k) * (Fib(k+1) + Fib(k-1))

    This property lends itself to computing Fib very efficiently using binary decomposition and memoizing (caching perviously computed values). Great fun to watch an interminably slow function become a brisk sprinter!

    As the commenter writes, simple problems often hide gems of this sort. The example is still artificial, but it gives us a cool way to learn some neat ideas. When used tactically and sparingly, toy examples open interesting doors.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    June 22, 2006 3:08 PM

    The H Number

    I ran across the concept of an "h number" again over at Computational Complexity. In case you've never heard of this number, an author has an h number of h if h of her Np papers have ≥ h citations each, and the rest of her (Np - h) papers have ≤ h citations each.

    It's a fun little idea with a serious idea behind it: Simply counting publications or the maximum number of citations to an author's paper can give a misleading picture of a scientist's contribution. The h number aims to give a better indication of an author's cumulative effect and relevance.

    Of course, as Lance points out, the h number can mislead, too. This number is dependent on the research community, as some communities tend to publish more or less, and cite more or less frequently, than other. It can reward a "clique" of authors who generously cite each other's work. Older authors have written more papers and so will tend to be cited more often than younger authors. Still, it does give us different information than raw counts, and it has the enjoyability of a good baseball statistic.

    Now someone has written an h number calculator that uses Google Scholar to track down papers for a specific researcher and then compute the researcher's index. (Of course, this introduces yet another sort of problem... How accurate is Scholar? And do self-citations count?)

    I love a good statistic and am prone to vanity surf, so I had to go compute my h number:

    The h-number of Eugene Wallingford is 5 (max citations = 22)

    You can put that into perspective by checking out some folks with much larger numbers. (Seventy?) I'm just surprised that I have a paper with 22 citations.

    I also liked one of the comments to Lance's post. It suggests another potentially useful index -- (h * maxC)/1000, where maxC is the number of citations to the author's most cited paper -- which seems to combine breadth of contribution with depth. For the baseball fans among you, this number reminds me of OPS, which adds on-base percentage to slugging percentage. The analogy even feels right. h, like on-base percentage, reflects how the performer contributes broadly to the community (team); maxC, like slugging percentage, reflects the raw "power" of the author (batter).

    The commenter then considers a philosophical question:

    Lastly, it is not so clear that a person who has published a thousand little theorems is truly a worse scientist than one who has tackled two large conjectures. You don't agree? Paul Erdos was accused of this for most of his life, yet for the last two decades of his life it became very clear that many of those "little theorems" were gateways to entire areas of research.

    Alan Kay doesn't publish a huge number of papers, but his work has certainly had a great effect on computing over the last forty years.

    Baseball has lots of different statistics for comparing the performance of players and teams. Have a large set of tools can both be fun and give a more complete picture of the world.

    I suppose that I should back to working beefing up my h number, or at least doing something administrative...


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    June 18, 2006 10:20 AM

    Programming as Program Transformation

    I see that Ralph Johnson is giving the Friday keynote talk at ECOOP 2006 this year. His talk is called "The Closing of the Frontier", and the abstract shows that it will relate to an idea that Ralph has blogged about before: software development is program transformation. This is a powerful idea that has emerged in our industry over the last decade or so, and I think that there are a lot of computer scientists who have to learn it yet. I have CS colleagues who argue that most programs are developed essentially from scratch, or at least that the skills our students most need to learn most closely relate to the ability to develop from scratch.

    I'm a big believer in learning "basic" programming skills (most recently discussed here), but I'd like for my students to learn many different ways to think about problems and solutions. It's essential they learn that, in a great many contexts, "Although user requirements are important, version N+1 depends more on version N than it does on the latest requests from the users."

    Seeing Ralph's abstract brought to mind a paper I read and blogged about a few months back, Rich Pattis's "A Philosophy and Example of CS-1 Programming Projects". That paper suggested that we teach students to reduce program specs to a minimum and then evolve successive versions of a program which converges on the program that satisfies all of the requirements. Agile programming for CS1 back in 1990 -- and a great implementation of the notion that software development is program transformation.

    I hope to make this idea a cornerstone of my CS1 course this fall, with as little jargon and philosophizing as possible. If I can help students to develop good habits of programming, then their thoughts and minds will follow. And this mindset helps prepare students for a host of powerful ideas that they will encounter in later courses, including programming languages, compilers, theory, and software verification and validation.

    I also wish that I could attend ECOOP this year!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    June 14, 2006 3:49 PM

    Picking a Textbook for Fall

    I've come to realize something while preparing for my fall CS1 course.

    I don't like textbooks.

    That's what some people call a "sweeping generalization", but the exceptions are so few that I'm happy to make one.

    For one thing, textbooks these days are expensive. I sympathize with the plight of authors, most of whom put in many more hours than book sales will ever pay them for. I even sympathize with the publishers and bookstores, who find themselves living in a world with an increasingly frictionless used-book market, low-cost Internet-based dealers, and overseas sellers such as Amazon India. But none of this sympathy changes the fact that $100 or more for a one-semester textbook -- one that was written specifically not to serve as a useful reference book for later -- is a lot. Textbook prices probably have not risen any faster than the rate of tuition and room and board, but still.

    Price isn't my real problem. My real problem is that I do not like the books themselves. I want to teach my course, and more and more the books just seem to get in the way. I don't like the style of the code shown to students. I don't like many of the design ideas they show students. I don't like all the extra words.

    I suppose that some may say these complaints say more about me than about the books, and that would be partly true. I have some specific ideas about how students should learn to program and think like a computer scientist, and it's not surprising that there aren't many books that fit my idiosyncrasy. Sticking to the textbook may have its value, but it is hard to do when I am unhappy at the thought turning another page.

    But this is not just me. By and large, these books aren't about anything. They are about Java or C++ or Ada. Sure, they may be about how to develop software, too, but that's an inward-looking something. It's only interesting if you are already interested in the technical trivia of our discipline.

    This issue seems more acute for CS 1, for a couple of reasons. First, one of the goals of that course is to teach students how to program so that they can use that skill in later courses, and so they tend toward teaching language. More important is the demand side of the equation, where the stakes are so high. I can usually live with one of the standard algorithms books or compilers books , if it gives students a reasonable point of view and me the freedom to do my own thing. In those cases, the book is almost a bonus for the students. (Of course, then the price of the book becomes more distasteful to students!)

    Why use a text at all? For some courses, I reach a point of not requiring a book. Over the last decade or more, I have evolved a way of teaching Programming Languages that no longer requires the textbook with which I started. (The textbook also evolved away from our course.) Now, I require only The Little Schemer, which makes a fun, small, relatively inexpensive contribution to how my students learn functional programming. After a few times teaching Algorithms, I am getting close to not needing a textbook in that course, either.

    I haven't taught CS 1 in a decade, so the support of a strong text would be useful. Besides, I think that most beginning students find comfort at least occasionally in a text, as something to read when today's lecture just didn't click, something to define vocabulary and give examples.

    Introduction to Computing ... A Multimedia Approach

    So, what was the verdict? After repressing my true desires for a few months in the putative interest of political harmony within the department, yesterday I finally threw off my shackles and chose Guzdial and Ericson's Introduction to Computing and Programming with Java: A Multimedia Approach. It is relatively small and straightforward, though a still a bit expensive -- ~ $90. But I think it will "stay out of my way" in the best sense, teaching programming and computing through concrete tasks that give students a chance to see and learn abstractions. Perhaps most important, it is about something, a something that students may actually care about. Students may even want to program. This book passes what I call the Mark Jacobson Test, after a colleague who is a big believer in motivation and fun in learning: a student's roommate might look over her shoulder one night while she's doing some programming and say, "Hey, that looks cool. Whatcha doing?"

    Let's see how it goes.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    June 11, 2006 2:05 PM

    Pleasantly Surprising Interconnections

    The most recent issue of the Ballast Quarterly Review, on which I've commented before, came out a month or so. I had set it aside for the right time to read and only came back around to it yesterday. Once again, I am pleasantly surprised by the interconnectedness of the world.

    In this issue, editor Roy Behrens reviews John Willats's book Making Sense Of Children's Drawings. (The review is available on-line at Leonardo On-Line.) Some researchers have claimed that children draw what they know and that adults draw what they see, and that what we adults think we see interferes with our ability to create authentic art. Willats presents evidence that young children draw what they see, too, but that at that stage of neural development they see in an object-centered manner, not a viewer-centered manner. It is this subjectivity of perspective that accounts for the freedom children have in creating, not their bypassing of vision.

    The surprising connection for came in the form of David Marr. A vision researcher at MIT, Marr had proposed the notion that we "see by processing phenomena in two very distinct ways", which he termed viewer-centered object-centered. Our visual system gathers data in a viewer-centered way and then computes from that data more objective descriptions from which we can reason.

    Where's the connection to computer science and my experience? Marr also wrote one of the seminal papers in my development as an artificial intelligence researcher, his "Artificial Intelligence: A Personal View". You can find this paper as Chapter 4 in John Haugeland's well-known collection Mind Design and on-line as a (PDF) at Elsevier.

    In this paper, Marr suggested that the human brain may permit "no general theories except ones so unspecific as to have only descriptive and not predictive powers". This is, of course, not a pleasant prospect for a scientist who wishes to understand the mind, as it limits the advance of science as a method. To the extent that the human mind is our best existence proof of intelligence, such a limitation would also impinge on the field of artificial intelligence.

    I was greatly influenced by Marr's response to this possibility. He argued strongly that we should not settle for incomplete theories at the implementation level of intelligence, such as neural network theory, and should instead strive to develop theories that operate at the computational and algorithmic levels. A theory at the computational level captures the insight into the nature of the information processing problem being addressed, and a theory at the algorithmic level captures insight into the different forms that solutions to this information processing problem can take. Marr's argument served as an inspiration for the work of the knowledge-based systems lab in which I did my graduate work, founded on the earlier work on the generic task model of Chandrasekaran.

    Though I don't do research in that area any more, Marr's ideas still guide how I think about problems, solutions, and implementations. What a refreshing reminder of Marr to encounter in light reading over the weekend.

    Behrens was likely motivated to review Willats's book for the potential effect that his theories might have on the "day-to-day practice of teaching art". As you might guess, I am now left to wonder what the implications might be for teaching children and adults to write programs. Direct visual perception has less to do with the programs an adult writes, given the cultural context and levels of abstraction that our minds impose on problems, but children may be able to connect more closely with the programs they write if we place them in environments that get out of the way of their object-centered view of the world.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

    June 10, 2006 3:29 PM

    Students, Faculty, and the Internet Age

    I've been meaning to write all week, but it turned out to be busy. First, my wife and daughters returned from Italy, which meant plenty of opportunity for family time. Then, I spent much of my office week writing content for our new department website. We were due for a change after many years of the same look, and we'd like to use the web as a part of attracting new students and faculty. The new site is very much an early first release, in the agile development sense, because I still have a lot of work to do. But it fills some of our needs well enough now, and I can use bits and pieces of time this summer to augment the site. My blogging urge was most satisfied this week by the material I assembled and wrote for the prospective students section of the site. (Thanks to the Lord of the Webs for his design efforts on the system.)

    Werner Vogels

    I did get a chance to thumb through the May issue of ACM Queue magazine, where I read with some interest the interview with Werner Vogels, CTO of Amazon. Only recently I had been discussing Vogels as a potential speaker for OOPSLA this or some year soon. I've read enough of Vogels's blog to know that he has interesting things to say.

    At the end of the interview, Vogels comments on recruiting students and more generally on the relationship of today's frontier IT firm to academia. First, on what kind of person Amazon seeks:

    The Amazon development environment requires engineers and architects to be very independent creative thinkers. We are building things that nobody else has done before, so you need to be able to think outside the box. You need to have a strong sense of ownership, because in the small teams in which you will work at Amazon, your colleagues will count on you to pull your weight -- especially when it comes to operating the service that you have built. Can you take responsibility for making this the best it can be?

    Many students these days hear so much about teamwork and "people" skills that they sometimes forget that every team member has to be able to contribute. No one wants a teammate who can't produce. Vogels stresses this upfront. To be able to contribute effectively, each of us needs to develop a set of skills that we can use right now, as well as the ability to pick up new skills with some facility.

    I'd apply the same advice to another part of Vogels's answer. In order to "think outside the box", you have to start with a box.

    Vogels then goes on to emphasize how important it is for candidates to "think the right way about customers and technology. Technology is useless if not used for the greater good of serving the customer." Sometimes, I think that cutting edge companies have an easier time cultivating this mindset than more mundane IT companies. A company selling a new kind of technology or lifestyle has to develop its customer base, and so thinks a lot about customers. It will be interesting to see how companies like Yahoo!, Amazon, and Google change as they make the transition into the established, mainstream companies of 2020.

    On the relationship between academia and industry, Vogels says that faculty and Ph.D. students need to get out into industry in order to come into contact with "the very exciting decentralized computing work that has rocked the operating systems and distributed systems world in the past few years". Academics have always sought access to data sets large enough for them to test their theories. This era of open source and open APIs has created a lot of new opportunities for research, but open data would do even more. Of course, the data is the real asset that the big internet companies hold, so it won't be open in the same way for a while. Internships and sabbaticals are the best avenue open for academics interested in this kind of research these days.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    April 22, 2006 7:07 PM

    A Day with Camouflage Scholars

    Camouflage Conference poster

    I am spending this day uncharacteristically, at the international conference Camouflage: art, science & popular culture. As I've written before, UNI is home to an internationally renown scholar of camouflage, Roy Behrens, who has organized this surprising event: a one-day conference that has attracted presenters and attendees from Europe, Australia, and all over the US, from art, architecture, photography, literature, theater, dance, music, and graphic arts, as well as a local mathematician and a local computer scientist, all to discuss "the art of hiding in plain sight". I am an outlier in every session, but I'm having fun.

    Marvin Bell, the Flannery O'Connor Professor of Letters at the University of Iowa and the state's first Poet Laureate, opened the day with some remarks and a first reading of a new poem in his Dead Man Walking series. First, he chuckled over his favorite titles from the conference program: "The Case of the Disappearing Student" and "Photographic Prevarications" among them. My talk's title, "NUMB3RS Meets The Da Vinci Code: Information Masquerading as Art", wasn't on his list, but that may be because my talk was opposite his favorite title. It is worthy of enshrinement with the all-time great MLA titles: "Art and the Blind: An Unorthodox Phallic Cultural Find". His remarks considered the ways in which poetry is like camouflage, how it uses a common vocabulary but requires a second look in order to see what is there.

    Thankfully, my talk came almost first thing in the day, as one of three talks kicking of the day's three parallel sessions. As you might guess, this was an unusual audience for me to speak to, a melange of artistic folks, with not a techie among them. What interests them about steganography in digital images?

    The talk went well. I had prepared much more than I could use in thirty minutes, but that gave me a freedom to let the talk grow naturally from plentiful raw material. I may write more about the content of my presentation later, but what is most on my mind right now are the reactions of the audience, especially in response to this comment I made near the end: "But of what interest can this be to an artist?" As I was packing up, a Los Angeles architect and artist asked about 3D steganography -- how one might hide one building inside another, either digitally or in real space. A writer asked me about hypertext and the ability to reveal different message to different readers depending on their context. Later, another artist/architect told me that what excited her most about my talk was simply knowing that something like this exists -- the idea had sparked her thoughts on military camouflage. Finally, on the way to lunch, two artists stopped me to say "Great talk! We could have listened to you for another 30 minutes, or even an hour." What a stroke to my ego.

    For me, perhaps the best part of this is to know that I am not such an oddball, that the arts are populated by kindred spirits who see inspiration in computer science much as I see it in the arts. This has been my first "public outreach" sort of talk in a long time, but the experience encourages me that we can share the thrill with everyone -- and then watch for the sparks of inspiration to create the fires of new ideas in other disciplines.

    I've done my best today to attend presentations from as many different media as possible: so far, poetry, architecture, literary translition (yes, that's an 'i'), photography, dance, painting, and neurobiology; coming up, language, music, and math. The talks have been of mixed interest and value to me, but I suppose that's not much different from most computer science conferences.

    Some thoughts that stood out or occurred to me:

    • Natural camouflage is not intentional, but rather the result of a statistical process -- natural selection.

    • Children develop a resentful attitude toward most poetry in school. They distrust it, because the meaning is hidden. Common words don't mean what they say. Such poetry makes them -- us -- feel stupid.

      Do computer science courses do this to students?

    • One poetry presentation was really a discussion of a couple of poems, including So This is Nebraska, by Ted Kooser. One member of the audience was a Chicago native who recently had moved to rural Iowa. The move had clearly devastated her; she felt lost, disoriented. The depth of her emotion poured out as she described how she did not "get" Iowa. "But Nebraska (Iowa) is so hard to see!" Over time she has begun to learn to see her new world. For her, the title of poem could be "So This is Nebraska!". She has had to learn not to resent Iowa(!)

      I think that introductory computer science courses disorient our students in a similar way. They are drowned in new language, new practices, and too often 'just programming'. How can we help them to see, "So This is Computer Science!"?

    • In a single talk this afternoon, I learned about abstract Islamic art, cell trees, and -- my personal favorite so far -- the SlingKing (tm). Two thumbs up, sez Dave.

    • "Translition" is a creativity technique described by poet Steve Kowit in which one playfully translates a poem written in a language one doesn't know. Like other creative writing techniques, the ideas is to write down whatever crosses the mind, whatever sounds right at the moment. The translitor plays off sounds and patterns, making up cognate words or any other vocabulary he wants. He is bound to preserve the structure of the original poem, its layout and typography. We did some translition during the session, and some brave audience members (not me, even being a recently self-published poet) read stanzas of their work with Norwegian and African poems.

      Okay, so I'm crazy, but how could I turn this into a programming etude?

    This is an indulgent day for me, frankly. I have a list of 500 hundred things to do for my job -- literally -- plus a hefty list for home. My daughters had soccer games, piano lessons, and babysitting today, so my wife spent a bunch of time running shuttle service solo. It's a privilege to spend an entire day, 8:00 AM-8:30 PM, on an interdisciplinary topic with little or no direct relationship to computer science. It's a good thing I don't have to worry about getting tenure.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    April 20, 2006 6:54 PM

    Artistic Diversions from Artistic Work

    This
    blog
    speaks in
    rhythms of
    numbers in patterns.
    Words in patterns entice the mind.

    That bit of doggerel is my first Fib -- a poem!

    Each poem is six lines, twenty syllables. You should see a familiar pattern.

    The syllables of a Fib follow the Fibonacci sequence. And, for my own amusement, the last three paragraphs extend the pattern.

    Don't I have something better to do, like prepare slides for my talk on steganography at this weekend's Camouflage Conference at UNI? Of course! But sometimes the mind chooses its own direction. (Count the syllables!)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    April 09, 2006 10:31 AM

    A Guilty Pleasure, Language-Style

    While doing a little reading to end what has been a long week at the office, I ran across a pointer to Steve Yegge's old piece, Tour de Babel, which has recently been touched up. This is the third time now that Yegge's writing has come recommended to me, and I've enjoyed the recommended article each time. That means I need to add his blog to my newsreader.

    This was my favorite quote from the article, a perfect thought with which to end the week:

    Familiarity breeds contempt in most cases, but not with computer languages. You have to become an expert with a better language before you can start to have contempt for the one you are most familiar with.

    So if you don't like what I am saying about C++, go become an expert at a better language (I recommend Lisp), and then you'll be armed to disagree with me. You won't, though. I'll have tricked you. You won't like C++ anymore...

    I know that this is the sort of inflammatory, holier-than-thou pronouncement that smug Lisp weenies make all the time, and that it doesn't do anything to move a language discussion forward. But from all I've read by Steve, he isn't a language bigot at all but someone who seems to like lots of languages for different virtues. He even speaks kindly of C++ and Java when they are discussed in certain contexts.

    Even though I know I shouldn't like these sorts of statements, or give them the bully pulpit of my ever-so-popular blog, I give in to the urge. They make me smile.

    I myself am not a smug Lisp weenie. However, if you replaced "Lisp" with "Smalltalk" or "Scheme" in the quoted paragraph, I would be smiling even wider. (And if you don't know why replacing "Lisp" with "Scheme" in that sentence would be a huge deal to a large number of Lisp devotees, well, then you just don't understand anything at all about smug Lisp weenies!)

    Pardon me this guilty pleasure.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    April 07, 2006 11:33 PM

    Back to the Basics. Accelerated

    The SIGCSE mailing list has been alive this week with a thread that started with pseudocode, moved to flowcharts, and eventually saddened a lot of readers. At the center of the thread is the age-old tension among CS educators that conflates debates between bottom-up and top-down, low-level and high-level, machine versus abstraction, and "the fundamentals" with "the latest trends". I don't mean to rehash the whole thread here, but I do want to share my favorite line in the discussion.

    Suffice to say: Someone announced an interest in introducing programming via a few weeks of working in pseudocode, which would allow students to focus on algorithms without the distraction of compilers. He asked for advice on tools and resources. A group of folks reported having had success with a similar idea, only using flowchart tools. Others reported the advantages of lightweight assembly-language style simulators. The discussion became a lovefest for the lowest-level details in CS1.

    My friend and colleague Joe Bergin, occasionally quoted here, saw where this was going. He eventually sent an impassioned and respectful message to the SIGCSE list, imploring folks to look forward and not backwards. In a message sent to a few of us who are preparing for next week's ChiliPLoP 2006 conference, he wrote what became the closing salvo in his response.

    The pseudocode thread on the SIGCSE list is incredibly depressing. ... Why not plugboards early? Why not electromechanical relays early? Why not abacus early?

    An "abacus-early" curriculum. Now, there's the fundamentals of computing! Who needs "objects first", "objects early", "procedures early", "structured programming", ...? Assignment statements and for-loops are johnny-come-latelys to the game. Code? Pshaw. Let's get back to the real basics.

    Joe, you are my hero.

    (Of course, I am being facetious. We all know that computing reached its zenith when C sprang forth as whole cloth from Bell Labs.)

    Am I too young to be an old fogey tired of the same old discussions? Am I too young to be a guy who likes to learn new things and help students do the same?

    I can say that I was happy to see that Joe's message pulled a couple of folks out of the shadows to say what really matters: that we need to share with students the awesome beauty and power of computing, to help them master this new way of thinking that is changing the world as we live. All the rest is details.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    March 31, 2006 12:19 PM

    Getting My Groove Back

    To soothe my bruised ego, yesterday evening I did a little light blog reading. Among the articles that caught my attention was Philip Greenspun's Why I love teaching flying more than software engineering. People learning to fly want to get better; I'm guessing that most of them want to become as good as they possibly can. (Just like this guy wanted to make as good a product as possible.) Philip enjoys teaching these folks, more so than teaching students in the area of his greatest expertise, computing, because the math and computing students don't seem to care if they learn or not.

    I see students who will work day and night until they become really good software developers or really good computer scientists, and the common thread through their stories is an internal curiosity that we can't teach. But maybe we can expose them to enough cool problems and questions that one will kick their curiosity into overdrive. The ones who still have that gear will do the rest. Philip worries that most students these days "are satisfied with mediocrity, a warm cubicle, and a steady salary." I worry about this, too, but sometimes wonder if I am just imagining some idyllic world that never has existed. But computer science is so much fun for me that I'm sad that more of our students don't feel the same joy.

    While reading, I listened to Olin Shivers's talk at Startup School 2005, "A Random Walk Through Startup Space" (mp3). It had been in my audio folder for a while, and I'm glad I finally cued it up. Olin gives lots of pithy advice to the start-up students. Three quotes stood out for me yesterday:

    • At one point, Olin was talking about how you have to be courageous to start a company. He quoted Paul Dirac, who did "physics so beautiful it will bring tears to your eyes", as saying

      Scientific progress advances in units of courage, not intelligence.

    • Throughout his talk, Olin spoke about how failure is unavoidable for those who ultimately succeed.
      ... to start a business, you've got to have a high tolerance for feeling like a moron all the time.

      And how should you greet failure when you're staring it in the face?

      Thank you for what you have taught me.

    Next, I plan to listen to Steve Wozniak's much-blogged talk (mp3). If I enjoy that one as much as I enjoyed Shivers's, I may check out the rest of them.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    March 10, 2006 5:37 PM

    Students Paying for Content

    Education is an admirable thing,
    but it is well to remember from time to time
    that nothing worth knowing can be taught.
    -- Oscar Wilde

    Recently I have been having an ongoing conversation with one of my colleagues, a senior faculty member, about teaching methods. This conversation is part of a larger discussion of the quality of our programs and the attractiveness of our majors to students.

    In one episode, we were discussing the frequency with which one quizzes and exams the class. Some of our math professors quiz almost daily, and even some CS professors give substantial quizzes every week. My colleague thinks is a waste of valuable class time and a disservice to students. I tend to agree, at least for most of our CS courses. When we assess students so frequently for the purposes of grading, the students become focused on assessment and not on the course material. They also tend not to think much about the fun they could be having writing programs and understanding new ideas. They are too worried about the next quiz.

    My colleague made a much stronger statement:

    Students are paying for content.

    In an interesting coincidence, when he said this I was preparing a class session in which my students would do several exercises that would culminate in a table-driven parser for a small language. We had studied the essential content over the previous two weeks: first and follow sets, LL(1) grammars, semantic actions, and so on.

    I don't think I was wasting my students time or tuition money. I do owe them content about compilers and how to build them. But they have to learn how to build a compiler, and they can't learn that by listening to me drone on about it at the front of the classroom; they have to do it.

    My colleague agrees with me on this point, though I don't think he ever teaches in the way I do. He prefers to use programming projects as the only avenue for practice. Where I diverge is in trying to help students gain experience doing in a tightly controlled environment where I can give almost immediate feedback. My hope is that this sort of scaffolded experience will help them learn and internalize technique more readily.

    (And don't worry that my students lack for practical project experience. Just ask my compiler students, who had to submit a full parser for a variant of Wirth's Oberon-0 language at 4 PM today.)

    I think that our students are paying for more than just content. If all they need is "dead" content, I can give them a book. Lecture made a lot of sense as the primary mode of instruction back when books were rare or unavailable. But we can do better now. We can give students access to data in a lot of forms, but as expert practitioners we can help them learn how to do things by working with them in the process of doing things.

    I am sympathetic to my colleague's claims, though. Many folks these days spend far too much time worrying about teaching methodology than about the course material. The content of the course is paramount; how we teach it is done in service of helping students learn the material. But we can't fall into the trap of thinking that we can lecture content and magically put it into our students' heads, or that they can magically put it there by doing homework.

    This conversation reminded me of a post on learning styles at Tall, Dark, and Mysterious. Here is an excerpt she quotes from a cognitive scientist:

    What cognitive science has taught us is that children do differ in their abilities with different modalities, but teaching the child in his best modality doesn't affect his educational achievement. What does matter is whether the child is taught in the content's best modality. All students learn more when content drives the choice of modality.

    The issue isn't that teaching a subject, say, kinesthetically, doesn't help a kinesthetic learner understand the material better; the issue is that teaching material kinesthetically may compromise the content.

    Knowledge of how to do something sometimes requires an approach different from lecture. Studio work, apprenticeship, and other forms of coached exercise may be the best way to teach some material.

    Finally, that post quotes someone who sees the key point:

    Perhaps it's more important for a student to know their learning style than for a teacher to teach to it. Then the student can make whatever adjustments are needed in their classroom and study habits (as well as out of classroom time with the instructor).

    In any case, a scientist or a programmer needs to possess both a lot of declarative knowledge and a lot of procedural knowledge. We should use teaching methods that best help them learn.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    March 07, 2006 6:44 PM

    A Blog Entry From Before I Had a Blog

    Back when I started this blog, I commented that "the hard drive of every computer I've ever had is littered with little snippets and observations that I would have liked to make available to someone, anyone, at the time." At the time, I thought that I might draw on some of my old writings for my blog. That hasn't happened, because I've usually either had something new to write about that interested or had no time to blog at all.

    Apparently, not all of my old writings ended up in dead files somewhere. While I was getting ready for my compiler class today, I looked over the lecture notes from the corresponding session the last time I taught this course. The date was November 4, 2003, and I had just returned from OOPSLA in Anaheim. I was excited about the conference, I'm sure, as I always am, and used part of my first class back to share some of what I had learned at the conference. (I wouldn't be surprised if part of my motivation in sharing was avoid diving head-first back into semantic analysis right away... :-)

    My lecture notes for that day looked a lot like a set of blog entries! So it seems that my students were unwitting, though not unwilling, audience for my blogging before this blog began.

    I enjoyed re-reading one of these OOPSLA'03 reports enough that I thought I'd polish it up for a wider audience. Here is an excerpt from my compiler lecture notes of 11/04/03, on David Ungar's keynote address, titled "Seven Paradoxes of Object-Oriented Programming Languages". I hope you enjoy it, too.

    OOPSLA 1: David Ungar's Keynote Address

    David Ungar

    David Ungar of Sun Microsystems gave an invited talk at OOPSLA last week, entitled "Seven Paradoxes of Object-Oriented Programming Languages". ( local mirror)

    His abstract lists seven paradoxes about language that he thinks make designing a good programming language so difficult. He titled and focused his talk on object-oriented languages, given his particular audience, but everything he said applies to languages more generally, and indeed to most design for human use. His points are about language design, not compiler design, but language design has a profound effect on the implementation of a compiler. We've seen this at a relatively low level when considering LL and LR grammars, and it applies at higher levels, too.

    He only addressed three of his paradoxes in the talk, and then in a more accessible form.

    1. Making a language better for computers makes it worse for people, and vice versa.

    He opened with a story about his interview at MIT when he was looking for his first academic position. One interviewer asked him to define 'virtual machine'. He answered, "a way to give programmers good error messages, so that when there weren't any error messages the program would run successfully". The interviewer said, "If you believe that, then we have nothing more to talk about it." And they didn't. (He got the offer, but turned it down.)

    A programming language has to be mechanistic and humanistic, but in practice we let the mechanistic dominate. Consider: write a Java class that defines a simple stack of integers. You'll have to say int five times -- but only if you want 32-bit integers; if you want more or less, you need to say something else.

    The machines have won! How can we begin to design languages for people? We first must understand how they think. Some ideas that he shared from non-CS research:

    • Abstractions are in your head, not in the world. A class of objects is an abstraction. An abstract data type is an abstraction. An integer is an abstraction. The idea of identity is an abstraction.

      Our programming languages must reflect a consciousness of abstraction for even the simplest ideas; in C++ and Java, these include const, final, and ==. (A final field in Java cannot be modified, but its value can change if it is a mutable object...)

    • From the moment of perception, our bodies begin to guess at abstractions. But these abstractions are much 'softer' than their equivalents in a program.

    • Classical categories -- classes of objects -- are defined in terms of shared features. This implies a symmetry to similarity and representativeness. But that is not how the human mind seems to work. Example: birds, robins, and ostriches. For humans, classes are not defined objectively in our minds but subjectively, with fuzzy notions of membership. One interesting empirical observation: People treat the middle of a classification hierarchy as the most basic unit, not the root or leaves. Examples: cat, tree, linked list.

    • Why is state so important to us in our programs? Because the 'container' metaphor seems deeply ingrained in how we think about the world. "The variable x 'holds' a 4", like a box. The same metaphor affects how we think about block structure and single inheritance. But the metaphor isn't true in the world; it's just a conceptual framework that we construct to help us understand. And it can affect how we think about programs negatively (especially for beginners!?)

    If class-based languages pose such problems, what is the alternative? How about an instance-based language? One example is Self, a language designed by Ungar and his colleagues at Sun. But instance-based languages pose their own problems...

    2. The more concepts we add to a language, the less general code that we write.

    He opened with a story about his first big programming assignment in college, to write an assembler. He knew APL best -- a simple language with few concepts but powerful operators. He wrote a solution in approximately 20 lines of code. How? By reading in the source as a string and then interpreting it.

    Some of his PL/1-using classmates didn't get done. Some did, but they wrote hundreds and hundreds of lines of code. Why? They could have done the same thing he did in APL -- but they didn't think of it!

    But why? Because in languages like Fortran, PL/1, C, Java, etc., programs and data are different sorts of things. In languages like APL, Lisp, Smalltalk, etc., there is no such distinction.

    Adding more concepts to a language -- such as distinguishing programs from data -- impoverish discourse because they blind us, create a mindset that is focused on concepts, not problems.

    Most OOPs adopt a classes-and-bits view of objects, which encourages peeking at implementation (getters/setters, public/private, ...). Could create a better language that doesn't distinguish between data and behavior? Self also experiments with this idea, as do Smalltalk and Eiffel.

    Adding a feature to a language solves a specific problem but degrades learnability, readability, debuggability -- choosing what to say. (Why then do simpler languages like Scheme not catch on? Maybe it's not a technical issue but a social one.)

    What is an alternative? Build languages with layering -- e.g., Smalltalk control structures are built on top of blocks and messages.

    3. Improving the type system of a language makes the type system worse.

    This part of Ungar's talk was more controversial, but it's a natural application of his other ideas. His claim: less static type checking implies more frequent program change, implies more refactoring, implies better designs.

    [-----]

    So what? We can use these ideas to understand our languages and the ways we program in them. Consider Java and some of the advice in Joshua Bloch's excellent book Effective Java.

    • "Use a factory method, not a constructor."

      What went wrong? Lack of consciousness of abstraction. Richer is poorer (constructor distinctive from message send).

    • "Override hashCode() when you override equals()."

      What went wrong? Better types are worse. Why doesn't the type system check this?

    Ambiguity communicates powerfully. We just have to find a way to make our machines handle ambiguity effectively.

    [end of excerpt]

    As with any of my conference reports, the ideas presented belong to the presenter unless stated otherwise, but any mistakes are mine.

    The rest of this lecture was a lot like my other conference-themed blog entries. Last week, I blogged about the Python buzz at SIGCSE; back in late 2003 I commented on the buzz surrounding the relatively new IDE named Eclipse. And, like many of my conference visits, I came back with a long list of books to read, including "Women, Fire, and Dangerous Things", by George Lakoff, "The Innovator's Dilemma", by Clayton Christensen, and "S, M, L, XL", by Rem Koolhaas and Bruce Mau.

    Some things never change...


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    March 05, 2006 10:23 AM

    SIGCSE Wrap-Up: This and That

    I was able to stay at SIGCSE only through the plenary talk Friday morning, and so there won't be a SIGCSE Day 2 entry. But I can summarize a few ideas that came up over the rest of Day 1 and the bit of Day 2 I saw. Consider this "SIGCSE Days 1 and 1.15".

    Broadening the Reach of CS

    I went to a birds-of-a-feather session called "Innovative Approaches to Broadening Computer Science", organized by Owen Astrachan and Jeffrey Forbes of Duke. I was surprised by the number and variety of schools that are creating new courses aimed at bringing computing to non-computer scientists. Many schools are just beginning the process, but we heard about computing courses designed specially for arts majors, psychology majors, and the more expected science and engineering majors. Bioinformatics is a popular specialty course. We have an undergraduate program in bioinformatics, but the courses students take at the beginning of this program are currently traditional programming courses. My department recently began discussions of how to diversify the entry paths into computing here, with both majors and non-majors in mind. It's encouraging to see so many other schools generating ideas along these lines, too. We'll all be able to learn from one another,

    Extreme Grading

    Not quite, but close. Henry Walker and David Levine presented a paper titled "XP Practices Applied to Grading". David characterized this paper as an essay in the sense resurrected by the essays track at OOPSLA. I enjoy SIGCSE talks that reflect on practices. While there wasn't much new for me in this talk, it reminded Jim Caristi and me of a 1998 OOPSLA workshop that Robert Biddle, Rick Mercer, and I organized called Evaluating Object-Oriented Design. That session was one of those wonderful workshops where things seem to click all day. Every participant contributed something of value, and the contributions seemed to build on one another to make something more valuable. I presented one of my favorite patterns-related teaching pieces, Using a Pattern Language to Evaluate Design. What Jim remembered most vividly from the workshop was the importance in the classroom of short cycles and continuous feedback. It was good to see CS educators like Henry and Dave presenting XP practices in the classroom to the broader SIGCSE community.

    ACM Java Task Force

    Over the last two-plus years, the ACM Java Task Force has put in a lot of time and hard work designing a set of packages for use in teaching Java in CS1. I wonder what the ultimate effect will be. Some folks are concerned about the graphics model that the task force adopted. ("Back to Java 1.0!" one person grumbled.) But I'm thinking more of the fact that Java may not last as the dominant CS1 language much longer. At last year's SIGCSE one could sense a ripple of unease with Java, and this year the mood seemed much more "when...", not "if...". Rich Pattis mentioned in his keynote lecture that he had begun teaching a new CS1 language every five years or so, and Java's time should soon be up. He didn't offer a successor, but my read of the buzz at SIGCSE is that Python is on the rise.

    Computer Science in K-12 Education

    The second day plenary address was given by a couple of folks at the Computer Science Teachers Association, a group affiliated with the ACM that "supports and promotes the teaching of computer science and other computing disciplines" in the K-12 school system. I don't have much to say about their talk other than to note that there a couple of different issues at play. One is the teaching of computer science, especially AP CS, at the pre-university level. Do we need it? If so, how do we convince schools and taxpayers to do it right? The second is more general, the creation of an environment in which students want to study math, science, and technology. Those are the students who are in the "pipeline" of potential CS majors when they get to college. At first glance, these may seem like two separate issues, but they interconnect in complicated ways when you step into the modern-day high school. I'm glad that someone is working on these issues full-time, but no one should expect easy answers.

    ...

    In the end, Rich Pattis's talk was the unchallenged highlight of the conference for me. For all its activity and relatively large attendance (1100 or so folks), the rest of the conference seemed a bit low on energy. What's up? Is the discipline in the doldrums, waiting for something new to invigorate everyone? Or was it just I who felt that way?


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    March 05, 2006 10:06 AM

    SIGCSE Buzz: Python Rising?

    If the buzz is accurate, Python may be a successor. The conference program included a panel on experiences teaching Python to novices, including John Zelle and Mark Guzdial, who have written the two texts currently available. My publisher friend Jim Leisy of Franklin, Beedle was very high on Zelle's book and the adoptions they've picked up in the last few months. The range of schools now using Python in at least one track of their intro courses ranges from small private schools all the way to MIT.

    Heck, I got home from SIGCSE and "PYTHON" even showed up as a solution in the local newspaper's Saturday Jumble!

    All in all, I think I prefer Ruby, both for me and for beginners, but it is behind Python in the CS education life cycle. In particular, there is only one introductory level book available right now, Chris Pine's Learn To Program. Andy Hunt from the Pragmatic Bookshelf sent me a review copy, and it looks good. It's not from the traditional CS1 textbook mold, though, and will face an uphill battle earning broad attention for CS1 courses.

    In any case, I welcome the return to a small, simple, language for teaching beginners to program. Whether Ruby or Python, we would be using an interpreted language that is of practical value as a scripting language. This has great value for three audiences of student: non-majors can learn a little about computing while learning scripting skills that they can take to their major discipline; folks who intend to major in CS but change their minds can also leave the course with useful skills; and even majors will develop skills that are valuable in upper-division courses. (You gotta figure that they'll want to try out Ruby on Rails at some point in the next few years.)

    Scripting languages pose their own problems, both in terms of language and curriculum. In particular, you need to introduce at least one and maybe two systems languages such as Java, C, or C++ in later courses, before they are needed for large projects. But I think the trade-off will be a favorable one. Students can learn to program in an engaging and relatively undistracting context before moving on to bigger issues. Then again, I've always favored languages like Smalltalk and Scheme, so I may not be the best bellwether of this trend.

    Anyway, I left SIGCSE surprised to have encountered Python at so many turns. Maybe Java will hang on as the dominant CS1 language for a few more years. Maybe Python will supplant it. Or maybe Python will just be a rebound language that fills the void until the real successor comes along. But for now the buzz is notable.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    March 02, 2006 6:36 PM

    SIGCSE Day 1: Keynote Talk by Rich Pattis

    I've been lucky in my years as a computer science professor to get to know some great teachers. Some folks embody the idea of the teacher-scholar. They are able computer scientists -- in some cases, much more than that -- but they have devoted their careers to computer science education. They care deeply about students and how they learn. My good fortune to have met many of these folks and learned from them is dwarfed by the greater fortune to have become acquaintances and, in some cases, their friends.

    Rich Pattis

    One of my favorite computer science educators, Rich Pattis, is receiving the SIGCSE Award Winner for Outstanding Contributions for CS Education at the 2006 conference -- this morning, as I type. I can't think of many people who deserve this award as much as Rich. His seminal contribution to the CS education is Karel The Robot, a microworld for learning basic programming concepts, and the book Karel the Robot: A Gentle Introduction to The Art of Programming, which has been updated over the years and extended to C++ and Java. But Rich's contributions have continued steadily since then, and he has been in the vanguard of all the major developments in CS education since. Check him out in the ACM Digital Library.

    Rich titled his acceptance talk "Can't Sing, Can't Act, Can Dance a Little (On Choosing the Right Dance Partners)", a reference to a famous remark about Fred Astaire and how Ginger Rogers "gave him class". The talk is a trace of Rich's computing history through the people who helped him become the teacher he is, from high school until today. Rich found photos of most of these folks, famous and obscure, and told a series of anecdotes to illustrate his journey.

    Those who know Rich wouldn't be surprised that he began his talk with a list of books that have influenced him. Rich reads voraciously on all topics that even tangentially relate to computing and technology in the world, and he loves to share the book ideas with other folks.

    Rich said, "I get all of my best ideas these days from Owen Astrachan", and one particular idea is Owen's practice of giving book awards to students in his courses. Owen doesn't select winners based on grades but on who has contributed the most to the class. Rich decided to reinstate this idea here at SIGCSE by arranging for all conference attendees to receive his all-time favorite book, Programmers at Work by Susan Lammers. I've not read this book but have long wanted to read it, but it's been out of print. (Thanks, Rich, and Susan, and Jane Prey and Microsoft, for the giveaway!)

    He also recommended Out of their Minds, by Dennis Shasha, which has long been on my to-read list. It just received a promotion.

    Throughout the talk, Rich came back to Karel repeatedly, just as his career has crossed paths with it many time. Rich wrote Karel instead of writing his dissertation, under John Hennessy, now president of Stanford. Karel was third book published in TeX. (We all know the first.) Karel has been used a lot at Stanford over the years, and Rich demoed a couple of projects written by students that had won course programming contests there, including a "17 Queens" puzzle and a robot that launched fireworks. Finally, Rich showed a photo of a t-shirt given him by Eric Roberts, which had a cartoon picture with the caption "Two wrongs don't make a right, but three lefts do." If you know Karel, you'll get the joke.

    The anecdotes flew fast in the talk, so I wasn't able to record them all. A few stuck with me.

    Rich told about one of the lessons he remembers from high school. He went to class to take a test, but his mechanical pencil broke early in the period. He asked Mr. Lill, his teacher, if he could borrow a pencil. Mr. Lill said 'no'. May I borrow a pencil from another student? Again, 'no'. "Mr. Pattis, you need to come to class prepared." This reminded me of Dr. Brown, my assembly language and software engineering prof in college, who did not accept late work. Period. The computer lab lost power, so you couldn't run your card deck? The university's only mainframe printer broke? "Not my problem," he'd say. The world goes on, and you need to work with the assumption that computers will occasionally fail you. I never hated Dr. Brown, even for a short while, as Rich said he did Mr. Lill for a while. But I fully understood Rich when he finished this anecdote with the adage, "Learning not accompanied by pain will be forgotten."

    Rich praised Computer Science, Logo Style, a three-book series by Brian Harvey as the best introduction to programming ever written. Wow.

    Not surprisingly, some of the Rich's best anecdotes related to students. He likes to ask an exam question on the fact that there are many different infinities. (An infinite number?) Once, he asked students, "Why are their more mathematical functions than computer programs?" One student answered, "Because mathematicians work harder than computer scientists." (Get to work, folks...)

    My favorite of Rich's student anecdotes was a comment a student wrote on a course evaluation form. The comment was labeled A Relevant Allegory:

    In order to teach someone to boil water, he would first spend a day giving the history of pots, including size, shape, and what metals work best. The next day he'd lecture on the chemical properties of water. On day three, he'd talk about boiled water through the ages. That night, he'd tell people to go home and use boiled water to make spaghetti carbonera. But never once would he tell you to put the water in the pot and heat it.

    That's what his programming classes are like -- completely irrelevant to the task at hand."

    Rather than summarize Rich's comments, I'll quote him, too, from a course syllabus in which he quoted the student:

    I like this comment because it is witty, well-written, and true -- although maybe not in the extreme that the author states. Teaching students to boil water is great for a high school class, but in college we are trying to achieve something deeper...

    I acknowledge that learning from first principles is tougher than memorization, and that sometimes students feel that the material covered is not "applied".

    Eventually, Rich's dance-partner history reached the "SIGSCE years". He showed two slides of pictures. The first showed a first generation of folks from SIGCSE who have become a long-term cadre that shares ideas about computer science, teaching, books, and life. The second showed later influences on Rich from among the usuals at SIGCSE. I was a bit surprised and highly honored to see my own picture up on Slide 1! I recall first meeting Rich at SIGCSE back in 1994 or so, when we began a long dialogue on teaching OOP and C++ in CS1. I was pretty honored even then that he engaged me in this serious conversation, and impressed by the breadth of the ideas he had collected and cultivated.

    Rich ended his talk with a tribute to one of his favorite films, The Paper Chase. Long-time readers of Knowing and Doing may recall that I wrote a blog entry that played off my own love for this movie (and Dr. Brown!). Rich said that this movie has "more truths per scene" about teaching than any other movie he knows. As much as he loves "The Paper Chase", Rich admitted to feeling like a split personality, torn between the top-down intellectual tour déforce of Professor Kingsfield and the romantic, passionate, bottom-up, "beat" style of Dead Poets Society's John Keating.

    Dead Poets Society

    Kingsfield and Keating occupy opposite ends of the teaching spectrum, yet both inspire a tremendous commitment to learning in their students. Like many of us who teach, Rich resonates with both of these personalities. Also like many of us, he knows that it's hard to be both. He likes to watch "Dead Poets Society" each year as a way to remind him of how his students must feel as they move on to the wide open wonder of the university. Yes, we know that "Dead Poets Society" is about high school. But, hey, "The Paper Chase" is about law school. You should watch both, if you haven't already.

    Rich closed with a video clip from "The Paper Chase", a famous scene in which Professor Kingsfield introduces his class to the Socratic method. (The clip is 10.4 Mb, and even still not of the highest quality.)

    This was an inspirational close to an inspirational talk, from a CS educator's CS educator, a guy who has been asking questions and trying to get better as a teacher for over twenty years -- learning from his dance partners and sharing what he has created. A great way to start a SIGCSE.

    Congratulations, Rich.

    (UPDATE March 4: I have posted a link to the video clip from "The Paper Chase" that Rich showed. He has said that he will post his presentation slides to the web; I'll watch for them, too.)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    February 28, 2006 11:19 PM

    DNA, Ideas, and the CS Curriculum

    Today is the anniversary of Watson and Crick's piecing together the three-dimensional structure of DNA, the famed double helix. As with many great discoveries, Watson and Crick were not working in isolation. Many folks were working on this problem, racing to be the first to "unlock the secret of life". And Watson and Crick's discovery itself depended crucially on data collected by chemist Rosalind Franklin, who died before the importance of her contribution became widely known outside of the inner circle of scientists working on the problem.

    Ironically, this is also the birthday of one of the men against whom Watson and Crick were racing: Linus Pauling. Pauling is the author of one of my favorite quotes:

    The best way to have a good idea is to have a lot of ideas...

    Perhaps the great geniuses have only good ideas, but most of us have to work harder. If we can free ourselves to Think Different, we may actually come across the good ideas we need to make progress.

    Of course, you can't stop at generating ideas. You then have to examine your candidates critically, exposing them to the light of theory and known facts. Whereas one's inner critic is an enemy during idea generation, it is an essential player during the sifting phase.

    Pauling knew this, too. The oft-forgotten second half of Pauling's quote is:

    ... and throw the bad ones away.

    Artists work this way, and so do scientists.

    This isn't a "round" anniversary of Watson and Crick's discovery; they found the double helix in 1953. It's not even a round anniversary of Pauling's birth, as he would be 105 today. (Well, that's sort of round.) But I heard about the anniversaries on the radio this morning, and the story grabbed my attention. Coincidentally, DNA has been on my mind for a couple of reasons lately. First, my last blog entry talked about a paper by Bernard Chazelle that uses DNA as an example of duality, one of the big ideas that computer science has helped us to understand. Then, on the plane today, I read a paper by a group of folks at Duke University, including my friend Owen Astrachan, on an attempt to broaden interest in computing, especially among women.

    Most research shows that women become interested in computing when they see how it can be used to solve real problems in the world. The Duke folks are exploring how to use the science of networks as a thematic motivation for computing, but another possible domain of application is bioinformatics. Women who major in science and technology are far more likely to major in biology than in any other discipline. Showing the fundamental role that computing plays in the modern biosciences might be a way to give women students a chance to get excited about our discipline, before we misdirect them into thinking that computerScience.equals( programming ).

    My department launched a new undergraduate major in bioinformatics last fall. So we have a mechanism for using the connection between biology and computing to demonstrate computing's utility. Unfortunately, we have made a mistake so far in the structure of our program: all students start by taking two semesters of traditional programming courses before they see any bioinformatics! I think we need to do some work to our first courses. Perhaps Astrachan and his crew can teach us something.

    I'm in Houston for SIGCSE this week, and the Duke paper will be presented here on Saturday. Sadly, I have to leave town on Friday... If I want to learn more about the initiative than I can learn just from the paper, I will need to take advantage of my own social network to make a connection.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    February 24, 2006 2:30 PM

    iPods and Big Ideas

    Last summer I blogged on a CACM article called "The Thrill Is Gone?", by Sanjeev Arora and Bernard Chazelle, which suggested that "we in computing have done ourselves and the world a disservice in failing to communicate effectively the thrill of computer science -- not technology -- to the general public." Apparently, Chazelle is carrying the flag for effective communication into battle by trying to spread the word beyond traditional CS audiences. The theoryCS guys -- Suresh, Lance, and Ernie among them -- have all commented on his latest public works: an interview he gave prior to giving a talk called "Why Computer Science Theory Matters?" at the recent AAAS annual meeting, the talk itself, and a nice little article to appear in an upcoming issue of Math Horizons. Be sure to read the Horizons piece; the buzz is well-deserved.

    Chazelle ties to convince his audience that computing is the nexus of three Big Ideas:

    • universality - the idea that any digital computer can, in principle, do what any other does. Your iPod can grow up to be anything any other computer can be.

    • duality - the idea that data and program are in principle interchangeable, that perspective determines whether something is data or program.

    • self-reference - the idea that a program can refer to itself, or data that looks just like it. This, along with the related concept of self-replication, ties the intellectual ground of computing to that of biology.

    ... and the source of two new, exceedingly powerful ideas that cement the importance of computing to the residents of the 21st century: tractability and algorithm.

    Chazelle has a nice way of explaining tractability to a non-technical audience, in terms of the time it takes to answer questions. We have identified classes of questions characterized by their "time signatures", or more generally, their consumption of any resource we care about. This is a Big Idea, too:

    Just as modern physics shattered the platonic view of a reality amenable to noninvasive observation, tractability clobbers classical notions of knowledge, trust, persuasion, and belief. No less.

    Chazelle's examples, including e-commerce and nuclear non-proliferation policy, are accessible to any educated person.

    The algorithm is the "human side" of the program, an abstract description of a process. The whole world is defined by processes, which means that in the largest sense computer science gives us tools for studying just about everything that interests us. Some take the extreme view that all science is computer science now. That may be extreme, but in one way it isn't extreme enough! Computer science doesn't revolutionize how we study only science, but also the social sciences and literature and art. I think that the greatest untapped reservoir of CS's influence lies in the realm of economics and political science.

    Chazelle makes his case that CS ultimately will supplant mathematics as the primary vehicle for writing down our science:

    Physics, astronomy, and chemistry are all sciences of formulae. Chaos theory moved the algorithmic zinger to center stage. The quantitative sciences of the 21st century (e.g., genomics, neurobiology) will complete the dethronement of the formula by placing the algorithm at the core of their modus operandi.

    This view is why I started my live as a computer scientist by studying AI: it offered me the widest vista on the idea of modeling the world in programs.

    I will be teaching CS1 this fall for the first time in ten years or so. I am always excited at the prospect of a new course and kind of audience, but I'm especially excited at the prospect of working with freshmen who are beginning our major -- or who might, or who might not but will take a little bit of computing with them off to their other majors. Learning to program (perhaps in Ruby or Python?) is still essential to that course, but I also want my students to see the beauty and importance of CS. If my students can leave CS1 next December with an appreciation of the ideas that Chazelle describes, and the role computing plays in understanding them and bringing them to the rest of the world, then I will have succeeded in some small measure.

    Of course, that isn't enough. We need to take these ideas to the rest of our students, especially those in the sciences -- and to their faculty!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    February 08, 2006 2:23 PM

    Functional Programming Moments

    I've been having a few Functional Programming Moments lately. In my Translation of Programming Languages course, over half of the students have chosen to write their compiler programs in Scheme. This brought back fond memories of a previous course in which one group chose to build a content management system in Scheme, rather than one of the languages they study and use more in their other courses. I've also been buoyed by reports from professors in courses such as Operating Systems that some students are opting to do their assignments in Scheme. These students seem to have really latched onto the simplicity of a powerful language.

    I've also run across a couple of web articles worth noting. Shannon Behrens wrote the provocatively titled Everything Your Professor Failed to Tell You About Functional Programming. I plead guilty on only one of the two charges. This paper starts off talking about the seemingly inscrutable concept of monads, but ultimately turns to the question of why anyone should bother learning such unusual ideas and, by extension, functional programming itself. I'm guilty on the count of not teaching monads well, because I've never taught them at all. But I do attempt to make a reasonable case for the value of learning functional programming.

    His discussion of monads is quite nice, using an analogy that folks in his reading audience can appreciate:

    Somewhere, somebody is going to hate me for saying this, but if I were to try to explain monads to a Java programmer unfamiliar with functional programming, I would say: "Monad is a design pattern that is useful in purely functional languages such as Haskell.

    I'm sure that some folks in the functional programming community will object to this characterization, in ways that Behrens anticipates. To some, "design patterns" are a lame crutch object-oriented programmers who use weak languages; functional programming doesn't need them. I like Behrens's response to such a charge (emphasis added):

    I've occasionally heard Lisp programmers such as Paul Graham bash the concept of design patterns. To such readers I'd like to suggest that the concept of designing a domain-specific language to solve a problem and then solving that problem in that domain-specific language is itself a design pattern that makes a lot of sense in languages such as Lisp. Just because design patterns that make sense in Java don't often make sense in Lisp doesn't detract from the utility of giving certain patterns names and documenting them for the benefit of ... less experienced programmers.

    His discussion of why anyone should bother to do the sometimes hard work needed to learn functional programming is pretty good, too. My favorite part addressed the common question of why someone should willingly take on the constraints of programming without side effects when the freedom to compute both ways seems preferable. I have written on this topic before, in an entry titled Patterns as a Source of Freedom. Behrens gives some examples of self-imposed cosntraints, such as encapsulation, and how breaking the rules ultimately makes your life harder. You soon realize:

    What seemed like freedom is really slavery.

    Throw off the shackles of deceptive freedom! Use Scheme.

    The second article turns the seductiveness angle upside down. Lisp is Sin, by Sriram Krishnan, tells a tale being drawn to Lisp the siren, only to have his boat dashed on the rocks of complexity and non-standard libraries again and again. But in all he speaks favorably of ideas from functional programming and how they enter his own professional work.

    I certainly second his praise of Peter Norvig's classic text Paradigms of AI Programming.

    I took advantage of a long weekend to curl up with a book which has been called the best book on programming ever -- Peter Norvig's Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp. I have read SICP but the 300 or so pages I've read of Norvig's book have left a greater impression on me than SICP. Norvig's book is definitely one of those 'stay awake all night thinking about it' books.

    I have never heard anyone call Norvig's book the best of all programming books, but I have heard many folks say that about SICP -- Structure and Interpretation of Computer Programs, by Abelson and Sussman. I myself have praised Norvig's book as "one of my favorite books on programming", and it teaches a whole lot more than just AI programming or just Lisp programming. If you haven't studied, put it at or near the top of your list, and do so soon. You'll be glad you did.

    In speaking of his growth as a Lisp programmer, Krishnan repeats an old saw about the progression of a Lisp programmer that captures some of the magic of functional programming:

    ... the newbie realizes that the difference between code and data is trivial. The expert realizes that all code is data. And the true master realizes that all data is code.

    I'm always heartened when a student takes that last step, or show that they've already been there. One example comes to mind immediately: The last time I taught compilers, students built the parsing tables for the compiler by hand. One student looked at the table, thought about the effort involved in translating the table into C, and chose instead to write a program that could interpret the table directly. Very nice.

    Krishnan's article closes with some discussion of how Lisp doesn't -- can't? -- appeal to all programmers. I found his take interesting enough, especially the Microsoft-y characterization of programmers as one of "Mort, Elvis, and Einstein". I am still undecided just where I stand on claims of the sort that Lisp and its ilk are too difficult for "average programmers" and thus will never be adoptable by a large population. Clearly, not every person on this planet is bright enough to do everything that everyone else does. I've learned that about myself many, many times over the years! But I am left wondering how much of this is a matter of ability and how much is a matter of needing different and better ways to teach? The monad article I discuss above is a great example. Monads have been busting the chops of programmers for a long time now, but I'm betting that Behrens has explained it in a way that "the average Java programmer" can understand it and maybe even have a chance of mastering Haskell. I've long been told by colleagues that Scheme was too abstract, too different, to become a staple of our students, but some are now choosing to use it in their courses.

    Dick Gabriel once said that talent does not determine how good you can get, only how fast you get there. Maybe when it comes to functional programming, most of us just take too long to get there. Then again, maybe we teachers of FP can find ways to help accelerate the students who want to get good.

    Finally, Krishnan closes with a cute but "politically incorrect analogy" that plays off his title:

    Lisp is like the villainesses present in the Bond movies. It seduces you with its sheer beauty and its allure is irresistible. A fleeting encounter plays on your mind for a long, long time. However, it may not be the best choice if you're looking for a long term commitment. But in the short term, it sure is fun! In that way, Lisp is...sin."

    Forego the demon temptations of Scheme! Use Perl.

    Not.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

    February 02, 2006 3:47 PM

    Java Trivia: Unary Operators in String Concatenation

    Via another episode of College Kids Say the Darnedest Things, here is a fun little Java puzzler, suitable for your compiler students or, if you're brave, even for your CS1 students.

    What is the output of this snippet of code?

            int price = 75;
            int discount = -25;
            System.out.println( "Price is " + price + discount + " dollars" );
            System.out.println( "Price is " + price - discount + " dollars" );
            System.out.println( "Price is " + price + - + discount + " dollars" );
            System.out.println( "Price is " + price + - - discount + " dollars" );
            System.out.println( "Price is " + price+-+discount + " dollars" );
            System.out.println( "Price is " + price+--discount + " dollars" );
    

    Okay, we all that the second println causes a compile-time error, so comment that line out before going on. After you've made your guesses, check out the answers.

    Many students, even upper-division ones, are sometimes surprised that all of the rest are legal Java. The unary operators + and - are applied to the value of discount before it is appended to the string.

    Even some who knew that the code would compile got caught by the fact that the output of the last two printlns is not identical to the output of the middle two. These operators are self-delimiting, so the scanner does not require that they be surrounded by white space. But in the last line, the scanner is able to match a single token in -- (the unary operator for pre-decrement) rather than two unary - operators, and so it does. This is an example of how most compilers match the longest possible token whenever they have the choice.

    So whitespace does matter -- sometimes!

    This turned into a good exercise for my compiler students today, as we just last time finished talking about lexical analysis and were set to talk about syntax analysis today. Coupled with the fact that they are in the midst of writing a scanner for their compiler, we were able to discuss some issues they need to keep in mind.

    For me, this wasn't another example of Why Didn't I Know This Already?, but in the process of looking up "official" answers about Java I did learn something new -- and, like that lesson, it involved implicit type conversions of integral types. On Page 27, the Java Language Reference, says:

    The unary plus operator (+) ... does no explicit computation .... However, the unary + operator may perform a type conversion on its operand. ... If the type of the operand is byte, short, or char, the unary + operator produces an int value; otherwise the operator produces a value of the same type as its operand.

    The unary - operator works similarly, with the type conversion made before the value is negated. So, again, an operation promotes to int in order to do arithmetic. I assume that this done for the same reason that the binary operators promote bytes, shorts, and chars.

    Like many CS professors and students, I enjoy this sort of language trivia. I don't imagine that all our students do. If you'd like to see more Java trivia, check out Random Java Trivia at the Fishbowl. (You gotta love the fact that you can change the value of a string constant!) I've also enjoyed thumbing through Joshua Bloch's and Neal Grafter's Java Puzzlers. I am glad that someone knows all these details, but I'm also glad not to have encountered most of them in my own programming experience.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    January 30, 2006 5:51 PM

    Making Things Worse in the Introductory Course

    Reading Physics Face over at Uncertain Principles reminded me of a short essay I wrote a few months ago. Computer scientists get "the face", but for different reasons. Fewer people take computer science courses, in high school or college, so we don't usually get the face because of bad experiences in one of our courses. We get the face because people have had bad experiences using technology.

    (On the flip side, at least physicists don't have to listen to complaints like "Gravity didn't work for me yesterday, but it seems to be working okay right now. Why can't you guys make things work all the time?" or "Why are there so many different ways I can exert force on an object? That's so confusing!")

    But the main thrust of Chad's article struck a chord with me. A physics education group at the University of College Park found that introductory physics courses cause student expectations about physics -- about the nature of physics as an intellectual activity -- to deteriorate rather than improve! In every group, students left their intro physics thinking less like a physicist, not more.

    I know of no such study of introductory CS courses (if you do, please let me know), but I suspect that many of our courses do the same thing. For students who leave CS 1 or CS 2 unhappy or with an inaccurate view of the discipline, their primary image of computing is an overemphasis on programming drudgery. I've written several times here in the last year or so about how we might make our intro courses more engaging -- make them about something, more than "just programming" -- via the use of engaging problems from the "real world", maybe even with a domain-specific set of applications. I notice that Owen Astrachan and his colleagues at Duke are presenting a paper at SIGCSE in early March on using the science of networks as a motivating centerpiece for CS 1. Whatever the focus, we need to helps students see that computing is about concepts bigger than a for-loop. In the 1980s, we saw a "breadth-first" curriculum movement that aimed to give students a more accurate view of the discipline in their first year or two, but it mostly died out from lack of interest -- and the logistical problem that students do to master programming before they can go very far in most CS programs.

    I don't have any new answers, but seeing that physics has documented this problem with their introductory courses makes me wonder even more about the state of science education at the university.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    January 25, 2006 6:07 PM

    Camouflage in Computer Science

    Camouflage Conference poster

    A couple of months ago, I mentioned that I was submitting a proposal to speak at a local conference on camouflage. The conference is called "CAMOUFLAGE: Art, Science and Popular Culture" and will be held at my university on April 22. It is being organized by Roy Behrens, a graphic design professor here about whom I wrote a bit in that old entry. My proposal was accepted. You can see a list of many of the conference speakers on the promotional poster shown here. (Click on the image to see it full size.)

    Behrens has attracted speakers from all over the world, despite no financial support for anyone. He recently announced that Marvin Bell, the first Poet Laureate of Iowa, will open the conference by reading a new poem about camouflage, especially written for the event, named "Dead Man". I've enjoyed hearing poets at CS conferences before, most recently Robert Hass at OOPSLA, but usually they've been invited to speak on creativity or some other "right-brain" topic. I've never been at a conference with a world-premiere poetry reading... It should be interesting!

    A conference on camouflage run out of a graphics arts program might seem an odd place for a computer science professor to speak, but I thought of proposing a talk almost as soon as I heard about the conference. Computer scientists use camouflage, too, but with a twist -- as a way to transmit a message without anyone but the intended recipient being aware that a message exists. This stands in contrast to encryption, a technique for concealing the meaning of a message even as the message may be publicly known. I've studied encryption a bit in the last couple of years while preparing for and teaching an undergraduate course in algorithms, but I've not read as much on this sort of "computational camouflage", known more formally as steganography.

    This is not an area of research for me, at least yet, but it has long been an idea that intrigues me. This audience isn't looking for cutting-edge research in computer science anyway; they are more interested in the idea of hiding things via patterns in their surroundings. This conference affords me a great opportunity to learn more about steganography and other forms of data hiding -- and teach a non-technical audience about it at the same time. If you have ever taught something to beginners, you know that committing to teach a topic forces you to understand it at a deeper level than you might otherwise be able to get away with. For me, this project will be one part studying computer science, one part educating the public, and one part learning about an idea bigger than computer science -- and where CS fits into the picture.

    I have titled my talk NUMB3RS Meets The DaVinci Code: Information Masquerading as Art. (I'm proud of that title; I hope it's not too kitschy...) I figure I'll show plenty of examples, in text and images and maybe even music, and then relate steganography to the idea of camouflage more generally.

    I also figure that I will have a lot of fun writing code!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    January 21, 2006 2:50 PM

    Golden Rules, and a Favorite Textbook

    What of Tanenbaum's talk? Reading through his slides reminded me a bit of his talk. He titled his address "Ten Golden Rules for Teaching Computer Science", and they all deserve attention:

    1. Think long-term.
    2. Emphasize principles, not facts.
    3. Expect paradigm shifts.
    4. Explain how things work inside.
    5. Show students how to master complexity.
    6. Computer science is not science.
    7. Think in terms of systems.
    8. Keep theory under control.
    9. Ignore hype.
    10. Don't forget the past.

    Items 1-3 and 9-10 aim to keep us professors focused on ideas, not on the accidents and implementations of the day. Those accidents and implementations are the examples we can use to illustrate ideas at work, but they will pass. Good advice.

    Items 4-5 and 7 remind us that we and our students need to understand systems inside and out, and that complexity is an essential feature of the problems we solve. Interfaces, implementations, and interactions are the "fundamentals".

    Items 6 and 8 reflect a particular view of computing that Tanenbaum and many others espouse: computer science is about building things. I agree that CS is about building things, but I don't want us to discount the fact that, as we build things and study the results, we are in fact laying a scientific foundation for an engineering discipline. We are doing both. (If you have not yet read Herb Simon's The Sciences of the Artificial, hie thee to the library!) That said, I concur with Tanenbaum's reminder to make sure that we apply theory in a way that helps us builds systems better.

    I especially liked one slide, which related a story from his own research. One of his students implemented the mkfs program for MINIX using a complex block caching mechanism. The mechanism was so complex that they spent six months making the implementation work correctly. But Tanenbaum estimates that this program "normally runs for about 30 sec[onds] a year". How's that for unnecessary optimization!

    The other part of the talk I liked most was his pairwise comparison of a few old textbooks, to show that some authors had thought had captured and taught timeless principles, while others had mostly taught the implementations of the day. He held up as positive examples Per Brinch Hansen's operating systems text and John Hayes's architecture text. I immediately thought of my favorite data structures book ever, Thomas Standish's Data Structure Techniques.

    I am perhaps biased, as this was the textbook from which I learned data structures as a sophomore in college. In the years that have followed, many, many people have written data structures books, including Standish himself. The trend has been to make these books language-specific ("... in C", "... using Java") or to have them teach other content at the same time, such as object-oriented programming or software engineering principles. Some of these books are fine, but none seem to get to the heart of data structures as well as Standish did in 1980. And this book made me feel like I was studying a serious discipline. Its dark blue cover with spare lettering; its tight, concise text; its small, dense font; its mathematical notation; its unadorned figures... all communicated that I was studying something real, a topic that mattered. I loved writing programs to make its trees bloom and its hash tables avoid collisions.

    A valuable result of the textbook expressing all algorithms using pseudocode is that we had to learn how to write code for ourselves. We thought about the algorithms as algorithms, and then we figured out how to make PL/I programs implement them. (Yes, PL/I. Am I old yet?) We finished the course with both a solid understanding of data structures and a fair amount of experience turning ideas into code.

    Reading through someone's presentation slides can be worth the time even if they can't recreate the talk they shadow.

    Postscript: Who, to my surprise, has a CS2-related paper in the most recent issue of inroads: The SIGCSE Bulletin? Thomas Standish. It discusses a fast sorting algorithm that works in O(n) for certain kinds of data sets. I haven't studied the paper yet, but I do notice that the algorithm is given in... Java. Oh, well.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    January 21, 2006 2:38 PM

    On Presentations, Slides, and Talks

    Someone on the SIGCSE mailing list recently requested a reference for a presentation at a past conference that had suggested we teach concepts that had "staying power". He had looked through past proceedings for the paper with no success.

    It turns out there wasn't a paper, because the presentation had been the 1997 keynote talk by Andrew Tanenbaum, who that year received the SIGCSE award for Outstanding Contributions to Computer Science Education. Fortunately, Prof. Tanenbaum has posted the slides of his talk on his web site.

    Of course, reading Tanenbaum's presentation slides is not the same experience at all as hearing his talk as a live performance. Whenever I come across a conference proceedings, I run through the table of contents to see what all happened at the conference. The titles of the keynote addresses and invited talks always look so inviting, and the speakers are usually distinguished, so I turn to the listed page for a paper on the topic of the presentation... only to find at most a one-page abstract of the talk. Sometimes there is no page number at all, because the the proceedings carry no other record of the talk.

    This has made me appreciate very much those invited speakers who write a paper to accompany their talks. Of course, reading a paper is not the same experience at all as hearing a talk live, either. But written text can say so much more than the cute graphics and bullet points that constitute most speakers' presentation slides. And for a talk that is done right -- such as Alan Kay's lectures at OOPSLA 2004 -- the presentation materials are so dynamic that the slides convey even less of the talk's real value. (The best way Alan could share his talk materials would be to make the Squeak image he used available for download!)

    I think that this is why I like to write such complete notes for the talks I attend, to capture as best I can the experience and thoughts I have in real-time. Having a blog motivates me, too, as it becomes a distribution outlet that justifies even more a job done better.

    This is also why I like to write detailed lecture notes, a lá book chapters, for my courses. I write them as much for me as for my students, though the students give me an immediate reason to write and receive what I hope is a substantial benefit.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    January 16, 2006 12:46 PM

    Chairing Tutorials for OOPSLA 2006

    OOPSLA 2006 Long Logo

    After chairing the OOPSLA Educators' Symposium in 2004 and 2005, I've been entrusted with chairing the tutorials track at OOPSLA 2006. While this may not seem to have the intellectual cachét of the Educators' Symposium, it carries the responsibility of a major financial effect on the conference. If I had screwed an educators' event, I would have made a few dozen people unhappy. If I screw up the tutorials track, I could cost the conference tens of thousands of dollars!

    The call for tutorial proposals is out, with a deadline of March 18. My committee and I will also be soliciting a few tutorials on topics we really want to see covered and from folks we especially want to present. We'd like to put together a tutorial track that does a great job of helping software practitioners and academics get a handle on the most important topics in software development these days, with an emphasis on OO and related technologies. In marketing terms, I think of it as exciting the folks who already know that OOPSLA is a must-attend conference and attracting new folks who should be attending.

    I'd love to hear what you think we should be thinking about. What are the hottest topics out there, the ones we should all be learning about? Is there something on the horizon that everyone will be talking about in October? I'm thinking not only of the buzzwords that define the industry these days, but also of topics that developers really need to take their work to another level.

    Who are the best presenters out there, the ones we should be inviting to present? I'm thinking not only of the Big Names but also of those folks who simply do an outstanding job teaching technical content to professionals. We've all probably attended tutorials where we left room thinking, "Wow, that was good. Who was that guy?"

    One idea we are considering this year is to offer tutorials that help people prepare for certifications in areas such as Java and MSCD. Do you think that folks could benefit from tutorials of this sort, or is it an example trying to do too much?

    Trying to keep a great conference fresh and exciting requires a mix of old ideas and new. It's a lot like revising a good course... only in front of many more eyes!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    January 14, 2006 3:29 PM

    Just a Course in Compilers

    Soon after I arrived at UNI, my colleague Mahmoud Pegah and I were given the lead in redesigning the department's CS curriculum. We were freshly-minted Ph.D.s with big dreams. We designed a curriculum from scratch based on the ACM's Computing Curricula 1991 guidelines. In a move of perhaps subconscious rebellion, we called one of the upper-level electives "Translation of Programming Languages". This course sat in the position usually occupied by the traditional course in compilers, but at the time we thought that name too limiting. After all, we were Smalltalkers, and our programming environment used a blend of compilation of interpretation, a powerful VM, and all sorts of language processing elements. We were also Unix guys, and the Unix world encourages stepwise transformation of data through pipes made up of simple processors.

    Ever since, we have had to explain the name "Translation of Programming Languages", because no one knows what it means. When we say, "Oh, that's like a course in compilers", everyone nods in satisfaction. Most then say, "Does anyone still write compilers any more?"

    But I still think that our name choice better reflects what that course should be about, and why it is still important as more than just a great programming experience. The rise of refactoring as a standard programming practice over the last 5-7 years has caused a corresponding need for refactoring tools, and these tools rely well-known techniques from programming languages and compilers. I'm certainly glad that someone taught the folks behind IntelliJ IDEA and Eclipse learned how to write language-processing tools somewhere.

    This semester, I am teaching the "compiler course", Translation of Programming Languages. We finally have this course back in the regular rotation of courses we offer, so I get to teach it every so often. (We last offered it in Fall 2003, but we plan to offer it every third semester for the foreseeable future.) I am probably more excited than my students!

    Wirth's Compiler Construction text

    I thought long and hard choosing a textbook. To be honest, I would really have liked to use Niklaus Wirth's Compiler Construction. I love small books with big lessons, and Wirth didn't waste any words or space writing this book. In 173 pages, he teaches us how to build a compiler from beginning to end -- and gives us the full source of his compiler, describes a RISC architecture of his own design for which his compiler generates code, and gives us full source code for a simulator of the architecture. Boom, boom, boom. Of course, he doesn't have a lot of time for theory, so he covers many ideas only at a high level and moves quickly to practical issues.

    Why not choose this book? Well, I had two misgivings. First, I would have to supplement his book with a lot of outside material, both my own lecture notes and other papers. That's not a major problem, as I tend to write up extensive lecture notes, and I rarely follow big textbooks all that closely anyway. But the real killer was when I went to Amazon to check out the book's availability and saw:

    4 used & new available from $274.70

    We may be able to get by on four copies, but... $274.70?

    The standard text is, of course, Dragon book. There are plenty of copies available at Amazon, and they run a relatively svelte $95.18. (The price of textbooks these days is a subject for another blog entry, another day.) But I have always felt that the Dragon book is a bit too much for juniors and many seniors, who are my primary audience in this course. I do not think that I am contributing to the dumbing down of our curriculum by not using this classic text; indeed, I will draw many of my lecture material from my experiences with this book. But a 15-week course requires some focus, and I don't think that most of our undergraduates will get as much from the course as they might if they get lost in the deep swirls of some of Aho, Sethi, and Ullman's chapters.

    I finally settled on Louden's Compiler Construction: Principles and Practice, with which I have had no prior experience. It seems a better choice for my students, one they might be able to read and learn something from. We'll see.

    I learned a few lessons teaching this course in Fall 2003. One is: less content. If I try to cover even a significant fraction of what we know about scanning, parsing, static analysis, code generation, and optimization, my students won't get a chance to experience building a compiler from beginning to end. This robs them not only of understanding the compiler material at a deeper level but also of the occasional pain and ultimate triumph of building a non-trivial program that works. A 15-week course requires focus.

    A 15-week course in translating programming languages also requires a relatively small source language. In my previous offering, the language the students compiled was simply too big, but in the wrong ways. You don't learn much new from making your scanner and parser recognize a second or third repetition construct; you mostly find yourself just doing more grunt work. I'd rather save that time to get deeper into a later stage of the compiler, or to discuss the notion of syntactic abstractions and how o preprocess a second repetition construct away.

    That said, I do want students to do some of the grunt work. I want them to build their own scanners and parsers. Sure, we use parser generators to write these components of most compilers these days, but I want my students to really understand how some of these techniques work and to see that they can implement them and make them fly. I remember the satisfaction I felt when I wrote my first LL parser and watched it recognize a simple sorting program's worth of tokens.

    Last time I used a source language I home-brewed from a colleague's course. This time, I am going to have my students process a subset of Oberon, based in part on Wirth's now out-of-print book. It strikes a nice balance between small enough and large enough, and has a nice enough syntax to work with for a semester.

    Now that I have administrative duties, I teach only one course a semester. The result is that I get even more psyched about the course, because it is my best chance to get my hands dirty in real computer science during the term, to think deeply in the discipline. It is also gives me a chance to write code. The thing I missed most last semester in my first semester as head was having more time to program. This course offers even more: a chance to get back to a recently-dormant project, a refactoring tool for Scheme. So, I'm psyched.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    January 06, 2006 4:02 PM

    ... But You Doesn't Have to Call Me Lefschetz

    In the last two days, I have run across references to John von Neumann twice.

    First, I was reading The Geomblog yesterday and found this:

    It reminds me of a quote attributed to John von Neumann:

    In mathematics you don't understand things. You just get used to them.

    I've had that feeling in computer science... A few months ago I described something similar, but in that case I did come to understand the course material; it only seemed as if I never world. My "just get used to it" experiences came in an area right up Suresh's alley: Computational Complexity. I loved that class, but I always felt like I was swimming in the dark -- even as I did well enough in the course.

    Then today I was cleaning out a folder of miscellaneous notes and found a clipping from some long-forgotten article.

    In Princeton's Fine Hall, someone once posted a "Scale of Obviousness":

    • If Wedderburn says it's obvious, everybody in the room has seen it ten minutes ago.

    • If Bohnenblust says it's obvious, it's obvious.

    • If If Bochner says it's obvious, you can figure it out in half an hour.

    • If von Neumann says it's obvious, you can prove it in three months if you are a genius.

    • If Lefschetz says it's obvious, it's wrong.

    I'll venture to say that students at every institution occasionally make such lists and discuss them with their friends, even if they are too polite to post them in public. That's good for the egos of us faculty members. In our fantasies, we are all von Neumanns. In reality, most of us are Bohnenblusts at best and more likely Wedderburns. And we all have our Lefschetz days.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    December 29, 2005 5:35 AM

    You Have to Write the Program

    As we close out 2005, an article in MSU Today reminds us why scientists have to run experiments, not just sit in their easy chairs theorizing. A group of physicists was working on a new mix of quarks and gluons. Their theory predicted that they were creating a plasma with few or no interactions between the particles. When they ran their experiments, they instead "created a new state of matter, a nearly perfect fluid in which the quarks and gluons interact strongly." Gary Westfall, the lead MSU researcher on the project, said,

    What is so incredibly exciting about this discovery is that what we found turned out to be totally different from what we thought we would find. ... But it shows that you cannot just rely on making theories alone. In the end, you have to build the machine and do the experiment. What you learn is often more beautiful than the most vivid imagination.

    Folks who don't write computer programs often say that computers only do what we tell them to do, implying somehow that their mechanistic nature makes them uninteresting. Anyone who writes programs, though, has had the sort of experience that Gary Westfall describes. What we learn by writing a program and watching it is often more beautiful than anything we could ever have imagined.

    I love this game.

    Best wishes to you all for a 2006 full of vivid imagination and beautiful programs.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    December 22, 2005 12:18 PM

    Joining the Present

    Yesterday I received e-mail from the chair of a CS education conference. For some reason, this snippet caught my eye:

    We encourage you to visit the conference website
    [URL]

    on a regular basis for the latest information about [the conference]."

    My first thought was, "You need an RSS feed!" I am not very good about remembering to check a web site on a regular basis, at least in part because there are so many web sites in which I am interested. The result: I tend to miss out on the latest information. But with a subscription feed, my newsreader reminds me to check the sites that have new content.

    And this e-mail was for a conference in computer science education. A conference of techies, right? Why haven't they joined the 21st century?

    My second thought was, "Physician, heal thyself!" I do the same thing to my students. Here is a snippet from the home page for my fall course:

    Welcome to your portal into the world of 810:154 Programming Languages and Paradigms. These pages will complement what you find in class. You will want to check the "What's New" section often -- even when I don't mention changes in class -- to see what is available.

    Can I really expect students to check the site on their own? At least I put up lecture notes (with code) twice a week and homework once a week to create some 'pull'. But students are pulled in many different directions, and maybe a little push would help.

    This raises a question: How many of my students use a newsreader or RSS-enable web browser these days? Offering a news feed will only improve the situation if these folks take advantage of the feed. So I will have pushed the problem from one required habit to another, but at least it's a habit that consolidates multiple problems into one, and a habit that is growing in its reach. But beginning next semester, I'll ask my students if and how they use news feeds, and encourage them to give it a try.

    And I will offer a feed for my course web site. Perhaps that will help a few students stay on top of the game and not miss out on the latest information.

    You would think that we computer scientists would not be so behind the technological curve. Shouldn't we be living just a bit in the future more often? You know what they say about the cobbler's children...


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    December 21, 2005 5:07 PM

    Experiments in Art and Software

    Double Planet, by Pyracantha

    Electron Blue recently wrote about some of her experiments in art. As an amateur student of physics, she knows that these experiments are different the experiments that scientists most often perform. She doesn't always start with a "hypothesis", and when she gets done it can be difficult to tell if the experiment was a "success" or not. Her experiments are opportunities to try ideas, to see whether a new technique works out. Sometimes, that's easy to see, as when the paint of a base image dries with a grainy texture that doesn't fit the image or her next stage. Other times, it comes down to her judgment about balance or harmony.

    This is quite unlike many science experiments, but I think it has more in common with science than may at first appear. And I think it is very much like what programmers and software developers do all the time.

    Many scientific advances have resulted from what amounts to "trying things out", even without a fixed goal in mind. On my office wall, I have a wonderful little news brief called "Don't leave research to chance", taken from some Michigan State publication in the early 1990s. The article is about some work by Robert Root-Bernstein, an MSU science professor who in the 1980s spent time as a MacArthur Prize fellow studying creativity in the sciences. In particular, it lists ten ways to increase one's chances of serendipitously encountering valuable new ideas. Many of these are strict matters of technique, such as removing background "noise" that everyone else accepts or varying experimental conditions or control groups more widely than usual. But others fit the art experiment mold, such as running a reaction backward, amplifying a side reaction, or doing something else "unthinkable" just to see what happens. The world of science isn't always as neat as it appears from the outside.

    And certainly we software developers explore and play in a way that an artist would recognize -- at least we do when we have the time and freedom to do so. When I am learning a new technique or language or framework, I frequently invoke the Three Bears Pattern that I first learned from Kent Beck via one of the earliest pedagogical patterns workshops. One of the instantiations of this pattern is to use the new idea everywhere, as often and as much as you can. By ignoring boundaries, conventional wisdom, and pat textbook descriptions of when the technique is useful, the developer really learns the technique's strengths and weaknesses.

    I have a directory called software/playground/ where I visit when I just want to try something out. This folder is a living museum of some of the experiments I've tried. Some are as mundane as learning some hidden feature of Java interfaces, while others are more ambitious attempts to see just how far I can take the Model-View-Controller pattern before the resulting pain exceeds the benefits. Just opportunities to try an idea, to see how a new technique works out.

    My own experience is filled with many other examples. A grad student and I learned pair programming by giving it a whirl for a while to see how it felt. And just a couple of weeks ago, on the plane to Portland for the OOPSLA 2006 fall planning meeting, I whipped up a native Ook! interpreter in Scheme -- just because. (There is still a bug in it somewhere... )

    Finally, I suspect that web designers experiment in much the way that artists do when they have ideas about layout, design, and usability. The best way to evaluate the idea is often to implement it and see what real users think! This even fits Electron Blue's ultimate test of her experiments: How do people react to the work? Do they like it enough to buy it? Software developers know all about this, or should.

    One of the things I love most about programming is that I have the power to write the code -- to make my ideas come alive, to watch them in animated bits on the screen, to watch them interacting with other people's data and ideas.

    As different as artists and scientists and software developers are, we all have some things in common, and playful experimentation is one.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

    November 23, 2005 1:46 PM

    This and That, from the Home Front

    The daily grind of the department office has descended upon me the last couple of weeks, which with the exception of two enjoyable talks (described here and here) have left me with little time to think in the way that academics are sometimes privileged. Now comes a short break full of time at home with family.

    Here are a few things that have crossed my path of late:

    • Belated "Happy Birthday" to GIMP, which turned 10 on Monday, November 21, 2005. There is a lot of great open-source software out there, much of which is older than GIMP, but there's something special to me about this open-source program for image manipulation. Most of the pros use Photoshop, but GIMP is an outstanding program for a non-trivial task that shows how far an open-source community can take us. Check out the original GIMP announcement over at Google Groups.

    • Then there is this recently renamed oldie but goodie on grounded proofs. My daughters are at the ages where they can appreciate the beauty of math, but their grade-school courses can do only so much. Teaching them bits and pieces of math and science at home, on top of their regular work, is fun but challenging.

      The great thing about explaining something to a non-expert is that you have to actually understand the topic.

      Content and method both matter. Don't let either the education college folks or the "cover all the material" lecturers from the disciplines tell you otherwise.

    • Very cool: an on-line version of John Von Neumann's Theory of Self-Reproducing Automata.

    • Finally, something my students can appreciate as well as I:

      If schedule is more important than accuracy, then I can always be on time.

      Courtesy of Uncle Bob, though I disagree with his assumption that double-entry bookkeeping is an essential practice of modern accounting. (I do not disagree with the point he makes about test-driven development!) Then again, most accountants hold double-entry bookkeeping in nearly religious esteem, and I've had to disagree with them, too. But one of my closest advisors as a graduate student, Bill McCarthy, is an accountant with whom I can agree on this issue!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development, Teaching and Learning

    November 23, 2005 1:30 PM

    More on "We're Doomed"

    Ernie's 3D Pancakes picked up on the We're Doomed theme from our OOPSLA panel. There are many good blogs written by theoretical computer scientists; I especially enjoy 3D Pancakes, Computational Complexity, and The Geomblog. Recently, the CS theory community has been discussing how to communicate the value and importance of theory and algorithms to the broader CS community, the research funding agencies, and the general public, and they've struck on some of the same ideas some of my colleagues and I have been batting around concerning CS more generally. Check those blogs out for some thought-provoking discussions.

    Anyway, I found the comments on Ernie's entry that quotes me quoting Owen to be worth reading. It's good to be reminded occasionally how diverse the set of CS programs is. Michael Stiber's comment (sixth in the list) points out that CS department's have themselves to blame for many of these problems. One of my department colleagues was just at my door talking about missed opportunities to serve the university community with computing courses that matter to them. Pretty soon, we see courses like this filling a very mainstream corner of the market, and people in other departments hungering for courses in the newly-developed markets that Owen points out.

    "How may of us really need to rewrite BLAS, LAPACK, etc., routines?"

    None. But how many students are taught to write them anyway?

    And this quote speaks to the much simpler issue of how to revise our curriculum for majors. How much tougher it is for us to re-imagine what we should be doing for non-computer scientists and then figuring out how to do it.

    I just realized that by "simple" I mean that we computer scientists at least have some control over our own domain. In many ways, the task of reforming the major curriculum is tougher due to the tight cultural constraints of our community. I imagine that CS is no different than any discipline in this regard. We are a young discipline and used to the rapid change of technology -- perhaps we can find a way to become more nimble. certainly, having the conversation is a first step.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    November 18, 2005 9:37 PM

    Teaching as Subversive Inactivity

    Two enjoyable talks in one week -- a treat!

    On Tuesday, I went to the 2005 CHFA Faculty Excellence Award lecture. (CHFA is UNI's College of Humanities and Fine Arts.) The winner of the award was Roy Behrens, whose name long-time readers of this blog may recognize from past entries on a non-software patterns of design and 13 Books. Roy is a graphic arts professor at my university, and his talk reflected both his expertise in design and the style that contributed to his winning an award for faculty excellence. He didn't use PowerPoint in that stultifying bullet-point way that has afflicted the science and technology for the last decade or more... They used high-resolution images of creative works and audio to create a show that amplified his words. They also demonstrated a wonderful sense of "visual wit".

    The title of the talk was teaching as a SUBVERSIVE INACTIVITY: a miscellany, in homage to Neil Postman's famous book. When he was asked to give this talk, he wondered what he should talk about -- how to teach for 34 years without burning out? He decided to share how his teaching style has evolved away from being the center of attention in the classroom toward giving students the chance to learn.

    The talk opened with pivotal selections from works that contributed to his view on teaching. My favorite from the bunch came from "The Cult of the Fact" by Liam Hudson, a British psychologist: The goal of the teacher is to

    ... transmit an intellectual tradition of gusto, and instill loyalty to it, ...

    Behrens characterized his approach to teaching in terms of Csikszentmihalyi's model of Flow: creativity and productivity happen when the students' skills are within just the right range of the challenges given to them.

    (After seeing this talk, I'm almost afraid to use my homely line art in a discussion of it. Roy's images were so much better!)

    He called this his "Goldilocks Model", the attempt to create an environment for students that maximizes their chance to get into flow.

    What followed was a collage of images and ideas. I enjoyed them all. Here are three key points about teaching and learning from the talk.

    Aesthetic and Anesthetic

    What Csikszentmihalyi calls flow is roughly comparable to what we have historically called "aesthetic". And in its etymological roots, the antonym of 'aesthetic' is anesthetic. What an interesting juxtaposition in our modern language!

    In what ways can the atmosphere of our classrooms be anesthetic?

    extreme similarity ... HUMDRUM ... monotony
    extreme difference ... HODGEPODGE ... mayhem

    We often think of boredom as a teaching anesthetic, but it's useful to trace this back to the possibility that the boredom results from a lack of challenge. Even more important is to remember that too much challenge, too much activity, what amounts to too much distraction also serves as an anesthetic. People tend to tune out when they are overstimulated, as a coping mechanism. I am guessing that when I bore students the most, it's more likely to be from a mayhem of ideas than a monotony. ("Let's sneak in one more idea...)

    Behrens is a scholar of patterns, and he has found it useful to teach students patterns -- linguistic and visual -- suitable to their level of development, and then turn them lose in the world. Knowing the patterns changes our vision; we see the world in a new way, as an interplay of patterns.

    Through patterns, students see style and begin to develop their own. 'Style' is often maligned these days as superficial, but the idea of style is essential to understanding designs and thinking about creating. That said, style doesn't determine quality. One can find quality in every genre of music, of painting. There is something deeper than style. Teaching our principles of programming languages course this semester as I am, I hope that my students are coming to understand this. We can appreciate beautiful object-oriented programs, beautiful functional programs, beautiful logic programs, and beautiful procedural programs.

    Creativity as Postmodern

    Behrens didn't use "postmodern", but that's my shorthand description of his idea, in reference to ideas like the scrapheap challenge.

    During the talk, Behrens several times quoted Arthur Koestler's The Act of Creation. Here's one:

    The creative process is an "unlikely marriage of cabbages and kings -- of previously unrelated frames of reference or universes of discourse -- whose union will solve the previously insoluble problem." -- Koestler

    Koestler's "cabbages and kings" is an allusion to a nonsense poem in Alice in Wonderland. (Remember what Alan Perlis said about "Alice": The best book on programming for the layman ...; but that's because it's the best book on anything for the layman.") Koestler uses the phrase because Carroll's nonsense poem is just the sort of collage of mismatched ideas that can, in his view, give rise to creativity.

    Humans don't create anything new. They assemble ideas from different contexts to make something different. Creativity is a bisociation, a "sort crossing", as opposed to analytic intelligence, which is an association, a "sort-matching".

    We have to give students the raw material they need to mix and match, to explore new combinations. That is why computer science students should learn lots of different programming languages -- the more different, the better! They should study lots of different ideas, even in courses that are not their primary interest: database, operating systems, compilers, theory, AI, ... That's how we create the rich sea of possibilities from which new ideas are born.

    Problems, Not Solutions

    If we train them to respond to problems, what happens when the problem giver goes away? Students need to learn to find and create problems!

    In his early years, Behrens feared giving students examples of what he wanted, at the risk of "limiting" their creativity to what they had seen. But examples are critical, because they, too, give students the raw material they need to create.

    His approach now is to give students interesting and open problems, themes on which to work. Critique their products and ideas, frequently and openly. But don't sit by their sides while they do things. Let them explore. Sometimes we in CS tend hold students' hands too much, and the result is often to turn what is fun and creative into tedious drudgery.

    I'm beginning to think that one of the insidious ingredients in students' flagging interest in CS and programming is that we have taken the intellectual challenge out of learning to program and replaced it with lots of explanation, lots of text talking about technical details. Maybe our reasons for doing so seemed on the mark at the time -- I mean, C++ and Java are pretty complex -- but the unintended side effects have been disastrous.

    ----

    I greatly enjoyed this talk. One other good thing came out of the evening: after 13 years working on the same campus, I finally met Roy, and we had a nice chat about some ideas at the intersection of our interests. This won't be the last time we cross paths this year; I hope to present a paper at his conference Camouflage: Art, Science and Popular Culture conference, on the topic of steganography.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

    November 08, 2005 3:00 PM

    An Index to the OOPSLA Diaries

    I have now published the last of my entries intended to describe the goings-on at OOPSLA 2005. As you can see from both the number and the length of entries I wrote, the conference provided a lot of worthwhile events and stimulated a fair amount of thinking. Given the number of entries I wrote, and the fact that I wrote about single days over several different entries and perhaps several weeks, I thought that some readers might appreciate a better-organized index into my notes. Here it is.

    Of course, many other folks have blogged on the OOPSLA'05 experience, and my own notes are necessarily limited by my ability to be in only one place at a time and my own limited insight. I suggest that you read far and wide to get a more complete picture. First stop is the OOPSLA 2005 wiki. Follow the link to "Blogs following OOPSLA" and the conference broadsheet, the Post-Obvious Double Dispatch. In particular, be sure to check out Brian Foote's excellent color commentary, especially his insightful take on the software devolution in evidence at this year's conference.

    Now, for the index:

    Day 1

    Day 2

    Day 3

    Day 4

    Day 5

    This and That

    I hope that this helps folks navigate my various meanderings on what was a very satisfying OOPSLA.

    Finally, thanks to all of you who have sent me notes to comment on this postings. I appreciate the details you provide and the questions you ask...

    Now, get ready for OOPSLA 2006.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Patterns, Software Development, Teaching and Learning

    November 04, 2005 5:16 PM

    Simplicity and Humility in Start-Ups

    Two of Paul Graham's latest essays, Ideas for Startups and What I Did This Summer, echo ideas about simplicity, exploration, and conventional wisdom that Ward Cunningham talked about in his Educators' Symposium keynote address. Of course, Graham speaks in the context of launching a start-up company, but I don't think he sees all that much difference between that and other forms of exploration and creation.

    Ideas for Startups focuses first on a problem that many people face when trying to come up with a Big Idea: they try to come up with a Big Idea. Instead, Graham suggests asking a question...

    Treating a startup idea as a question changes what you're looking for. If an idea is a blueprint, it has to be right. But if it's a question, it can be wrong, so long as it's wrong in a way that leads to more ideas.

    Humility. Simplicity. Learn something along the way. This is just the message that Ward shared.

    Later, he speaks of how to "do" simplicity...

    Simplicity takes effort -- genius, even. ... It seems that, for the average engineer, more options just means more rope to hang yourself.

    In this regard, Ward offers more hope to the rest of us. Generating a truly great something may require genius; maybe not. But in any case, ordinary people can act in a way that biases their own choices toward simplicity and humility, and in doing so learn a lot and create good programs (or whatever). That's what things like CRC cards, patterns, and XP are all about. I know a lot of people think that XP requires Superhero programmers to succeed, but I know lots of ordinary folks who benefit from the agile mindset. Take small steps, make simple choices where possible, and get better as you go.

    I am not sure whether of how pessimistic Graham really is here about achieving simplicity, but his writing often leaves us thinking he is. But I prefer to read his stuff as "Yeah, I can do that. How can I do that?" and take good lessons from what he writes.

    Then, in "What I Did This Summer", Graham relates his first experience with the Summer Founders Program, bankrolling a bunch of bright, high-energy, ambitious your developers. Some of the lessons his proteges learned this summer are examples of what Ward told us. For example, on learning by doing:

    Another group was worried when they realized they had to rewrite their software from scratch. I told them it would be a bad sign if they didn't. The main function of your initial version is to be rewritten.

    This is an old saw, one I'm surprised that the SFP needed to learn. Then again, we often know something intellectually but, until we experience it, it's not our own yet.

    But I really liked the paragraphs that came next:

    That's why we advise groups to ignore issues like scalability, internationalization, and heavy-duty security at first. I can imagine an advocate of "best practices" saying these ought to be considered from the start. And he'd be right, except that they interfere with the primary function of software in a startup: to be a vehicle for experimenting with its own design. Having to retrofit internationalization or scalability is a pain, certainly. The only bigger pain is not needing to, because your initial version was too big and rigid to evolve into something users wanted.

    I suspect this is another reason startups beat big companies. Startups can be irresponsible and release version 1s that are light enough to evolve. In big companies, all the pressure is in the direction of over-engineering.

    Ward spoke with great feeling about being willing to settle for the incomplete, about ignoring some things you probably shouldn't ignore, about disobeying conventional wisdom -- all in the service of keeping things simple and being able to see patterns and ideas that are obscured by the rules. We are conditioned to think of these behaviors as irresponsible, but they may in fact point us in the most likely direction of success.

    ----

    I also found one really neat idea to think about from reading these papers that is independent of Ward's Cunningham. Graham was talking about doodling and what the intellectual equivalent is, because doodling is such a productive way for visual artists to let their minds wonder. Technical innovators need to let their minds wander so that they can stumble upon newly synthesized ideas, where a common frame of reference is applied to some inappropriate data. This can be the source of analogies that help us to see newly.

    Out of this discussion, his programmer's mind created a programming analogy:

    That's what a metaphor is: a function applied to an argument of the wrong type.

    What a neat idea, both as a general characterization of metaphor and also as a potential source of ideas for programming languages. What might a program gain from the ability to reason about a function applied to invalid arguments? On its face, that's an almost meaningless question, but that's part of Graham's -- and Ward's -- point. What may come from such a thought? I want to think more.

    I don't know that I have any more time than Graham to spend on this particular daydream...


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

    October 31, 2005 7:19 PM

    "Mechanistic"

    In my recent post on Gerry Sussman's talk at OOPSLA, I quoted Gerry Sussman quoting concert pianist James Boyk, and then commented:

    A work of art is a machine with an aesthetic purpose.

    (I am uncomfortable with the impression these quotes give, that artistic expression is mechanistic, though I believe that artistic work depends deeply on craft skills and unromantic practice.)

    Thanks to the wonders of the web, James came across my post and responded with to my parenthetical:

    You may be amused to learn that fear of such comments is the reason I never said this to anyone except my wife, until I said it to Gerry! Nevertheless, my remark is true. It's just that word "machine" that rings dissonant bells for many people.

    I was amused... I mean, I am a computer scientist and an old AI researcher. The idea of a program, a machine, being beautiful or even creating beauty has been one of the key ideas running through my entire professional life. Yet even for me the word "machine" conjured up a sense that devalued art. This was only my initial reaction to Sussman's sentiment, though. I also felt an almost immediate need to mince my discomfort with a disclaimer about the less romantic side of creation, in craft and repetition. I must be conflicted.

    James then explained the intention underlying his use of the mechanistic reference in way that struck close to home for me:

    I find the "machine" idea useful because it leads the musician to look for, and expect to find, understandable structures and processes in works of music. This is productive in itself, and at the same time, it highlights the existence and importance of those elements of the music that are beyond this kind of understanding.

    This is an excellent point, and it sheds light on other domains of creation, including software development. Knowing and applying programming patterns helps programmers both to seek and recognize understandable structures in large programs and to recognize the presence and importance of the code that lies outside of the patterns. This is true even -- especially!? -- for novice programmers, who are just beginning to understand programs and their structure, and the process of reading and writing them. Much of the motivation for work on the use of elementary patterns in instruction, as we try to help learn to comprehend masses of code that at first glance may seem but a jumble but which in fact bear a lot of structure within them. Recognizing code that is and isn't part of recurring structure, and understanding the role both play, is an essential skill for the novice programmer to learn.

    Folks like Gerry Sussman and Dick Gabriel do us all a service by helping us to overcome our discomfort when thinking of machines and beauty. We can learn something about science and about art.

    Thanks to James for following up on my post with his underlying insight!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

    October 27, 2005 7:56 PM

    OOPSLA Day 5: Grady Booch on Software Architecture Preservation

    Grady Booch, free radical

    Grady Booch refers to himself as an "IBM fellow and free radical". I don't know if 'free radical' part of his job description or only self-appellation, but it certainly fits his roving personality. He is a guy with many deep interests and a passion for exploring new lands.

    His latest passion is Handbook of Software Architecture, a project that many folks thinks is among the most important strategic efforts for the history and future of software development.

    Booch opened his invited talk at OOPSLA by reminding everyone that "classical science advances via the dance between quantitative observation and theoretical construction." The former is deliberate and intentional; the latter is creative and testable. Computer science is full of empirical observation and the construction of theories, but in the world of software we often spend all of time building artifacts and not enough time doing science. We have our share of theories, about process and tools, but much of that work is based on anecdote and personal experience, not the hard, dispassionate data that reflects good empirical work.

    Booch reminisced about a discussion he had with Ralph Johnson at the Computer Museum a few years ago. They did a back-of-envelope calculation that estimated the software industry had produces approximately 1 trillion lines of code in high-level languages since the 1950s -- yet little systematic empirical study had been done of this work. What might we learn from digging through all that code? One thing I feel pretty confident of: we'd find surprises.

    In discussing the legacy of OOPSLA, Booch mentioned one element of the software world launched at OOPSLA that has taken seriously the attempt to understand real systems: the software patterns community, of which Booch was a founding father. He hailed patterns as "the most important contribution of the last 10-12 years" in the software world, and I imagine that his fond evaluation rests largely on patterns community's empirical contribution -- a fundamental concern for the structure of real software in the face of real constraints, not the cleaned up structures and constraints of traditional computer science.

    We have done relatively little introspection into the architecture of large software systems. We have no common language for describing architectures, no discipline that studies software in a historical sense. Occasionally, people publish papers that advance the area -- one that comes to mind immediately is Butler Lampson's Hints for Computer System Design -- but these are individual efforts, or ad hoc group efforts.

    The other thing this brought to my mind was my time in an undergraduate architecture program. After the first-year course, every archie took courses in the History of Housing, in which students learned about existing architecture, both as historical matter and to inform current practice. My friends became immersed in what had been done, and that certainly gave them a context in which to develop their own outlook on design. (As I look at the program's current curriculum, I see that the courses have been renamed History of Architecture, which to me replaces the rich flavor of houses for the more generic 'architecture', even if it more accurately reflects the breadth of the courses.)

    Booch spent a next part of his talk comparing software architecture to civil architecture. I can't do justice to this part of his talk; you should read the growing volume of content on his web site. One of his primary distinctions, though, involved the different levels of understanding we have about the materials we use. The transformation from vision to execution in civil systems is not different in principle from that in software, but we understand more about the physical materials that a civil architect uses than we do about a software developer's raw material. Hence the need to study existing software systems more deeply.

    Civil architecture has made tremendous progress over the years in its understanding of materials, but the scale of its creations has not grown commensurately other what the ancients built. But the discipline has a legacy of studying the works of the masters.

    Finally he listed a number of books that document patterns in physical and civil systems, including The Elements of Style -- not the Strunk and White version -- and the books of Christopher Alexander, the godfather of the software patterns movement.

    Booch's goal is for the software community to document software architectures in as great detail, both for history's sake and for the patterns that will help us create more and more beautiful systems. His project is one man's beginning, and an inspirational one at that. In addition to documenting classic systems such as MacPaint, he aims to preserve our classic software as well. That will enable us to study and appreciate it in new ways as our understanding of computing and software grow.

    He closed his talk with inspiration but also a note of warning... He told the story of contacting Edsger Dijkstra to tell him about the handbook project and seek his aid in the form of code and papers and other materials from Dijkstra's personal collection. Dijkstra supported the project enthusiastically and pledged materials from his personal library -- only to die before the details had been formalized. Now, Booch must work through the Dijkstra estate in hopes of collecting any of the material pledged.

    We are a young discipline, relatively speaking, but time is not on our side.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

    October 24, 2005 7:36 PM

    OOPSLA Day 3: Sussman on Expressing Poorly-Understood Ideas in Programs

    Gerald Sussman is renown as one of the great teachers of computing. He co-authored the seminal text Structure and Interpretation of Computer Programs, which many folks -- me included -- think is the best book ever written about computer science. Along with Guy Steele, he wrote an amazing series of papers, collectively called the "Lambda papers", that taught me as much as any other source about programming and machines. It also documented the process that created Scheme, one of my favorite languages.

    Richard Gabriel introduced on Sussman before his talk with an unambiguous statement of his own respect for the presenter, saying that when Sussman speaks, "... record it at 78 and play it back at 33." In inimitable Gabriel fashion, he summarized his career as, "He makes things. He thinks things up. He teaches things."

    Sussman opened by asserting that programming is, at its foundation, a linguistic phenomenon. It is a way in which we express ourselves. As a result, computer programs can be both prose and poetry.

    In programs, we can express different kinds of "information":

    • knowledge of world as we know it
    • models of possible worlds
    • structures of beauty
    • emotional content

    The practical value that we express in programs sometimes leads to the construction of intellectual ideas, which ultimately makes us all smarter.

    Sussman didn't say anything particular about why we should seek to express beauty and emotional content in programs, but I can offer a couple of suggestions. We are more likely to work harder and deeper when we work on ideas that compel us emotionally. This is an essential piece of advice for graduate students in search of thesis topics, and even undergrads beginning research. More importantly, I think that great truths possess a deep beauty. When we work on ideas that we think are beautiful, we are working on ideas that may ultimately pay off with deeper intellectual content.

    Sussman then showed a series of small programs, working his way up the continuum from the prosaic to the beautiful. His first example was a program he called "useful only", written in the "ugliest language I have ever used, C". He claimed that C is ugly because it is not expressive enough (there are ideas we want to express that we cannot express easily or cleanly) and because "any error that can be made can be made in C".

    His next example was in a "slightly nicer language", Fortran. Why is Fortran less prosaic than C? It doesn't pretend to be anything more than it is. (Sussman's repeated use of C as inelegant and inexpressive got a rise out of at least one audience member, who pointed out afterwards that many intelligent folks like C and find it both expressive and elegant. Sussman agreed that many intelligent folks do, and acknowledged that there is room for taste in such matters. But I suspect that he believes that folks who believe such things are misguided or in need of enlightenment. :-)

    Finally Sussman showed a program in a language we all knew was coming, Scheme. This Scheme program was beautiful because it allows us to express the truth of the domain, that states and differential states are computed by functions that can be abstracted away, that states are added and subtracted just like numbers. So, the operators + and must be generic across whatever value set we wish to compute over at the time.

    In Scheme, there is nothing special about + or . We can define them to mean what they mean in the domain where we work. Some people don't like this, because they fear that in redefining fundamental operations we will make errors. And they are right! But addition can be a fundamental operation in many domains with many different meanings; why limit ourselves? Remember what John Gribble and Robert Hass told us: you have to take risks to create something beautiful.

    This point expresses what seemed to be a fulcrum point in Sussman's argument: Mathematics is a language, not a set of tools. It is useful to us to the extent we we can express the ideas that matter to us.

    Then Sussman showed what many folks consider to be among the most beautiful pieces of code ever written, if not the most beautiful: Lisp's eval procedure written in Lisp. This may be as close to Maxwell's equations in computer science as possible.

    This is where Sussman got to the key insight of his talk, the insight that has underlay much of his intellectual contribution to our world:

    There are some things we could not express until we invented programming.

    Here Sussman distinguished two kinds of knowledge about the world, declarative knowledge and imperative knowledge. Imperative knowledge is difficult to express clearly in an ambiguous language, which all natural languages are. A programming language lets us express such knowledge in a fundamentally new way. In particular, computer programs improve our ability to teach students about procedural knowledge. Most every computer science student has had the experience of getting some tough idea only after successfully programming it.

    Sussman went further to state baldly, "Research that doesn't connect with students is a waste." To the extent that we seek new knowledge to improve the world around us, we must teach it to others, so I suppose that Sussman is correct.

    Then Sussman clarified his key insight, distinguishing computer programs from traditional mathematics. "Programming forces one to be precise and formal, without being excessively rigorous." I was glad that he then said more specifically what he means here by 'formal' and 'rigorous'. Formality refers to lack of ambiguity, while rigor referd to what a particular expression entails. When we write a program, we must be unambiguous, but we do not yet have to understand the full implication of what we have written.

    When we teach students a programming language, we are able to have a conversation with them of the sort we couldn't have before -- about any topic in which procedural knowledge plays a central role. Instead of trying to teach students to abstract general principles from the behavior of the teacher, a form of induction, we can now give them a discursive text that expresses the knowledge directly.

    In order to participate in such a conversation, we need only know a few computational ideas. One is the lambda calculus. All that matters is that you have a uniform system for naming things. "As anyone who has studied spirituality knows, if you give a name to a spirit, you have power over it." So perhaps the most powerful tool we can offer in computing is the ability to construct languages quickly. (Use that to evaluate your favorite programming language...)

    Sussman liked the Hass lecture, too. "Mr. Hass thinks very clearly. One thing I've learned is that all people, if they are good at what they do, whatever their area -- they all think alike." I suspect that this accounts for why many of the OOPSLA crowd enjoyed the Hass lecture, even if they do not think of themselves as literary or poetic; Hass was speaking truths about creativity and beauty that computer scientists know and live.

    Sussman quoted two artists whose comments echoed his own sentiment. First, Edgar Allan Poe from his 1846 The Philosophy of Composition:

    ... it will not be regarded as a breach of decorum on my part to show the modus operandi by which some one of my own works was put together. I select "The Raven" as most generally known. It is my design to render it manifest that no one point in its composition is referable either to accident or intuition -- that the work proceeded step by step, to its completion with the precision and rigid consequence of a mathematical problem.

    And then concert pianist James Boyk:

    A work of art is a machine with an aesthetic purpose.

    (I am uncomfortable with the impression these quotes give, that artistic expression is mechanistic, though I believe that artistic work depends deeply on craft skills and unromantic practice.)

    Sussman considers himself an engineer, not a scientist. Science believes in a "talent theory" of knowledge, in part because the sciences grew out of the upper classes, which passed on a hereditary view of the world. On the other hand, engineering favors a "skill theory" of knowledge; knowledge and skill can be taught. Engineering derived from craftsmen, who had to teach their apprentices in order to construct major artifacts like cathedrals; if the product won't be done in your lifetime, you need to pass on the skills needed for others to complete the job!

    The talk went on for a while thereafter, with Sussman giving more examples of using programs as linguistic expressions in electricity and mechanics and mathematics, showing how a programming language enables us -- forces us -- to express a truth more formally and more precisely than what our old mathematical and natural languages did.

    Just as most programmers have experienced the a-ha! moment of understanding after having written a program in an area we were struggling to learning, nearly every teacher has had an experience with a student who has unwittingly bumped into the wall at which computer programming forces us to express an idea more precisely than our muddled brain allows. Just today, one of my harder-working students wrote me in an e-mail message, "I'm pretty sure I understand what I want to do, but I can't quite translate it into a program." I empathize with his struggles, but the answer is: You probably don't understand, or you would be able to write the program. In this case, examination of the student's code revealed the lack of understanding that manifests itself in a program far more complex than the idea itself.

    This was a good talk, one which went a long way toward helping folks see just how important computer programming is as an intellectual discipline, not just as a technology. I think that one of the people who made a comment after the talk said it well. Though the title of this talk was "Why Programming is a Good Medium for Expressing Poorly Understood and Sloppily Formulated Ideas", the point of this talk is that, in expressing poorly-understood and sloppily-formulated ideas in a computer program, we come to understand them better. In expressing them, we must eliminate our sloppiness and really understand what we are doing. The heart of computing lies in the computer program, and it embodies a new form of epistemology.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    October 19, 2005 8:17 PM

    More on Safety and Freedom in the Extreme

    giving up liberty for safety

    In my entry on Robert Hass's keynote address, I discussed the juxtaposition of 'conservative' and 'creative', the tension between the desire to be safe and the desire to be free, between memory and change. Hass warned us against the danger inherent in seeking safety, in preserving memory, to an extreme: blindness to current reality. But he never addressed the danger inherent in seeking freedom and change to the exclusion of all else. I wrote:

    There is a danger in safety, as it can blind us to the value of change, can make us fear change. This was one of the moments in which Hass surrendered to a cheap political point, but I began to think about the dangers inherent in the other side of the equation, freedom. What sort of blindness does freedom lead us to?

    giving up safety for liberty

    During a conversation about the talk with Ryan Dixon, it hit me. The danger inherent in seeking freedom and change to an extreme untethered idealism. Instead of "Ah, the good old days!", we have, "The world would be great if only...". When we don't show proper respect to memory and safety, we become blind in a different way -- to the fact that the world can't be the way it is in our dreams, that reality precludes somehow our vision.

    That doesn't sound so bad, but people sometimes forget not to include other people in their ideal view. We sometimes become so convinced by our own idealism that we feel a need to project it onto others, regardless of their own desires. This sort of blindness begins to look in practice an awful lot like the blindness of overemphasizing safety and memory.

    Of course, when discussing creative habits, we need to be careful not to censor ourselves prematurely. As we discussed at Extravagaria, most people tend toward one extreme. They need encouragement to overcome their fears of failure and inadequacy. But that doesn't mean that we can divorce ourselves from reality, from human nature, from the limits of the world. Creativity, as Hass himself told us, thrives when it bumps into boundaries.

    Being creative means balancing our desire for safety and freedom. Straying too far in either way may work in the short term, but after too long in either land we lose something essential to the creative process.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

    October 19, 2005 6:14 PM

    OOPSLA Day 4: Mary Beth Rosson on the End of Users

    As you've read here in the past, I am one of a growing number of CS folks who believe that we must expand the purview of computer science education far beyond the education of computer scientists and software developers. Indeed, our most important task may well lie in the education of the rest of the educated world -- the biologists and sociologists, the economists and physicists, the artists and chemists and political scientists whose mode of work has been fundamentally altered by the aggregation of huge stores of data and the creation of tools for exploring data and building models. The future of greatest interest belongs not to software development shops but to the folks doing real work in real domains.

    Mary Beth Rosson

    So you won't be surprised to know how excited I was to come to Mary Beth Rosson's Onward! keynote address called "The End of Users". Mary Beth has been an influential researcher across a broad spectrum of problems in OOP, HCI, software design, and end-user programming, all of which have had prominent places at OOPSLA over the years. The common theme to her work is how people relate to technology, and her methodology has always had a strong empirical flavor -- watching "users" of various stripes and learning from their practice how to better support them.

    In today's talk, Mary Beth argued that the relationship between software developers and software users is changing. In the old days, we talked about "end-user programming", those programming-like activities done by those without formal training in programming. In this paradigm, end users identify requirements on programs and then developers produce software to meet the need. This cycle occurs at a relatively large granularity, over a relatively long time line.

    But the world is changing. We now find users operating in increasingly complex contexts. In the course of doing their work, they frequently run into ad hoc problems for their software to solve. They want to integrate pieces of solution across multiple tools, customize their applications for specific scenarios, and appropriate data and techniques from other tools. In this new world, developers must produce components that can be used in an ad hoc fashion, integrated across apps. Software developers must create knowledge bases and construction kits that support an interconnected body of problems. (Maybe the developers even strive to respond on demand...)

    These are not users in the traditional sense. We might call them "power users", but that phrase is already shop-worn. Mary Beth is trying out a new label: use developers. She isn't sure whether this label is the long-term solution, but at least this name recognizes that users are playing an increasingly sophisticated role that looks more and more like programming.

    What sorts of non-trivial tasks do use developers do?

    An example scenario: a Word user redefining the 'normal' document style. This is a powerful tool with big potential costs if done wrong.

    Another scenario: an Excel user creates a large spreadsheet that embodies -- hides! -- a massive amount of computation. (Mary Beth's specific example was a grades spreadsheet. She is a CS professor after all!)

    Yet another: an SPSS defines new variables and toggles between textual programming and graphical programming.

    And yet another: a FrontPage user does visual programming of a web page, with full access to an Access database -- and designs a database!

    Mary Beth summarized the characteristics of use developers as:

    • comfortable with a diverse array of software apps and data sources
    • work with them multiple apps in parallel and so want to pick and choose among functionality at any time, hooking components up and configuring custom solutions on demand.
    • working collaboratively, with group behaviors emerging
    • see the computer as a tool, not an end; it should not get in their way

    Creating use developers has potential economic benefits (more and quicker cycles getting work done) and personal benefits (more power, more versatility, higher degree of satisfaction).

    But is the idea of a use developer good?

    Mary Beth quoted an IEEE Software editor whose was quite dismissive of end users. He warned that they do not systematically test their work, that they don't know to think about data security and maintainability, and -- when they do know to think about these issues -- they don't know *how* to think about them. Mary Beth thinks these concerns are representative of what folks in the software world and that we need to be attentive to them.

    (Personally, I think that, while we certainly should be concerned about the quality of the software produced by end users, we also must keep in mind that software engineers have a vested interested in protecting the notion that only Software Engineers, properly trained and using Methods Anointed From On High are capable of delivering software of value. We all know of complaints from the traditional software engineering community about agile software development methods, even when the folks implementing and using agile methods are trained in computing and are, presumably, qualified to make important decisions about the environment in which we make software.)

    Mary Beth gave an example to illustrate the potential cost inherent in the lack of dependability -- a Fannie Mae spreadsheet that contained a $1.2B error.

    As the base of potential use developers grows so do the potential problems. Consider just the spreadsheet and database markets... By 2012, the US Department of Labor estimates that there will be 55M end users. 90% of all spreadsheets contain errors. (Yes, but is that worse or better than in programs written by professional software developments?) The potential costs are not just monetary; they can be related to the quality of life we all experience. Such problems can be annoying and ubiquitous: web input forms with browser incompatibilities; overactive spam filters that lose our mail; Word styles that break the other formatting in user documents; and policy decisions based on research findings that themselves are based on faulty analysis due to errors in spreadsheets and small databases.

    Who is responsible for addressing these issues? Both! Certainly, end users must take on the responsibility of developing new habits and learning the skills they need to use their tools effectively and safely. But we in the software world need to recognize our responsibilities:

    • to build better tools, to build the scaffolding users need to be effective and safe users. The tools we build should offer the right amount of help to users who are in the moment of doing their jobs.
    • to promote a "quality assurance" culture among users. We need to develop and implement new standards for computing literacy courses.

    How do we build better tools?

    Mary Beth called them smarter tools and pointed to a couple of the challenges we must address. First, much of the computation being done in tools is invisible, that is, hidden by the user interface. Second, people do not want to be interrupted while doing their work! (We programmers don't want that; why should our users have to put up with it?)

    Two approaches that offer promise are interactive visualization of data and minimalism. By minimalism, she means not expanding the set of issues that the user has concern herself with by, say, integrating testing and debugging into the standard usage model.

    The NSF is supporting a five-school consortium called EUSES, End Users Shaping Effective Software, who are trying these ideas out in tool and experiment. Some examples of their work:

    • CLICKS is a drag-and-drop, design-oriented web development environment.

    • Whyline is a help system integrated directly into Alice's user environment. The help system monitors the state of the user's program and maintains a dynamic menu of problems they may run into.

    • WYSIWYT is a JUnit-style interface for debugging spreadsheets, in which the system keeps an eye on what cells have and have not been verified with tests.

    How can we promote a culture of quality assurance? What is the cost-benefit trade-off involved for the users? For society?

    Mary Beth indicated three broad themes we can build on:

    • K-12 education: making quality a part of schoolchildren's culture of computer use
    • universal access: creating tools aimed at specific populations of users
    • communities of practice: evolving reflective practices within the social networks of users

    Some specific examples:

    • Youngsters who learn by debugging in Alice. This is ongoing work by Mary Beth's group. Children play in 3D worlds that are broken, and as they play the child users are invited to fix the system as they play. You may recognize this as the Fixer Upper pedagogical pattern, but in a "non-programming" programming context.

    • Debugging tools that appeal to women. Research shows that women take debugging seriously, but they tend to use strategies in their heads more than the tools available in the typical spreadsheet and word processing systems. How do we invite women with lower self-confidence to avail themselves of system tools? One experimental tool does this by letting users indicate "not sure" when evaluating correctness of a spreadsheet cell formula.

    • Pair programming community simulations. One group has has a Sim City-like world in which a senior citizen "pair programs" with a child. Leaving the users unconstrained led to degeneration, but casting the elders as object designers and the children as builders led to coherent creations.

    • Sharing and reuse in a teacher community. The Teacher Bridge project has created a collaborative software construction tool to support an existing teacher community. The tool has been used by several groups, including the one that created PandapasPond.org. This tool combines a wiki model for its dynamic "web editor" and more traditional model for its static design tool (the "folder editor"). Underneath the service, the system can track user activity in a variety of ways, which allows us to explore the social connections that develop within the user community over time.

    The talk closed with a reminder that we are just beginning the transition from thinking of "end users" to thinking of "use developers", and one of our explicit goals should be to try to maximize the upside, and minimize the downside, of the world that will result.

    For the first time in a long time, I got up to ask a question after one of the big talks. Getting up to stand in line at an aisle mic in a large lecture hall, to ask a question in front of several hundred folks, seems a rather presumptuous act. But my interest in this issue is growing rapidly, and Mary Beth has struck on several issues close to my current thinking.

    My question was this: What should university educators be thinking about with regard to this transition? Mary Beth's answer went in a way I didn't anticipate: We should be thinking about how to help users develop the metacognitive skills that software developers learn within our culture of practice. We should extend cultural literacy curricula to focus on the sort of reflective habits and skills that users need to have when building models. "Do I know what's going on? What could be going wrong? What kinds of errors should I be watching for? How can I squeeze errors out of my program?"

    After the talk, I spent a few minutes discussing curricula issues more specifically. I told her about our interest in reaching out to new populations of students, with the particular example of a testing certificate that folks in my department are beginning to work on. This certificate will target non-CS students, the idea being that many non-CS students end up working as testers in software development for their domain, yet they don't understand software or testing or much anything about computing very deeply. This certificate is still aimed at traditional software development houses, though I think it will bear the seeds of teaching non-programmers to think about testing and software quality. If these folks ever end up making a spreadsheet or customizing Word, the skills they learn here will transfer directly.

    Ultimately, I see some CS departments expanding their computer literacy courses, and general education courses, to aim at use developers. Our courses should treat them with the same care and respect as we treat Programmers and Computer Scientists. The tasks users do are important, and these folks deserve tools of comparable quality.

    Three major talks, three home runs. OOPSLA 2005 is hot.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    October 18, 2005 4:04 PM

    OOPSLA Day 3: Robert Hass on Creativity

    Robert Hass, former poet laureate of the US

    With Dick Gabriel and Ralph Johnson leading OOPSLA this year, none of us were that the themes of the conference were creativity and discovery. This theme presented itself immediately in the conference's opening keynote speaker, former poet laureate Robert Hass. He gave a marvelous talk on creativity.

    Hass began his presentation by reading a poem (whose name I missed) from Dick's new chapbook, Drive On. Bob was one of Dick's early teachers, and he clearly reveled in the lyricism, the rhythm of the poem. Teachers often form close bonds with their students, however long or short the teaching relationship. I know the feeling from both sides of the phenomenon.

    He then described his initial panic at thought of introducing the topic of creativity to a thousand people who develop software -- who create, but in a domain so far from his expertise. But a scholar can find ways to understand and transmit ideas of value wherever they live, and Hass is not only a poet but a first-rate scholar.

    Charles Dickens burst on scene with publication of The Pickwick Papers. With this novel, Dickens essentially invented the genre of the magazine-serialized novel. When asked how he created a new genre of literature, he said simply, "I thought of Pickwick."

    I was immediately reminded of something John Gribble said in his talk at Extravagaria on Sunday: Inspiration comes to those already involved in the work.

    Creativity seems to happen almost with cause. Hass consulted with friends who have created interesting results. One solved a math problem thought unsolvable by reading the literature and "seeing" the answer. Another claimed to have resolved the two toughest puzzles in his professional career by going to sleep and waking up with the answer.

    So Hass offered his first suggestion for how to be creative: Go to sleep.

    Human beings were the first animals to trade instinct for learning. The first major product of our learning was our tools. We made tools that reflected what we learned about solving immediate problems we faced in life. These tools embodied the patterns we observed in our universe.

    We then moved on to broader forms of patterns: story, song, and dance. These were,according to Hass, the original forms of information storage and retrieval, the first memory technologies. Eventually, though, we created a new tool, the printing press, that made these fundamentally less essential -- less important!? And now the folks in this room contribute to the ultimate tool, the computer, that in many ways obsoletes human memory technology. As a result, advances in human memory tech have slowed, nearly ceased.

    The bulk of Hass's presentation explored the interplay between the conservative in us (to desire to preserve in memory) and the creative in us (the desire to create anew). This juxtaposition of 'conservative' and 'creative' begets a temptation for cheap political shots, to which even Hass himself surrendered at least twice. But the juxtaposition is essential, and Hass's presentation repeatedly showed the value and human imperative for both.

    Evolutionary creativity depends on the presence having a constant part and a variable part, for example, the mix of same and different in an animal's body, in the environment. The simultaneous presence of constant and variable is the basis of man's physical life. It is also the basis of our psychic life. We all want security and freedom, in an unending cycle Indeed, I believe that most of us want both all the time, at the same time. Conservative and creative, constant and variable -- we want and need both.

    Humans have a fundamental desire for individuation, even while still feeling a oneness with our mothers, our mentors, the sources of our lives. Inspiration, in a way, is how a person comes to be herself -- is in part a process of separation.

    "Once upon a time" is linguistic symbol, the first step of the our separation from the immediate action of reading into a created world.

    At the same time, we want to be safe and close to, free and faraway. Union and individuation. Remembering and changing.

    Most of us think that most everyone else is more creative than we are. This is a form of the fear John Gribble spoke about on Sunday, one of the blocks we must learn to eliminate from our minds -- or at least fool ourselves into ignoring. (I am reminded of John Nash choosing to ignore the people his mind fabricates around him in A Beautiful Mind.)

    Hass then told a story about the siren song from The Odyssey. It turns out that most of the stories in Homer's epics are based in "bear stories" much older than Homer. Anyway, Odysseus's encounter with the sirens is part of a story of innovation and return, freedom on the journey followed by a return to restore safety at home. Odysseus exhibits the creativity of an epic hero: he ties himself to the mast so that he can hear the sirens' song without having to take ship so close to the rocks.

    According to Hass, in some versions of the siren story, the sirens couldn't sing -- the song was only a sailors' legend. But they desire to hear the beautiful song, if it exists. Odysseus took a path that allowed him both safety and freedom, without giving up his desire.

    In preparing for this talk,hass asked himself, "Why should I talk to you about creativity? Why think about it all?" He identified at least four very good reasons, the desire to answer these questions:

    • How can we cultivate creativity in ourselves?
    • How can we cultivate creativity in our children?
    • How can we identify creative people?
    • How can we create environments that foster creativity?

    So he went off to study what we know about creativity. A scholar does research.

    Creativity research in the US began when academic psychologists began trying to measure mental characteristics. Much of this work was done at the request of the military. As time went by, the number of characteristics, perhaps in correlation of research grants awarded by the government. Creativity is, perhaps, correlated with salesmanship. :-) Eventually, we had found several important characteristics, including that there is little or no correlation between IQ and creativity. Creativity is not a province of the intellectually gifted.

    Hass cited the research of Howard Gardner and Mihaly Csikszentmihalyi (remember him?), both of whom worked to identify key features of the moment of a creative change, say, when Dickens thought to publish a novel in serial form. The key seems to be immersion in a domain, a fascination with domain and its problem and possibilities. The creative person learns the language of the domain and sees something new. Creative people are not problems solvers but problem finders.

    I am not surprised to find language at the center of creativity! I am also not surprised to know that creative people find problems. I think we can save something even stronger, that creative people often create their own problems to solve. This is one of the characteristics that biases me away from creativity: I am a solver more than a finder. But thinking explicitly about this may enable me to seek ways to find and create problems.

    That is, as Hass pointed out earlier, one of the reasons for thinking about creativity: ways to make ourselves more creative. But we can use the same ideas to help our children learn the creative habit, and to help create institutions that foster the creative act. He mentioned OOPSLA as a social construct in the domain of software that excels at fostering creative. It's why we all keep coming back. How can we repeat the process?

    Hass spoke more about important features of domains. For instance, it seems matter how clear the rules of the domain are at the point that a person enters it. Darwin is a great example. He embarked on his studies at a time when the rules of his domain had just become fuzzy again. Geology had recently expanded European science's understanding of the timeline of the earth; Linnaeus had recently invented his taxonomy of organisms. So, some of the knowledge Darwin needed was in place, but other parts of the domain were wide open.

    The technology of memory is a technology of safety. What are the technologies of freedom?

    Hass read us a funny poem on story telling. The story teller was relating a myth of his people. When his listener questioned an inconsistency in his story, the story teller says, "You know, when I was a child, I used to wonder that..." Later, the listener asked the same question again, and again, and each time the story teller says, "You know, when I was a child, I used to wonder that..." When he was a child, he questioned the stories, but as he grew older -- and presumably wiser -- he came to accept the stories as they were, to retell them without question.

    We continue to tell our stories for their comfort. They make us feel safe.

    They is a danger in safety, as it can blind us to the value of change, can make us fear change. This was one of the moments in which Hass surrendered to a cheap political point, but I began to think about the dangers inherent in the other side of the equation, freedom. What sort of blindness does freedom lead us to?

    Software people and poets have something in common, in the realm of creativity: We both fall in love with patterns, with the interplay between the constant and the variable, with infinite permutation. In computing, we have the variable and the value, the function and the parameter, the framework and the plug-in. We extend and refactor, exposing the constant and the variable in our problem domains.

    Hass repeated an old joke, "Spit straight up and learn something." We laugh, a mockery of people stuck in same old patterns. This hit me right where I live. Yesterday at the closing panel of the Educators' Symposium, Joe Bergin said something that I wrote about a while back: CS educators are an extremely conservative lot. I have something to say about that panel, soon...

    First safety, then freedom -- and with it the power to innovate.

    Of course, extreme danger, pressure, insecurity can also be the necessity that leads to the creative act. As is often the case, opposites turn out to be true. As Thomas Mann said,

    A great truth is a truth whose opposite is also a great truth.

    Hass reminds us that there is agony in creativity -- a pain at stuckness, found in engagement with the world. Pain is unlike pleasure, which is homeostatic ("a beer and ballgame"). Agony is dynamic, ceasing to cling to safe position. There is always an element of anxiety, consciousness heightened at the moment of insight, gestalt in face of incomplete pattern.

    The audience asked a couple of questions:

    • Did he consult only men in his study of creativity? Yes, all but his wife, who is also a poet. She said, "Tell them to have their own strangest thoughts." What a great line.

    • Is creativity unlimited? Limitation is essential to creativity. If our work never hits an obstacle, then we don't know when it's over. (Sounds like test-driven development.) Creativity is always bouncing up against a limit.

    I'll close my report with how Hass closed the main part of his talk. He reached "the whole point of his talk" -- a sonnet by Michelangelo -- and he didn't have it in his notes!! So Hass told us the story in paraphrase:

    The pain is unbearable, paint dripping in my face, I climb down to look at it, and it's horrible, I hate it, I am no painter...

    It was the ceiling of the Sistine Chapel.

    ~~~~~

    UPDATE (10/20/05): Thanks again to Google, I have tracked down the sonnet that Hass wanted to read. I especially love the ending:

    Defend my labor's cause,
    good Giovanni, from all strictures:
    I live in hell and paint its pictures.

    -- Michelangelo Buonarroti

    I have felt this way about a program before. Many times.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

    October 17, 2005 9:54 PM

    OOPSLA Day 2: Morning at The Educators' Symposium

    This was my second consecutive year to chair the OOPSLA Educators' Symposium, and my goal was something more down to earth yet similar in flavor: encouraging educators to consider Big Change. Most of our discussions in CS education are about how to do the Same Old Thing better, but I think that we have run the course with incremental improvements to our traditional approaches.

    We opened the day with a demonstration called Agile Apprenticeship in Academia, wherein two professors and several students used a theatrical performance to illustrate a week in a curriculum built almost entirely on software apprenticeship. Dave West and Pam Rostal wanted to have a program for developing software developers, and they didn't think that the traditional CS curriculum could do the job. So they made a Big Change: they tossed the old curriculum and created a four-year studio program in which students, mentors, and faculty work together to create software and, in the process, students learn how to do create software.

    West and Rostal defined a set of 360 competencies that students could satisfy at five different levels. Students qualify to graduate from the program by satisfying each competency at at least the third level (the ability to apply the concept in a novel situation) and some number at higher levels. Students also have to complete the standard general education curriculum of the university.

    Thinking back to yesterday's morning session at Extravagaria, we talked the role of fear and pressure in creativity. West and Rostal put any fear behind them and acted on their dream. Whatever difficulties they face in making this idea work over the long run in a broader setting -- and I believe that the approach faces serious challenges -- at least they have taken a big step forward could make something work. Those of us who don't take any big steps forward are doomed to remain close to where we are.

    I don't have much to say about the paper sessions of the day except that I noticed a recurring theme: New ideas are hard on instructors. I agree, but I do not think that they are hard in the NP-hard sense but rather in the "we've never done it that way before" sense. Unfamiliarity makes things seem hard at first. For example, I think that the biggest adjustment most professors need to make in order to move to the sort of studio approach advocated by West and ROstal is from highly-scripted lectures and controlled instructional episodes to extemporaneous lecturing in response to student needs in real-time. The real hardness in this is that faculty must have a deep, deep understanding of the material they teach -- which requires a level of experience doing that many faculty don't yet have.

    This idea of professors as practitioners, as professionals practiced in the art and science we teach, will return in later entries from this conference...

    Like yesterday's entry, I'll have more to say about today's Educators' Symposium in upcoming entries. I need some time to collect my thoughts and to write. In particular, I'd like to tell you about Ward Cunningham's keynote address and our closing panel on the future of CS education. The panel was especially energizing but troubling at the same time, and I hope to share a sense of both my optimism and my misgivings.

    But with the symposium over, I can now take the rest of the evening to relax, then sleep, have a nice longer run, and return to the first day of OOPSLA proper free to engage ideas with no outside encumbrances.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    October 16, 2005 9:52 PM

    OOPSLA Day 1: The Morning of Extravagaria

    OOPSLA 2005 logo

    OOPSLA has arrived, or perhaps I have arrived at OOPSLA. I almost blew today off, for rest and a run and work in my room. Some wouldn't have blamed me after yesterday, which began at 4:42 AM with a call from Northwest Airlines that my 7:05 AM flight had been cancelled, included my airline pilot missing the runway on his first pass at the San Diego airport, and ended standing in line for two hours to register at my hotel. But I dragged myself out of my room -- in part out of a sense of obligation to having been invited to participate, and in part out of a schoolboy sense of propriety that I really ought to go to the events at my conferences and make good use of my travels.

    My event for the day was an all-day workshop called Extravagaria III: Hunting Creativity. As its title reveals, this workshop was the third in a series of workshops initiated by Richard Gabriel a few years ago. Richard is motivated by the belief that computer science is in the doldrums, that what we are doing now is mostly routine and boring, and that we need a jolt of creativity to take the next Big Step. We need to learn how to write "very large-scale programs", but the way we train computer scientists, especially Ph.D. students and faculty, enforce a remarkable conservatism in problem selection and approach. The Extravagaria workshops aim to explore creativity in the arts and sciences, in an effort to understand better what we mean by creativity and perhaps better "do it" in computer science.

    The workshop started with introductions, as so many do, but I liked the twist that Richard tossed in: each of us was to tell what was the first program we ever wrote out of passion. This would reveal something about each of us to one another, and also perhaps recall the same passion within each storyteller.

    My first thought was of a program I wrote as a high school junior, in a BASIC programming course that was my first exposure to computers and programs. We wrote all the standard introductory programs of the day, but I was enraptured with the idea of writing a program to compute ratings for chessplayers following the Elo system used for chessplayers. This was much more complex than the toy problems I solved in class, requiring input in the form of player ratings and a crosstable showing results of games among the players and output in the form of updated ratings for each player. It also introduced new sorts of issues, such as using text files to save state between runs and -- even more interesting to me -- the generation of an initial set of ratings through a mechanism of successive approximations, process that may never quite converge unless we specified an epsilon larger than 0. I ultimately wrote a program of several hundred lines, a couple of orders of magnitude larger than anything I had written before. And I cared deeply about my program, the problem it solved, and its usefulness to real people.

    I enjoyed everyone else's stories, too. They reminded us all about the varied sources of passion, and how a solving a problem can open our eyes to a new world for us to explore. I was pleased by the diversity of our lot, which included workshop co-organizer John Gribble, a poet friend of Richard's who has never written a program; Rebecca Rikner, the graphic artist who designed the wonderful motif for Richard's book Writers' Workshops and the Work of Making Things, and Guy Steele, one of the best computer scientists around. The rest of us were computer science and software types, including one of my favorite bloggers, Nat Pryce. Richard's first passionate program was perhaps a program to generate "made-up words" from some simple rules, to use in naming his rock-and-roll-band. Guy offered three representative, if not first, programs: a Lisp interpreter written in assembly language, a free verse generator written in APL, and low chart generator written in RPG. This wasn't the last mention of APL today, which is often the sign of a good day.

    Our morning was built around an essay written by John Gribble for the occasion, called "Permission, Pressure, and the Creative Process". John read his essay, while occasionally allowing us in the audience to comment on his remarks or previous comments. John offered as axioms two beliefs that I share with him:

    • that all people are creative, that is, possess the potential to act creatively, and
    • that there is no difference of kind between creativity in the arts and creativity in the sciences.

    What the arts perhaps offer scientists is the history and culture of examining the creative process. We scientists and other analytical folks tend to focus on product, often to the detriment of how well we understand how we create them.

    John quoted Stephen King from his book On Writing, that the creator's job is not to find good ideas but to recognize them when they come along. For me, this idea foreshadows Ward Cunningham's keynote address at tomorrow's Educators' Symposium. Ward will speak on "nurturing the feeble simplicity", on recognizing the seeds of great ideas despite their humility and nurturing them into greatness. As Brian Foote pointed out later in the morning, this sort of connection is what makes conferences like OOPSLA so valuable and fun -- plunk yourself down into an idea-rich environment, soak in good ideas from good minds, and your own mind has the raw material it needs to make connections. That's a big part of creativity!

    John went on to assert that creativity isn't rare, but rather so common that we are oblivious to it. What is rare is for people to act on their inspirations. Why do we not act? We have so low an opinion of our selves that we figure the inspiration isn't good enough or that we can't do it justice in our execution. Another reason: We fear to fail, or to look bad in front of our friends and colleagues. We are self-conscious, and the self gets in the way of the creative act.

    Most people, John believes, need permission to act creatively. Most of us need external permission and approval to act, from friends or colleagues, peers or mentors. This struck an immediate chord with me in three different relationships: student and teacher, child and parent, and spouse and spouse. The discussion in our workshop focused on the need to receive permission, but my immediate thought was of my role as potential giver of permission. My students are creative, but most of them need me to give them permission to create. They are afraid of bad grades and of disappointing me as their instructor; they are self-conscious, as going through adolescence and our school systems tend to make them. My young daughters began life unself-conscious, but so much of their lives are about bumping into boundaries and being told "Don't do that." I suspect that children grow up most creative in an environment where they have permission to create. (Note that this is orthogonal to the issue of discipline or structure; more on that later.) Finally, just as I find myself needing my wife's permission to do and act -- not in the henpecked husband caricature, but in the sense of really caring about what she thinks -- she almost certainly feels the need for *my* permission. I don't know why this sense that I need to be a better giver of permission grew up so strong so quickly today, but it seemed like a revelation. Perhaps I can change my own behavior to help those around me feel like they can create what they want and need to create. I suspect that, in loosing the restrictions I project onto others, I will probably free myself to create, too.

    When author Donald Ritchie is asked how to start writing, he says, "First, pick up your pencil..." He's not being facetious. If you wait for inspiration to begin, then you'll never begin. Inspiration comes to those already involved in the work.

    Creativity can be shaped by constraints. I wrote about this idea six months or so ago in an entry named Patterns as a Source of Freedom. Rebecca suggested that for her at least constraints are essential to creativity, that this is why she opted to be a graphic designer instead of a "fine artist". The framework we operate in can change, across projects or even within a project, but the framework can free us to create. Brian recalled a song by the '80s punk-pop band Devo called Freedom Of Choice:

    freedom of choice is what you got
    then if you got it you don't want it
    seems to be the rule of thumb
    don't be tricked by what you see
    you got two ways to go
    freedom from choice is what you want

    Richard then gave a couple of examples of how some artists don't exercise their choice at the level of creating a product but rather at the level of selecting from lots of products generated less self-consciously. In one, a photographer for National Geographic, put together a pictorial article containing 22 pictures selected from 40,000 photos he snapped. In another, Francis Ford Coppolla shot 250 hours of film in order to create the 2-1/2 hour film Apocalypse Now.

    John then told a wonderful little story about an etymological expedition he took along the trail of ideas from the word "chutzpah", which he adores, to "effrontery", "presumptuous", and finally "presumption" -- to act as if something were true. This is a great way to free oneself to create -- to presume that one can, that one will, that one should. Chutzpah.

    Author William Stafford had a particular angle he took on this idea, what he termed the "path of stealth". He refused to believe in writer's block. He simply lowered his standards. This freed him to write something and, besides, there's always tomorrow to write something better. But as I noted earlier, inspiration comes to those already involved in the work, so writing anything is better than writing nothing.

    As editor John Gould once told Stephen King, "Write with the door closed. Revise with the door open." Write for yourself, with no one looking over your shoulder. Revise for readers, with their understanding in mind.

    Just as permission is crucial to creativity, so is time. We have to "make time", to "find time". But sometimes the work is on its own time, and will come when and at the rate it wants. Creativity demands that we allow enough time for that to happen! (That's true even for the perhaps relatively uncreative act of writing programs for a CS course... You need time, for understanding to happen and take form in code.)

    Just as permission and time are crucial to creativity, John said, so is pressure. I think we all have experienced times when a deadline hanging over our heads seemed to give us the power to create something we would otherwise have procrastinated away. Maybe we need pressure to provide the energy to drive the creative act. This pressure can be external, in the form of a client, boss, or teacher, or internal.

    This is one of the reasons I do not accept late work for a grade in my courses; I believe that most students benefit from that external impetus to act, to stop "thinking about it" and commit to code. Some students wait too long and reach a crossover point: the pressure grows quite high, but time is too short. Life is a series of balancing acts. The play between pressure and time is, I think, fundamental. We need pressure to produce, but we need time to revise. The first draft of a paper, talk, or lecture is rarely as good as it can be. Either I need to give myself to create more and better drafts, or -- which works better for me -- I need to find many opportunities to deliver the work, to create multiple opportunities to create in the small through revision, variation, and natural selection. This is, I think, one of the deep and beautiful truths embedded in extreme programming's cycle "write a test, write code, and refactor".

    Ultimately, a professional learns to rely more on internal pressure, pressure applied by the self for the self, to create. I'm not talking about the censoriousness of self-consciousness, discussed earlier, which tells us that what we produce isn't good enough -- that we should not act, at least in the current product. I'm talking about internal demands that we act, in a particular way or time. Accepting the constraints of a form -- say, the line and syllable restrictions of haiku, or the "no side effects" convention of functional programming style -- puts pressure on us to act in a way, whether it's good or bad. John gave us two other kinds of internal pressure, ones he applies to himself: the need to produce work to share at his two weekly writers' workshops, and the self-discipline of submitting work for publication every month. These pressures involve outside agents, but they are self-imposed, and require us do something we might otherwise not.

    John closed with a short inspiration. Pay attention to your wishes and dreams. They are your mind's way of telling you to do something.

    We spent the rest of the morning chatting as a group on whatever we were thinking after John's talk. Several folks related an experience well-known to any teacher: someone comes to us asking for help with a problem and, in the act of explaining the problem to us they discover the answer for themselves. Students do this with me often. Is the listener essential to this experience, or could we just ask if we were speaking to someone? I suspect that another person is essential for this to work for the learner, both because having a real person to talk to makes us explain things (pressure!) and because the listener can force us to explain the problem more clearly ("I don't understand this yet...")

    A recurring theme of the morning was the essential interactivity of creativity, even when the creator works fundamentally alone. Poets need readers. Programmers need other folks to bounce ideas off of. Learners need someone to talk to, if only to figure things out for themselves. People can be sources of ideas. They can also be reflectors, bouncing our own ideas back at us, perhaps in a different form or perhaps the same, but with permission to act on them. Creativity usually comes in the novel combination of old ideas, not truly novel ideas.

    This morning session was quite rewarding. My notes on the whole workshop are, fittingly, about half over now, but this article has already gotten quite long. So I think I'll save the afternoon sessions for entries to come. These sessions were quite different from the morning, as we created things together and then examined our processes and experiences. They will make fine stand-alone articles that I can write later -- after I break away for a bite at the IBM Eclipse Technology Exchange reception and for some time to create a bit on my own for what should take over my mind for a few hours: tomorrow's Educators' Symposium, which is only twelve hours and one eight- to ten-mile San Diego run away!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

    October 14, 2005 6:11 PM

    Rescued by Google

    Okay, so I know some people don't like Google. They are getting big and more ambitious. Some folks even have Orwellian nightmares about Google. (If that link fails, try this one.) But, boy, can Google be helpful.

    Take today, for instance. I was scping some files from my desktop machine to the department server, into my web space. Through one part sloppiness and one part not understanding how scp handles sub-directories, I managed to overwrite my home page with a different index.html.

    What to do now? I don't keep a current back-up of that web space, because the college backs it up regularly. But recovering back-up files is slow, it's Friday morning, I'm leaving for OOPSLA at sunrise tomorrow, and I don't have time for this.

    What to do?

    I google myself. Following the first hit doesn't help, because it goes to the live page. But click on Cached link takes me to Google's cached copy my index. The only difference between it and the Real Thing is that they have bolded the search terms Eugene and Wallingford. Within seconds, my web site is as good as new.

    Maybe I should be concerned that Google has such an extensive body of data. We as a society need to be vigilant when it comes to privacy in this age of aggregation and big search tools and indexes of God, the universe, and everything. We need to be especially vigilant about civil rights in an age when our governments could conceivably gain access to such data. But the web and Google have changed how we think about data storage and retrieval, search and research. These tools open doors to collective goods we could hardly imagine before. Let's be vigilant, but let's look for paths forward, not paths backward.

    Another use of Google data that I am enjoying of late is gVisit, a web-based tool for tracking visitors to web sites. I use a bare-bones blogging client, NanoBlogger, which doesn't come with fancy primitive features like comments and hit counters. (At least the version I use didn't; there are more recent releases.) But gVisit lets me get a sense of at least where people have been reading my blog. Whip up a little Javascript, and I can see the last N unique cities from which people have read Knowing and Doing, where I choose N. I love seeing that someone from Indonesia or Kazakhstan or Finland has read my blog. I also love seeing names of all the US cities in which readers live. Maybe it's voyeurism, but it reminds me that people really do read.

    No, I haven't tried Google Reader yet. I'm still pretty happy with NetNewsWire Lite, and then there's always the latest version of Safari...


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    September 29, 2005 1:49 PM

    Mathematics Coincidence

    An interesting coincidence... Soon after I post my spiel on preparing to study computer science, especially the role played by mathematics courses, the folks on the SIGCSE mailing list have started a busy thread on the place of required math courses in the CS curriculum. The thread began with a discussion of differential equations, which some schools apparently still require for a CS degree. The folks defending such math requirements have relied on two kinds of argument.

    One is to assert that math courses teach discipline and problem-solving skills, which all CS students need. I discussed this idea in my previous article. I don't think there is much evidence that taking math courses teaches students problem-solving skills or discipline, at least not as most courses are taught. They do tend to select for problem-solving skills and discipline, though, which makes them handy as a filter -- if that's what you want. But they are not always helpful as learning experiences for students.

    The other is to argue that students may find themselves working on scientific or engineering projects that require solving differential equations, so the course is valuable for its content. My favorite rebuttal to this argument came from a poster who listed a dozen or so projects that he had worked on in industry over the years. Each required specific skills from a domain outside computing. Should we then require one or more courses from each of those domains, on the chance that our students work on projects in them? Could we?

    Of course we couldn't. Computing is a universal tool, so it can and usually will be applied everywhere. It is something of a chameleon, quickly adaptable to the information-processing needs of a new discipline. We cannot anticipate all the possible applications of computing that our students might encounter any more than we can anticipate all the possible applications of mathematics they might encounter.

    The key is to return to the idea that underlies the first defense of math courses, that they teach skills for solving problems. Our students do need to develop such skills. But even if students could develop such skills in math courses, why shouldn't we teach them in computing courses? Our discipline requires a particular blend of analysis and synthesis and offers a particular medium for expressing and experimenting with ideas. Computer science is all about describing what can be systematically described and how to do so in the face of competing forces. The whole point of an education in computing should be to help people learn how to use the medium effectively.

    Finally, Lynn Andrea Stein pointed out an important consideration in deciding what courses to require. Most of my discussion and the discussion on the SIGCSE mailing list has focused on the benefits of requiring, say, a differential equations course. But we need also to consider the cost of such a requirement. We have already encountered one: an opportunity cost in the form of people. Certain courses filter out students who are unable to succeed in that course, and we need to be sure that we are not missing out on students who would make good computer science student. For example, I do not think that a student's inability to succeed in differential equations means that the student cannot succeed in computer science. A second opportunity cost comes in the form of instructional time. Our programs can require only so many courses, so many hours of instruction. Could we better spend a course's worth of time in computing on a topic other than differential equations? I think so.

    I remember learning about opportunity cost, in an economics course I took as an undergrad. Taking a broad set of courses outside of computing really can be useful.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    September 27, 2005 7:29 PM

    Learning by Dint of Experience

    While writing my last article, I encountered one of those strange cognitive moments. I was in the process of writing the trite phrase "through sheer dint of repetition" when I had a sudden urge to use dent in place of 'dint' -- even though I know deep inside that dint is correct.

    What to do? I used what for many folks is now the standard Spell Checker for Tough Cases: Google. Googling dent of repetition found 4 matches; googling dint of repetition found 470. This is certainly not conclusive evidence; maybe everyone else is as clueless as I. But it was enough evidence to help me go with my instinct in the face of a temporary brain cramp.

    Of course, our growing experience with the World Wide Web and other forms of collaborative technologies is that the group is often smarter than the individual. The wisdom of crowds and all that. It's probably no accident that I link "wisdom of crowds" to Amazon.com, either.

    To further confirm my decision to stick with 'dint', I spent some time at Merriam-Webster On-Line, where I learned that 'dint' and 'dent' share a common etymology. It's funny what I can learn when I sit down to write.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    September 27, 2005 7:10 PM

    Preparing to Study Computer Science

    Yesterday, our department hosted a "preview day" for high school seniors who are considering majoring in computer science here at UNI. During the question-and-answer portion of one session, a student asked, "What courses should we take in our senior year to best prepare to study CS?" That's a good question, and one that resulted in a discussion among the CS faculty present.

    For most computer science faculty, the almost reflexive answer to this question is math and science. Mathematics courses encourage abstract thinking, attention to detail, and precision. Science courses help think like an empiricist: formulating hypotheses, designing experiments, making and recording observations, and drawing inferences. A computer science student will use all of these skills throughout her career.

    I began my answer with math and science, but the other faculty in the room reacted in a way that let me know they had something to say, too. So I let them take the reins. All three downplayed the popular notion that math, at least advanced math, is an essential element of the CS student's background.

    One faculty member pointed out the students with backgrounds in music often do very well in CS. This follows closely with the commonly-held view that music helps children to develop skill at spatial and symbolic reasoning tasks. Much of computing deals not with arithmetic reasoning but with symbolic reasoning. As an old AI guy, I know this all too well. In much the same way that music might help CS students, studying language may help students to develop facility manipulating ideas and symbolic representations, skills that are invaluable to the software developer and the computing researcher alike.

    We ended up closing our answer to the group by saying that studying whatever interests you deeply -- and really learning that discipline -- will help you prepare to study computer science more than following any blanket prescription to study a particular discipline.

    (In retrospect, I wish I had thought to tack on one suggestion to our conclusion: There can be great value in choosing to study something that challenges you, that doesn't interest you as much as everything else, precisely because it forces you to grow. And besides, you may find that you come to understand the something well enough to appreciate it, maybe even like it!)

    I certainly can't quibble with the direction our answer went. I have long enjoyed learning from writers, and I believe that my study of language and literature, however narrow, has made me a better computer scientist. I have had many CS students with strong backgrounds in art and music, including one wrote about last year. Studying disciplines other than math and science can lay a suitable foundation for studying computer science.

    I was surprised by the strength of the other faculty's reaction to the notion that studying math is among the best ways to prepare for CS. One of these folks was once a high school math teacher, and he has always expressed dissatisfaction with mathematics pedagogy in the US at both the K-12 and university levels. To him, math teaching is mostly memorize-and-drill, with little or no explicit effort put into developing higher-order thinking skills for doing math. Students develop these skills implicitly, if at all, through sheer dint of repetition. In his mind, the best that math courses can do for CS is to filter out folks who have not yet developed higher-order thinking skills; it won't help students develop them.

    That may well be true, though I know that many math teachers and math education researchers are trying to do more. But, while students may not need advanced math courses to succeed in CS -- at least not in many areas of software development -- they do need to master some basic arithmetical skills. I keep thinking back to a relatively straightforward programming assignment I've given in my CS II course, to implement Nick Parlante's NameSurfer nifty assignment in Java. A typical NameSurfer display looks like this image, from Nick's web page:

    As the user resizes the window, the program should grow or shrink its graph accordingly. To draw this image, the student must do some basic arithmetic to lay out the decade lines and to place the points on the lines and the names in the background. To scale the image, the student must do this arithmetic relative to window size, not with fixed values.

    Easy, right? When I assigned this program, many students reacted as if I had cut off one of their fingers. Others seemed incapable of constructing the equations needed to do scaling correctly. (And you should have the reaction students had when once, many years ago, I asked students to write a graphical Java version of Mike Clancy's delicious Cat And Mouse nifty assignment. Horror of horror -- polar coordinates!)

    This isn't advanced math. This is algebra. All students in our program were required to pass second-year algebra before being admitted to our university. But passing a course does not require mastery, and students find themselves with a course on their transcript but not the skills that the course entails.

    Clearly, mastery of basic arithmetic skills is essential to most of computer science, even if more advanced math, even calculus, are not essential. Especially when I think of algebraic reasoning more abstractly, I have hard time imagining how students can go very far in CS without mastering algebraic reasoning. Whatever its other strengths or weaknesses the How to Design Programs approach to teaching programming does one thing well, and that is to make an explicit connection between algebraic reasoning and programs. The result is something in the spirit of Polya's How to Solve It.

    This brings us back to what is the weakest part of the "math and science" answer to our brave high school student's question. So much of computing is not theory or analysis but design -- the act of working out the form of a program, interface, or system. While we may talk about the "design" of a proof or scientific experiment, we mean something more complex when we talk about the design of software. As a result, math and science do relatively little to help students develop the design skills which will be so essential to succeeding in the software side of CS.

    Studying other disciplines can help, though. Art, music, and writing all involve the students in creating things, making them think about how to make. And courses in those disciplines are more likely to talk explicitly about structure, form, and design than are math and science.

    So, we have quite defensible reasons to tell students to study disciplines other than science and math. I would still temper my advice by suggesting that students study both math and science and literature, music, art, and other creative disciplines. While this may not be what our student was looking for, perhaps the best answer is all of the above.

    Then again, maybe success in computing is controlled by aptitude, than by learning. If that is the case, then many of us, students and faculty alike, are wasting a lot of time. But I don't think that's true for most folks. Like Alan Kay, I think we just need to understand better this new medium that is computing so that we can find the right ways to empower as many people as possible to create new kinds of artifact.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    September 23, 2005 7:26 PM

    Ruby Friday

    I have written about Scheme my last two times out, so I figured I should give some love to another of my favorite languages.

    Like many folks these days, I am a big fan of Ruby. I took a Ruby tutorial at OOPSLA several years ago from Dave Thomas and Andy Hunt, author of the already-classic Programming Ruby. At the time, the only Ruby interpreter I could find for Mac OS 9 was a port written by a Japanese programmer, almost of whose documentation was written in, you guessed, Japanese. But I made it run and learned how to be functional in Ruby within a few hours. That told me something about the language.

    (Recalling these tutorial reminds me of two things. One, Dave and Andy give a great tutorial. If you get the chance, learn from them in person. The same can be sai for many OOPSLA tutorials. Two, thank you, Apple, for OS X as a Unix -- and for shipping it with such a nice collction of programming tools.)

  • If you want to augment the Pragmatic Programmers' guide to Ruby, check out Why's (Poignant) Guide to Ruby. You can learn Ruby there, plus quite a bit on programming more generally. You could have some fun, too.

  • Unlike many dynamic language fans, I like Java just fine. I can enjoy programming in Java, but there is no question that it gets in my way more than a language like Scheme or Ruby.

    Still, I feel compelled to share this opportunity to improve your geekware collection:

    Java Rehabilitation Clinic

    Thanks to the magic of CafePress.com, you can buy a variety of merchandise in the Java Rehab line. But why?

    Java coding need not be a life-long debilitation. With the proper treatment, and a copy of Programming Ruby, you can return to a life of happy, productive software development.

    So, give yourself over to a higher power! Learn Ruby...

    Just imagine how much more fun Java would be if it gave itself over to the higher power of higher-order procedures...

  • Finally, a little Ruby on Ruby. Check out Sam Ruby's talk, The Case for Dynamic Languages. Sam uses Ruby to illustrate his argument that the distinction between system languages and scripting languages is slowly shrinking, as the languages we use everyday become more dynamic. Along the way, he shows the power of several ideas that have entered mainstream programming parlance only in the last decade, among them closures and higher-order procedures in the form of blocks.

    But my favorite part of Sam's talk is his sub-title:

    Reinventing Smalltalk, one decade at a time

    Paul Graham says that we are reinventing Lisp, and he has a strong case. With either language as a target, the lives of programmers can only get better. The real question is whether objects as abstractions ultimately displace functions as the starting level of abstraction for the program. Another question, perhaps more important, is whether the language-as-environment can ever displace the minimalism of the scripting language as the programmer's preferred habitat. I have a hunch that the answer to both questions will be the same.


  • Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    September 22, 2005 8:09 PM

    4 to the Millionth Power

    First I risk veering off into a political discussion in my review of a Thomas Friedman talk, and now I will write about a topic that arose during a contentious conversation among the science faculty here about Intelligent Design. Oh, my.

    However, my post today isn't about ID, but rather numbers. And Scheme.

    Here's the set-up: The local chapter of a scientific research society has invited a proponent of ID to give a talk here next week. (It has also invited a strong opponent of ID to speak next month.) Most of the biology professors here have argued quite strenuously that the society should not sponsor the talk, on the grounds that to do so lends legitimacy to ID. A vigorous discussion broke out on the science faculty's mailing list a couple of days ago about the value of fostering open debate on this issue.

    That is all well and good, but I'm not much interested in commenting on the debate itself.

    However, in the course of the discussion, one writer attempted to characterize the size of the search space inherent in evolving DNA. He used as his example a string of size one million, though I think he was going more for effect than 100% accuracy on the size. Now, each element in a strand of DNA is one of four bases: adenine (A), thymine (T), cytosine (C), and guanine (G). So, for a string one million bases long, there are 41,000,000 unique possible sequences. In order to show the size of this number, he went off and wrote a C program using the longest floating-point values available.

    4 to the millionth power overflowed the available space. So he tried 4100,000.

    That overflowed the available space, too, so he tried 410,000. Same result.

    Finally, 41,000 gave him an answer he could see in exponential notation, a number on the order of 10602. That's big.

    The next day, my colleague and I were discussing the combinatorics when I began to wonder how Dr. Scheme might handle his problem. So I turned to my keyboard and entered a Scheme expression at the interpreter prompt:

              > (expt 4 1000)
              114813069527425452423283320117768198402231770208869520047764273682576626139237031385665948631650626991844596463898746277344711896086305533142593135616665318539129989145312280000688779148240044871428926990063486244781615463646388363947317026040466353970904996558162398808944629605623311649536164221970332681344168908984458505602379484807914058900934776500429002716706625830522008132236281291761267883317206598995396418127021779858404042159853183251540889433902091920554957783589672039160081957216630582755380425583726015528348786419432054508915275783882625175435528800822842770817965453762184851149029376
    

    That answer popped out in no time flat.

    Emboldened, I tried 410,000. Again, an answer arrived as fast as Dr. Scheme could print it.

    My colleague, a C programmer who doesn't use Scheme, was impressed. "Try 4100,000," he says. This one took a few seconds -- less than five -- before printing the answer. Almost all of the delay was for I/O; the computation time still registered 0 milliseconds.

    The look on his face revealed admiration. But we still hadn't produced the number we really wanted, 41,000,000. So we tried.

    Dr. Scheme sat quietly, chugging away. Within a few seconds it had its answer, but it took a while longer to produce its output -- a few minutes, in fact. But there it was, the 602,060-digit number that is 41,000,000.

    Very nice indeed! I was again impressed with how well Scheme works with big numbers. You can imagine how impressed my colleague was. C compilers produce fast, tight code, but you need something beyond the base language to compute numbers this large. Scheme was more than happy to do the job out of the box.

    Of course, if all my colleague wanted to know was the order of magnitude of 41,000,000, we could have had our answer much quicker, using a little piece we all learned in high school:

              > (* 1000000 (/ (log 4) (log 10)))
              602059.9913279623
    

    602,060 digits. That sounds about right!


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    September 21, 2005 8:22 PM

    Two Snippets, Unrelated?

    First... A student related this to me today:

    But after your lecture on mutual recursion yesterday, [another student] commented to me, "Is it wrong to think code is beautiful? Because that's beautiful."

    It certainly isn't wrong to think code is beautiful. Code can be beautiful. Read McCarthy's original Lisp interpreter, written in Lisp itself. Study Knuth's TeX program, or Wirth's Pascal compiler. Live inside a Smalltalk image for a while.

    I love to discover beautiful code. It can be professional code or amateur, open source or closed. I've even seen many beautiful programs written by students, including my own. Sometimes a strong student delivers something beautiful as expected. Sometimes, a student surprises me by writing a beautiful program seemingly beyond his or her means.

    The best programmers strive to write beautiful code. Don't settle for less.

    (What is mutual recursion, you ask? It is a technique used to process mutually-inductive data types. See my paper Roundabout if you'd like to read more.)

    The student who told me the quote above followed with:

    That says something about the kind of students I'm associating with.

    ... and about the kind of students I have in class. Working as an academic has its advantages.

    Second... While catching up on some blog reading this afternoon, I spent some time at Pragmatic Andy's blog. One of his essays was called What happens when t approaches 0?, where t is the time it takes to write a new application. Andy claims that this is the inevitable trend of our discipline and wonders how it will change the craft of writing software.

    I immediately thought of one answer, one of those unforgettable Kent Beck one-liners. On a panel at OOPSLA 1997 in Atlanta, Kent said:

    As speed of development approaches infinity, reusability becomes irrelevant.

    If you can create a new application in no time flat, you would never worry about reusing yesterday's code!

    ----

    Is there a connection between these two snippets? Because I am teaching a course in programming languages course this semester, and particularly a unit on functional programming right now, these snippets both call to mind the beauty in Scheme.

    You may not be able to write networking software or graphical user interfaces using standard Scheme "out of the box", but you can capture some elegant patterns in only a few lines of Scheme code. And, because you can express rather complex computations in only a few lines of code, the speed of development in Scheme or any similarly powerful language approaches infinity much faster than does development in Java or C or Ada.

    I do enjoy being able to surround myself with the possibility of beauty and infinity each day.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

    September 15, 2005 8:10 PM

    Technology and People in a Flat World

    Technology based on the digital computer and networking has radically changed the world. In fact, it has changed what is possible in such a way that how we do business and entertain ourselves in the future may bear little resemblance to what we do today.

    This will surely come as no surprise to those of you reading this blog. Blogging itself is one manifestation of this radical change, and many bloggers devote much of their blogging to discussing how blogging has changed the world (ad nauseam, it sometimes seems). But even without blogs, we all know that computing has redefined the parameters within each information is created and shared, and defined a new medium of expression that we and the computer-using world have only begun to understand.

    Thomas Friedman

    Last night, I had the opportunity to hear Thomas Friedman, Pulitzer Prize-winning international affairs columnist for the New York Times, speak on the material in his bestseller, The World Is Flat: A Brief History of the Twenty-First Century. Friedman's book tells a popular tale of how computers and networks have made physical distance increasingly irrelevant in today's world.

    Two caveats up front. The first is simple enough: I have not read the book The The World Is Flat yet, so my comments here will refer only to the talk Friedman delivered here last night. I am excited by the ideas and would like to think and write about them while they are fresh in my mind.

    The second caveat is a bit touchier. I know that Friedman is a political writer and, as such carries with him the baggage that comes from at least occasionally advocating a political position in his writing. I have friends who are big fans of his work, and I have friends who are not fans at all. To be honest, I don't know much about his political stance beyond my friends' gross characterizations of him. I do know that he has engendered strong feelings on both sides of the political spectrum. (At least one of his detractors has taken the time to create the Anti-Thomas Friedman Page -- more on that later.) I have steadfastly avoided discussing political issues in this blog, preferring to focus on technical issues, with occasional drift into the cultural effects of technology. This entry will not be an exception. Here, I will limit my comments to the story behind the writing of the book and to the technological arguments made by Friedman.

    On a personal note, I learned that, like me, Friedman is from the American Midwest. He was born in Minneapolis, married a girl from Marshalltown, Iowa, and wrote his first op-ed piece for the Des Moines Register.

    The idea to write The The World Is Flat resulted as a side effect of research Friedman was doing on another project, a documentary on off-shoring. He was interviewing Narayana Murthy, chairman of the board at Infosys, "the Microsoft of India", when Murthy said, "The global economic playing field is being leveled -- and you Americans are not ready for it." Friedman felt as if he had been sideswiped, because he considers himself well-studied in modern economics and politics, and he didn't know what Murthy meant by "the global economic playing field is being leveled" or how we Americans were so glaringly unprepared.

    As writers often do, Friedman set out to write a book on the topic in order to learn it. He studied Bangalore, the renown center of the off-shored American computing industry. Then he studies Dalien, China, the Bangalore of Japan. Until last night, I didn't even know such a place existed. Dalyen plays the familiar role. It is a city of over a million people, many of whom speak Japanese and whose children are now required to learn Japanese in school. They operate call centers, manage supply chains, and write software for japanese companies -- all jobs that used to be done in Japan by Japanese.

    Clearly the phenomenon of off-shoring is not US-centric. Other economies are vulnerable. What is the dynamic at play?

    Friedman argues that we are in a third era of globalization. The first, which he kitschily calls Globalization 1.0, ran from roughly 1492, roughly when Europe began its imperial expansion across the globe, to the early 1800s. In this era, the agent of globalization was the country. Countries expanded their land holdings and grew their economies by reaching overseas. The second era ran from the early 1800s until roughly 2000. (Friedman chose this number as a literary device, I think... 1989 or 1995 would have made better symbolic endpoints.) In this era, the corporation was the primary agent of globalization. Companies such as the British East India Company reached around the globe to do commerce, carrying with them culture and politics and customs.

    We are now in Friedman's third era, Globalization 3.0. Now, the agent of change is the individual. Technology has empowered individual persons to reach across national and continental boundaries, to interact with people of other nationalities, cultures, and faiths, and to perform commercial, cultural, and creative transactions independent of their employers or nations.

    Blogging is, again, a great example of this phenomenon. My blog offers me a way to create a "brand identity" independent of any organization. (Hosting it at www.eugenewallingford.com would sharpen the distinction!) I am able to engage in intellectual and even commercial discourse with folks around the world in much the same way I do with my colleagues here at the university. In the last hour, my blog has been accessed by readers in Europe, Australia, South America, Canada, and in all corners of the United States. Writers have always had this opportunity, but at glacial rates of exchange. Now, anyone with a public library card can blog to the world.

    Technology -- specifically networking and the digital computer -- has made Globalization 3.0 possible. Friedman breaks our current era into a sequence of phases characterized by particular advances or realizations. The specifics of his history of technology are sometimes arbitrary, but at the coarsest level he is mostly on the mark:

    1. 11/09/89 - The Berlin Wall falls, symbolically opening the the door for the East and West to communicate. Within a few months, Windows 3.0 ships, and the new accessibility of the personal computer made it possible for all of us to be "digital authors".

    2. 08/09/95 - Netscape went public. The investment craze of its IPO presaged the dot-com boom, and the resultant investment in network technology companies supplied the capital that wired the world, connecting everyone to everyone else.

    3. mid 1990s - The technology world began to move toward standards for data interchange and software connectivity. This standards movement resulted in what Friedman calls a "collaboration platform", on which new ways of working together can be built.

    These three phases have been followed in rapid succession by a number of faster-moving realizations on top of the collaboration platform:

    1. outsourcing tasks from one company to another

    2. offshoring tasks from one country to another

    3. uploading of digital content by individuals

    4. supply chaining to maximize the value of offshoring and outsourcing by carefully managing the flow of goods and services at the unit level

    5. insourcing of other companies into public interface of a company's commercial transactions

    6. informing oneself via search of global networks

    7. mobility (my term) of data and means of communication

    Uploading is the phase where blogs entered the picture. But there is so much more. Before blogs came open source software, in which individual programmers can change their software platform -- and share their advances with others but uploading code into a common repository. And before open source became popular we had the web itself. If Mark Rupert objects to what he considers Thomas Friedman's "repeated ridicule" of those opposed to globalization, then he can create a web page to make his case. Early on, academics had an edge in creating web content, but the advance of computing hardware and now software has made it possible for anyone to publish content. The blogging culture has even increased the opportunity to participate in wider debate more easily (though, as discussions of the "long tail" have shown, that effect may be dying off as the space of blogs grows beyond what is manageable by a single person).

    Friedman's description of insourcing sounded a lot like outsourcing to me, so I may need to read his book to fully get it. He used UPS and FedEx as examples of companies that do outsourced work for other corporations, but whose reach extends deeper into the core functions of the outsourcing company, intermingling in a way that sometimes makes the two companies' identities indistinguishable to the outside viewer.

    The quintessential example of informing is, of course, Google, which has made more information more immediately accessible to more people than any entity in history. It seems inevitable that, with time, more and more content will become available on-line. The interesting technical question is how to search effectively in databases that are so large and heterogeneous. Friedman explains well to his mostly non-technical audience that we are at just the beginning of our understanding of search. Google isn't the only player in this field, obviously, as Yahoo!, Microsoft, and a host of other research groups are exploring this problem space. I hold out belief that techniques from artificial intelligence will play an increasing role in this domain. If you are interested in Internet search, I suggest that you read Jeremy Zawodny's blog.

    Friedman did not have a good name for the most recent realization atop his collaboration platform, referring to it as all of the above "on steroids". To me, we are in the stage of realizing the mobility and pervasiveness of digital data and devices. Cell phones are everywhere, and usually in use by the holder. Do university students ever hang up? (What a quaint anachronism that is...) Add to this numerous other technologies such as wireless networks, voice over internet, bluetooth devices, ... and you have a time in which people are never without access to their data or their collaborators. Cyberspace isn't "out there" any more. It is wherever you are.

    These seven stages of collaboration have, in Friedman's view, engendered a global communication convergence, at the nexus of which commerce, economics, education, and governance have been revolutionized. This convergence is really an ongoing conversion of an old paradigm into a new one. Equally important are two other convergences in process. One he calls "horizontaling ourselves", in which individuals stop thinking in terms of what they create and start thinking in terms of who they collaborate with, of what ideas they connect to. The other is the one that ought to scare us Westerners who have grown comfortable in our economic hegemony: the opening of India, China, and the former Soviet Union, and 3 billion new players walking onto a level economic playing field.

    Even if we adapt to all of the changes wrought by our own technologies and become well-suited to compete in the new marketplace, the shear numbers of our competitors will increase so significantly that the market will be a much starker place.

    Friedman told a little story later in the evening that illustrates this point quite nicely. I think he attributed the anecdote to Bill Gates. Thirty years ago, would you prefer to have been born a B student in Poughkeepsie, or a genius in Beijing or Bangalore? Easy: a B student in Poughkeepsie. Your opportunities were immensely wider and more promising. Today? Forget it. The soft B student from Poughkeepsie will be eaten alive by a bright and industrious Indian or Chinese entrepreneur.

    Or, in other words from Friedman, remember: In China, if you are "1 in a million", then there are 1300 people just like you.

    All of these changes will take time, as we build the physical and human infrastructure we need to capitalize fully on new opportunities. The same thing happened when we discovered electricity. The same thing happened when Gutenberg invented the printing press. But change will happen faster now, in large part due to the power of the very technology we are harnessing, computing.

    Gutenberg and the printing press. Compared to the computing revolution. Where have we heard this before? Alan Kay has been saying much the same thing, though mostly to a technical audience, for over 30 years! I was saddened to think that nearly everyone in the audience last night thinks that Friedman is the first person to tell this story, but gladdened that maybe now more people will understand the momentous weight of the change that the world is undergoing as we live. Intellectual middlemen such as Friedman still have a valuable role to play in this world.

    As Carly Fiorina (who was recently Alan's boss at Hewlett-Packard before both were let go in a mind-numbing purge) said, "The 'IT revolution' was only a warm-up act." Who was it that said, "The computer revolution hasn't happened yet."?

    The question-and-answer session that followed Friedman's talk produced a couple of good stories, most of which strayed into policy and politics. One dealt with a topic close to this blog's purpose, teaching and learning. As you might imagine, Friedman strongly suggests education as an essential part of preparing to compete in a flat world, in particular the ability to "learn how to learn" He told us of a recent speaking engagement at which an ambitious 9th grader asked him, "Okay, great. What class do I take to learn how to learn?" His answer may be incomplete, but it was very good advice indeed: Ask all your friends who the best teachers are, and then take their courses -- whatever they teach. It really doesn't matter the content of the course; what matters is to work with teachers who love their material, who love to teach, who themselves love to learn.

    As a teacher, I think one of the highest forms of praise I can get from a student is to be told that they want to take whatever course I am teaching the next semester. It may not be in their area of concentration, or in the hot topic du jour, but they want to learn with me. When a student tells me this -- however rare that may be -- I know that I have communicated something of my love for knowledge and learning and mastery to at least one student. And I know that the student will gain just as much in my course as they would have in Buzzword 401.

    We in science, engineering, and technology may benefit from Friedman's book reaching such a wide audience. He encourages a focus not merely on education but specifically on education in engineering and the sciences. Any American who has done a Ph.D. in computer science knows that CS graduate students in this country are largely from India and the Far East. These folks are bright, industrious, interesting people, many of whom are now choosing to return to their home countries upon completion of their degrees. They become part of the technical cadre that helps to develop competitors in the flat world.

    As I listened last night, Chad Fowler's new book My Job Went to India came to mind. This is another book I haven't read yet, but I've read a lot about it on the web. My impression is that Chad looks at off-shoring not as a reason to whine about bad fortune but as an opportunity to recognize our need to improve our skills for participating in today's marketplace. We need to sharpen our technical skills but also develop our communication skills, the soft skills that enable and facilitate collaboration at a level higher than uploading a patch to our favorite open source project. Friedman, too, looks at the other side of off-shoring, to the folks in Bangalore who are working hard to become valuable contributors in a world redefined by technology. It may be easy to blame American CEOs for greed, but that ignores the fact that the world is changing right before us. It also does nothing to solve the problem.

    All in all, I found Friedman to be an engaging speaker who gave a well-crafted talk full of entertaining stories but with substance throughout. I can't recommend his book yet, but I can recommend that you go to hear him speak if you have the opportunity.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    September 01, 2005 10:44 PM

    Back to Scheme in the Classroom

    In addition to my department head duties, I am also teaching one of my favorite courses this semester, Programming Languages and Paradigms. The first part of the course introduces students to functional programming in Scheme.

    I've noticed an interesting shift in student mentality about programming languages since I first taught this course eight or ten years ago. There are always students whose interest is on learning only those languages that they will be of immediate use to them in their professional careers. For them, Scheme seems at best a distraction and at worst a waste of time. I feel sorry for such folks, because they miss out on a lot of beauty with their eyes so narrowly focused on careers. Even worse, they miss out on a chance to learn ideas that may well show up in the languages they will find themselves using ... such as Python and Ruby.

    But increasingly I encounter students who are much more receptive to what Scheme can teach them. It seems that this shift has paralleled the rise of scripting languages. As students have come to use Perl, Python, and PHP for web pages and for hacking around with Linux, they have come to see the power of what they call "sloppy" languages -- languages that don't have a lot of overhead, especially as regards static typing, that let them make crazy mistakes, but that also empower them to say and do a lot in a only a few lines of code. After their experiences in our introductory courses, where students learn Java and Ada, students feel as if freed when using Perl, Python, and PHP -- and, yes, sometimes even Scheme. Now, Scheme is hardly a "sloppy" language in the true sense of the word, and its strange parenthetical syntax is unforgiving. But it gives them power to say things that were inconvenient or darn near impossible in Java or Ada, in very little code. Students also come to appreciate Mumps, which they learn in our Information Storage and Retrieval course, for much the same reason.

    I'm looking forward to the rest of this course, both for its Scheme and for its programming languages content. What great material for a computer science student to learn. With any luck, they come to know that languages and compilers aren't magic, though sometimes they seem to do magic. But we computer scientists know the incantations that make them dance.

    Speaking of parentheses... Teaching Scheme again reminds me of just how compact Scheme's list notation can be. In class today, we discussed lists, quotation, and the common form of programs and data. Quoted lists of symbols carry all of the structural information we need to reason about many kinds of data. Contrast that with the verbosity of, oh, say, XML. (Do you remember this post from the past?) Just last month Brian Marick spoke a Lisp lover's lament on an agile testing mailing list. Someone had said:

    Note to Brian: Explain that XML is not the be-all and end-all of manual data formats. Then explain it's the easiest place to start, modulo your tool quality.

    To which Brian replied:

    I'll use XML instead of YAML as my concession to reality and penance for not talking about Perl. It will be hard not to launch into my old Lisp programmer's rant about how all the people who thought writing programs in tree structures with parentheses was unreadable now think writing data in tree structures with angle brackets and keyword arguments and quotes is somehow just the cat's pajamas.

    ( YAML is a lightweight, easy to read mark-up language for use with scripting languages. Brian might also bow to reality and not use LAML, a Lispish mark-up language for the web implemented in Scheme.)

    Ceci n'est pas une pipe

    Speaking of XML ... Here is my favorite bit of Angle Bracket Speak from the last month or so, courtesy of James Tauber:

    <pipe>Ceci n'est pas une pipe</pipe>

    So that's how I'm feeling about my course right now. Any more, when someone asks me how my class went today, I feel like the guy answering in this cartoon:

    Was it good for you? Read my blog.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    August 10, 2005 3:17 PM

    IAWTP -- More on Sharing the Thrill

    Not too long ago, I wrote about an an opinion piece by Sanjeev Arora and Bernard Chazelle that, in part, decried the lack of good story-telling by computer scientists. Not the stories we tell each other, because there are plenty of those and many are wonderful. What's missing are stories we tell the rest of the world about just how thrilling our discipline is and can be.

    A recent post at Ernie's 3D Pancakes returns to this theme of story-telling. The article begins as a discussion on Bill Gates' much-discussed interview with Maria Klawe. Toward the end, though, Ernie gets to what for me was his killer point. First, he catalogs the most common stories that non-CS folks hear and tell about computing:

    • Dammit! This Thing Never Works!
    • Be Afraid! Be Very Afraid!
    • Revenge of the Nerds
    • How to Succeed in Business Without Really Trying

    When people do hear or tell the "Oh Wow! This Is So Cool!", he writes, it's usually just a cover for one of the other story lines.

    And here is the paragraph that we computer scientists should wake up and recite each day:

    This lack of stories is an endless source of frustration for those of us who say "Oh Wow!" every day. We see power and beauty in computer science, even while we rage against the limitations of the technology that grows out of it. We see our field not (just) as a way to make boxes that beep, but as a fundamentally new way of thinking about the world. We are craftsmen, taking great satisfaction in the structures we build. We drag abstractions kicking and screaming from Plato's cave, and we make them real. We are explorers, proud of our hard-won discoveries but humbled by the depth of our ignorance. We have changed the world, utterly and irreversibly. Our influence on your daily life may be less immediate than the influence of doctors, lawyers, politicians, bankers, and soldiers, but it is no less profound. And we are just barely getting started.

    Oooh. Drama.

    More importantly, we need to find entertaining and compelling ways to tell our friends and the rest of the world -- and, yes, our sons and our daughters.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    August 09, 2005 4:00 PM

    The Academic Future of Agile Methods

    After a week impersonating one of the Two Men and a Truck guys, I'm finally back to reading a bit. Brian Marick wrote several articles in the last week that caught my attention, and I'd like to comment on two now.

    The first talked about why Brian thinks the agile movement in software development is akin to the British cybernetics movement that began in the late 1940's. He points out three key similarities in the two:

    • a preference for performance (e.g., writing software) over representation (e.g., writing requirements documents)
    • constant adaptation to the environment as a means for achieving whole function
    • a fondness for surprises that arise when we build and play with complex artifacts

    I don't know much about British cybernetics, but I'm intrigued by the connections that Brian has drawn, especially when he says, "What I wanted to talk about is the fact that cybernetics fizzled. If we share its approaches, might we also share its fatal flaws?"

    My interest in this seemingly abstract connection would not surprise my Ph.D. advisor or any of the folks who knew me back in my grad school days -- especially my favorite philosophy professor. My research was in the area of knowledge-based systems, which naturally took me into the related areas of cognitive psychology and epistemology. My work with Dr. Hall led me to the American pragmatists -- primarily C. S. Peirce, William James, and John Dewey. I argued that the epistemology of the pragmatists, driven as it was by the instrumental value of knowledge for changing how we behave in particular contexts, was the most appropriate model for AI scientists to build upon, rather than the mathematical logic that dominates most of AI. My doctoral work on reasoning about legal arguments drew heavily on the pragmatic logic of Stephen Toulmin (whose book The Uses of Argument I strongly recommend, by the way).

    My interest in the connection between AI and pragmatic epistemology grew from a class paper into a proposed chapter in my dissertation. For a variety of reasons the chapter never made it into my dissertation, but my interest remains strong. While going through files as a part of my move last week, I came across my folder of drafts and notes on this. I would love to make time to write this up in a more complete form...

    Brian's second article gave up -- only temporarily, I hope -- on discussing how flaws in the agile movement threaten its advancement, but he did offer two suggestions for how agile folks might better ensure the long-term survival and effect o their work: produce a seminal undergraduate-level textbook and "take over a computer science department". Just how would these accomplish the goal?

    It's hard to overestimate the value of a great textbook, especially the one that reshapes how folks think about an area. I've written often about the ebbs and flows of the first course in CS and, while much of the history of CS1 can be told by tracing the changes in programming language used, perhaps more can be told by tracing the textbooks that changed CS1. I can think of several off-hand, most notably Dan McCracken's Fortran IV text and Nell Dale's Pascal text. The C++ hegemony in CS 1 didn't last long, and that may be due to the fact that no C++-based book ever caught fire with everyone. I think Rick Mercer's Computing Fundamentals with C++ made it possible for a lot of instructors and schools to teach a "soft" object-oriented form of OOP in C++. Personally, I don't think we have seen the great Java-in-CS1 book yet, though I'm sure that the small army of authors who have written Java-in-CS1 books may think differently.

    Even for languages and approaches that will never dominate CS1, a great textbook can be a defining landmark. As far as OOP in CS1 goes, I think that Conner, Nigidula, and van Dam's Object-Oriented Programming in Pascal is still the benchmark. More recently, Felleisen et al.'s How to Design Programs stakes a major claim for how to teach introductory programming in a new way. Its approach is very different from traditional CS1 pedagogy, though, and it hasn't had a galvanizing effect on the world yet.

    An agile software engineering text could allow us agile folks to teach software engineering in a new and provocative way. Many of us are teaching such courses already when we can, often in the face of opposition from the "traditional" software engineers in our departments. (When I taught my course last fall, the software engineering faculty argued strongly that the course should not count as a software engineering course at all!) I know of only agile software engineering text out there -- Steinberg and Palmer's Extreme Software Engineering -- but it is not positioned as the SE-complete text that Brian envisions.

    Closer to my own world, of course, is the need for a great patterns-oriented CS1 book of the sort some of us have been working on for a while. Such a text would almost certainly be more agile than the traditional CS1 text and so could provide a nice entry point for students to experience the benefits of an agile approach. We just haven't yet been able to put our money where our mouth is -- yet.

    On Brian's three notes:

    1. Using Refactoring and Test-Driven Development and various other readings can work well enough for an agile development course, but the need for a single text is still evident. First, having scattered materials is too much work for the more casual instructor charged with teaching "the agile course". Second, even together they do not provide the holistic view of software engineering required if this text is to convince CS faculty that it is sufficient for an introductory SE course.

    2. Yes and yes. Alternative forms of education such as apprenticeship may well change how we do some of our undergraduate curriculum, but no one should bet the survival of agile methods on the broad adoption of radically different teaching methods or curricula in the university. We are, as a whole, a conservative lot.

      That doesn't mean that some of us aren't trying. I'm chairing OOPSLA's Educators' Symposium again this year, and we are leading off our day Dave West and Pam Rostal's Apprenticeship Agility in Academia, which promises a firestorm of thinking about how to teach CS -- and software development and agility and ... -- differently.

    3. I have used Bill Wake's Refactoring Workbook as a source of lab exercises for my students. It is a great resource, as is Bill's website. But it isn't a software engineering textbook.

    Why "take over a computer science department"? To create a critical mass of agile-leaning faculty who can support one another in restructuring curricula, developing courses, writing textbook, experimenting with teaching methods, and thinking Big Thoughts. Being one among nine or 15 or 25 on a faculty means a lot of hard work selling a new idea and a lot of time isolated from the daily conversations that help new ideas to form and grow. OOPSLA and Agile 200x and SIGCSE only come once a year, after all. And Cedar Falls, Iowa, is far from everywhere when I need to have a conversation on agile software development right now. So is Raleigh, North Carolina, for that matter, when Laurie Williams could really use the sort of interaction that the MIT AI Lab has been offering AI scientists for 40 years.

    Accomplishing this takeover is an exercise left to the reader. It is a slow process, if even possible. But it can be done, when strong leaders of departments and colleges set their minds and resources to doing the job. It also requires a dose of luck.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    July 26, 2005 10:38 AM

    Computer Science and Liberal Education

    University-level instructors of mathematics and science have become increasingly concerned about the level of preparation for their disciplines that students receive in elementary and high school. For example, Tall, Dark, and Mysterious and Learning Curves both describe life in the trenches teaching calculus and other entry-level math courses at the university. We in computer science see this problem manifest in two forms. We, too, encounter students who are academically unprepared for the rigors of computing courses. However, unlike general education math and science, CS courses are usually not a required part of the students' curriculum. As a result, we begin to lose students as they find computing too difficult, or at least difficult enough that it's not as much fun as students had hoped for. I believe that this is one of the ingredients in the decline of enrollment in computing majors and minors being experienced by so many universities.

    Yesterday, via Uncertain Principles, I encountered Matthew Crawford's essay, Science Education and Liberal Education. Crawford writes beautifully on the essential role of science in the liberal education, the essential role of liberal education in democracy, and the vital importance of teaching science -- and presumably mathematics -- as worthy of study in their own right, independent of their utilitarian value to society in the form of technology. A significant portion of the essay catalogs the shortcomings of the typical high school physics textbook, with its pandering to application and cultural relevance. I shall not attempt to repeat the essay's content here. Go read it for yourself, and enjoy the prose of a person who both understands science and knows how to write with force.

    Notwithstanding Crawford's apparent assertion that the pure sciences are noble and that computing is somehow base technology, I think that what he says about physics is equally true of computing. Perhaps our problem is that we in computer science too often allow our discipline to be presented merely as technology and not as an intellectual discipline of depth and beauty. As I've written before we need to share the thrill, that computing brings us -- not the minutia of programming or the features of the latest operating system or chip set. Surely, these things are important to us, but they are important to us because we already love computing for its depth and beauty.

    Just yesterday, I commented abjectly in e-mail to a colleague that only twelve incoming freshmen had declared majors in computing during this summer's orientation sessions at our university. He wrote back:

    I think we need to move away from presenting the CS major as a path to being a software drone in competition with India. We need to present it as leading edge, discovery based -- that is -- a science. I think too many students now see it as a software engineering nightmare -- a 40 year career of carefully punctuated cookbook code. Too boring for words.

    Perhaps it is time for us to shift our mode of thought in computer science education toward the model of liberal education a lá the sciences. We might find that the number of students interested in the discipline will rise if we proudly teach computing as an intellectual discipline. In any case, I suspect that the level of interest and commitment of the students who do study computing will rise when we challenge them and address their need to do something intellectually worthwhile with their time and energy.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    July 23, 2005 4:45 PM

    Dog Days of Summer

    Busy, busy, busy.

    As I mentioned in my anniversary post, the more interesting thoughts I have, the more I tend to blog. The last few weeks have been more about the clutter of meetings and preparing for some new tasks than about interesting thoughts. That's a little sad, but true. I did manage to spend a little more time at home with my wife this week, while my daughters were away at camp. That isn't sad at all.

    OOPSLA 2005 logo

    I have been working a bit on the Educators' Symposium for OOPSLA 2005. My program committee and I are working on a panel session to close the symposium, one that we hope will spark the minds of attendees as they head out into the conference proper. The rough theme draws on what some of us see as a sea change in computing. Without a corresponding change in CS education, we may doom ourselves to a future in which biologists, economists, chemists, political scientists, and most everyone else teach courses that involve computers and modeling and simulation -- and we will teach only theory and perceived esoterica to a small but hardy minority of students. Maybe that is where CS education should go, but if so I'd rather go there because we intend to, not because we all behave as if we were Chance the Gardener from Being There. During our discussion of this panel, members of my program committee directed me to two classics I had not read in a while, Edsger Dijkstra's On the Cruelty of Really Teaching Computer Science and Tony Hoare's 1980 Turing Award lecture, The Emperor's Old Clothes. Reading these gems again will likely get my mind moving.

    An unusual note regarding the Educators' Symposium... For many years now, OOPSLA's primary sponsor -- ACM's Special Interest Group on Programming Languages -- has offered scholarships for educators to attend the conference and the Educators' Symposium. A few years ago, when OOP was especially hot, the symposium offered in the neighborhood of fifty scholarships, and the number of applicants was larger. This year, we have received only nineteen applications for scholarships. Is OOP now so mainstream that educators don't feel they need to learn any more about it or how to teach it? Did I not advertise the availability of scholarships widely enough? As an OOP educator with some experience, I can honestly say that I have a lot yet to learn about OOP and how to teach it effectively. I think we are only scratching the surface of what is possible. I wonder why more educators haven't taken advantage of the opportunity to apply for a great deal to come to OOPSLA. If nothing else, a few days in San Diego is worth the time of applying!

    I have had my opportunity to encounter some interesting CS thoughts the last few weeks, through meetings with grad students. But I've had a couple of weeks off from those as well. Maybe that's just as well... my mind may have been wandering a bit. Perhaps that would explain why one of my M.S. students sent me this comic:

    Ph.D. Comics, 05/28/05 -- Meeting of the Minds


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    July 15, 2005 4:41 PM

    Think Big!

    A couple of quotes about thinking big -- one serious, and one just perfect for the end of a Friday:

    Think Global, Act Local

    Grady Booch writes about the much-talked-about 125 hard questions that science faces, posed by the American Association for the Advancement of Science on the occasion of its 125th anniversary:

    Science advances daily through the cumulative efforts of millions of individuals who investigate the edges of existing knowledge, but these sort of questions help direct the global trajectory of all that local work.

    At the end of his entry, Grady asks a great question: What are the grand challenges facing us in computer science and software development, the equivalent of Hilbert's problems in mathematics?

    When Is Enough Enough?

    Elizabeth Keogh writes about reality for many readers:

    If you think you're going to finish reading all those books you bought, you need more books.

    I'm a book borrower more than a book buyer, but if I substitute "borrowed" for "bought", this quote fits me. Every time I read a book, I seem to check out three more from the library. I recently finished Glenway Wescott's "The Pilgrim Hawk", and I'm about to finish James Surowiecki's "The Wisdom of Crowds". Next up: Leoneard Koren's "13 Books". (I'm a slow reader and can't keep up with all the recommendations I receive from friends and colleagues and blogs and ...!)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    July 08, 2005 2:35 PM

    Breaking in a New iBook

    Daring Fireball recently ran a piece on several issues in the Apple world, including the recent streamlining of the iPod line:

    This emphasis on a simplified product lineup has been a hallmark of the Jobs 2.0 Administration. For the most part, given a budget and a use case, it's pretty easy to decide which Mac or which iPod to buy. (The hardest call to make, in my opinion, is between the iBooks and 12" PowerBook.)

    I agree with John's assessment of the last tough choice -- I recently agonized over the iBook versus PowerBook choice, focusing for budget reasons on the crossover point from iBook to PowerBook. In the end, I think it was more pride than anything else keeping me on the fence. I love my old G3 clamshell PowerBook and like the look of the titanium Powerbooks. But given my real needs and budget, the iBook was the right choice. One potential benefit of going with the simpler, lower-cost alternative for now is that it postpones a more substantial purchase until the shift to Intel-based processors is complete. My next PowerBook can be one from The Next Generation.

    My new iBook arrived last week. I've been having great fun playing with OS X 10.4... My old Powerbook is still running Jaguar, so I have been missing out on the many cool things available only on Panther. I'm only just now scratching the surface of Dashboard and Expose and the like, but they feel great. My only minor regret at this point is going with the 30GB drive. I'm already down to 13 gig free, and I haven't done much more than basic set up and my basic set of apps. In reality, that's still plenty of space for me. I don't store lots of video and music on my laptop yet, and my data will fit comfortably in 10 gig. If I do need more space, I can just pick up a external drive.

    While setting up this machine, it really struck how much of my Mac experience now is bundled up with Unix. In the old days, I set up Macs by dragging StuffIt archives around and creating folders; I spent a few minutes with control panels, but not all that much. Setting up OS X, I spend almost all of my time in a terminal shell, with occasional forays out to System Preferences. This machine switch may be more Unix-heavy than usual, because I've decided to follow OS X from tcsh to bash. Rewriting config files and hacking scripts is fun but time consuming.

    Of course, this change pales next to the switch I made when I went to grad school. As an undergrad, I became a rather accomplished VMS hacker on an old cluster of DEC Vaxes. When I got to my graduate program, there wasn't a Vax to be seen. Windows machines were everywhere, but the main currency was Unix, so I set out to master it.

    Another thing that struck me this week is how much of my on-line identity is bundled up in my Unix username. "I am wallingf." That has been my username since my first Unix account in the fall of 1986, and I've kept it on all subsequent Unix machines and whenever possible elsewhere. At least I know I'm not the only one who feels this way. Last year as we prepared for the Extravagria workshop at OOPSLA 2004, Dick Gabriel wrote that rpg is the

    Login name for RichardGabriel. I have used this login since 1973 and resent ISPs and organizations that don't allow me to use it.

    Anyway, my iBook now knows me as wallingf. I guess I should give her a name, too.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    June 29, 2005 2:12 PM

    Open File Formats Only, Please

    After being informed repeatedly that I will need either a Windows box or Office for the Mac in order to be a department head, due to the sheer volume of Word and Excel documents that flow through the university hierarchy, I found hope this piece of news. If Norway can do it, why not the State of Iowa or my university?

    I once read a letter to the editor of our local paper calling for the state to adopt an open-formats-only policy, as a matter of fairness to competing businesses and to citizens. Alas, the benefits of standardization far outweigh the long-term economic effects for most people. Philosophical issues rarely enter the discussion at all. I suspect that it's easier for this sort of edict to originate at the top of a heap, because then groups lower in the hierarchy face a consideration that trumps standardization.

    In any case, I think that I will get by just fine, at least for now. I could use OpenOffice, if I feel like running X-windows. But my preferred tool these days is NeoOffice/J, which is getting more solid with every release. (It's finally out of beta.) While it's not a native OS X app -- it looks and feels like a Windows port -- it does all I need right now for handling files in Office formats.

    I have spent many, many years of gently encouraging department heads not to send me Word files, offering alternatives that were as effortless as possible. I'm not sure if I'm ready to take on higher levels of university administration. But I do feel some obligation to lead by teaching on the issue of open standards in computing.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Managing and Leading

    June 29, 2005 8:40 AM

    Learning from the Masters

    This spring I was asked to participate on a panel at XP2005, which recently wrapped up in the UK. This panel was on agile practices in education, and as you may guess I would have enjoyed sharing some of my ideas and learning from the other panelists and from the audience. Besides, I've not yet been to any of the agile software development conferences, and this seemed like a great opportunity. Unfortunately, work and family duties kept me home for what is turning out to be a mostly at-home summer.

    In lieu of attending XP2005, I've enjoyed reading blog reports of the goings-on. One of the highlights seems to have been Laurent Bossavit's Coding Dojo workshop. I can't say that I'm surprised. I've been reading Laurent's blog, Incipient(thoughts), for a while and exchanging occasional e-mail messages with him about software development, teaching, and learning. He has some neat ideas about learning how to develop software through communal practice and reflection, and he is putting those ideas into practice with his dojo.

    The Coding Dojo workshop inspired Uncle Bob to write about the notion of repeated practice of simple exercises. Practice has long been a theme of my blog, going back to one of my earliest posts. In particular, I have written several times about relatively small exercises that Joe Bergin and I call etudes, after the compositions that musicians practice for their own sake, to develop technical skills. The same idea shows up in an even more obviously physical metaphor in Pragmatic Dave's programming katas.

    The kata metaphor reminds us of the importance of repetition. As Dave wrote in another essay, students of the martial arts repeat basic sequences of moves for hours on end. After mastering these artificial sequences, the students move on to "kumite", or organized sparring under the supervision of a master. Kumite gives the student an opportunity to assemble sequences of basic moves into sequences that are meaningful in combat.

    Repeating programming etudes can offer a similar experience to the student programmer. My re-reading of Dave's article has me thinking about the value of creating programming etudes at two levels, one that exercises "basic moves" and one that gives the student an opportunity to assemble sequences of basic moves in the context of a more open-ended problem.

    But the pearl in my post-XP2005 reading hasn't been so much the katas or etudes themselves, but one of the ideas embedded in their practice: the act of emulating a master. The martial arts student imitates a master in the kata sequences; the piano student imitates a master in playing Chopin's etudes. The practice of emulating a master as a means to developing technical proficiency is ubiquitous in the art world. Renaissance painters learned their skills by emulating the masters to whom they were apprenticed. Writers often advise novices to imitate the voice or style of a writer they admire as a way to ingrain how to have a voice or follow a style. Rather than creating a mindless copycat, this practice allows the student to develop her own voice, to find or develop a style that suits their unique talents. Emulating the master constrains the student, which frees her to focus on the elements of the craft without the burden of speaking her own voice or being labeled as "derivative".

    Uncle Bob writes of how this idea means just as much in the abstract world of software design:

    Michael Feathers has long pondered the concept of "Design Sense". Good designers have a "sense" for design. They can convert a set of requirements into a design with little or not effort. It's as though their minds were wired to translate requirements to design. They can "feel" when a design is good or bad. They somehow intrinsically know which design alternatives to take at which point.

    Perhaps the best way to acquire "Design Sense" is to find someone who has it, put your fingers on top of theirs, put your eyeballs right behind theirs, and follow along as they design something. Learning a kata may be one way of accomplishing this.

    Watching someone solve a kata in a workshop can give you this sense. Participating in a workshop with a master, perhaps as programming partner, perhaps as supervisor, can, too.

    The idea isn't limited to software design. Emulating a master is a great way to learn a new programming language. About a month ago, someone on the domain-driven design mailing list asked about learning a new language:

    So assuming someone did want to want to learn to think differently what would you go with? Ruby, Python, Smalltalk?

    Ralph Johnson's answer echoed the importance of working with a master:

    I prefer Smalltalk. But it doesn't matter what I prefer. You should choose a language based on who is around you. Do you know somebody who is a fan of one of these languages? Could you talk regularly with this person? Better yet, could you do a project with this person?

    By far the best way to learn a language is to work with an expert in it. You should pick a language based on people who you know. One expert is all it takes, but you need one.

    The best situation is where you work regularly with the expert on a project using the language, even if it is only every Thursday night. It would be almost as good if you would work on the project on your own but bring code samples to the expert when you have lunch twice a week.

    It is possible to learn a language on your own, but it takes a long time to learn the spirit of a language unless you interact with experts.

    Smalltalk or Scheme may be the best in some objective (or entirely subjective!) sense, but unless you can work with an expert... it may not the right language for you, at least right now.

    As a student programmer -- and aren't we all? -- find a person to whom you can "apprentice" yourself. Work on projects with your "master", and emulate his style. Imitate not only high-level design style but also those little habits that seem idiosyncratic and unimportant: name your files and variables in the same way; start your programming sessions with the same rituals. You don't have to retain all of these habits forever, and you almost certainly won't. But in emulating the master you will learn and internalize patterns of practice, patterns of thinking, and, yes, patterns of design and programming. You'll internalize them through repetition in the context of real problems and real programs, which give the patterns the richness and connectedness that make them valuable.

    After lots of practice, you can begin to reflect on what you've learned and to create your own style and habits. In emulating a master first, though, you will have a chance to see deeper into the mind and actions of someone who understands and use what you see to begin to understand better yourself, without the pressure of needing to have a style on your own yet.

    If you are a computer scientist rather than a programmer, you can do much the same thing. Grad students have been doing this as long as there have been grad students. But in these days of the open-source software revolution, any programmer with a desire to learn has ample opportunity to go beyond the Powerbook on his desk. Join an open-source project and interact with a community of experts and learners -- and their code.

    And we still have open to us an a more traditional avenue, in even greater abundance, literature. Seek out a writer whose books and articles can serve in an expert's stead. Knuth, Floyd, Beck, Fowler... the list goes on and on. All can teach you through their prose and their code.

    Knowing and doing go hand in hand. Emulating the masters is an essential part of the path.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

    June 22, 2005 10:25 AM

    Sharing the Thrill

    The August 2005 issue of Communications of the ACM will contain an opinion piece entitled "The Thrill Is Gone?" (pdf) by Sanjeev Arora and Bernard Chazelle, who are computer scientists at Princeton University. This article suggests that we in computing have done ourselves and the world a disservice in failing to communicate effectively the thrill of computer science -- not technology -- to the general public. On their view, this failure probably plays a role in our declining undergraduate enrollments, declining research funding, and a general lack of appreciation for importance of computer science by the public. People, especially young people, should be excited by computing, but instead they are nonplussed.

    As I read Arora and Chazelle's piece, I recalled some of the recent themes in my writing here, including the irony that this is the most exciting time to study CS ever, the love of computing planted by Gödel, Escher, Bach, and the need for a CS administrator to be a teacher to a broader audience. But this article makes a larger point.

    Arora and Chazelle do a very nice job of pointing out that the story of how computer science has shaped the technologies we all use can and should be told to the outside world. Fundamental ideas such as P-versus-NP, cryptography, and quantum computing are accessible to folks with a high school education or less when described in a way that strips away unnecessary complexity. The effect of computing on other disciplines such as physics, biology, and economics relies more on computer science ideas than on advanced engineering of hardware, but most people have no clue. How could they? We've never told them.

    Reading this piece now has heightened my resolve to do something this fall that I've been saying I would do for at least two years: put together a short course on computing for the middle-school students at my daughters' school. I am thinking of using the Computer Science Unplugged materials created by a group of New Zealand computer scientists who evidently believe what I am preaching here -- but who have done me one better in writing instructional material that is accessible to elementary school students.

    Another step I will take this fall is to look for opportunities to write an op-ed piece or two for the local paper. A few weeks back, our paper ran this story on software bugs in automobiles. At the time, I thought that this was a great opportunity for someone in my department to write a piece that might help the paper's readers better understand the issues involved, and maybe even garner a little good publicity for our academic programs. But I was busy, so... I considered asking a colleague to write the piece instead, but shied away. From now on, though, I plan to make the time or make the request. It's has always been a responsibility of my academic position, but now perhaps moreso.

    Regular readers of this blog can probably guess my favorite line from the Arora and Chazelle piece. It's a theme that appears here often, often with a link back to Alan Kay:

    Computer science is a new way of thinking.

    But I also like the final line of the article, enough to quote it here:

    We think it is high time that the computer science community should reveal to the public our best kept secret: our work is exciting science--and indispensable to the nation.

    (Thanks to Suresh and Ernie for their pointers to the pre-release of this article. Both of these guys are theoretical CS bloggers whom I enjoy reading.)


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    June 12, 2005 5:06 PM

    On Making Things Up

    As I think I've mentioned here before, I am a big fan of Electron Blue, a blog written by a professional artist who has taken up the study of physics in middle age. Her essays offer insights into the mind of an artist as it comes into contact with the abstractions of physics and math. Sometimes, the two mindsets are quite different, and others they are surprisingly in tune.

    I just loved this line, taken from an essay on some of the similarities between a creator's mind and a theoretical scientists:

    When I read stuff about dark energy and string theory and other theoretical explorations, I sometimes have to laugh, and then I say, "And you scientists think that we artists make things up!"

    Anyone who has done graduate research in the sciences knows how much of theory making is really story telling. We in computer science, despite working with ethereal computation as our reality, are perhaps not quite so make-believe as our friends in physics, whose efforts to explain the workings of the physical world long ago escaped the range of our senses.

    Then again, I'm sure some people look at XP and think, "It's better to program in pairs? It's better to write code with duplication and then 'refactor'? Ri-i-i-i-ght." What a story!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    June 10, 2005 3:10 PM

    Another Advertisement for Smalltalk

    Last week, I posted a note on a cool consequence of Smalltalk being written in Smalltalk, the ability to change the compiler to handle new kinds of numeric literals. Here is another neat little feature of Smalltalk: You can quit the interpreter in the middle of executing code, and when you start back up the interpreter will finish executing the code!

    Consider this snippet of code:

    Smalltalk snapshot: true andQuit: true.
    ... code to execute at next start-up, such as:
    PWS serveOnPort: 80 loggingTo: 'log.txt'

    If you execute this piece of code, whether as a method body or as stand-alone code in a workspace, it will execute the first statement -- which saves the image and quits. But in saving the image, you save the state of the execution, which is precisely this point in the code being executed. When you next start up the system, it will resume write where it left off, in this case with the message to start up a web server.

    How I wish my Java environments did this...

    This is another one of those things that you want Java, Perl, Python, and Ruby programmers to know about Smalltalk: it isn't just a language; it is a living system of objects. When you program in Smalltalk, you don't think of building a new program from scratch; you think of molding and growing the initial system to meet your needs.

    This example is from Squeak, the open-source Smalltalk I use when I have the chance to use Smalltalk. I ran across the idea at Blaine Buxton's blog, and he found the idea in a Squeak Swiki entry for running a Squeak image headless. (A "headless image" is one that doesn't come up interactively with a user. That is especially useful for running the system in the background to drive some application, say a web server.)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    June 09, 2005 6:19 PM

    Department Head as Teacher

    Some folks have expressed concern or even dismay that my becoming department head will pull me away from teaching. Administration can't be as much fun as the buzz of teaching, with its paperwork and meetings and bureaucracy. And there's no doubt that teaching one course instead of three will create a different focus to my days and weeks.

    But the more I prepare for my move to the Big Office Downstairs, the more I realize that -- done well -- a head's job involves a fair amount of teaching, too, only in a different form and to broader audiences.

    To address the problem of declining enrollments in our majors, we as a department need to educate students, parents, and high school counselors that this is the best time ever to major in computing. To ensure that the department has access to the resources it needs to do its job effectively, we as a department must educate deans, provosts, presidents, and state legislatures about the nature of the discipline and its needs. And that's just the beginning. We need to help high schools know how to better prepare students to study computer science at the university. We need to take educate the general public on issues where computing intersects the public interest, such as privacy, computer security, and intellectual property.

    These opportunties to teach are all about what computing is, does, and can be. They aren't one of those narrow and somewhat artificial slices of the discipline that we carve off for our courses, such as "algorithms" or "operating systems". They are about computing itself.

    The "we"s in the paragraph above refer to the department as a whole, which ultimately means the faculty. But I think that an important part of the department head's job is to be the "royal we", to lead the department's efforts to educate the many constituencies that come into contact with the department's mission -- suppliers, consumers, and everyone in between.

    So, I'm learning more about the mindset of my new appointment, and seeing that there will be a fair bit of education involved after all. I'm under no illusion that it will be all A-ha! moments, but approaching the job with an educator's mind should prepare me to be a more effective leader for my department. The chance to educate a broader audience about computer science and its magic should be a lot of fun. And, like teaching anything else, the teaching itself should help me to learn a lot -- in this case, about my discipline and its role in the world. Whether I seek to remain in administration or not, in the long run that should make me a better computer scientist.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Managing and Leading, Teaching and Learning

    June 06, 2005 5:04 PM

    A Personal Goodbye to AAAI

    the AAAI logo

    I recently made a bittersweet decision: I am not going to renew my membership in AAAI. The AAAI is the American Association for Artificial Intelligence, and I have been a member since 1987, when I joined as a graduate student.

    Like many computer scientists who grew up in the '70s and '80s, AI was the siren that lured me to computing. Programs that could play chess, speak and understand English sentences, create a piece of music; programs that could learn from experience... so many tantalizing ideas that all lay in the sphere of AI. I wanted to understand how the mind works, and how I could make one, if only a pretend one in the silicon of the rather inelegant machines of the day.

    I remember when I first discovered Gödel, Escher, Bach and was enchanted even further by the idea of self-reference, by the intertwining worlds of music, art, mathematics, and computers that bespoke a truth much deeper than I had ever understood before. The book took me a whole summer to read, because every few pages set my mind whirling with possibilities that I had to think through before moving on.

    I did my doctoral work in AI, at the intersection of knowledge-based systems and memory-based systems, and reveled in my years as a graduate student, during which the Turing Test and Herb Simon's sciences of the artificial and cognitive science were constant topics of discussion and debate. These ideas and their implications for the world mattered so much to us. Even more, AI led me to study psychology and philosophy, where I encountered worlds of new and challenging ideas that made me a better and more well-rounded thinker.

    My AI research continued in my early years as an assistant professor, but soon my interests and the needs of my institution pulled me in other directions. These days, I think more about programming support tools and programming languages than I do AI. I still love the AI Enterprise but find myself on the outside looking in more often than not. I still love the idea of machine learning, but the details of modern machine learning research no longer enthrall me. Maybe the field matured, or I changed. The AI that most interests me now is whatever technique I need to build a better tool to support programmers in their task. Still, a good game-playing program still draws my attention, at least for a little while...

    In any case, the idea of paying $95 a year to receive another set of printed magazines that I don't have time to study in depth seems wasteful of paper and money both. I read some AI stuff on the web when I need or want, and I keep up with what my students are doing with AI. But I have to admit that I'm not an AI scientist anymore.

    For some reason, that is easier to be than to say.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    June 03, 2005 2:34 PM

    Changed Your Language Today?

    I have had this link and quote in my "to blog" folder for a long time:

    ... the one thing that a Ruby (or Python) programmer should know about Smalltalk is, it's all written in Smalltalk.

    But I wanted to have a good reason to write about it. Why does it matter to a Smalltalker that his language and environment are implemented in Smalltalk itself?

    Today, I ran across a dynamite example that brings the point home. David Buck

    ... was working for a company once that did a lot of work with large numbers. It's hard, though, to write 45 billion as 45000000000. It's very hard to read. Let's change the compiler to accept the syntax 45b as 45 billion.

    Sisyphus pushing a stone up a mountain

    And he did it -- by adding six lines of code to his standard working environment and saving the changes. This is the sort of openness that makes working in Java or most any other ordinary language feel like pushing rocks up a mountain.

    Lisp and Scheme read macros give you a similar sort of power, and you can use regular macros to create many kinds of new syntax. But for me, Smalltalk stands above the crowd in its pliability. If you want to make the language you want to use, start with Smalltalk as your raw material.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    June 02, 2005 6:51 PM

    Who Says Open Source Doesn't Pay?

    Google's Summer of Code flyer

    Leave it to the guys from Google to offer the Summer of Code program for students. Complete an open-source project through one of Google's collaborators, and Google will give you a $4500 award. The collaborators range from relatively large groups such as Apache and FreeBSD, through medium-sized projects such as Subversion and Mono, down to specific software tools such as Jabber and Blender. Of course, the Perl Foundation, the Python Software Foundation, and Google itself are supporting projects. You can even work on open-source projects in Lisp for LispNYC, a Lisp advocacy group!

    The program bears a strong resemblance to the Paul Graham-led Summer Founders Program. But the Summer of Code is much less ambitious -- you don't need to launch a tech start-up; you only have to hack some code -- and so is likely to have a broader and more immediate effect on the tech world. Of course, if one of the SFP start-ups take off like Google or even ViaWeb, then the effects of the SFP could be much deeper and longer lasting.

    This is another slick move from the Google image machine. A bunch of $4500 awards are pocket change to Google, and in exchange they generate great PR and establish hefty goodwill with the open-source organizations participating.

    From my perspective, the best part of the Summer of Code is stated right on its web page: "This Summer, don't let your programming skills lie fallow...". I give this advice to students all the time, though they don't often appreciate its importance until the fall semester starts and they feel the rust in their programming joints. "Use it, or lose it" is trite but true, especially for nascent skills that are not yet fully developed or internalized. Practice, practice, practice.

    The Summer of Code is a great chance for ambitious and relatively advance students to use this summer for their own good, by digging deep into a real project and becoming better programmers. If you feel up to it, give it a try. But even if you don't, find some project to work on, even if it's just one for your amusement. Perhaps I should say especially if it's just one for your amusement -- most of the great software in this world was originally written by people who wanted the end result for themselves. Choose a project that will stretch your skills a bit; that will force you to improve in the process. Don't worry about getting stuck... This isn't for class credit, so you can take the time you need to solve problems. And, if you really get stuck, you can always e-mail your favorite professor with a question. :-)

    Oh, if you do want to take Google up on its offer, you will want to hurry. Applications are due on June 14.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    May 05, 2005 11:16 AM

    Software Case Studies

    (I wrote a complete essay on this topic over the course of several hours on Monday and then, in a few swift seconds of haste, I did the unthinkable: I rmed it. Sigh. I don't know if I can reproduce that wondrous work of art, but I still have something I want to say, so here goes.)

    Last time, I wrote about how having a standard literary form can ease the task faced by writers and readers, including reviewers. I found this to be true of submissions to the OOPSLA Educators Symposium, and my colleague Robert Biddle found it to be true of submissions to the conference's practitioner reports track. From this starting point, Robert and I moved onto an ambitious idea.

    I closed my last entry with a cliffhanger: The unsupported claim that "the lack of a standard form is especially acute in the area of practitioner reports, which have the potential to be one most important contributions OOPSLA makes in the software world." This time, I'd like to talk about both parts of this claim.

    First, why is the lack of a standard literary form especially acute in the area of practitioner reports?

    Keep in mind that the authors of practitioner reports are software practitioners, folks in the trenches solving real problems. Most of these folks are not in the habit of writing expository prose to teach others or to share experiences -- their jobs are primarily to create software. Unlike most submitters to the Educators Symposium, the authors of practitioner reports are usually not academics, who typically have experience with expository writing and and can at least fall back on the literary forms they learned in publishing academic papers. (Sadly, those forms tend not to work all that well for sharing pedagogical experience.)

    Not having a standard literary form makes writing and reading practitioner reports that much harder. How can authors best communicate what they learned while building a software project? What ideas do readers expect to find in the paper? What's worse, because many software developers don't have much experience writing for a broad audience, not knowing how to go about writing a paper can create a considerable amount of fear -- and the result is that many practitioners won't even try to write a paper in the first place. It turns out that a standard literary form has another benefit: it provides comfort to potential authors, lowering the entry barrier to new writers.

    A great example of this effect is the software patterns community. Its popularization of a common and powerful literary form made it both possible and attractive to many practicing software developers to record and disseminate what they had learned writing programs. The software community as a whole owes an extraordinary debt for this contribution.

    So, I contend that the practitioner's track would benefit even more than the educators' track from the creation and widespread adoption of a standard literary form. (But I still hope that we educators will take steps to improve our lot in this regard.)

    Second, how do practitioner reports have the potential to be one most important contributions that OOPSLA or any other conference makes in the software world?

    Keep in mind that the authors of practitioner reports are software practitioners, folks in the trenches solving real problems. Researchers and methodologists can propose ideas that sound great in theory, but practitioners find out how well they work in the real world, where academic abstractions take a back seat to the messiness of real businesses and real people and real hardware. Even when researchers and methodologists make a good-faith effort to vette their ideas outside of their labs, it is difficult to recreate all the complexities that can arise when new adopters try implement the ideas in their organizations.

    The result is, we don't really know how well ideas work until they have been tried in the trenches. And practitioner reports can tell us.

    Sharing knowledge of a practical sort used to lie outside the domain of computer science and even software engineering, but the software patterns movement showed us that it could be done, that it should be done, and gave us some hints for how to do it. The potential benefit to practitioners is immeasurable. Before trying out XP, or migrating from Visual Basic to VB.Net, or integrating automated acceptance tests into the build cycle, a practitioner can read about what happened when other folks in the trenches tried it out. Usually, we expect that the ideas worked out okay in practice, but a good report can point out potential pitfalls in implementation and describe opportunities to streamline the process.

    Of course, academics can benefit from good practitioner reports, too, because they close the loop between theory and practice and point out new questions to be answered and new opportunities to exploit.

    Robert and I didn't talk much about what I've written so far in this entry, because we rather quickly moved on to Robert's bigger vision of what practitioner reports can be, one that presupposes the untapped value buried in this resource: the software case study.

    As Robert pitched it, consider NCSA Mosaic. Here is a program that changed the world. How was it built? What technical and non-technical problems arose while it was being written, and how did the development team solve them? Did serendipity ever strike, and how did they gain leverage? We can find the answers to all of these questions and more -- the creators are still around to tell the story!

    Case studies are a standard part of many disciplines. In business schools, students learn the how companies work by reading case studies. I remember well a management course I took as an undergraduate in which we studied the development of particular companies and industries through case studies. (Most of what I know about the soft drink industry came from that case book!) Of course, the law itself is structured around cases and webs of facts and relationships.

    We are fortunate to have some very nice case studies in computing. Knuth has written widely about his own programs such as Tex and his literate programming tools. Because Smalltalk is as much system as language, Alan Kay's The Early History of Smalltalk from History of Programming Languages II qualifies as a software case study. Other papers in the HOPL volumes probably do, too. I have read some good papers on Unix that qualify as case studies, too.

    Two of my favorite textbooks, Peter Norvig's Paradigms of AI Programming and Clancy and Linn's Designing Pascal Solutions, are built around sets of case studies. In the latter, the cases were constructed for the book; in the former, Norvig analyzes programs from the history of AI and reimplements them in Common Lisp. (You really *must* study this book.) I even used a case study book as a CS undergrad: Case Studies in Business Data Bases, by James Bradley. It's still on my bookshelf. More recently, the College Board's Marine Biology Case Study has received considerable attention in the AP and CS1 communities.

    So, if we believe that software case studies have merit, we find ourselves back in the trenches... How do we write them? Using an excellent case study as an exemplar would be a start. I suggest that any case study of value probably must tell us at least three things:

    1. In what context did we operate?
    2. What did we do?
    3. What did we learn?
    These elements apply to case studies of whole systems as well as to case studies of incremental changes to systems or process improvements. They are almost certainly part of the better practitioner reports presented at OOPSLA.

    Robert and I will likely work on this idea further. If you have any ideas, please share them. In particular, I am interested in hearing about existing and potential software case studies. Which of the case studies you've read do you think are the best exemplars of the genre? Which programs would you like to see written up in case studies?


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    May 01, 2005 7:27 PM

    The Value of Standard Form in Evaluating Technical Papers

    Over lunch today, Robert Biddle and I discussed some thoughts we had after reviewing submissions to the OOPSLA Educators Symposium and practitioners reports track this year. Both of these tracks suffer from a problem that had never really occurred to me before this year: not having an accepted form for telling stories.

    If you read the proceedings of most computer science conferences, you will recognize that all the papers have a similar look and feel. This is the way that scientists in the domain communicate with one another. One function of a common form is to ensure an efficient exchange of information; having a common style means that readers immediately feel at home when they come to a paper. But there is a subtle secondary function, too. When you see a paper, you can tell whether the author is a part of the community or not.

    One of the the things that makes reviewing papers for the Educators Symposium tough is the variety of papers. With no standard, authors are left to mimic other papers or invent new forms that fit the story they want to tell. But every new form makes the program committee's job harder, too. How do we evaluate the contribution of a paper that looks and sounds different than any we've seen? What role should experimental validation of a teaching technique play? What role should an explanation of lessons learned, or a discussion of how to implement the technique at a different institution?

    Most frustrating for me as program chair were situations in which two reviewers evaluated the same paper in essentially complementary ways. This isn't a fault of the reviewers, because the community as a whole has not reached a consensus for what papers should be like. I suppose that one of my jobs as program chair is to guide this process closely, working with the program committee to give authors and reviewers alike a better sense of what we are looking for in submissions. The nifty assignment track that I introduced to the symposium last year was an attempt in this direction. I borrowed from a successful form used at recent SIGCSEs to encourage OO educators to tell the stories of their coolest and most engaging programming assignments. Having an expectation of what a nifty assignment should look like has led, I think, to a more satisfying evaluation process for these submissions.

    Maybe we as a community of educators need to work together to develop a standard way to tell teaching stories. Perhaps the greatest contribution of the software patterns community was to standardize the way practitioners and academics discuss the elements of program design and construction at a level above algorithms and data structures. As linguistic form, it enables communication in a way that was heretofore impossible.

    Notice here that a common literary form is more than a literary format. The patterns community is a good example of this. Even after a decade there are a number of accepted formats for writing patterns, some favored by one group of writers, some viewed as especially effective for patterns of a particular sort. The key to the pattern community's contribution is that it establishes expectations for the content of software patterns. I can comfortably read a pattern in almost any format, but if I can't find the context, the problem, the forces, and the solution, then I know the pattern has problems. The PLoP conferences play the role of enculturating pattern writers by helping them to learn the standard form and how to use it.

    The lack of a standard form is especially acute in the area of practitioners reports, which have the potential to be one most important contributions OOPSLA makes in the software world. I'll have more to say about this tomorrow.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    April 26, 2005 5:40 PM

    Importing Language Features

    While doing some link surfing yesterday, I ran across an old blog entry that has a neat programming languages idea in it: Wouldn't it be nice if we could encapsulate language features.

    The essay considers the difference between two kinds of complexity we encounter in programs. One is complexity in the language features themselves. First-class closures or call-with-current-continuation are examples. Just having them in a language seems to complicate matters, because then we feel a need to teach people to use them. Even if we don't, some programmer may stumble across them, try to use them, and shoot himself in the foot. Such abstractions are sometimes more than the so-called ordinary programmer needs.

    Another kind of complexity comes from the code we write. We build a library of functions, a package of classes, or a framework. These abstractions can also be difficult to use, or it may be difficult to understand their inner workings. Yet far fewer people complain about a language having too many libraries. (*)

    Why? Because we can hide details in libraries, in two ways. First, in order to use Java's HashMap class, I must import java.util.HashMap explicitly. Second, once I have imported the class, I don't really need to know anything about the inner workings of the class or its package. The class exposes only a set of public methods for my use. I can write some pretty sophisticated code before I need to delve into the details of the class.

    Alexander asks the natural question: Why can't we encapsulate language features in a similar way?

    Following his example, suppose that Sun adds operator overloading to Java but doesn't want every programmer to have to deal with it. I could write a package that uses it and then add a new sort of directive at the top of my source file:

    exposeFeature operatorOverloading;

    Then, if other programmers wanted to use my package, they would have to import that feature into their programs:

    importFeature operatorOverloading;

    Such an information-hiding mechanism might make adding more powerful features to a language less onerous on everyday programmers, and thus more attractive to language designers. We might even so languages grow in different and more interesting ways.

    Allowing us to reveal complex language features incrementally would also change the way we teach and write about programming. I am reminded of the concept of "language levels" found in Dr. Scheme (and now in Dr. Java). But the idea of the programmer controlling the exposure of his code to individual language features seems to add a new dimension of power -- and fun -- to the mix.

    More grist for my Programming Languages and Compilers courses next year...

    ----

    (*) Well, unless we want to use the language to teach introductory CS courses, of course.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    April 24, 2005 3:13 PM

    Programming as Literacy

    Every once in a while I am jolted from my own parochial concerns by a reminder that we in computing have an opportunity to do so much more than just train tomorrow's Java or C++ programmers. On Friday, Darren Hobbs reminded me with his blog Programming as Literacy. Darren begins:

    In what's typically called 'the western world', before literacy became essentially ubiquitous it was limited to a select few. Monks originally, then scribes. These individuals possessed a truly remarkable skill. They could make marks on paper that could remember things. They could capture information and retrieve it later. Associated with this skill was the ability to make marks that could perform complex computations and produce an answer. People with these skills were hired by nobles and merchants to increase business value.

    I have mentioned this idea before, looking to computer scientists such as Alan Kay and Ward Cunningham, for whom computing is more than just a technical skill reserved to a special few. Ward's wiki enables a new kind of conversation. In Kay's vision, computing is

    a new kind of reading and writing medium that allows some of the most important powerful ideas to be discussed and played with and learned

    in ways more profound than any book. (See his essay Background on How Children Learn for more.)

    Alan often speaks passionately that we hold in our hands the most powerful tool for changing the world since the printing press five hundred years ago. I do not think that the passion in his words in out of place. Like Alan, I believe that we are creating the medium that people will use to create the next Renaissance.

    When we think about computing and computing education in such terms, the problems we encounter teaching CS majors to program seem small. But these problems also point out just how much work we have to do if we want to effect the sort of change that our opportunity affords. I suspect that the sort of things we need to do to make programming a new form of general literacy available to all would only make teaching CS majors easier.

    Others are thinking about this problem, too. Darren Hobbs' blog was a nice reminder of that. And John Mitchell responds to my blog on accountability through conversation by suggesting that our programming languages should be more conversational. I think that natural language will always have an advantage over programming language in this regard, but John's idea that we can create little languages that are structured, predictable, comprehensible, and concise enough to support more natural interaction is worth thought. The work of Alan Kay's group on eToys and Active Essays, described at Squeakland, is certainly aimed in this direction.

    I am glad summer is almost here. I need some time to back away from day-to-day classroom issues to think about loftier goals, and concrete plans to move toward them.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    April 20, 2005 11:16 AM

    At the End of an Empty Office Hour

    A while back on the mailing list for the textbook Artificial Intelligence: A Modern Approach, co-author Peter Norvig wrote:

    A professor who shall remain anonymous once thanked me for including him in the bibliography of AIMA. I told him that his work was seminal to the field, so of course I included it. "Yeah, yeah, I don't care about that," he said, "what I care about is that students look in the back of the book to see if my name is there, and if it is they think I'm important and they don't bother me in office hours as much."

    I am not cited in AIMA, but whatever I am doing to scare them off must be working.

    By the way, Norvig also wrote one of my favorite books on programming, Paradigms of AI Programming. Nominally, this book teaches AI programming in Lisp, but really it teaches the reader how to program, period. Norvig re-implements many of the classic AI programs, such as Eliza, GPS, and Danny Bobrow's Student, showing design and implementation decisions along the way. His use of case studies provides a lot of context in which to learn about building a programming.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    April 01, 2005 9:07 AM

    Reading to Write

    Yesterday morning, one of my students told me that he was thinking of changing his major. It turns out that he was English major before switching to CS, and he is thinking about switching back. We got to talking about the similarities and differences between the majors and how much fun it would be to major in English or literature.

    Some salesman I am! We need more CS majors, so I should probably have tried to convince to stay with us. Discussing the relative value in the two majors was beyond the scope of our short discussion, though, and that's not really what I want to write about.

    The student mentioned that he knew of other folks who have bounced between CS and English in school, or who have studied in one field and ended up working in the other. I wasn't too surprised, as I know of several strong students in both disciplines who have performed well in the other and, more importantly, have deep interests in both. Writing and programming have a lot more in common than most people realize, and people who love to communicate in written form may well enjoy programming.

    I myself love to read books by artists about their crafts. The "recommended reading list" that I give to students who ask includes two books on writing: William Zinsser's On Writing Well and Joseph Williams's Style. But I've enjoyed many wonderful books on writing over the years, often on recommendation from other software developers...

    • several of Zinsser's other books,
    • Natalie Goldberg's Writing Down the Bones
    • Richard Hugo's The Triggering Town
    • Scott Russell Sanders's Writing From The Center
    • Anne Lamott's Bird by Bird
    • Annie Dillard's The Writing Life
    • E.L. Doctorow's Reporting the Universe

    Each has taught me something about how to write. They have helped me write better technical papers and better instructional material. But I have to admit that I don't usually read these books for such practical reasons. I just like to feel what it's like to be a writer: the need to have a voice, the passion for craft. These books keep me motivated as a computer scientist, and they have indirectly helped me to write better programs.

    Writers aren't the only artists whose introspective writing I like to read. The next book to read on my nightstand is Twyla Tharp's The Creative Habit. Dance is much different than fiction, but it, too, has something to teach us software folks. (I seem to recall a dinner at PLoP many years ago at which Brad Appleton suggested dance as a metaphor for software development.)

    When I started writing this essay, I thought that it would be about my recommended reading list. That's not how it turned out. Writing is like that. So is software development sometimes.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

    March 14, 2005 2:47 PM

    Problems Are The Thing

    In my last blog, I discussed the search for fresh new examples for use in teaching an OO CS1 course. My LazyWeb request elicited many interesting suggestions, some of which have real promise. Thanks to all who have sent suggestions, and thanks for any others that might be on the way!

    While simple graphical ball worlds make for fun toy examples, they don't seem as engaging to me as they did a few years back. As first demos, they are fine. But students these days often find them too simplistic. The graphics that they can reasonably create in the first course pale when compared to the programs they have seen and used. To use ball world, we need to provide infrastructure to support more realistic examples with less boilerplate code.

    Strategy and memory games engage some students quite nicely. Among others, I have used Nim, Simon, and Mastermind. Mastermind has always been my favorite; it has so many wonderful variations and offers so many opportunities to capitalize on good OO design.

    Games of all sorts seem to engage male students more than female students, which is a common point of conversation when discussing student retention and the diversity of computer science students. We need to use a broader and more interesting set of examples if we hope to attract and retain a more broader group of students. (More on whether this should be an explicit goal later.)

    I think that this divide is about something much more important than just the interests of men and women. The real problem with games as our primary examples is that these examples are really about nothing of consequence. Wanting to work on such a problem depends almost solely on one's interest in programming for programming's sake. If students want to do something that matters, then these examples won't engage their interest.

    This idea is at the core of our desire to create a new sort of CS 1 book. As one of my readers pointed out in an e-mail, most introductory programming instruction isn't about anything, either. It "simply marches through the features of whatever language is being used" at the time. The examples used are contrived to make the language feature usable -- not always even desirable, just usable.

    Being about nothing worked for Seinfeld, but it's not the best way to help students learn -- at least not if that's all we offer them. It also limits the audience that we can hope to attract to computing.

    So much of computer science instruction is about solutions and how to make them, but the solutions aren't to anything in particular. That appeals to folks who are already interested in the geeky side of programming. What about all those other folks who would make good computer scientists working on problems in other domains? Trying to create interesting problems around programming techniques and language features is a good idea, but it's backward.

    Interesting problems come from real domains. I learned the same lesson when studying AI as a graduate student in the 1980s. We could build all the knowledge-based systems we wanted as toys to demonstrate our techniques, but no one cared. And besides, where was the knowledge to come from? For toy problems, we had to make it up, or act as our own "domain experts". But you know, building KBS was tougher when we had to work with real domain experts: tax accountants and chemical engineers, plant biologists and practicing farmers. The real problems we worked on with real domain experts exercised our techniques in ways we did not anticipate, helping us to build richer theories at the same time we were building programs that people really used. I thank my advisor for encouraging this mindset in our laboratory from its inception.

    Real domain problems are more likely to motivate students and teachers. They offer rich interconnections with related problems and related domains. Some of these problems can be quite simple, which is a good thing for teaching beginning students. But many have the messy nature that makes interesting ideas matter.

    At OOPSLA last year, Owen Astrachan was touting the new science of networks as a source of interesting problems for introductory CS instruction. The books Linked and Six Degrees provide a popular introduction to this area of current interest throughout academia and the world of the Web. Even the idea of the power law can draw students into this area. I recently asked students to write a simple program that included calculating the logarithm of word counts in a document, without saying anything about why. Several students stopped by, intrigued, and asked for more. When I told them a little about the power law and its role in analyzing documents and explaining phenomena in economics and the Web, they all expressed interest in digging deeper.

    Other problems of this sort are becoming popular. We have started an undergraduate program in bioinformatics and have begun to explore ways to build early CS instruction around examples from that domain. The availability of large databases and APIs opens new doors, too. Google and Amazon have opened their databases to outside programmers. At last year's ITiCSE conference the invited talks all focused on the future of CS instruction by going back to a time in which computing research focused on applied problems.

    If you've been reading here for a while, then you have read about the importance of context in learning before. A big part of Alan Kay's talks there focused on how students can learn about the beauty of computing through doing real science. His eToys project has students build simulations of physical phenomena that they can observe, and in doing so learn a lot about mathematics and computation. But this is a much bigger idea in Kay's work. If you haven't yet, read his Draper Prize talk, The Power Of The Context. It speaks with eloquence about how people have great ideas when they are immersed in an environment that stimulates thought and connections. Kay's article says a lot about how the working conditions in lab foster such an environment, and perhaps most importantly the people. In an instructional setting the teacher and fellow students define most of this part of the environment. Other people can be a part of the environment, too, through their ideas and creations -- see my article on Al Cullum and his catchphrase "a touch of greatness".

    But the problems that we work on are also an indispensable element in the context that motivates people to do great work. In his Draper Prize talk, Kay speaks about how his erox PARC lab worked with educators, artists, and engineers on problems that mattered to them, along with all the messy distractions that those problems entail. Do you think that the PARC folks would have created as many interesting tools and ideas if they had been working "Hello, World" and other toy problems of their own invention? I don't.

    Alan's vision in his OOPSLA talks was to create the computing equivalent of Frank Oppenheimer's Exploratorium -- 500 or more exciting examples with which young people could learn about math, science, computation, reading, and writing, all in context. With that many problems at hand, we wouldn't have to worry as much about finding a small handful of examples for use each semester, as every student would likely find something that attracted him or her, something to engage their minds deeply enough that they would learn math and science and computing just to be able to work on the problem that had grabbed a hold of them. The real problem engages the students, and its rich context makes learning something new worthwhile. Whenever the context of the problem is so messy that distractions inhibit learning, we as instructors have to create boundaries that keep the problem real but let the students focus on what matters.

    Having students work on real problems offers advantages beyond motivation. Remember the problem of attracting and retaining a wider population of students? Real problems may help us there, too. We geeks may like programming for its own sake, but not everyone who could enrich our discipline does. Whatever the natural interests and abilities of students are, different kinds of people seem to different values that affect their choice of academic disciplines and jobs. This may explain some of the difficulty that computer science has attracting and retaining women. A study at the University of Michigan found that women tend to value working with and for people more than men, and that this value accounted at least in part for women tend to choose math and CS careers less frequently: they perceive that math and CS are less directly about people than some other disciplines, even in the sciences. If women do not have this perception, many of our CS1 and 2 courses would give it to them right away. But working on real problems from real domains might send a different signal: computing provides an opportunity to work with other people in more different ways than just about any other discipline!

    I know that this is one of the reasons I so loved working in knowledge-based systems. I had a chance to work with interesting people from all over the spectrum of ideas: lawyers, accountants, scientists, practitioners, .... And in the meantime I had to study each new discipline in order to understand it well enough to help the people with whom I worked. It was a constant stream of new and interesting ideas!

    I don't want you to think that no one is using real problems in their courses. Certainly a number of people are. For example, check out Mark Guzdial's media computation approach to introductory computing courses. Media computation -- programs that manipulate images, sounds, and video -- seems like a natural way to go for students of this generation. I think an example of this sort would make a great starting point for my group's work at ChiliPLoP next week. But Mark's project is one of only a few big projects aimed in this direction. If we are to reach an Exploratorium-like 650 great CS 1 examples, then we all need to pitch in.

    One downside for instructors is that working with real problems requires a lot of work up front. If I want to use the science of networks or genomics as a my theme for a course, then I need to study the area myself well in advance of teaching the course. I have to design all new classroom examples -- and programming assignments, and exam questions. I will probably have to build support software to hide my students from gratuitous complexity in the domain and the programming tools that students use.

    Another potential downside for someone at a small school is that an applied theme in your course may appeal to some students but not others, and your school can only teach one or a few sections of the course at any one time. This is where Alan Kay's idea of having a large laboratory of possibilities becomes so appealing.

    This approach requires work, but my ChiliPLoP colleagues seem willing to take the plunge. I'll keep you posted on our efforts and results as they progress.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    March 13, 2005 11:35 AM

    Looking for Engaging Examples

    Some of my favorite colleagues and I will be getting together in a week or so for ChiliPLoP 2005, where we will continue to work on our longstanding project, to produce a different sort of textbook for the CS 1 course. We'd like to create a set of instructional units built around engaging and instructive applications. In a first course, these applications will have to be rather small, but we believe students can learn more and better about how to program when their code has a context.

    In many ways, Mike Clancy's and Marcia Linn's classic Designing Pascal Solutions serves as my goal. Clancy and Linn did for structured programming and Pascal what I'd like for our work to do for object-oriented programming and Java (or whatever succeeds it). Their case studies focus on simple yet complete programs such as a calendar generator, a Roman numeral calculator, and a text processor, using them as the context in which students learn the basics of computation and Pascal syntax. Along the way, they also learn to something about how to write programs which is, in many ways, the central point of the course. This book, and its data structures follow-up, implement a wonderful teaching idea well. I think we can do this for OOP, and can use what we've learned about patterns in the intervening 15 years to do it in an especially effective way.

    Such an approach requires that we identify and work out in some detail several such examples. The old examples worked great in a text-oriented world in which students' experience with computers was rather limited, but we can surely do better. Our students come to the university with broad experience interacting with computer systems. Cell phones, iPods, and TiVo are a part of the fabric of their lives. Besides, objects and languages like Java bring graphical apps, client-server apps, web apps, and other more sophisticated programs within reach.

    A canonical first example for a graphical OOP introduction to programming is the ball world, a framework for simple programs in which balls and other graphical elements move about some simulated world, interacting in increasingly sophisticated ways. The folks at Brown and Duke have been developing this line of examples for a decade or more, and Tim Budd wrote a successful book aimed at students in a second or third course in which this is the first "real" Java program students see.

    But ball world is tired, and besides it doesn't make much of a connection to the real world of computing these days. It might work fine to introduce students to the basics of Java graphics, but it needs serious work as an example that can both be used as early in CS 1 and engage many different students.

    This is drifting dangerously toward a longer and different essay that I've been meaning to write for a while now, but I don't have the time this morning. That essay will have to wait until tomorrow. In the meantime, I'll leave you with a catchphrase that Owen Astrachan was touting at OOPSLA last fall: Problems are the thing. Owen's right.

    Returning to the more immediate issue of ChiliPLoP 2005 and our Hot Topic, if you have an idea for a motivating example that might engage beginning programmers long enough for us to help them learn a bit about programming and objects, please pass it on. Maybe we can make it one of our working examples!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    February 26, 2005 5:39 PM

    Resolved: "Object Early" Has Failed

    The focal event of Day 3 at SIGCSE is the 8:30 AM special session, a panel titled Resolved: "Object Early" Has Failed. It captures a sense of unrest rippling through the SIGCSE community, unrest at the use of Java in the first year, unrest at the use of OOP in the first year, unrest at changes that have occurred in CS education over the last decade or more. Much of this unrest confounds unhappiness with Java and unhappiness with OOP, but these are real feelings felt by good instructors. But there are almost as many people uneasy that, after five or more years, so few intro courses seem to really do OOP, or at least do it well. These folks are beginning to wonder whether objects will ever take a place at the center of CS education, or if we are destined to teach Pascal programming forever in the language du jour. This debate gives people a chance to hear some bright folks on both sides of the issue share their thoughts and answer questions from one another and the audience.

    The panel is structured as a light debate among four distinguished CS educators:

    • Stuart "17 years after writing my Pascal book, I'm teaching the same stuff in Java... Now that's fun" Reges
    • Eliot "'Objects first' failed? Not according to Addison Wesley" Koffman
    • Michael Kölling, of BlueJ fame
    • Kim Bruce, whom you've read about here earlier

    Our fearless leader is Owen "Moderator with a Law" Astrachan. The fact that the best humor in the introductions was aimed at the pro-guys may reveal Owen's bias, or maybe Stuart and Eliot are just easier targets...

    Eliot opened by defining "objects early" as a course that on using and writing classes before writing control structures and algorithm development. He made the analogy to new math: students learn "concepts" but can't do basic operations well. Most of Eliot's comments focused on faculty's inability or unwillingness to change and on the unsuitability of the object-oriented approach for the programs students write at the CS1 level. The remark about faculty reminded me of a comment Owen made several years ago in a workshop, perhaps in an Educators Symposium at OOPSLA, that our problem here is no longer legacy software but legacy faculty.

    Michael followed quite humorously with a send-up of this debate as really many debates: objects early, polymorphism early, interfaces early, GUIs early, events early, concurrency early... If we teach all of these in Week 1, then teaching CS 1 will be quite nice; Week 2 is the course wrap-up lecture! The question really is, "What comes last?" Michael tells us that objects haven't failed us; we have failed objects. Most of us aren't doing object yet! And we should start. Michael closed with a cute little Powerpoint demo showing a procedural-oriented instructor teaching objects not by moving toward objects, but by reaching for objects. When you reach too far without moving, you fall down. No surprise there!

    Stuart returned the debate to the pro side. He sounded like someone who had broken free of a cult. He said that, once, he was a true believer. He drank the kool-aid and developed a CS 1 course in which students discussed objects on Day 2. He even presented a popular paper on the topic at SIGCSE 2000, Conservatively Radical Java in CS1. But over time he found that, while his good students succeeded in his new course, the middle tier of students struggled with the "object concept". He is willing to entertain the idea that the problem isn't strictly with objects-first but with the overhead of OOP in Java, but pragmatic forces and a herd mentality make Java the language of choice for most schools these days, so we need an approach that works in Java. Stuart lamented that his students weren't getting practice at decomposing problems into parts or implementing complete programs on their own. Students seem to derive great pleasure in writing a complete, if small, program to solve a problem. This works in the procedural style, where a 50- to 100-line can do something. Stuart asserted that this doesn't work with an OO style, at least in Java. Students have to hook their programs in with a large set of classes, but that necessitates programming to fixed API. The result just isn't the same kind of practice students used to get when we taught procedural programming in, um, Pascal. Stuart likens this to learning to paint versus learning to paint-by-the-numbers. OOP is, to Stuart, paint-by-the-numbers -- and it is ultimately unsatisfying.

    Stuart's contribution to the panel's humorous content was to claim that the SIGCSE community was grieving the death of "objects early". Michael, Kim, and their brethren are in the first stage of grief, denial. Some objects-early folks are already in the stage of anger, and they direct their anger at teachers of computer science, who obviously haven't worked hard enough to learn OO if they can't succeed at teaching it. Others have moved onto the stage of bargaining: if only we form a Java Task Force or construct the right environment or scaffolding, we can make this work. But Stuart likened such dickering with the devil to cosmetic surgery, the sort gone horribly wrong. When you have to do that much work to make the idea succeed, you aren't doing cosmetic surgery; you are putting your idea on life support. A few years ago, Stuart reached the fourth stage of grief, depression, in which he harbored thoughts that he was the alone in his doubts, that perhaps he should just retire from the business. But, hurray!, Stuart finally reached the stage of acceptance. He decided to go back to the future to return to the halcyon days of the 1980s, of simple examples and simple programming constructs, of control structures and data structures and algorithm design. At last, Stuart is free.

    Kim closed the position statement portion of the panel by admitting that it is hard work for instructors who are new to OO to learn the style, and for others to build graphics and event-driven libraries to support instruction. But the work is worth the effort. And we shouldn't fret about using "non-standard libraries", because that is how OO programming really works. Stuart followed up with a question: Graphics seems to be the killer app of OOP; name two other kinds of examples that we can use. Kim suggested that the key is not graphics themselves but the visualization and immediate feedback they afford, and pointed to BlueJ as an environment provides these features to most any good object.

    In his closing statement for the con side, Michael closed with quote attributed Planck:

    A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it.

    Stuart's closing statement for the pro side was more serious. It returned to structured programming comparison. It was hard to make the switch to structured programming in CS education. Everyone was comfortable with BASIC; now they had to buy a Pascal compiler for their machines; a compiler might not exist for the school's machines; .... But the trajectory of change was different. It worked, in the sense that people got it and felt it was an improvement over the old way -- and it worked faster than the switch to OOP has worked. Maybe the grieving is premature. Perhaps objects-early hasn't failed yet -- but it hasn't succeeded yet, either. According to Stuart, that should worry us.

    The folks on the panel seemed to find common ground in the idea that objects-early has neither succeeded nor failed yet. They also seemed to agree that there are many reasonable ways to teach objects early. And most everyone seemed to agree that instructors should use a language that works best for the style of programming they teach. Maybe Java isn't that language.

    In the Q-n-A session that followed, Michael made an interesting observation: We are now living through the lasting damage of the decision by many schools to adopt C++ in CS 1 over a decade ago. When Java came along, it looked really good as a teaching language only because we were teaching C++ at the time. But now we see that it's not good enough. We need a language suitable for teaching, in this case, teaching OOP to beginners. (Kim reminded us of Michael's own Blue language, which he presented at the OOPSLA'97 workshop on Resources for Early Object Design Education.)

    I think that this comment shows an astute understanding of recent CS education history. Back when I first joined the CS faculty here, I supported the move from Pascal to C++ in CS 1. I remember some folks at SIGCSE arguing against C++ as too complex, too unsettled, to teach to freshmen. I didn't believe that at the time, and experience ultimately showed that it's really hard to teach C++ well in CS 1. But the real damage in everyone doing C++ early wasn't in the doing itself, because some folks succeeded, and the folks who didn't like the results could switch to something else. The real damage was in creating an atmosphere in which even a complex language such as Java looks good as a teaching language, an environment in which we seem compelled by external forces to teach an industrial language early in our curricula. Perhaps we slid down a slippery slope.

    My favorite question from the crowd came from Max Hailperin He asked, "Which of procedural programming and OOP is more like the thinking we do in computer science when we aren't programming? The implication is that the answer to this question may give us a reason for preferring one over the other for first-year CS, even make the effort needed to switch approaches a net win over the course of the CS curriculum. I loved this question and think it could be the basis of a fun and interesting panel in its own right. I suspect that, on Max's implicit criterion, the mathematical and theoretical sides of computing may make procedural programming the preferred option. Algorithms and theory don't seem to have objects in them in the same way that objects populate an OO program. But what about databases and graphics and AI and HCI? In these "applications" objects make for a compelling way to think about problems and solutions. I'll have to give this more thought.

    After the panel ended, Robert Duvall commented that the panel had taught him that the so-called killer examples workshop that has taken place at the last few OOPSLAs have failed. Not that idea -- having instructors share their best examples for teaching various OO concepts -- is a bad one. But the implementation has tended to favor glitzy examples, complicated examples. What we need are simple examples, that teach an important idea with the minimum overhead and minimum distraction to students. I'm certainly not criticizing these workshops or the folks who organize them, nor do I think Robert is; they are our friends. But the workshops have not yet had the effect that we had all hoped for them.

    Another thing that struck about this panel was Stuart's relative calmness, almost seriousness. He seems at peace with his "back the '80s" show, and treats this debate as almost not a joking matter any more. His demeanor says something about the importance of this issue to him.

    My feeling on all of this was best captured by a cool graphic that Owen posted sometime near the middle of the session:

    The second attempt to fly Langley's Aerodrome on December 8, 1903, also ended up in failure. After this attempt, Langley gave up his attempts to fly a heavier-than-air aircraft.

    (Thanks to the wonders of ubiquitous wireless, I was able in real time to find Owen's source at http://www.centennialofflight.gov/.)

    I don't begrudge folks like Stuart and Elliott finding their comfort point with objects later. They, and more importantly their students, are best served that way. But I hope that the Michael Köllings and Kim Bruces and Joe Bergins of the world continue to seek the magic of object-oriented flight for CS 1. It would be a shame to give up on December 8 with the solution just around the corner.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    February 25, 2005 3:43 PM

    Day 2 at SIGCSE: Another Keynote on Past and Present

    This morning's keynote was given by last year's recipient of SIGCSE's award for outstanding contributions to CS education, Mordechai (Moti) Ben-Ari. Dr. Ben-Ari was scheduled to speak last year but illness prevented him from going to Norfolk. His talk was titled "The Concorde Doesn't Fly Anymore" and touched on a theme related to yesterday's talk, though I'm not certain he realized so.

    Dr. Ben-Ari gave us a little history lesson, taking us back to the time of the US moon landing. He asserted that this was the most impressive achievement in the history of technology and reminded us that the project depended on some impressive computing -- a program that was delivered six months in advance of the mission, which used only 2K of writable memory.

    Then he asked the audience to date some important "firsts" in computing history, such as the year the first e-mail was sent. I guessed 1973, but he gave 1971 as the right answer. (Not too bad a guess, if I do say so myself.) In the end, all of the firsts dated to the period 1970-1975 -- just after that moon landing. So much innovation in a such a short time span. Ben-Ari wondered, how much truly new have we done since then? In true SIGCSE fashion, he made good fun of Java, a language cobbled out of ideas discovered and explored in the '60s and '70s, among them Simula, Smalltalk, Pascal, and even C (whose contribution was "cryptic syntax").

    The theme of the talk was "We in computing are not doing revolutionary stuff anymore, but that's okay." Engineering disciplines move beyond glitz as they mature. Valuable and essential engineering disciplines no longer break much new ground, but they have steady, sturdy curricula. He seemed to say that we in computing should accept that we have entered this stage of development and turn CS into a mature educational discipline.

    His chief example was mechanical engineering. He contrasted the volume of mathematics, science, and settled engineering theory and concepts required by the ME program at his university with the same requirements in the typical CS program. Seven math courses instead of four; five science courses instead of two; a dozen courses in engineering foundations instead of three or four in computing. Yet we in CS feel a need to appear "relevant", teaching new material and new concepts and new languages and new APIs. No one, he said, complains that mechanical engineering students learn about pulleys and inclined planes -- 300-year-old technology! -- in their early courses, but try to teach 30-year-old Pascal in a CS program and prepare for cries of outrage from students, parents, and industry.

    In this observation, he's right, of course. We taught Ada in our first-year courses for a few years in the late 1990s and faced a constant barrage of questions from parents and students as to why, and what good would it be to them and their offspring.

    But in the larger scheme of things, though, is he right? It seems that Dr. Ben-Ari harkens back to the good old days when we could teach simple little Pascal in our first course. He's not alone in this nostalgia here at SIGCSE... Pascal has a hold on the heartstrings of many CS educators. It was a landmark the history of CS education, when a single language captured the zeitgeist and technology of computing all at once, in a simple package that appealed to instructors looking for something better and to students who could wrap their heads around the ideas inside of Pascal and the constructs that made up Pascal in their early courses.

    A large part of Pascal's success as a teaching language lay in how well it supplanted languages such as Fortran (too limited) and PL/I (too big and complex) in the academic niche. I think PL/I is a good language to remember in this context. To me, Java is the PL/I of the late 1990s and early 2000s: a language that aspires to do much, a language well-suited to a large class of programs that need to be written in industry today, and a language that is probably too big and complex to serve all our needs as the first language our students see in computer science.

    But that was just the point of Kim Bruce's talk yesterday. It is our job to build the right layer of abstraction at which to teach introductory CS, and Java makes a reasonable base for doing this. At OOPSLA last year, Alan Kay encouraged us to aspire to more, but I think that he was as disturbed by the nature of CS instruction as with Java itself. If we could build an eToys-like system on top of Java, then Alan would likely be quite happy. (He would probably still drop in a barb by asking, "But why would you want to do that when so many better choices exist?" :-)

    In the Java world, many projects -- BlueJ, Bruce's ObjectDraw, JPT, Karel J. Robot, and many others -- are aimed in this direction. They may or may not succeed, but each offers an approach to focusing on the essential ideas while hiding the details of the industrial-strength language underneath. And Ben-Ari might be happy that Karel J. Robot's pedagogical lineage traces back to 1981 and the Era of Pascal!

    As I was writing the last few paragraphs, Robert Duvall sat down and reminded me that we live in a different world than the one Pascal entered. Many of our incoming students arrive on campus with deep experience playing with and hacking Linux. Many arrive with experience building web sites and writing configuration scripts. Some even come in with experience contributing to open-source software projects. What sort of major should we offer such students? They may not know all that they need to know about computer science to jump to upper-division courses, but surely "Hello, World" in Pascal or C or Java is not what they need -- or want. And as much as we wax poetic about university and ideas, the world is more complicated than that. What students want matters, at least as it determines the desire they have to learn what we have to teach them.

    Ben-Ari addressed this point in a fashion, asserting that we spend a lot of time trying to making CS easy, but that we should be trying to make it harder for some students, so they will be prepared to be good scientists and engineers. Perhaps so, but if we construct our programs in this way we may find that we aren't the ones educating tomorrow's software developers. The computing world really is a complex mixture of ideas and external forces these days.

    I do quibble with one claim made in the talk, in the realm of history. Ben-Ari said, or at least implied, that the computing sciences and the Internet were less of a disruption to the world than the introduction of the telegraph. While I do believe that there is great value in remembering that we are not a one-of-a-kind occurrence in the history of technology -- as the breathless hype of the Internet boom screamed -- I think that we lack proper perspective for judging the impact of computing just yet. Certainly the telegraph changed the time scale of communication by orders of magnitude, a change that the Internet only accelerates. But computing affects so many elements of human life. And, as Alan Kay is fond of pointing out, its potential is far greater as a communication medium than we can even appreciate at this point in our history. That's why Kay reminds us of an even earlier technological revolution: the printing press. That is the level of change to which we in computing should aspire, fundamentally redefining how we talk about and do almost everything we do.

    Ben-Ari's thesis resembles Kay's in its call for simplicity, but it differs in many other ways. Are we yet a mature a discipline? Depending on how we answer this question, the future of computer science -- and computer science education -- should take radically different paths.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    February 18, 2005 2:20 PM

    Never Been Compared to a Barrel of Boiling Oil Before

    I hope that my Algorithms course doesn't have this kind of reputation among our students!

    Teaching Algorithms is a challenge different from my other courses, which tend to be oriented toward programming and languages. It requires a mixture of design and analysis, and the abstractness of the design combines with the theory in the analysis to put many students off. I try to counter this tendency by opening most class sessions with a game or puzzle that students can play. I hope that this will relax students, let them ease into the session, and help them make connections from abstract ideas to concrete experiences they had while playing.

    Some of my favorite puzzles and games require analysis that stretches even the best students. For example, in the last few sessions, we've taken a few minutes each day to consider a puzzle that David Ginat has called Election. This is my formulation:

    An election is held. For each office, we are given the number of candidates, c, and a text file containing a list of v votes. Each vote consists of the candidate's number on the ballot, from 1 to c. v and c are so large that we can't use naive, brute-force algorithms to process the input.

    Design an algorithm to determine if there is a candidate with a majority of the votes. Minimize the space consumed by the algorithm, while making as few passes through the list of votes as possible.

    We have whittled complexity down to O(log c) space and two passes through the votes. Next class period, we will see if we can do even better!

    I figure that students should have an opportunity to touch greatness regularly. How else will they become excited by the beauty and possibilities of an area that to many of them looks like "just math"?


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    January 14, 2005 4:50 PM

    A Couple of Nice Excerpts

    Before leaving for the weekend, two quotes from blog reading today. Both are about the life of the mind.

    From Uncertain Principles:

    And, you know, in a certain sense, that's the whole key to being a scientist. It's not so much a matter of training, as a habit of mind. It's a willingness to poke randomly at a selection of buttons on somebody else's camera, in hopes that it might shed some light on the problem.

    It's a habit that's all too easy to fall out of, too. It's easy to start relying on colleagues and technicians and reference works for information, and stop poking at things on your own.

    From Electron Blue:

    Books are patient, books are kind. They stay up with you all night and never complain that you are taking up too much of their time. They are never too busy to help you. They don't complain about being insulted and they rarely (if they are well-written and helpful books, that is) make you feel belittled or stupid. A book will not condescend or laugh at me because I am doing the most basic things over again. And you can work with a book at your own pace. [...] I once said to a Live Physicist that I am learning physics one electron at a time. He replied that this would mean that the time-span of my learning physics would exceed the projected life-span of the entire universe. Well, it may indeed take me that long to get to any kind of advanced physics. But I might as well start where I am, and keep going.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    January 13, 2005 5:50 AM

    I Knew It!

    My first musical recommendation here has a computer programming twist. Eternal Flame is the first song I've ever heard that talks about both assembly language programming and Lisp programming, Even if you don't like folk music all that much, listen for a little while... It will confirm what you've always known deep in your heart: God programs in Lisp.

    You can find other songs with technology themes at The Virtual Filksing.

    (Via Jefferson Provost. Jefferson, if you ever record a rock version of Eternal Flame, please let me know!)


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    January 06, 2005 3:03 PM

    Looking Under the Hood to Be a Better Programmer

    Brian Marick just blogged on Joel Spolsky's Advice for Computer Science College Students. I had the beginnings of an entry on Joel's stuff sitting in my "to blog" folder, but was planning to put it off until another day. (Hey, this is the last Thursday I'll be able to go home early for sixteen weeks...) But now I feel like saying something.

    Brian's entry points that Joel's advice reflects a change in the world of programming from back when we old fogies studied computer science in school: C now counts as close enough to the hardware to be where students should learn how to write efficient programs. (Brian has more to say. Read it.)

    I began to think again about Joel's advice when I read the article linked above, but it wasn't the first time I'd thought about it. In fact, I had a strong sense of deja vu. I looked back and found another article by Joel, on leaky abstractions, and then another, called Back to Basics. There is is a long-term theme running through Joel's articles, that programmers must understand both the abstractions they deal in and how these abstractions are implemented. In some ways, his repeated references to C are mostly pragmatic; C is the lingua franca at the lowest level of software development, even -- Brian mentions -- for those of us who prefer to traffic in Ruby, Smalltalk, Perl or Python. But C isn't the key idea Joel is making. I think that this, from the leaky abstractions, article, is [emphasis added]:

    The law of leaky abstractions means that whenever somebody comes up with a wizzy new ... tool that is supposed to make us all ever-so-efficient, you hear a lot of people saying "learn how to do it manually first, then use the wizzy tool to save time." [...] tools which pretend to abstract out something, like all abstractions, leak, and the only way to deal with the leaks competently is to learn about how the abstractions work and what they are abstracting. So the abstractions save us time working, but they don't save us time learning.

    And all this means that paradoxically, even as we have higher and higher level programming tools with better and better abstractions, becoming a proficient programmer is getting harder and harder.

    Programming has always been hard, but it gets harder when we move up a level in abstraction, because now we have to worry about the interactions between the levels. Joel's article argues us that it's impossible to create an abstraction that doesn't leak. I'm not sure I willing to believe it's impossible just yet, but I do believe that it's close enough for us to act as if it is.

    That said, Brian's history lesson offers some hope that that the difficulty of programming isn't growing harder at an increasing rate, because sometimes what counts as the lowest level rises. C compilers really are good enough these days that we don't have to learn assembly in order to appreciate what's happening at the machine level. Or do we?

    Oh, and I second Brian's recommendation to read Joel's Advice for Computer Science College Students. He may just be one of those old fogies "apt to say goofy, antediluvian things", but I think he's spot on with his seven suggestions (even though I think a few of his reasons are goofy). And that includes the one about microeconomics!


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    December 30, 2004 8:06 AM

    Computation and Art

    For your daily reminder of what computation can do and the future of art might be like, check out Jared Tarbell's Gallery of Computation. The artwork generated by Tarbell's code is complex yet often quite alluring. So many artists use computation as medium these days that this site is "nothing new", but I was struck by the beauty of the images produced.

    I was also struck by the fact that all of the code for producing these works is available on-line:

    I believe all code is dead unless executing within the computer. For this reason I distribute the source code of my programs in modifiable form to encourage life and spread love. Opening one's code is a beneficial practice for both the programmer and the community. I appreciate modifications and extensions of these algorithms. Please send me your experiences.

    Eugene sez: Two thumbs up!

    I've had one student do work of this sort. A few years ago, I had a student named David Schmudde. Dave is one of those guys who mixes both technical skills and interests with artistic skills and interests, both music and visual art. For his course project in Intelligent Systems, he created a program called Ardis. This program acts like a set of sophisticated Adobe Photoshop filters. It consists of a set of rules about the features of paintings done in certain artistic styles, such as German Expressionism. Given an image of any sort, it applies the rules of the styles selected by the user to the images in funky ways, as if to say "How would a German expressionist have made this picture?" In the case of German Expressionism, it finds lines that mark objects and exaggerates them. The program uses a bit of randomness in its filtering, which means that you can use Ardis to create a set of images all of a theme.

    As his instructor, I was most impressed that Dave wrote almost all of the code that makes up Ardis. At the time, there wasn't all that much in the way of image processing packages in Java, so he went off and learned what he needed to implement and did it himself. The program isn't perfect or polished, not nearly as much so as Tarbell's work on-line, but it was a great result for a semester's work. Dave went on to do a master's degree in music and technology at Northwestern, which he just completed last spring. I'll have to dig out Ardis and see if I can't package it up for folks to play with and extend.

    [ Update: I found an old pointer to a description of Ardis on line. Check out David's page http://www.davidshino.com/ardis.html for a bit about his program. ]

    Sometimes, a computer scientist can produce a beautiful picture without intending to. One of my current M.S. students, Nathan Labelle, is working on a project involving power laws and open-source software. In the course of displaying a particular relationship among 100 randomly selected Linux packages, he produced the image to the right: a graph that appears to be a wonderful line drawing of a book whose pages are being riffled. I think it's quite beautiful.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    December 27, 2004 11:19 AM

    Dispatches from the Programmer Liberation Front

    Owen Astrachan pointed me in the direction of Jonathan Edward's new blog. I first encountered Edwards's work at OOPSLA 2004, where he gave a talk in the Onward! track on his example-driven programming tool.

    In his initial blog entries, Edwards introduces readers to the programming revolution in which he would like to participate, away from more powerful and more elegant languages and toward a programming for the Everyman, where we build tools and models that fit the way the real programmer's mind works. His Onward! talk demoed a initial attempt at a tool of the sort, in which the programmer gives examples of the desired computation and the examples become the program.

    Edwards's current position stands in stark contrast to his earlier views as a more traditional researcher in programming languages. As many of us are coming to see, though, "programming is about learning to work effectively in the face of overwhelming complexity" than it is about ever more clever programming languages and compiler tricks. When it comes to taming the complexity inherent in large software, "simplicity, flexibility, and usability are more effective than cleverness and elegance."

    The recent trend toward agile development methodologies and programming support tools such as JUnit and Eclipse also draw their inspiration from a desire for simpler and more flexible programs. Most programmers -- and for Edwards, this includes even the brightest -- don't work very well with abstractions. We have to spend a lot of brain power managing the models we have of our software, models that range from execution on a standard von Neumann architecture up to the most abstract designs our languages will allow. Agile methods such as XP aim to keep programmers' minds on the concrete embodiment of a program, with a focus on building supple code that adapts to changes in our understanding of the problem as we code. Edwards even uses one of Kent Beck's old metaphors that is now fundamental to the agile mindset: Listen carefully to what our code is telling us.

    But agile methods don't go quite as far as Edwards seems to encourage. They don't preclude the use of abstract language mechanisms such as closures or higher-order procedures, or the use of a language such as Haskell, with its "excessive mathematical abstraction". I can certainly use agile methods when programming in Lisp or Smalltalk or even Haskell, and in those languages closures and higher-order procedures and type inference would be natural linguistic constructs to use. I don't think that Edwards is saying such things are in and of themselves bad, but only that they are a symptom of a mindset prone to drowning programmers in the sort of abstractions that distract them from what they really need in order to address complexity. Abstraction is a siren to programmers, especially to us academic types, and one that is ultimately ill-suited as a universal tool for tackling complexity. Richard Gabriel told us that years ago in Patterns of Software (pdf).

    I am sympathetic to Edwards's goals and rationale. And, while I may well be the sort of person he could recruit into the revolution, I'm still in the midst of my own evolution from language maven to tool maven. Oliver Steele coined those terms, as near as a I can tell, in his article The IDE Divide. Like many academics, I've always been prone to learn yet another cool language rather than "go deep" with a tool like emacs or Eclipse. But then it's been a long time since slogging code was my full-time job, when using a relatively fixed base of language to construct a large body of software was my primary concern. I still love to learn a Scheme or a Haskell or a Ruby or a Groovy (or maybe Steele's own Laszlo) to see what new elegant ideas I can find there. Usually I then look to see how those ideas can inform my programming in the language where I do most of my work, these days Java, or in the courses where I do most of my work.

    I don't know where I'll ultimately end up on the continuum between language and tool mavens, though I think the shift I've been undergoing for the last few years has taken me to an interesting place and I don't think I'm done yet. A year spent in the trenches might have a big effect on me.

    As I read Edwards's stuff, and re-read Steele's, a few other thoughts struck me:

    • In his piece on the future of programming, Edwards says,

      I retain a romantic belief in the potential of scientific revolution ... that there is a "Calculus of Programming" waiting to be discovered, which will ... revolutionize the way we program....

      (The analogy is to the invention of the calculus, which revolutionized the discipline of physics.) I share this romantic view, though my thoughts have been with the idea of a pattern language of programs. This is a different sort of 'language' than Edwards means when he speaks of a calculus of programs, but both types of language would provide a new vocabulary for talking about -- and building -- software.

    • Later in the same piece, Edwards says,

      Copy & paste is ubiquitous, despite universal condemnation. ... I propose to decriminalize copy & paste, and even to elevate it into the central mechanism of programming.

      Contrary to standard pedagogy, I tell my students that it's okay to copy and paste. Indeed, I encourage it -- so long as they take the time after "making it work" to make it right. This means refactoring to eliminate duplication, among other things. Some students find this to be heresy, or nearly so, which speaks to how well some of their previous programming instructors have drilled this wonderful little practice out of them. Others take to the notion quite nicely but, under the time pressures that school creates for them and that their own programming practices exacerbate, have a hard time devoting sufficient energy to the refactoring part of the process. The result is just what makes copy and paste so dangerous: a big ball of mud with all sorts of duplicated code.

      Certainly, copy and paste is a central mechanism of doing the simplest thing that could possibly work. The agile methods generally suggest that we then look for ways to eliminate duplication. Perhaps Edwards would suggest that we look for ways to leave the new code as our next example.

    • At the end of the same piece, Edwards proposes an idea I've never seen before: the establishment of "something like a teaching hospital" in which to develop this new way of programming. What a marvelous analogy!

      Back when I was a freshman architecture major, I saw more advanced students go out on charrette. This exercise had the class go on site, say, a road trip to a small town, to work as a group to design a solution to a problem facing the folks there, say, a new public activity center, under the supervision of their instructors, who were themselves licensed architects. Charrette was a way for students to gain experience working on a real problem for real clients, who might then use the solution in lieu of paying a professional firm for a solution that wasn't likely to be a whole lot better.

      Software engineering courses often play a similar role in undergraduate computer science programs. But they usually miss out on a couple of features of a charrette, not the least of which is the context provided by going to the client site and immersing the team in the problem.

      A software institute that worked like a teaching hospital could provide a more authentic experience for students and researchers exploring new ways to build software. Clients would come to the institute, rather than instructors drumming up projects that are often created (or simplified) for students. Clients would pay for the software and use it, meaning that the product would actually have to work and be usable by real people. Students would work with researchers and teachers -- who should be the same people! -- in a model more like apprenticeship than anything our typical courses can offer.

      The Software Engineering Institute at Carnegie Mellon may have some programs that work like this, but it's an idea that is broader than the SEI's view of software engineering, one that could put our CS programs in much closer touch with the world of software than many are right now.

    There seems to be a wealth of revolutionary projects swirling in the computer science world these days: test-driven development, agile software methods, Croquet and eToys .... That's not all that unusual, but perhaps unique to our time is the confluence of so many of these movements in the notion of simplicity, of pulling back from abstraction toward the more concrete expression of computational ideas. This more general trend is perhaps a clue for us all, and especially for educators. One irony of this "back to simplicity" trend is that it is predicated on increasingly complex tools such as Eclipse and Croquet, tools that manage complexity for us so that we can focus our limited powers on the things that matter most to us.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    December 17, 2004 1:51 PM

    Transcript of Alan Kay's Turing Award Lecture

    Earlier, I blogged about Alan Kay's talks at OOPSLA 2004. Now, courtesy of John Maxwell, a historian at the University of British Columbia, I can offer a transcript of Alan's Turing Award lecture, as plain text and rich text. Enjoy!

    Update: And now, thanks to Darius Bacon, we have a lightweight HTML version.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    December 16, 2004 2:34 PM

    Google Fun and Future

    I added "google" to my OS X spell-checker's dictionary yesterday morning. I'm surprised that it's taken me this long. I'm also reminded of a couple of cool Google services I've been playing with of late.

    • Google Suggest does auto-completion as you type your queries, based on the popularity of possible continuations. In my ego-driven way, I tried my name and was disappointed by how many characters it took for my name to be the top choice: "eugene walli" -- and by that point I am the only choice. :-)

    • Straight from Google Labs, Google Sets is a very cool idea, though Google has made it available since mid-2002. You type the names of one or more related items, and Google uses its database to offer you other members of the set it induces from yours.

      Their on-line sample shows Sets finding the names of many automobile manufacturers from an initial set of three. Of course, the quality of the sets it can find depends on the existence of web pages containing your terms with rich connections to one another. For example, when I typed in the names of several chess players, including "Bobby Fischer", which gives nearly 800,000 matches, Google couldn't find a set for me.

    To be honest, I'm not sure how much I'll use these services after my initial playing phase. I've never been a big fan auto-completion, except when I request it explicitly through, say, emacs's tab key. I've read that the dynamic HTML implementation beneath the hood of Suggest is a valuable attempt to extend the diversity and quality of web app interfaces, but that's outside my domain of expertise.

    Indeed, I'm not certain that these particular services will be the ultimate wins that arise from the techniques used. For example, Joel Spolsky says this about Google Suggest:

    It's important not for searching, but because it's going to teach web users to expect highly responsive user interfaces.

    My thoughts about the things you find at Google Labs were focused more on Google and its vitality as a leading corporation. The idea is that Google can use its massive databases and computing power to gain leverage beyond traditional web search.

    I'm not much of a visionary when it comes to predicting what emerging goods, services, and technologies will win big in the future. If I were, I could do better than a professor's salary! But services like Suggest and Sets and Scholar are innovative ways for Google to explore the horizon of the services its offers, and ultimately to push the boundaries of its technology -- and the boundaries of what we can do as users.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    December 10, 2004 3:46 PM

    A Fun Analogy on Garbage Collection

    Recently, jaybaz made an interesting analogy between garbage collection and, well, garbage collection:

    What if Garbage Collection was like Garbage Collection?

    Every Thursday morning at 6:00am, the garbage truck stops in front of your house. A scruffy man in an orange jumpsuit steps down, walks up to your front door, and lets himself in.

    He walks around the house, picking up each item you own, and asks, "Are you still using this?" If you don't say "yes", he carts it away.

    What if we turned the analogy around? Be sure to check out the reader comments to his article.

    When I teach design patterns, I like to make analogies to real-world examples, like television remote controls (iterator) and various kinds of adapter. I think a set of analogies for all the garbage collection techniques we use in programming language implementation would make a fun teaching tool.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    December 08, 2004 2:14 PM

    The Evolution of Language

    I've been reading about XML a bit lately, and I can't help but be reminded of a wonderful T-shirt I saw Philip Wadler wearing at the 2002 International Conference on Functional Programming. I know that this is an oldie, but it's still a goodie.

    The Evolution of Language

                    2x       (Descartes)
    

    λx.2x (Church)

    (LAMBDA (X) (* 2 X)) (McCarthy)

    <?xml version="1.0"?> (W3C) <LAMBDA-TERM> <VAR-LIST> <VAR> X </VAR> </VAR-LIST> <EXPR> <APPLICATION> <EXPR> <CONST> * </CONST> </EXPR> <ARGUMENT-LIST> <EXPR> <CONST> 2 </CONST> </EXPR> <EXPR> <VAR> X </VAR> </EXPR> </ARGUMENT-LIST> </APPLICATION> </EXPR> </LAMBDA-TERM>

    Philip has a PDF version on line.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    November 06, 2004 9:03 PM

    Alan Kay's Talks at OOPSLA

    Alan Kay gave two talks at OOPSLA last week, the keynote address at the Educators Symposium and, of course, his Turing Award lecture. The former was longer and included most of the material from the Turing lecture, especially when you consider his generous question-and-answer session afterwards, so I'll just comment on the talks as a single presentation. That works because they argued for a common thesis: introductions to computing should use simple yet powerful computational ideas that empower the learner explore the larger world of ideas.

    Kay opened by decrying the premature academization of computing. He pointed out that Alan Perlis coined the term "computer science" as a goal to pursue, not as a label for existing practice, and that the term "software engineering" came with a similar motivation. But CS quickly ossified into a discipline with practices and standards that limit intellectual and technical discovery.

    Computing as science is still an infant. Mathematicians work in a well-defined space, creating small proofs about infinite things. Computing doesn't work that way: our proofs are programs, and they are large, sometimes exceedingly so. Kay gave a delightful quote from Donald Knuth, circa 1977:

    Beware of bugs in the above code; I have only proved it correct, not tried it.

    The proof of our programs lies in what they do.

    As engineering, computing is similarly young. Kay contrasted the construction of the Great Pyramid by 200,000 workers over 20 years with the construction of the Empire State Building by fewer than 3,000 people in less than 11 months. He asserts that we are somewhere in the early Middle Ages on that metaphorical timeline. What should we be doing to advance? Well, consider that the builders of the Pyramid used hand tools and relatively weak ideas about building, whereas the engineers who made the Empire State Building used power tools and even more powerful ideas. So we should be creating better tools and ideas. (He then made a thinly veiled reference to VisualStudio.NET 2005, announced at OOPSLA, as just another set of hand tools.)

    So, as a science and professional discipline in its youth, what should computing be to people who learn it? The worst thing we can do is to teach computer science as if it already exists. We can afford to be humble. Teach students that much remains to be done, that they have to help us invent CS, that we expect them to do it!

    Kay reminded us that what students learn first will have a huge effect on what they think, on how they think about computing. He likened the effect to that of duck imprinting, the process in which ducklings latch onto whatever object interacts with them in the first two days of their lives -- even if the object is a human. Teaching university students as we do today, we imprint in them that computing is about arcane syntax, data types, tricky little algorithms, and endless hours spent in front of a text editor and compiler. It's a wonder that anyone wants to learn computing.

    So what can we do instead? I ran into some interesting ideas on this topic at OOPSLA even before the Educators' Symposium and had a few of my own during the week. I'll be blogging on this soon.

    Alan has an idea of the general direction in which we should aim, too. This direction requires new tools and curricula designed specifically to introduce novices to the beauty of computing.

    He held up as a model for us Frank Oppenheimer's Exploratorium, "a collage of over 650 science, art, and human perception exhibits." These "exhibits" aren't the dead sort we find in most museums, where passive patrons merely view an artifact, though. They are stations with projects and activities where children can come face to face with the first simple idea of science: The world is not always as it seems. Why are there so many different exhibits? Kay quoted Oppenheimer to the effect that, if only we can bring each student into contact with that one project or idea that speaks directly to her heart, then we will have succeeded.

    Notice that, with that many projects available, a teacher does not have to "assign" particular projects to students at a particular point in time. Students can choose to do something that motivates them. Kay likened this to reading the classics. He acknowledged that he has read most of the classics now, but he didn't read them in school when they were assigned to him. Then he read other stuff, if only because the he had chosen for himself. One advantage of students reading what they want is that a classroom will be filled with people who have read different things, which allows you to have an interesting conversation about ideas, rather than about "the book".

    What are the 650 examples or projects that we need to light a fire in every college student's heart? Every high schooler? Every elementary school child?

    Kay went on to say that we should not teach dumbed-down versions of what experts know. That material is unnecessarily abstract, refined, and distant from human experience. Our goal shouldn't be to train a future professional computer scientist (whatever that is!) anyway. Those folks will follow naturally from a population that has a deep literacy in the ideas of science and math, computing and communication.

    Here, he pointed to the typical first course in the English department at most universities. They do not set out to create professional writers or even professional readers. Instead, they focus on "big ideas" and how we can represent and think about them using language. Alan thinks introductory computer science should be like this, too, about big ideas and how to represent and think about them in language. Instead, our courses are "driver's ed", with no big ideas and no excitement. They are simply a bastardization of academic computer science.

    What are the big ideas we should introduce to students? What should we teach them about language in order that they might represent ideas, think about them, and even have ideas of their own?

    Alan spent quite a few minutes talking about his first big inspiration in the world of computing. Ivan Sutherland's Sketchpad. It was in Sketchpad that Kay first realized that computing was fundamentally a dynamic medium for expressing new ideas. He hailed Sutherland's work "the greatest Ph.D. dissertation in computer science of all time", and delighted in pointing out Sketchpad's two-handed user interface ("the way all UIs should be"). In an oft-told but deservedly repeated anecdote, Kay related how he once asked Sutherland how he could have created so many new things -- the first raster graphics system, the first object-oriented system, the first drawing program, and more -- in a single year, working alone, in machine language. Sutherland replied, "... because I didn't know it was hard".

    One lesson I take from this example is that we should take care in what we show students while they are learning. If the see examples that are so complex that they can't conceive of building them, then they lose interest -- and we lose a powerful motivator.

    Sutherland's dissertation includes the line, "It is to be hoped that future work will far surpass this effort". Alan says we haven't.

    Eventually, Kay's talk got around to showing off some of the work he and his crew have been doing at Squeakland, a science, math, and computing curriculum project built on top of the modern, open source version of Smalltalk, Squeak. One of the key messages running through all of this work can be found in a story he told about how, in his youth, he used to take apart an old Model T Ford on the weekend so that he could figure out how it worked. By the end of the weekend, he could put it back together in running condition. We should strive to give our students the same experience in the computing environments they use: Even if there's a Ferrari down there, the first thing you see when you open the hood is the Model T version -- first-order theories that, even if they throw some advanced ideas away, expose the central ideas that students need to know.

    Alan demoed a sequence of increasingly sophisticated examples implemented and extended by the 5th- and 6th-grade students in B.J. Conn's charter school classes. The demos in the Educators' Symposium keynote were incredible in their depth. I can't do them justice here. The best you can do is to check out the film Squeakers, and even that has only a small subset of what we saw in Vancouver. We were truly blessed that day!

    The theme running through the demos was how students can explore the world in their own experience, and learn powerful ideas at the "hot spot" where math and science intersect. Students can get the idea behind tensor calculus long before they can appreciate the abstractions we usually think of as tensor calculus. In the course of writing increasingly complex scripts to drive simulations of things they see in the world, students come to understand the ideas of the variable, feedback, choice, repetition, .... They do so by exposing them in action, not in the abstract.

    The key is that students learn because they are having fun exploring questions that matter to them. Sometime along in here, Kay uttered what was for me the Quote of the Day, no, the Quote of OOPSLA 2004:

    If you don't read for for fun, you won't be fluent enough to read for purpose.

    I experience this every day when interacting with university students. Substitute 'compute' or 'program' for 'read', and you will have stated a central truth of undergraduate CS education.

    As noted above, Kay has a strong preference for simple, powerful ideas over complex ideas. He devoted a part of his talk to the Hubris of Complexity, which he believes long ago seduced most folks in computing. Software people tend to favor the joy of complexity, yet we should strive for the joy of simplicity.

    Kay gave several examples of small "kernels" that have changed the world, which all people should know and appreciate. Maxwell's equations were one. Perhaps in honor of the upcoming U.S. elections, he spent some time talking about the U.S. Constitution as one such kernel. You can hold it in the palm of your hand, yet it thrives still after 225 years. It is an example of great system design -- it's not law-based or case-based, but principle-based. The Founding Fathers created a kernel of ideas that remains not only relevant but also effective.

    I learned a lot hearing Alan tell some of the history of objects and OOP. In the 1960s, an "object" was simply a data structure, especially one containing pointers. This usage predates object-oriented programming. Alan said that his key insight was that an object could act as a miniature software computer -- not just a data structure, not just a procedure -- and that software scales to any level of expression.

    He also reminded us of something he has said repeatedly in recent years: Object-oriented programming is about messages, not the objects. We worry about the objects, but it's the messages that matter.

    How do we make messages the centerpiece of our introductory courses in computing?

    Periodically throughout the talk, Alan dropped in small hints about programming languages and features. He said that programming language design has a large UI component that we technical folks sometimes forget. A particular example he mentioned was inheritance. While inheritance is an essential part of most OOP, Alan said that students should not encounter it very soon in their education, because it "doesn't pay for the complexity it creates".

    As we design languages and environments for beginners, we can apply lessons from Mihaly Csikszentmihalyi's idea of "flow". Our goal should be to widen the path of flow for learners. One way to do that is to add safety to the language so that learners do not become anxious. Another is to create attention centers to push away potential boredom.

    Kay's talks were full of little nuggets that I jotted down and wanted to share:

    • He thanked many people for participating in the creation of his body of work, sharing with them his honors. Most notably, he thanked Dan Ingalls "for creating Smalltalk; I only did the math". I have often heard Alan single out Ingalls for his seminal contributions. That makes a powerful statement about the power of teams.

    • Throughout the talk, Alan acknowledged many books and ideas that affected him. I couldn't get down all of the recommended reading, but I did record a few:
      • Bruce Alberts, "Molecular Biology of the Cell"
      • Lewis Carroll Epstein, "Relativity Visualized"
      • James D. Watson, "Molecular Biology of the Gene"

    • Clifford Shaw's JOSS was the most beautiful programming language ever created. I've heard of JOSS but now have to go learn more.

    • Whatever you do, ask yourself, "What's the science of it?" You can replace 'science' with many other words to create other important questions: art, math, computing, civilization.

    • "In every class of thirty 11-year-olds, there's usually one Galileo kid." What can we do to give him or her the tools needed to redefine the world?

    Alan closed both talks on an inspirational note, to wrap up the inspiration of what he had already said and shown us. He told us that Xerox PARC was so successful not because the people were smart, but because they had big ideas and had the inclination to pursue them. They pursued their ideas simple-mindedly. Each time they built something new, they asked themselves, "What does the complexity in our system buy us?" If it didn't buy enough, they strove to make the thing simpler.

    People love to quote Alan's most famous line, "The best way to predict the future is to invent it." I leave you today with the lines that follow this adage in Alan's paper "Inventing the Future" (which appears in The AI Business: The Commercial Uses of Artificial Intelligence, edited by Patrick Henry Winston and Karen Prendergast). They tell us what Alan wants us all to remember: that the future is in our hands.

    The future is not laid out on a track. It is something that we can decide, and to the extent that we do not violate any known laws of the universe, we can probably make it work the way that we want to.

    -- Alan Kay, 1984

    Guys like Alan set a high bar for us. But as I noted last time, we have something of a responsibility to set high goals when it comes to computing. We are creating the medium that people of the future -- and today? -- will use to create the next Renaissance.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    November 03, 2004 3:47 PM

    Other Folks Comment on OOPSLA

    Some other folks have blogged on their experiences at OOPSLA last week. These two caught my attention, for different reasons.

    The trade-off between power and complexity in language

    Nat Pryce writes about stopping on the SeaSide BoF. SeaSide is a web app framework written in Smalltalk. Nat marvels at its elegance and points out:

    SeaSide is able to present such a simple API because it takes advantage of "esoteric" features of its implementation language/platform, Smalltalk, such as continuations and closures. Java, in comparison to Smalltalk, is designed on the assumption that programmers using the Java language are not skilled enough use such language features without making a mess of things, and so the language should only contain simple features to avoid confusing our poor little heads. Paradoxically, the result is that Java APIs are overly complex, and our poor little heads get confused anyway. SeaSide is a good demonstration that powerful, if complex, language features make the job of everyday programming easier, not harder, by letting API designers create elegant abstractions that hide the complexity of the problem domain and technical solution.

    There is indeed great irony in how choosing the wrong kind of simplicity in a language leads to unnecessary complexity in the APIs and systems written in the language. I don't have an opportunity to teach students Smalltalk much these days, but I always hope that they will experience a similar epiphany when programming in Scheme.

    Not surprisingly, Alan Kay has a lot to say on this topic of simplicity, complexity, and thinking computationally, too. I hope to post my take on Alan's two OOPSLA talks later this week.

    Making Software in a Joyous World

    You gotta love a blog posting subtitled with a line from a John Mellencamp song.

    Brian Marick writes about three talks that he heard on the first day at OOPSLA. He concludes wistfully:

    I wish the genial humanists like Ward [Cunningham] and the obsessive visionaries like Alan Kay had more influence [in the computing world]. I worry that the adolescence of computers is almost over, and that we're settling into that stagnant adulthood where you just plod on in the world as others made it, occasionally wistfully remembering the time when you thought endless possibility was all around you.

    In the world where Ward and Alan live, people use computing to make lives better. People don't just consume ideas; they also produce them. Ward's talk described how programmers can change their world and the world of their colleagues by looking for opportunities to learn from experience and creating tools that empower programmers and users. Alan's talk urged us to take that vision out into the everyone's world, where computers can serve as a new kind of medium for expressing new kinds of ideas -- for everyone, not just software folks.

    This are high goals to which we can aspire. And if we don't then who else will be able to?


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

    October 22, 2004 9:45 AM

    Kenneth Iverson, RIP

    Sad news: Kenneth Iverson passed away on October 19. Iverson is best known for creating APL, an extremely powerful and concise programming language. In 1979, Iverson received the Turing Award for his work on programming and notations for describing programs.

    Concise taken too far turns into cryptic, and APL is often derided for its unreadability. But I have a special place in my heart for APL after a three-week unit studying it in my undergraduate programming languages course in the mid-80s. I've always marveled at just how powerful language can be. In APL, I could write a program to shuffle a deck of cards and deal them to several players in four characters. Within its domain of matrix processing, APL was king.

    I met Iverson in the late 1980s while studying at Michigan State. He came to speak about his more recent language development work, which ultimately led to the J programming language. J synthesized ideas from APL with the functional programming ideas found in John Backus's FP. (Backus also won the Turing Award for work on programming languages, in 1977.) I don't remember much about the day of Iverson's talk at MSU, except that he graciously and patiently fielded many aggressive questions from the audience.

    I am on quite a Turing Award ride these days. As I blogged earlier today, I've been thinking of Edsger Dijkstra this week. Iverson and Backus won in the late '70s. And I'm preparing my introduction for Alan Kay, who is giving the keynote address at the OOPSLA 2004 Educators' Symposium on Monday and then giving his 2003 Turing Award lecture the next night. My excitement about Alan Kay's talks is only heightened by Iverson's passing, as it reminds me that great thinkers are mortal, too. We should appreciate them and learn from them while they are with us.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    October 01, 2004 2:04 PM

    Proofs from THE BOOK

    I just finished reading Proofs from THE BOOK, by Martin Aigner and Günter Ziegler. The title comes from Paul Erdós, who "liked to talk about The Book, in which God maintains the perfect proofs for mathematical theorems". Mathematicians don't have to believe in God, said Erdós, but they have to believe in the book. Proofs from THE BOOK collects some of the proofs Erdós liked best and some others in the same spirit: clever or elegant proofs that reflect some interesting insight into a problem.

    I am a computer scientist, not a mathematician, but many of these proofs made me smile. My favorite sections were on number theory and combinatorics. Some of the theorems on prime and irrational numbers were quite nice.

    My favorite proof from The Book applied the pigeonhole principle in a surprising way. The claim:

    Suppose we are given n integers a1, ..., an, which need not be distinct. Then there is always a set of consecutive numbers aj+1, ..., ak whose sum is a multiple of n.

    This doesn't seem obvious to me at all. But consider the sequences N = {0, a1, a1+a2, a1+a2+a3, ..., a1+a2+...+an} and R = {0, 1, 2, ..., n-1}. Now consider the function f that maps each member of N, ai, to (ai mod n) in R. Now, |N| = n+1 and |R| = n, so by the pigeonhole principle we know that there must be two sums from N, a1+...+aj and a1+...+ak (j<k), mapped to the same value in R. a1+...+aj may actually be 0, the first value in N, but that doesn't affect our result.

    Then:

    subtract the smaller sum from the larger -- and get the claim!

    must have a remainder of 0. QED. Beautiful.

    Eugene sez: Check out Proofs from THE BOOK.

    P.S. It might be fun to create a similar book for proofs related specifically to computer science. Proofs from THE BOOK has some proofs on counting and one proof on information theory, but most of the book focuses on mathematics more broadly.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    September 15, 2004 11:33 AM

    Ward on the Wiki of the Future

    Last time I wrote about Norm Kerth's Saturday session at the recently-ended PLoP 2004. Norm's topic was myth, especially the Hero's Journey myth.

    Ward Cunningham led the second session of the day, beginning with some of his own history. As many of you know, Ward is best known for taking ideas and turning them into programs, or ways of making programs better. He spoke about "wielding the power of programming", to be able to make a computer do what is important to you. If you can think of an idea, you can write a program to bring it about.

    But programmers can also empower other people to do the same. Alan Kay's great vision going back to his grad school days is to empower people with a medium for creating and expressing thoughts. Ward pointed out that the first program to empower a large group of non-programmers was the spreadsheet. The web has opened many new doors. He called it "a faulty system that delivers so much value that we ignore its fault".

    Ward's wiki also empowers people. It is an example of social software, software that doesn't make sense to be used by one person. The value is in the people that use it together. These days, social software dominates our landscape: Amazon, ebay, and a multitude of web-based on-line communities are just a few examples. Wiki works best when people seek common ground; it is perhaps best thought of as a medium for making and refining arguments, carrying on a conversation that progresses toward a shared understanding.

    This dynamic is interesting, because wiki was predicated in part on the notion of taking a hard problem and acting as if it weren't a problem at all. For wiki, that problem is malevolent editing, users who come to a site for the purpose of deleting pages or defacing ideas. Wiki doesn't guard against this problem, yet, surprisingly, for the most part this just isn't a problem. The social processes of a community discourage malevolent behavior, and when someone violates the community's trust we find that the system heals itself through users themselves repairing the damage. A more subtle form of this is in the flip side of wiki as medium for seeking common ground: so-called "edit wars", in which posters take rigid positions and then snipe at one another on wiki pages that grow increasingly long and tedious. Yet community pressure usually stems the war, and volunteers clean up the mess.

    Ward's latest thoughts on wiki focus on two questions, one technical and one social, but both aimed at a common end.

    First, how can we link wikis together in a way benefits them all? When there was just one wiki, every reference matched a page on the same server, or a new page was created. But now there are dozens (hundreds?) of public wikis on the web, and this leads to a artificial disjunction in the sharing of information. For example, if I make a link to AgileSoftwareDevelopment in a post to one wiki, the only page to which I can refer is one on the same server -- even if someone has posted a valuable page of that name on another wiki. How could we manage automatic links across multiple wikis, multiple servers?

    Second, how can wiki help information to grow around the world? Ward spoke of wiki's role as a storytelling device, with stories spreading, being retold and changes and improved, across geographic and linguistic boundaries, and maybe coming back around to the originators with a trail of where the story has been and how it's change. Think of the children's game "telephone", but without the accumulation of accidental changes, only intentional ones. Could my server connect to other servers that have been the source of stories that interested me before, to find out what's new there? Can my wiki gain information while forgetting or ignoring the stuff that isn't so good?

    Some of these ideas exist today at different levels of human and program control. Some wikis have sister sites, implemented now with crosslinking via naming convention. But could such crosslinking be done automatically by the wikis themselves? For instance, I could tell my wiki to check Ward's nightly, looking for crossreferenced names and linking, perhaps even linking to several wikis for the same name.

    In the world of blogging, we have the blogroll. Many bloggers put links to their favorite bloggers on their own blog, which serves as a way to say "I find these sites useful; perhaps you will, too." I've found many of the blogs I like to read by beginning at blogs by Brian Marick and Martin Fowler, and following the blogroll trail. This is an effective human implementation of the spreading of useful servers, and much of the blogging culture itself is predicated on the notion of sharing stories -- linking to an interesting story and then expanding on it.

    Ward's discussion of automating this process brought to mind the idea of "recommender systems", which examine a user's preferences, finds a subset of the community whose collective preferences correlate well with the user's, and then uses that correlation to recommends content that the user hasn't seen yet. (One of my colleagues, Ben Schafer, does research in this area.) Maybe a collection of wikis could do something similar? The algorithms are simple enough; the real issue seems to be tracking and recording user preferences in a meaningful way. Existing recommender systems generally require the user to do a lot of the work in setting up preferences. But I have heard Ralph Johnson tell about an undergraduate project he directed in which preferences were extracted from Mac users' iTunes playlists.

    I must admit that I was a bit worried when I first heard Ward talk about having a wiki sift through its content and activity to determine who the most valuable contributors are. Who needs even narrower bandwidth for many potential posters to add useful content? But then I thought about all the studies of how power laws accurately model idea exchange in the blogosphere, and I realized that programmed heuristics might actually increase the level of democratization rather than diminish it. Even old AI guys like me sometimes react with a kneejerk when a new application of technology enters our frame of reference.

    The Saturday sessions at PLoP created an avalanche of thoughts in my mind. I don't have enough time to think them now or act on them, but I'll keep at it. Did I say before how much I like PLoP?


    Posted by Eugene Wallingford | Permalink | Categories: Computing, General

    September 07, 2004 5:24 PM

    Aptitude as Controlling Factor in Learning to Program

    I'm busily preparing to leave town for PLoP, and I haven't had time to write lately. I also haven't had time to clarify a bunch of loose ideas in my head on the topics of aptitude for computing and the implications for introductory CS courses. So I'll write down my jumble as is.

    A couple of months ago, I read E.L. Doctorow's memoir, Reporting The Universe. As I've noted before, I like to read writers, especially accomplished ones, writing about writing, especially how and why they write. One of Doctorow's comments stuck in my mind. He claimed that the aptitude for math and physics (and presumably computing) is rare, but all people are within reach of writing good narrative. For some reason, I didn't want to believe that. Not the part about writing, because I do believe that most folks can learn to write well. I didn't want to believe that most folks are predisposed away from doing computing. I know that it's technical, and that it seems difficult to many. But to me, it's about communicating, too, and I've always held out hope that most folks can learn to write programs, too.

    Then on a mailing list last weekend, this topic came up in the form of what intro CS courses should be like.

    • One person quoted Donald Knuth from his Selected Papers on Computer Science as saying that people drawn to disciplines where they find people think as they do, and that only 1-2% of all people have 'brains wired for algorithmic thinking'. The poster said that it is inevitable and natural that many folks will take and be turned off by the geeky things that excite computer scientists. He wants to create a first course that gives students glimpse of what CS is really like, so that they can decide early whether they want to study the discipline more deeply.

    • Another person turned this issue of geeky inside out and said that all courses should be designed to help student's distinguish beauty from ugliness in the discipline.

    • A third person lamented that attempts to perfect CS 1 as we teach it now may lead to a Pyhrric victory in which we do a really good job appealing to a vanishingly small audience. He argued that we should go to higher level of abstraction, re-inventing early CS courses so that economics, political science, and biology majors can learn and appreciate the beauty and the relevance of CS.

    These guys are among the best, if not the best, intro CS teachers in the country, and they make their claims from deep understanding and broad experience. The ideas aren't mutually exclusive, of course, as they talk some about what our courses should be like and about the aptitude that people may have for the discipline. I'm still trying to tease the two apart in my mind. But the aptitude issue is stickier for me right now.

    I'm reminded of something Ralph Johnson said at a PLoP a few years ago. He offered as motivation piano teachers. Perhaps not everyone has the aptitude to be a concert pianist, but piano teachers have several pedagogies that enable them to teach nearly anyone to play piano serviceably. Perhaps we can aim for the same: not all of our students will become masters, but all can become competent programmers, if they show some interest and put in the work required to get better.

    Perhaps aptitude is more of a controlling factor than I have thought. Certainly, I know of more scientists and mathematicians who have enjoyed writing (and well) than humanities folks who enjoy doing calculus or computer programming on the side. But I can't help but think that interest can trump aptitude for all but a few, so long as "merely" competence is the goal.


    Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

    August 20, 2004 2:05 PM

    You Don't Want to Program?

    Yesterday I blogged about programming. Today, I advised severally newly-declared computer science majors. Not a single one of them wants to learn to program. Most want to take our new Network and System Administration major. Others are interested in bioinformatics or other applied areas.

    Some of these incoming freshmen know that they have to do a certain amount of programming in order to reach their goals, and they're okay with that. They'll humor us along the way. But others seemed genuinely concerned that we expect them to learn how to program before teaching them how to set up Windows networks and troubleshoot systems -- so much so that they immediately began to explore their alternatives at the university.

    I read occasionally about problems with science and mathematics education in the United States, but I wonder if any discipline is quite like computer science. We go through cycles of having an incredibly popular major followed by having a major that few students want to take. But at no time do very many of our students come with the slightest inkling of what computer science is or the role that programming plays in it. We seem to start with a student body that has no idea what they are in for. Some students must feel sideswiped when the truth hits them.

    This state of affairs helps to explain why intro CS courses have rather high drop rates compared to similar courses in other departments. The university should consider this when comparing numbers.

    What can we do about this? The move toward "breadth-first" curricula a decade ago aimed to address this problem, and I think such an approach has some benefits. (It also has some drawbacks.) But it would perhaps be better if we could address the problem earlier, if we could somehow expose high school students to a more accurate view of computing beyond the applications they see and use in so many contexts. Computing is a fundamental component of the modern world, yet it is still largely a mystery to the public at large.

    This fall, I hope to teach a six-week unit on computing concepts at my daughters' school, to sixth graders. Maybe I can get a feeling for what is possible then. But, here in the trough of our enrollment cycle, encounters like the one I had this morning spook me.


    Posted by Eugene Wallingford | Permalink | Categories: Computing

    July 31, 2004 4:12 PM

    Index-Card Computing

    This story is a wonderful illustration of the idea of associative memory:

    I remember my grandad, who worked in the police, telling me how they used to keep details of criminals on filing cards. On the edge of each card there were a number of punched holes, some punched right the way to the edge. Each punched hole corresponded to some fact about the person -- whether they were a burglar or a mugger or things like that. So, if you wanted to pick out all the burglars, you'd take a metal skewer and slide it in through the appropriate hole and lift it up. All the cards whose hole had been punched right to the edge (non-burglars) would stay behind, whilst all the relevant ones would get lifted up.

    I found this story in an article on DNA-based computing at Andrew Birkett's blog. If you've ever wondered about how biocomputing can work, read this article! It's really neat -- yet another example of how Mother Nature can do intractable search using massive replication and parallelism.


    Posted by Eugene Wallingford | Permalink | Categories: Computing