October 17, 2014 3:05 PM

Assorted Quotes

... on how the world evolves.

On the evolution of education in the Age of the Web. Tyler Cowen, in Average Is Over, via The Atlantic:

It will become increasingly apparent how much of current education is driven by human weakness, namely the inability of most students to simply sit down and try to learn something on their own.

I'm curious whether we'll ever see a significant change in the number of students who can and do take the reins for themselves.

On the evolution of the Web. Jon Udell, in A Web of Agreements and Disagreements:

The web works as well as it does because we mostly agree on a set of tools and practices. But it evolves when we disagree, try different approaches, and test them against one another in a marketplace of ideas. Citizens of a web-literate planet should appreciate both the agreements and the disagreements.

Some disagreements are easier to appreciate after they fade into history.

On the evolution of software. Nat Pryce on the Twitter, via The Problematic Culture of "Worse is Better":

Eventually a software project becomes a small amount of useful logic hidden among code that copies data between incompatible JSON libraries

Not all citizens of a web-literate planet appreciate disagreements between JSON libraries. Or Ruby gems.

On the evolution of start-ups. Rands, in The Old Guard:

... when [the Old Guard] say, "It feels off..." what they are poorly articulating is, "This process that you're building does not support one (or more) of the key values of the company."

I suspect the presence of incompatible JSON libraries means that our software no longer supports the key values of our company.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Managing and Leading, Software Development, Teaching and Learning

October 16, 2014 3:54 PM

For Programmers, There Is No "Normal Person" Feeling

I see this in the lab every week. One minute, my students sit peering at their monitors, their heads buried in their hands. They can't do anything right. The next minute, I hear shouts of exultation and turn to see them, arms thrust in the air, celebrating their latest victory over the Gods of Programming. Moments later I look up and see their heads again in their hands. They are despondent. "When will this madness end?"

Last week, I ran across a tweet from Christina Cacioppo that expresses nicely a feeling that has been vexing so many of my intro CS students this semester:

I still find programming odd, in part, because I'm either amazed by how brilliant or how idiotic I am. There's no normal-person feeling.

Christina is no beginner, and neither am I. Yet we know this feeling well. Most programmers do, because it's a natural part of tackling problems that challenge us. If we didn't bounce between feeling puzzlement and exultation, we wouldn't be tackling hard-enough problems.

What seems strange to my students, and even to programmers with years of experience, is that there doesn't seem to be a middle ground. It's up or down. The only time we feel like normal people is when we aren't programming at all. (Even then, I don't have many normal-person feelings, but that's probably just me.)

I've always been comfortable with this bipolarity, which is part of why I have always felt comfortable as a programmer. I don't know how much of this comfort is natural inclination -- a personality trait -- and how much of it is learned attitude. I am sure it's a mixture of both. I've always liked solving puzzles, which inspired me to struggle with them, which helped me get better struggling with them.

Part of the job in teaching beginners to program is to convince them that this is a habit they can learn. Whatever their natural inclination, persistence and practice will help them develop the stamina they need to stick with hard problems and the emotional balance they need to handle the oscillations between exultation and despondency.

I try to help my students see that persistence and practice are the answer to most questions involving missing skills or bad habits. A big part of helping them this is coaching and cheerleading, not teaching programming language syntax and computational concepts. Coaching and cheerleading are not always tasks that come naturally to computer science PhDs, who are often most comfortable with syntax and abstractions. As a result, many CS profs are uncomfortable performing them, even when that's what our students need most. How do we get better at performing them? Persistence and practice.

The "no normal-person feeling" feature of programming is an instance of a more general feature of doing science. Martin Schwartz, a microbiologist at the University of Virginia, wrote a marvelous one-page article called The importance of stupidity in scientific research that discusses this element of being a scientist. Here's a representative sentence:

One of the beautiful things about science is that it allows us to bumble along, getting it wrong time after time, and feel perfectly fine as long as we learn something each time.

Scientists get used to this feeling. My students can, too. I already see the resilience growing in many of them. After the moment of exultation passes following their latest conquest, they dive into the next task. I see a gleam in their eyes as they realize they have no idea what to do. It's time to bury their heads in their hands and think.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 06, 2014 4:02 PM

A New Programming Language Can Inspire Us

In A Fresh Look at Rust, Armin Ronacher tells us that some of what inspires him about Rust:

For me programming in Rust is pure joy. Yes I still don't agree with everything the language currently forces me to do but I can't say I have enjoyed programming that much in a long time. It gives me new ideas how to solve problems and I can't wait for the language to get stable.

Rust is inspiring for many reasons. The biggest reason I like it is because it's practical. I tried Haskell, I tried Erlang and neither of those languages spoke "I am a practical language" to me. I know there are many programmers that adore them, but they are not for me. Even if I could love those languages, other programmers would never do and that takes a lot of enjoyment away.

I enjoy reading personal blog entries from people excited by a new language, or newly excited by a language they are visiting again after a while away. I've only read Rust code, not written it, but I know just how Ronacher feels. These two paragraphs touch on several truths about how languages excite us:

  • Programmers are often most inspired when a language shows them new ideas how to solve problems.
  • Even if we love a language, we won't necessarily love every feature of the language.
  • What inspires us is personal. Other people can be inspired by languages that do not excite us.
  • Community matters.

Many programmers make a point of learning a new language periodically. When we do, we are often most struck by a language that teaches us new ways to think about problems and how to solve them. These are usually the languages that have the most teach us at the moment.

As Kevin Kelly says, progress sometimes demands that we let go of problems. We occasionally have to seek new problems, in order to be excited by new ways to answer them.

This all is very context-specific, other. How wonderful it is to live in a time with so many languages available to learn from. Let them all flourish, I say.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 02, 2014 3:46 PM

Skills We Can Learn

In a thread on motivating students on the SIGCSE mailing list, a longtime CS prof and textbook author wrote:

Over the years, I have come to believe that those of us who can become successful programmers have different internal wiring than most in the population. We know you need problem solving, mathematical, and intellectual skills but beyond that you need to be persistent, diligent, patient, and willing to deal with failure and learn from it.

These are necessary skills, indeed. Many of our students come to us without these skills and struggle to learn how to think like a computer scientist. And without persistence, diligence, patience, and a willingness to deal with failure and learn from it, anyone will likely have a difficult time learning to program.

Over time, it's natural to begin to think that these attributes are prerequisites -- things a person must have before he or she can learn to write programs. But I think that's wrong.

As someone else pointed out in the thread, too many people believe that to succeed in certain disciplines, one must be gifted, to possess an inherent talent for doing that kind of thing. Science, math, and computer science fit firmly in that set of disciplines for most people. Carol Dweck has shown that having such a "fixed" mindset of this sort prevents many people from sticking with these disciplines when they hit challenges, or even trying to learn them in the first place.

The attitude expressed in the quote above is counterproductive for teachers, whose job it is to help students learn things even when the students don't think they can.

When I talk to my students, I acknowledge that, to succeed in CS, you need to be persistent, diligent, patient, and willing to deal with failure and learn from it. But I approach these attributes from a growth mindset:

Persistence, diligence, patience, and willingness to learn from failure are habits anyone can develop with practice. Students can develop these habits regardless of their natural gifts or their previous education.

Aristotle said that excellence is not an act, but a habit. So are most of the attributes we need to succeed in CS. They are habits, not traits we are born with or actions we take.

Donald Knuth once said that only about 2 per cent of the population "resonates" with programming the way he does. That may be true. But even if most of us will never be part of Knuth's 2%, we can all develop the habits we need to program at a basic level. And a lot more than 2% are capable of building successful careers in the discipline.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

September 25, 2014 4:18 PM

Producers and Consumers

The more you produce and the more needs you meet, the more freedom you earn.

As Seth Godin says, it's fun to be (only) a consumer, but in the long run, smart producers win. Knowing how to produce solutions for yourself and others is the first step to freedom. Actually making things is the second.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

September 24, 2014 3:54 PM

Is It Really That Hard?

This morning, I tweeted:

Pretty sure I could build a git-based curriculum management system in two weeks that would be miles better than anything on the market now.

Yes, I know that it is easy to have ideas, and that carrying an idea through to a product is the real challenge. At least I don't just need a programmer...

My tweet was the result of temporary madness provoked by yet another round of listening to non-CS colleagues talk about one of the pieces of software we use on campus. It is a commercial product purchased for one task only, to help us manage the cycle of updating the university catalog. Alas, in its current state, it can handle only one catalog at a time. This is, of course, inconvenient. There are always at least two catalogs: the one in effect at this moment, and the one in progress of being updated. That doesn't even take into account all of the old catalogs still in effect for the students who entered the university when they were The Catalog.

Yes, we need version control. Either the current software does not provide it, or that feature is turned off.

The madness arises because of the deep internal conflict that occurs within me when I'm drawn into such conversations. Everyone assumes that programs "can't do this", or that the programmers who wrote our product were mean or incompetent. I could try to convince them otherwise by explaining the idea of version control. But their experience with commercial software is so uniformly bad that they have a hard time imagining I'm telling the truth. Either I misunderstand the problem, or I am telling them a white lie.

The alternative is to shake my head, agree with them implicitly, and keep thinking about how to teach my intro students how to design simple programs.

I'm convinced that a suitable web front-end to a git back end could do 98% of what we need, which is about 53% more than either of our last two commercial solutions has done for us.

Maybe it's time for me to take a leave of absence, put together a small team of programmers, and do this. Yes, I would need a team. I know my limitations, and besides working with a few friends would be a lot more fun. The current tools in this space leave a lot of room for improvement. Built well and marketed well, this product would make enough money from satisfaction-starved universities to reward everyone on the team well enough for all to retire comfortably.

Maybe not. But the idea is free the taking. All I ask is that if you build it, give me a shout-out on your website. Oh, and cut my university a good deal when we buy your software to replace whatever product we are grumbling about when you reach market.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

September 19, 2014 3:34 PM

Ask Yourself, "What is the Pattern?"

I ran across this paragraph in an essay about things you really need to learn in college:

Indeed, you should view the study of mathematics, history, science, and mechanics as the study of archetypes, basic patterns that you will recognize over and over. But this means that, when you study these disciplines, you should be asking, "what is the pattern" (and not merely "what are the facts"). And asking this question will actually make these disciplines easier to learn.

Even in our intro course, I try to help students develop this habit. Rather than spending all of our time looking at syntax and a laundry list of language features, I am introducing them to some of the most basic code patterns, structures they will encounter repeatedly as they solve problems at this level. In week one came Input-Process-Output. Then after learning basic control structures, we encountered guarded actions, range tests, running totals, sentinel loops, and "loop and a half". We encounter these patterns in the process of solving problems.

While they are quite low-level, they are not merely idioms. They are patterns every bit as much as patterns at the level of the Gang of Four or PoSA. They solve common problems, recur in many forms, and are subject to trade-offs that depend on the specific problem instance.

They compose nicely to create larger programs. One of my goals for next week is to have students solve new problems that allow them to assemble programs from ideas they have already seen. No new syntax or language features, just new problems.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

July 17, 2014 10:00 AM

A New Commandment

... I give unto you:

Our first reaction to any comrade, any other person passionate about and interested in building things with computers, any human crazy and masochistic enough to try to expand the capabilities of these absurd machines, should be empathy and love.

Courtesy of Britt Butler.

I hope to impart such empathy and love to my intro students this fall. Love to program, and be part of a community that loves and learns together.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

July 03, 2014 2:13 PM

Agile Moments: Conspicuous Progress and Partial Value

Dorian Taylor, in Toward a Theory of Design as Computation:

You can scarcely compress the time it takes to do good design. The best you can do is arrange the process so that progress is conspicuous and the partially-completed result has its own intrinsic value.

Taylor's piece is about an idea much bigger than simply software methodology, but this passage leapt off the page at me. It seems to embody two of the highest goals of the various agile approaches to making software: progress that is conspicuous and partial results that have intrinsic value to the user.

If you like ambition attempts to create a philosophy of design, check out the whole essay. Taylor connects several disparate sources:

  • Edwin Hutchins and Cognition in the Wild,
  • Donald Norman and Things That Make Us Smart, and
  • Douglas Hofstadter and Gödel, Escher, Bach
with the philosophy of Christopher Alexander, in particular Notes on the Synthesis of Form and The Nature of Order. Ambitious it is.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

July 02, 2014 4:31 PM

My Jacket Blurb for "Exercises in Programming Style"

On Monday, my copy of Crista Lopes's new book, Exercises in Programming Style, arrived. After blogging about the book last year, Crista asked me to review some early chapters. After I did that, the publisher graciously offered me a courtesy copy. I'm glad it did! The book goes well beyond Crista's talk at StrangeLoop last fall, with thirty three styles grouped loosely into nine categories. Each chapter includes historical notes and a reading list for going deeper. Readers of this blog know that I often like to go deeper.

I haven't had a chance to study any of the chapters deeply yet, so I don't have a detailed review. For now, let me share the blurb I wrote for the back cover. It gives a sense of why I was so excited by the chapters I reviewed last summer and by Crista's talk last fall:

It is difficult to appreciate a programming style until you see it in action. Cristina's book does something amazing: it shows us dozens of styles in action on the same program. The program itself is simple. The result, though, is a deeper understanding of how thinking differently about a problem gives rise to very different programs. This book not only introduced me to several new styles of thinking; it also taught me something new about the styles I already know well and use every day.

The best way to appreciate a style is to use it yourself. I think Crista's book opens the door for many programmers to do just that with many styles most of us don't use very often.

As for the blurb itself: it sounds a little stilted as I read it now, but I stand by the sentiment. It is very cool to see my blurb and name along side blurbs from James Noble and Grady Booch, two people whose work I respect so much. Very cool. Leave it to James to sum up his thoughts in a sentence!

While you are waiting for your copy of Crista's book to arrive, check out her recent blog entry on the evolution of CS papers in publication over the last 50+ years. It presents a lot of great information, with some nice images of pages from a few classics. It's worth a read.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 20, 2014 1:27 PM

Programming Everywhere, Business Edition

Q: What do you call a company that has staff members with "programmer" or "software developer" in their titles?

A: A company.

Back in 2012, Alex Payne wrote What Is and Is Not A Technology Company to address a variety of issues related to the confounding of companies that sell technology with companies that merely use technology to sell something else. Even then, developing technology in house was a potential source of competitive advantage for many businesses, whether that involved modifying existing software or writing new.

The competitive value in being able to adapt and create software is only larger and more significant in the last two years. Not having someone on staff with "programmer" in the title is almost a red flag even for non-tech companies these days.

Those programmers aren't likely to have been CS majors in college, though. We don't produce enough. So we need to find a way to convince more non-majors to learn a little programming.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 12, 2014 2:29 PM

Points of Emphasis for Teaching Design

As I mentioned recently, design skills were a limiting factor for some of the students in my May term course on agile software development. I saw similar issues for many in my spring Algorithms course as well. Implementing an algorithm from lecture or reading was straightforward enough, but organizing the code of the larger system in which the algorithm resided often created challenges for students.

I've been thinking about ways to improve how I teach design in the future, both in courses where design is a focus and in courses where it lives in the background of other material. Anything I come up with can be also part of conversation with colleagues as we talk about design in their courses.

I read Kent Beck's initial Responsive Design article when it first came out a few years ago and blogged about it then, because it had so many useful ideas for me and my students. I decided to re-read the article again last week, looking for a booster shot of inspiration.

First off, it was nice to remember how many of the techniques and ideas that Kent mentions already play a big role in my courses. Ones that stood out on this reading included:

  • taking safe steps,
  • isolating changes within modules,
  • recognizing that design is a team sport, fundamentally a social activity, and
  • playing with words and pictures.

My recent experiences in the classroom made two other items in Kent's list stand out as things I'll probably emphasize more, or at least differently, in upcoming courses.

Exploit Symmetries. Divide similar elements into identical parts and different parts.

As I noted in my first blog about this article, many programmers find it counterintuitive to use duplication as a tool in design. My students struggle with this, too. Soon after that blog entry, I described an example of increasing duplication in order to eliminate duplication in a course. A few years later, in a fit of deja vu, I wrote about another example, in which code duplication is a hint to think differently about a problem.

I am going to look for more opportunities to help students see ways in which they can make design better by isolating code into the identical and the different.

Inside or Outside. Change the interface or the implementation but not both at the same time.

This is one of the fundamental tenets of design, something students should learn as early as possible. I was surprised to see how normal it was for students in my agile development course not to follow this pattern, even when it quickly got them into trouble. When you try to refactor interface and implementation at the same time, things usually don't go well. That's not a safe step to take...

My students and I discussed writing unit tests before writing code a lot during the course. Only afterward did it occur to me that Inside or Outside is the basic element of test-first programming and TDD. First, we write the test; this is where we design the interface of our system. Then, we write code to pass the test; this is where we implement the system.

Again, in upcoming courses, I am going to look for opportunities to help students think more effectively about the distinction between the inside and the outside of their code.

Thus, I have a couple of ideas for the future. Hurray! Even so, I'm not sure how I feel about my blog entry of four years ago. I had the good sense to read Kent's article back then, draw some good ideas from it, and write a blog entry about them. That's good. But here I am four years later, and I still feel like I need to make the same sort of improvements to how I teach.

In the end, I am glad I wrote that blog entry four years ago. Reading it now reminds me of thoughts I forgot long ago, and reminds me to aim higher. My opening reference to getting a booster shot seems like a useful analogy for talking about this situation in my teaching.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

June 07, 2014 10:17 AM

Pascal, Forgiveness, and CS1

Last time, I thought about the the role of forgiveness in selecting programming languages for instruction. I mentioned that BASIC had worked well for me as a first programming language, as it had worked for so many others. Yet I would probably would never choose it as a language for CS1, at least for more than a few weeks of instruction. It is missing a lot of the features that we want CS majors to learn about early. It's also a bit too free.

In that post, I did say that I still consider Pascal a good standard for first languages. It dominated CS1 for a couple of decades. What made it work so well as a first instructional language?

Pascal struck a nice balance for its time. It was small enough that students could master it all, and also provided constructs for structured programming. It had the sort of syntax that enabled a compiler to provide students guidance about errors, but its compilers did not seem overbearing. It had a few "gothchas", such as the ; as a statement separator, but not so many that students were constantly perplexed. (Hey to C++.) Students were able try things out and get programs to work without becoming demoralized by a seemingly endless stream of complaints.

(Aside: I have to admit that I liked Pascal's ; statement separator. I understood it conceptually and, in a strange way, appreciated it aesthetically. Most others seem to have disagreed with me...)

Python has attracted a lot of interest as a CS1 language in recent years. It's the first popular language in a long while that brings to mind Pascal's feel for me. However, Pascal had two things that supported the teaching of CS majors that Python does not: manifest types and pointers. I love dynamically-typed languages with managed memory and prefer them for my own work, but using that sort of language in CS1 creates some new challenges when preparing students for upper-division majors courses.

So, Pascal holds a special place for me as a CS1 language, though it was not the language I learned there. We used it to teach CS1 for many years and it served me and our students well. I think it balances a good level of forgiveness with a reasonable level of structure, all in a relatively small package.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

June 06, 2014 4:24 PM

Programming Languages and the Right Level of Forgiveness

In the last session of my May term course on agile software development, discussion eventually turned to tools and programming languages. We talked about whether some languages are more suited to agile development than others, and whether some languages are better tools for a given developer team at a given time. Students being students, we also discussed the courses used in CS courses, including the intro course.

Having recently thought some about choosing the right languages for early CS instruction, I was interested to hear what students thought. Haskell and Scala came up; they are the current pet languages of students in the course. So did Python, Java, and Ada, which are languages our students have seen in their first-year courses. I was the old guy in the room, so I mentioned Pascal, which I still consider a good standard for comparing CS1 languages, and classic Basic, which so many programmers of my generation and earlier learned as their first exposure to the magic of making computers do our bidding.

Somewhere in the conversation, an interesting idea came up regarding the first language that people learn: good first languages provide the right amount of forgiveness when programmers make mistakes.

A language that is too forgiving will allow the learner to be sloppy and fall into bad habits.

A language that is not forgiving enough can leave students dispirited under a barrage of not good enough, a barrage of type errors and syntax gotchas.

What we mean by 'forgiving' is hard to define. For this and other reasons, not everyone agrees with this claim.

Even when people agree in principle with this idea, they often have a hard time agreeing on where to draw the line between too forgiving and not forgiving enough. As with so many design decisions, the correct answer is likely a local maximum that balances the forces at play among the teachers, students, and desired applications involved.

I found Basic to be just right. It gave me freedom to play, to quickly make interesting programs run, and to learn from programs that didn't do what I expected. For many people's taste, though Basic is too forgiving and leads to diseased minds. (Hey to Edsger Dijkstra.) Maybe I was fortunate to learn how to use GOSUBs early and well.

Haskell seems like a language that would be too unforgiving for most learners. Then again, neither my students nor I have experience with it as a first-year language, so maybe we are wrong. We could imagine ways in which learning it first would lead to useful habits of thought about types and problem decomposition. We are aware of schools that use Haskell in CS1; perhaps they have made it work for them. Still, it feels a little too judgmental...

In the end, you can't overlook context and the value of good tools. Maybe these things shift the line of "just right" forgiveness for different audiences. In any case, finding the right level seems to be a useful consideration in choosing a language.

I suspect this is true when choosing languages to work in professionally, too.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 30, 2014 4:09 PM

Programming is Social

The February 2014 issue of Math Horizons included A Conversation With Steven Strogatz, an interview conducted by Patrick Honner. The following passage came to mind this week:

PH: Math is collaborative?

SS: Yeah, math is social. ... The fact that math is social would come as a surprise to the people who think of it as antisocial.

PH: It might also come as a surprise to some math teachers!

SS: It's extremely social. Mathematicians constantly spend time talking to each other about places where they're stuck. They get insights from each other, new ways of looking at things. Sometimes it's just to commiserate.

Programming is social, too. Most people think it's not. With assistance from media portrayals of programmers and sloppy stereotypes of our own, they think most of us would prefer to work alone in the dark. Some do, of course, but even then most programmers I know like to talk shop with other programmers all the time. They like to talk about the places where they are stuck, as well as the places they used to be stuck. War stories are the currency of the programmer community.

I think a big chunk of the "programming solo" preference many programmers profess is learned habit. Most programming instruction and university course work encourages or requires students to work alone. What if we started students off with pair programming in their CS 1 course, and other courses nurtured that habit throughout the rest of their studies? Perhaps programmers would learn a different habit.

My agile software development students this semester are doing all of their project work via pair programming. Class time is full of discussion: about the problem they are solving, about the program they are evolving, and about the intricacies of Java. They've been learning something about all three, and a large part of that learning has been social.

They've only been trying out XP for a couple of weeks, so naturally the new style hasn't replaced their old habits. I see them fall out of pairing occasionally. One partner will switch off to another computer to look up the documentation for a Java class, and pretty soon both partners are quietly looking at their own screens. Out of deference to me or the course, though, they return after a couple of minutes and resume their conversation. (I'm trying to be a gentle coach, not a ruthless tyrant, when it comes to the practices.)

I suspect a couple members of the class would prefer to program on their own, even after noticing the benefits of pairing. Others really enjoy pair programming but may well fall back into solo programming after the class ends. Old habits die hard, if at all. That's too bad, because most of us are better programmers when pairing.

But even if they do choose, or fall back into, old habits, I'm sure that programming will remain a social activity for them, at some level. There are too many war stories to tell.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 23, 2014 12:27 PM

Words Matter, Even in Code

Jim Weirich on dealing with failure in Ruby, via Avdi Grimm's blog:

(An aside, because I use exceptions to indicate failures, I almost always use the "fail" keyword rather than the "raise" keyword in Ruby. Fail and raise are synonyms so there is no difference except that "fail" more clearly communicates that the method has failed. The only time I use "raise" is when I am catching an exception and re-raising it, because here I'm *not* failing, but explicitly and purposefully raising an exception. This is a stylistic issue I follow, but I doubt many other people do).

Words matter: the right words, used at the right times. Weirich always cared about words, and it showed both in his code and in his teaching and writing.

The students in my agile class got to see my obsession with word choice and phrasing in class yesterday, when we worked through the story cards they had written for their project. I asked questions about many of their stories, trying to help them express what they intended as clearly as possible. Occasionally, I asked, "How will you write the test for this?" In their proposed test we found what they really meant and were able to rephrase the story.

Writing stories is hard, even for experienced programmers. My students are doing this for the first time, and they seemed to appreciate the need to spend time thinking about their stories and looking for ways to make them better. Of course, we've already discussed the importance of good names, and they've already experienced that way in which words matter in their own code.

Whenever I hear someone say that oral and verbal communication skills aren't all that important for becoming a good programmer, I try to help them see that they are, and why. Almost always, I find that they are not programmers and are just assuming that we techies spend all our time living inside mathematical computer languages. If they had ever written much software, they'd already know.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 19, 2014 4:09 PM

Becoming More Agile in Class, No. 2

After spending a couple of days becoming familiar with pair programming and unit tests, for Day 4 we moved on to the next step: refactoring. I had the students study the "before" code base from Martin Fowler's book, Refactoring, to identify several ways they thought we could improve it. Then they worked in pairs to implement their ideas. The code itself is pretty simple -- a small part of the information system for a movie rental store -- and let the students focus on practice with tools, running tests, and keeping the code base "green".

We all know Fowler's canonical definition of refactoring:

Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure.

... but it's easy to forget that refactoring really is about design. Programmers with limited experience in Java or OOP can bring only so much to the conversation about improving an OO program written in Java. We can refactor confidently and well only if we have a target in mind, one we understand and can envision in our code. Further, creating a good software design requires taste, and taste generally comes from experience.

I noticed this lack of experience manifesting itself in the way my students tried to decompose the work of a refactoring into small, safe steps. When we struggle with decomposing a refactoring, we naturally struggle with choosing the next step to work on. Kent Beck calls this the challenge of succession. Ordering the steps of a refactoring is a more subtle challenge than many programmers realize at first.

This session reminded me why I like to teach design and refactoring in parallel: coming to appreciate new code smells and quickly learning how to refactor code into a better state. This way, programming skill grows along side the design skill.

On Day 5, we tried to put the skills from the three previous days all together, using an XP-style test-code-refactor-repeat cycle to implement a bit of code. Students worked on either the Checkout kata from Dave Thomas or a tic-tac-toe game based on a write-up by Gojko Adzic. No, these are not the most exciting programs to build, but as I told the class, this makes it possible for them to focus on the XP practices and habits of mind -- small steps, unit tests, and refactoring -- without having to work too hard to grok the domain.

My initial impression as the students worked was that the exercise wasn't going as well as I had hoped it would. The step size was too big, and the tests were too intrusive, and the refactoring was almost non-existent. Afterwards, though, I realized that programmers learning such foreign new habits must go through this phase. The best I can do is inject an occasional suggestion or question, hoping that it helps speed them along the curve.

This morning, I decided to have each student pair up with someone who had worked on the other task last time, flip a coin, and work on the one of the same two tasks. This way, each pair had someone working on the same problem again and someone working on a new problem. I instructed them to start from scratch -- new code, new thoughts -- and have the person new to the task write the first test.

The goal wass to create an asymmetry within each pair. Working on the same piece again would be valuable for the partner doing so, in the way playing finger exercises or etudes is valuable for a musician. At the same time, the other partner would see a new problem, bringing fresh eyes and thoughts to the exercise. This approach seems like a good one, as it varies the experience for both members of the pair. I know how important varying the environment can be for student learning, but I sometimes forget to do that often enough in class.

The results seemed so much better today. Students commented that they made better progress this time around, not because one of them had worked on the same problem last time, but because they were feeling more comfortable with the XP practices. One students something to the effect,

Last time, we were trying to work on the simplest or smallest piece of code we could write. This time, we were trying to work on the smallest piece of functionality we could add to the program.

That's a solid insight from an undergrad, even one with a couple of years programming experience.

I also liked the conversation I was hearing among the pairs. They asked each other, "Should we do this feature next, or this other?" and said, "I'm not sure how we can test this." -- and then talked it over before proceeding. One pair had a wider disparity in OO experience, so the more experienced programmer was thinking out loud as he drove, taking into account comments from his partner as he coded.

This is a good sign. I'm under no illusion that they have all suddenly mastered ordering features, writing unit tests, or refactoring. We'll hit bumps over the next three weeks. But they all seem to be pretty comfortable with working together and collaborating on code. That's an essential skill on an agile team.

Next up: the Planning Game for a project that we'll work on for the rest of the class. They chose their own system to build, a cool little Android game app. That will change the dynamic a bit for customer collaboration and story writing, but I think that the increased desire to see a finished product will increase their motivation to master the skills and practice. My job as part-time customer, part-time coach will require extra vigilance to keep them on track.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 14, 2014 4:52 PM

Becoming More Agile in Class

Days 2 and 3 of my Agile Software Development May term course are now in the books. This year, I decided to move as quickly as we could in the lab. Yesterday, the students did their first pair-programming session, working for a little over an hour on one of the industry standard exercises, Conway's Game of Life. Today, they did their first pair programming with unit tests, using Bill Wake's Test-First Challenge to implement the beginnings of a simple data model for spreadsheets.

I always enjoy watching students write code and interacting with them while they do it. The thing that jumped out to me yesterday was just how much code some students write before they ever think about compiling it, let alone testing it. Another was how some students manage to get through a year of programming-heavy CS courses without mastering their basic tools: text editor, compiler, and language. It's hard to work confidently when your tools feel uncomfortable in your hands.

There's not much I can do to help students develop greater facility with their tools than give them lots of practice, and we will do that. However, writing too much code before testing even its syntactic correctness is a matter of mindset and habit. So I opened today's session with a brief discussion, and then showed them what I meant in the form of a short bit of code I wrote yesterday while watching them. Then I turned them loose with Wake's spreadsheet tests and encouragement to help each other write simple code, compile frequently even with short snippets, and run the tests as often as their code compiles.

Today, we had an odd number of students in class, something that's likely to be our standard condition this term, so paired with one of the students on a spreadsheet. He wanted to work in Haskell, and I was game. I refreshed my Haskell memories a bit and even contributed some helpful bits of code, in addition to meta-contributions on XP style.

The student is relatively new to the language, so he's still developing the Haskell in his in his mind. There were times we struggled because we were thinking of the problem in a stateful way. As you surely know, that's not the best way to work in Haskell. Our solutions were not always elegant, but we did our best to get in the rhythm of writing tests, writing code, and running.

As the period was coming to an end, our code had just passed a test that had been challenging us. Almost simultaneously, a student in another thrust his arms in the air as his pair's code passed a challenging test, too, much deeper in the suite. We all decided to declare victory and move on. We'll all get better with practice.

Next up: refactoring, and tools to support it and automated testing.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 12, 2014 5:01 PM

Teaching a Compressed Class

May term started today, so my agile course is off the ground. We will meet for 130 minutes every morning through June 6, excepting only Memorial Day. That's a lot of time spent together in a short period of time.

As I told the students today, each class is almost a week's worth of class in a regular semester. This means committing a fair amount of time out of class every day, on the order of 5-7 hours. There isn't a lot of time for our brains to percolate on the course content. We'll be moving steadily for four weeks.

This makes May term unsuitable, in my mind at least, for a number of courses. I would never teach CS 1 in May term. Students are brand new to the discipline, to programming, and usually to their first programming language. They need time for the brains to percolate. I don't think I'd want to teach upper-division CS courses in May term if they have a lot of content, either. Our brains don't always absorb a lot of information quickly in a short amount of time, so letting it sink in more slowly, helped by practice and repetition, seems best.

My agile course is, on the other hand, almost custom made for a compressed semester. There isn't a lot of essential "content". The idea is straightforward. I don't expect students to memorize lists of practices, or the rules of tools. I expect them to do the practices. Doing them daily, in extended chunks of time, with immediate feedback, is much better than taking a day off between practice sessions.

Our goal is, in part, to learn new habits and then reflect on how well they fit, on where they might help us most and where they might get in the way. We'll have better success learning new habits in the compressed term than we would with breaks. And, as much as I want students to work daily during a fifteen-week semester to build habits, it usually just doesn't happen. Even when the students buy in and intend to work that way, life's other demands get in the way. Failing with good intentions is still failing, and sometimes feels worse than failing without them.

So we begin. Tomorrow we start working on our first practice, a new way of working with skills to be learned through repetition every day the rest of the semester. Wish us luck.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 09, 2014 4:11 PM

Transition

Spring semester ends today. May term begins Monday. I haven't taught during the summer since 2010, when I offered a course on agile software development. I'm reprising that course this month, with nine hardy souls signed on for the mission. That means no break for now, just a new start. I like those.

I'm sure I could blog for hours on the thoughts running through my head for the course. They go beyond the readings we did last time and the project we built, though all that is in the mix, too.

For now, though, three passages that made the highlights of my recent reading. All fit nicely with the theme of college days and transition.

~~~~

First, this reminder from John Graham, a "self-made merchant" circa 1900, in a letter to his son at college.

Adam invented all the different ways in which a young man can make a fool of himself, and the college yell at the end of them is just a frill that doesn't change essentials.

College is a place all its own, but it's just a place. In many ways, it's just the place where young people spend a few years while they are young.

~~~~

Next, a writer tells a story of studying with Annie Dillard in college. During their last session together, she told the class:

If I've done my job, you won't be happy with anything you write for the next 10 years. It's not because you won't be writing well, but because I've raised your standards for yourself.

Whatever we "content" teach our students, raising their standards and goals is sometimes the most important thing we do. "Don't compare yourselves to each other", she says. Compare yourselves to the best writers. "Shoot there." This advice works just as well for our students, whether they are becoming software developers or computer scientists. (Most of our students end up being a little bit of both.)

It's better to aim at the standard set by Ward Cunningham or Alan Kay than at the best we can imagine ourselves doing right now.

~~~~

Now that I think about it, this last one has nothing to do with college or transitions. But it made me laugh, and after a long academic year, with no break imminent, a good laugh is worth something.

What do you call a rigorous demonstration that a statement is true?
  1. If "proof", then you're a mathematician.
  2. If "experiment", then you're a physicist.
  3. If you have no word for this concept, then you're an economist.

This is the first of several items in The Mathematical Dialect Quiz at Math with Bad Drawings. It adds a couple of new twists to the tongue-in-cheek differences among mathematicians, computer scientists, and engineers. With bad drawings.

Back to work.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 07, 2014 3:39 PM

Thinking in Types, and Good Design

Several people have recommended Pat Brisbin's Thinking in Types for programmers with experience in dynamically-typed languages who are looking to grok Haskell-style typing. He wrote it after helping one of his colleagues of mine was get unstuck with a program that "seemed conceptually simple but resulted in a type error" in Haskell when implemented in a way similar to a solution in a language such as Python or Ruby.

This topic is of current interest to me at a somewhat higher level. Few of our undergrads have a chance to program in Haskell as a part of their coursework, though a good number of them learn Scala while working at a local financial tech company. However, about two-thirds of undergrads now start with a one or two semesters of Python, and types are something of a mystery to them. This affects their learning of Java and colors how they think about types if they take my course on programming languages.

So I read this paper. I have two comments.

First, let me say that I agree with my friends and colleagues who are recommending this paper. It is a clear, concise, and well-written description of how to use Haskell's types to think about a problem. It uses examples that are concrete enough that even our undergrads could implement with a little help. I may use this as a reading in my languages course next spring.

Second, I think think this paper does more than simply teach people about types in a Haskell-like language. It also gives a great example of how thinking about types can help programmers create better designs for their programs, even if they are working in an object-oriented language! Further, it hits right at the heart of the problem we face these days, with students who are used to working in scripting languages that provide high-level but very generic data structures.

The problem that Brisbin addresses happens after he helps his buddy create type classes and two instance classes, and they reach this code:

    renderAll [ball, playerOne, playerTwo]

renderAll takes a list of values that are Render-able. Unfortunately, in this case, the arguments come from two different classes... and Haskell does not allow heterogeneous lists. We could try to work around this feature of Haskell and "make it fit", but as Brisbin points out, doing so would cause you to lose the advantages of using Haskell in the first place. The compiler wouldn't be able to find errors in the code.

The Haskell way to solve the problem is to replace the generic list of stuff we pass to renderAll with a new type. With a new Game type that composes a ball with two players, we are able to achieve several advantages at once:

  • create a polymorphic render method for Game that passes muster with the type checker
  • allow the type checker to ensure that this element of our program is correct
  • make the program easier to extend in a type-safe way
  • our program is correct
  • and, perhaps most importantly, express the intent of the program more clearly

It's this last win that jumped off the page for me. Creating a Game class would give us a better object-oriented design in his colleague's native language, too!

Students who become accustomed to programming in languages like Python and Ruby often become accustomed to using untyped lists, arrays, hashes, and tuples as their go-to collections. They are oh, so, handy, often the quickest route to a program that works on the small examples at hand. But those very handy data structures promote sloppy design, or at least enable it; they make it easy not to see very basic objects living in the code.

Who needs a Game class when a Python list or Ruby array works out of the box? I'll tell you: you do, as soon as you try to almost anything else in your program. Otherwise, you begin working around the generality of the list or array, writing code to handle special cases really aren't special cases at all. They are simply unbundled objects running wild in the program.

Good design is good design. Most of the features of a good design transcend any particular programming style or language.

So: This paper is a great read! You can use it to learn better how to think like a Haskell programmer. And you can use it to learn even if thinking like a Haskell programmer is not your goal. I'm going to use it, or something like it, to help my students become better OO programmers.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

April 23, 2014 3:29 PM

Simple Tests

As Corey Haines tells us, it really can be this simple:

def assert_equal(expected, actual, message)
  if expected != actual
    raise "Expected #{expected}, got #{actual}\n#{message}"
  end
end

Don't let the overhead of learning or using a test harness prevent you from starting. Write a test, then write some code. Or, if you prefer: Write some code, then write a test.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

April 21, 2014 2:45 PM

The Special Case Object Pattern in "Confident Ruby"

I haven't had a chance to pick up a copy of Avdi Grimm's new book, Confident Ruby, yet. I did buzz by the book's Pragmatic Programmers page, where I was able to pick up a sample chapter or two for elliptical reading.

The chapter "Represent special cases as objects" was my first look. This chapter and the "Represent do-nothing cases as null objects" chapter that follows deal with situations in which our program is missing a kind of object. The result is code that has too many responsibilities because there is no object charged with handling them.

The chapter on do-nothing cases is @avdi's re-telling of the Null Object pattern. Bobby Woolf workshopped his seminal write-up of this pattern at PLoP 1996 (the first patterns conference I attended) and later published an improved version in the fourth Pattern Languages of Program Design book. I had the great pleasure to correspond with Bobby as he wrote his original paper and to share a few ideas about the pattern.

@avdi's special cases chapter is a great addition to the literature. It shows several different ways in which our code can call out for a special case object in place of a null reference. It then shows how creating a new kind of object can make our code better in each case, giving concrete examples written in Ruby, in the context of processing input to a web app.

I was discussing the pattern and the chapter with a student, who asked a question about this example:

    if current_user
      render_logout_button
    else
      render_login_button
    end

This is the only example in which the if check is not eliminated after introducing the special case object, an instance of the new class, GuestUser. Instead, @avdi adds an authenticated? to the User and GuestUser classes, has them return true and false respectively, and then changes the original expression to:

    if current_user.authenticated?
      render_logout_button
    else
      render_login_button
    end

As the chapter tells us, using the authenticated? predicate makes the conditional statement express the programmer's intent more clearly. But it also says that "we can't get rid of the conditional". My student asked, "Why not?"

Of course we can. The question is whether we want to. (I have a hard time using words like "cannot", "never", and "always", because I can usually imagine an exception to the absolute...)

In this case, there is a lingering smell in the code that uses the special case object: authenticated? is a surrogate for type check. Indeed, it behaves just like a query to find the object's class so that we can tailor our behavior to receiver's type. That's just the sort of thing we don't have to do in an OO program.

The standard remedy for this code smell is to push the behavior into the classes and send the object, whatever its type, a message. Rather ask a user if it is authenticated so that we can render the correct button, we might ask it to render the correct button itself:

    current_user.render_button

...

class User def render_button render_logout_button end end

class GuestUser def render_button render_login_button end end

Unfortunately, it's not quite this simple. The render_logXXX_button methods don't live in the user classes, so the render_button methods need to send those messages to some other object. If the user object already knows to whom to send it, great. If not, then the send of the render_button message will need to send itself as an argument along with the message, so that the receiver can send the appropriate message back.

Either of these approaches requires us to let some knowledge from the original context leak into our User and GuestUser classes, and that creates a new form of coupling. Ideally, there will be a way to mitigate this coupling in the form of some other shared association. Ruby web developers know the answer to this better than I.

In any case, this may be what @avdi means when he says that we can't get rid of the if check. Doing so may create more downside than upside.

This turned into a great opportunity to discuss design with my student. Design is about trade-offs. Things never seem quite as simple in the trenches as they do when we learn the rules and heuristics of design. There is no perfect solution. Our goal as programmers should be to develop the ability to make informed decisions in these situations, taking into account the context in which we are working.

Patterns document design solutions and so must be used with care. One of the thing I love about the pattern form is that it encourages the writer to make as explicit as possible the context in which the solution applies and the forces that make its use more or less applicable. This helps the reader to face the possible trade-offs with his or her eyes wide open.

So, one minor improvement @avdi might make in this chapter is to elaborate on the reason underlying the assertion that we can't eliminate this particular if check. Otherwise, students of OOP are likely to ask the same question my student asked.

Of course, the answer may be obvious to Ruby web developers. In the end, working with patterns is like all other design: the more experience we have, the better.

This is a relatively minor issue, though. From what I've seen, "Confident Ruby" will be a valuable addition to most Ruby programmers' bookshelves.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

April 17, 2014 3:30 PM

The "Subclass as Client" Pattern

A few weeks ago, Reginald Braithwaite wrote a short piece discouraging us from creating class hierarchies. His article uses Javascript examples, but I think he intends his advice to apply everywhere:

So if someone asks you to explain how to write a class hierarchy? Go ahead and tell them: "Don't do that!"

If you have done much object-oriented programming in a class-based language, you will recognize his concern with class hierarchies: A change to the implementation of a class high up in the hierarchy could break every class beneath it. This is often called the "fragile base class" problem. Fragile code can't be changed without a lot of pain, fixing all the code broken by the change.

I'm going to violate the premise of Braithwaite's advice and suggest a way that you can make your base classes less fragile and thus make small class hierarchies more attractive. If you would like to follow his advice, feel free to tell me "Don't do that!" and stop reading now.

The technique I suggest follows directly from a practice that OO programmers use to create good objects, one that Braithwaite advocates in his article: encapsulating data tightly within an object.

JavaScript does not enforce private state, but it's easy to write well-encapsulated programs: simply avoid having one object directly manipulate another object's properties. Forty years after Smalltalk was invented, this is a well-understood principle.

The article then shows a standard example of a bank account object written in this style, in which client code uses the object without depending on its implementation. So far, so good.

What about classes?

It turns out, the relationship between classes in a hierarchy is not encapsulated. This is because classes do not relate to each other through a well-defined interface of methods while "hiding" their internal state from each other.

Braithwaite then shows an example of a subclass method that illustrates the problem:

    ChequingAccount.prototype.process = function (cheque) {
      this._currentBalance = this._currentBalance - cheque.amount();
      return this;
    }

The ChequingAccount directly accesses its _currentBalance member, which it inherits from the Account prototype. If we now change the internal implementation of Account so that it does not provide a _currentBalance member, we will break ChequingAccount.

The problem, we are told, is that objects are encapsulated, but classes are not.

... this dependency is not limited in scope to a carefully curated interface of methods and behaviour. We have no encapsulation.

However, as the article pointed out earlier, JavaScript does not enforce private state for objects! Even so, it's easy to write well-encapsulated programs -- by not letting one object directly manipulate another object's properties. This is a design pattern that makes it possible to write OO programs even when the language does not enforce encapsulation.

The problem isn't that objects are encapsulated and classes are not. It's that we tend treat superclasses differently than we treat other classes.

When we write code for two independent objects, we think of their classes as black boxes, sealed off from external inspection. The data and methods defined in the one class belong to it and its objects. Objects of one class must interact with objects of another via a carefully curated interface of methods and behavior.

But when we write code for a subclass, we tend to think of the data and methods defined in the superclass as somehow "belonging to" instances of the subclass. We take the notion of inheritance too literally.

My suggestion is that you treat your classes like you treat objects: Don't let one class look into another class and access its state directly. Adopt this practice even when the other class is a superclass, and the state is an inherited member.

Many OO programs have this pattern. I usually call it the "Subclass as Client" pattern. Instances of a subclass act as clients of their superclass, treating it -- as much as possible -- as an independent object providing a set of well-defined behaviors.

When code follows this pattern, it takes Braithwaite's advice for designing objects up a level and follows it more faithfully. Even instance variables inherited from the superclass are encapsulated, accessible only through the behaviors of the superclass.

I don't program in Javascript, but I've written a lot of Java over the years, and I think the lessons are compatible. Here's my story.

~~~~~

When I teach OOP, one of the first things my students learn about creating objects is this:

All instance variables are private.

Like Javascript, Java doesn't require this. We can tell the compiler to enforce it, though, through use of the private modifier. Now, only methods defined in the same class can access the variable.

For the most part, students are fine with this idea -- until we learn about subclasses. If one class extends another, it cannot access the inherited data members. The natural thing to do is what they see in too many Java examples in their texts and on the web: change private variables in the superclass to protected. Now, all is right with the world again.

Except that they have stepped directly into the path of the fragile base class problem. Almost any change to the superclass risks breaking all of its subclasses. Even in a sophomore OO course, we quickly encounter the problem of fragile base classes in our programs. But other choice do we have?

Make each class a server to its subclasses. Keep the instance variables private, and (in Braithwaite's words) carefully curate an interface of methods for subclasses to use. The class may be willing to expose more of its identity to its subclasses than to arbitrary objects, so define protected methods that are accessible only to its subclasses.

This is an intentional extension of the class's interface for explicit interaction with subclasses. (Yes, I know that protected members in Java are accessible to every class in the package. Grrr.)

This is the same discipline we follow when we write well-behaved objects in any language: encapsulate data and define an interface for interaction. When applied to the class-subclass relationship, it helps us to avoid the dangers of fragile base classes.

Forty years after Smalltalk was invented, this principle should be better understood by more programmers. In Smalltalk, variables are encapsulated within their classes, which forces subclasses to access them through methods defined in the superclass. This language feature encourages the writer of the class to think explicitly about how instances of a subclass will interact with the class. (Unfortunately, those methods are public to the world, so programmers have to enforce their scope by convention.)

Of course, a lazy programmer can throw away this advantage. When I first learned OO in Smalltalk, I quickly figured out that I could simply define accessors with the same names as the instance variables. Hurray! My elation did not last long, though. Like my Java students, I quickly found myself with a maze of class-subclass entanglements that made programming unbearable. I had re-invented the Fragile Base Class problem.

Fortunately, I had the Smalltalk class library to study, as well as programs written by better programmers than I. Those programs taught me the Subclass as Client pattern, I learned that it was possible to use subclasses well, when classes were designed carefully. This is just one of the many ways that Smalltalk taught me OOP.

~~~~~

Yes, you should prefer composition to inheritance, and, yes, you should strive to keep your class hierarchies as small and shallow as possible. But if you apply basic principles of object design to your superclasses, you don't need to live in absolute fear of fragile base classes. You can "do that" if you are willing to carefully curate an interface of methods that define the behavior of a class as a superclass.

This advice works well only for the class hierarchies you build for yourself. If you need to work with a class from an external package you don't control, then you can't be control the quality of those class's interfaces. Think carefully before you subclass an external class and depend on its implementation.

One technique I find helpful in this regard is to build a wrapper class around the external class, carefully define an interface for subclasses, and then extend the wrapper class. This at least isolates the risk of changes in the library class to a single class in my program.

Of course, if you are programming in Javascript, you might want to look to the Self community for more suitable OO advice than to Smalltalk!


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

March 31, 2014 3:21 PM

Programming, Defined and Re-imagined

By Chris Granger of Light Table fame:

Programming is our way of encoding thought such that the computer can help us with it.

Read the whole piece, which recounts Granger's reflection after the Light Table project left him unsatisfied and he sought answers. He concludes that we need to re-center our idea of what programming is and how we can make it accessible to more people. Our current idea of programming doesn't scale because, well,

It turns out masochism is a hard sell.

Every teacher knows this. You can sell masochism to a few proud souls, but not to anyone else.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 29, 2014 10:06 AM

Sooner

That is the advice I find myself giving to students again and again this semester: Sooner.

Review the material we cover in class sooner.

Ask questions sooner.

Think about the homework problems sooner.

Clarify the requirements sooner.

Write code sooner.

Test your code sooner.

Submit a working version of your homework sooner. You can submit a more complete version later.

A lot of this advice boils down to the more general Get feedback sooner. In many ways, it is a dual of the advice, Take small steps. If you take small steps, you can ask, clarify, write, and test sooner. One of the most reliable ways to do those things sooner is to take small steps.

If you are struggling to get things done, give sooner a try. Rather than fail in a familiar way, you might succeed in an unfamiliar way. When you do, you probably won't want to go back to the old way again.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

March 18, 2014 2:30 PM

Deploy So That You Can Learn The Rest

In this interview prior to Monday's debut of FiveThirtyEight, Joe Coscarelli asked Nate Silver if the venture was ready to launch. Silver said that they they were probably 75-80% ready and that it was time to go live.

You're going to make some mistakes once you launch that you can't really deal with until you actually have a real product.

If they waited another month, they'd probably feel like they were ... 75-80% ready. There are some things you can't learn "unless your neck is on the line".

It ought not be surprising that Silver feels this way. His calling card is using data to make better decisions. Before you can have big data, or good data, you have to have data. It is usually better to start collecting it now than to start collecting it later.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

March 14, 2014 3:47 PM

We're in a Dr. Seuss Book

Sass, Flexbox, Git, Grunt? Frank Chimero whispers:

(Look at that list, programmers. You need to get better at naming things. No wonder why people are skittish about development. It's like we're in a Dr. Seuss book.)

For new names, it's time to hunt.
I will not Git!
I will not Grunt!

Nevermore shall we let these pass.
No more Flexbox!
No more Sass!


Posted by Eugene Wallingford | Permalink | Categories: Software Development

March 11, 2014 4:52 PM

Change The Battle From Arguments To Tests

In his recent article on the future of the news business, Marc Andreessen has a great passage in his section on ways for the journalism industry to move forward:

Experimentation: You may not have all the right answers up front, but running many experiments changes the battle for the right way forward from arguments to tests. You get data, which leads to correctness and ultimately finding the right answers.

I love that clause: "running many experiments changes the battle for the right way forward from arguments to tests".

While programming, it's easy to get caught up in what we know about the code we have just written and assume that this somehow empowers us to declare sweeping truths about what to do next.

When students are first learning to program, they often fall into this trap -- despite the fact that they don't know much at all. From other courses, though, they are used to thinking for a bit, drawing some conclusions, and then expressing strongly-held opinions. Why not do it with their code, too?

No matter who we are, whenever we do this, sometimes we are right, and sometimes, we are wrong. Why leave it to chance? Run a simple little experiment. Write a snippet of code that implements our idea, and run it. See what happens.

Programs let us test our ideas, even the ideas we have about the program we are writing. Why settle for abstract assertions when we can do better? In the end, even well-reasoned assertions are so much hot air. I learned this from Ward Cunningham: It's all talk until the tests run.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development, Teaching and Learning

March 08, 2014 10:18 AM

Sometimes a Fantasy

This week I saw a link to The Turing School of Software & Design, "a seven-month, full-time program for people who want to become professional developers". It reminded me of Neumont University, a ten-year-old school that offers a B.S. degree program in Computer science that students can complete in two and a half years.

While riding the bike, I occasionally fantasize about doing something like this. With the economics of universities changing so quickly [ 1 | 2 ], there is an opportunity for a new kind of higher education. And there's something appealing about being able to work closely with a cadre of motivated students on the full spectrum of computer science and software development.

This could be an accelerated form of traditional CS instruction, without the distractions of other things, or it could be something different. Traditional university courses are pretty confining. "This course is about algorithms. That one is about programming languages." It would be fun to run a studio in which students serve as apprentices making real stuff, all of us learning as we go along.

A few years ago, one of our ChiliPLoP hot topic groups conducted a greenfield thought experiment to design an undergrad CS program outside of the constraints of any existing university structure. Student advancement was based on demonstrating professional competencies, not completing packaged courses. It was such an appealing idea! Of course, there was a lot of hard work to be done working out the details.

My view of university is still romantic, though. I like the idea of students engaging the great ideas of humanity that lie outside their major. These days, I think it's conceivable to include the humanities and other disciplines in a new kind of CS education. In a recent blog entry, Hollis Robbins floats the idea of Home College for the first year of a liberal arts education. The premise is that there are "thousands of qualified, trained, energetic, and underemployed Ph.D.s [...] struggling to find stable teaching jobs". Hiring a well-rounded tutor could be a lot less expensive than a year at a private college, and more lucrative for the tutor than adjuncting.

Maybe a new educational venture could offer more than targeted professional development in computing or software. Include a couple of humanities profs, maybe some a social scientist, and it could offer a more complete undergraduate education -- one that is economical both in time and money.

But the core of my dream is going broad and deep in CS without the baggage of a university. Sometimes a fantasy is all you need. Other times...


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

March 07, 2014 2:24 PM

Take Small Steps

If a CS major learns only one habit of professional practice in four years, it should be:

Take small steps.

A corollary:

If things aren't working, take smaller steps.

I once heard Kent Beck say something similar, in the context of TDD and XP. When my colleague Mark Jacobson works with students who are struggling, he uses a similar mantra: Solve a simpler problem. As Dr. Nick notes, students and professionals alike should scale the step size according to their level of knowledge or their confidence about the problem.

When I tweeted these thoughts yesterday, two pieces of related advice came in:

  • Slow down. -- Big steps are usually a sign of trying to hurry. Beginners are especially prone to this.

  • Lemma: Keep moving. -- Small steps keep us moving more reliably. We can always fool ourselves into believing that the next big step is all we need...

Of course, I've always been a fan of baby steps and unusual connections to agile software development. They apply quite nicely to learners in many settings.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

March 01, 2014 11:35 AM

A Few Old Passages

I was looking over a couple of files of old notes and found several quotes that I still like, usually from articles I enjoyed as well. They haven't found their way into a blog entry yet, but they deserve to see the light of day.

Evidence, Please!

From a short note on the tendency even among scientists to believe unsubstantiated claims, both in and out of the professional context:

It's hard work, but I suspect the real challenge will lie in persuading working programmers to say "evidence, please" more often.

More programmers and computer scientists are trying to collect and understand data these days, but I'm not sure we've made much headway in getting programmers to ask for evidence.

Sometimes, Code Before Math

From a discussion of The Expectation Maximization Algorithm:

The code is a lot simpler to understand than the math, I think.

I often understand the language of code more quickly than the language of math. Reading, or even writing, a program sometimes helps me understand a new idea better than reading the math. Theory is, however, great for helping me to pin down what I have learned more formally.

Grin, Wave, Nod

From Iteration Inside and Out, a review of the many ways we loop over stuff in programs:

Right now, the Rubyists are grinning, the Smalltalkers are furiously waving their hands in the air to get the teacher's attention, and the Lispers are just nodding smugly in the back row (all as usual).

As a Big Fan of all three languages, I am occasionally conflicted. Grin? Wave? Nod? Look like the court jester by doing all three simultaneously?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 16, 2014 10:48 AM

Experience Happens When You Keep Showing Up

You know what they say about good design coming from experience, and experience coming from bad design? That phenomenon is true of most things non-trivial. Here's an example from men's college basketball.

The University of Florida has a veteran team. The University of Kentucky has a young team. Florida's players are very good, but not generally considered to be in the same class as Kentucky's highly-regarded players. Yesterday, the two teams played a close game on Kentucky's home floor.

Once they fell behind by five with less than two minutes remaining, Kentucky players panicked. Florida players didn't. Why not? "Well, we have a veteran group here that's panicked before -- that's been in this situation and not handled it well," [Patric] Young said.

How did Florida's players maintain their composure at the end of a tight game on the road against another good team? They had been in that same situation three times before, and failed. They didn't panic this time in large part because they had panicked before and learned from those experiences.

Kentucky's starters have played a total of 124 college games. Florida's four seniors have combined to play 491. That's a lot of experience -- a lot of opportunities to panic, to guess wrong, to underestimate a situation, or otherwise to come up short. And a lot of opportunities to learn.

The young players at Kentucky hurt today. As the author of the linked game report notes, Florida's players have hurt like that before, for coming up short in much the same way, "and they used that pain to get better".

It turns out that composure comes from experience, and experience comes from lack of composure.

As a teacher, I try to convince students not to shy away from the error messages their compiler gives them, or from the convoluted code they eventually sneak past it. Those are the experiences they'll eventually look back to when they are capable, confident programmers. They just need the opportunity to learn.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

February 14, 2014 3:07 PM

Do Things That Help You Become Less Wrong

My students and I debriefed a programming assignment in class yesterday. In the middle of class, I said, "Now for a big question: How do you know your code is correct?

There were a lot of knowing smiles and a lot of nervous laughter. Most of them don't.

Sure, they ran a few test cases, but after making additions and changes to the code, some were just happy that it still ran. The output looked reasonable, so it must be done. I suggested that they might want to think more about testing.

This morning I read a great quote from Nathan Marz that I will share with my students:

Feedback is everything. Most of the time you're wrong, and feedback is the only way to realize your mistakes and help you become less wrong. This applies to everything.

Most of the time you're wrong. Do things that help you become less wrong. Getting feedback, early and often, is one of the best ways to do this.

A comment by a student earlier in the period foreshadowed our discussion of testing, which made me feel even better. In response to the retrospective question, "What design or programming choices went well for you?", he answered unit tests.

That set me up quite nicely to segue from manual testing into automated testing. If you aren't testing your code early and often, then manual testing is a huge improvement. But you can do even better by pushing those test cases into a form that can be executed quickly and easily, with the code doing the tedious work of verifying the output.

My students are writing code in many different languages, so I showed them testing frameworks in Ruby, Java, and Python. The code looks simple, even with the boilerplate imposed by the frameworks.

The big challenges in getting students to write unit tests are the same as for getting professionals to write them: lack of time, and misplaced confidence. I hope that a few of my students will see that the real time sink is debugging bad code and that a fear of changing code is a lack of confidence. The best way to be confident is to have evidence.

The student who extolled unit tests works in Racket and so has test cases in RackUnit. He set me up nicely for a future discussion, too, when he admitted out loud that he wrote his tests first. This time, it was I who smiled knowingly.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

January 31, 2014 3:13 PM

"What Should It Be?"

When asked to design and implement a program, beginning programmers often aren't sure what data type or data structure to use for a particular value. Should they use an array or a list? Or they've decided to use a record but can't decide exactly what fields to include, or names to give them.

"What should it be?", they ask.

I often have a particular implementation in mind, based on what we've been studying or on my own experience as a programmer, but I prefer not to tell them what to do. This is a great opportunity for them to learn to think about design.

Instead, I ask questions. "What have you considered?" "Do you think one or the other is better?" "Why?"

We discuss how so often there is no "right" answer. There are merely trade-offs. They have to choose. This is a design decision.

But, in making this decision, there's another opportunity to learn something about design. They don't have to commit now and forever to an implementation before proceeding with the rest of their program. Because the rest of the program shouldn't know about their decision anyway!

They should make an object that encapsulates the choice. They are then able to start building the rest of the program without fear that it depends on the details of their design choice. The rest of the program will interact with the object in terms of what the object means in the program, not in terms of how it is implemented. Later, if they change their minds, they will be able to change the implementation details without disturbing the rest of the code.

Yes this is basic stuff, but beginners often struggle with basic stuff. They've learned about ADTs or OOP, and they can talk abstractly about abstraction. But when it comes time to write code, indecision descends upon them. They are afraid of messing up.

If I can help allay their fears of proceeding, then I've contributed something to their work that day. I even suggest that writing the rest of the program might even help them figure out which alternative is better. I like to listen to my code, even if that idea seems strange or absurd to them. Some day soon, it may not.

In any case, they have the beginnings of a program, and perhaps a better idea of what design is all about.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

January 24, 2014 2:18 PM

Could I be a programmer?

... with apologies to Annie Dillard.

A well-known programmer got collared by a university student who asked, "Do you think I could be a programmer?"

"Well," the programmer said, "I don't know... Do you like function calls?"

The programmer could see the student's amazement. Function calls? Do I like function calls? I am twenty years old and do I like function calls?

If the student had liked function calls, of course, he could begin, like a joyful painter I knew. I asked him how he came to be a painter. He said, "I liked the smell of paint."


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

January 08, 2014 3:06 PM

"I'm Not a Programmer"

In The Exceptional Beauty of Doom 3's Source Code, Shawn McGrath first says this:

I've never really cared about source code before. I don't really consider myself a 'programmer'.

Then he says this:

Dyad has 193k lines of code, all C++.

193,000 lines of C++? Um, dude, you're a programmer.

Even so, the point is worth thinking about. For most people, programming is a means to an end: a way to create something. Many CS students start with a dream program in mind and only later, like McGrath, come to appreciate code for its own sake. Some of our graduates never really get there, and appreciate programming mostly for what they can do with it.

If the message we send from academic CS is "come to us only if you already care about code for its own sake", then we may want to fix our message.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 20, 2013 3:01 PM

Sometimes, Good Design is Simple

In an article on the Moonpig billing system, Mark Dominus writes:

Sometimes I see other people [screw] up a project over and over, and I say "I could do that better", and then I get a chance to try, and I discover it was a lot harder than I thought, I realize that those people who tried before are not as stupid as as I believed.

That did not happen this time.

Sometimes, good design is pretty simple. Separate interface from implementation. Create simple abstraction layers to separate different levels of functionality. Encapsulate data and behavior in objects that circumscribe potential change.

I liked a few of the specific tactics described, too:

Don't use raw primitives from the language, even standard classes. "Instead of using raw DateTime [a standard Perl class], we wrapped it in a derived class called Moonpig::DateTime."

Define convenience functions that hide underlying data implementations. Moonpig does this in several places, most notably money and time.

Use mutable data sparingly, and never for values. One way Moonpig does this is to implement "values with history", an idea I first learned from Ralph Johnson in Smalltalk. Each new value for an entity is pushed onto an array. When a piece of code asks for the current value, it receives the top of the array.

Object-oriented programming is centered around objects. That means encapsulated behavior. Other concepts, such as classes and inheritance, are add-ons. Dominus is especially hard on inheritance, based on past experience. I agree that it must be used carefully and sparingly. I like how Moonpig uses roles to eliminate the need for classes entirely in the application.

This was a fun read.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

December 17, 2013 3:32 PM

Always Have At Least Two Alternatives

Paraphrasing Kent Beck:

Whenever I write a new piece of code, I like to have at least two alternatives in mind. That way, I know I am not doing the worst thing possible.

I heard Kent say something like this at OOPSLA in the late 1990s. This is advice I give often to students and colleagues, but I've never had a URL that I could point them to.

It's tempting for programmers to start implementing the first good idea that comes to mind. It's especially tempting for novices, who sometimes seem surprised that they have even one good idea. Where would a second one come from?

More experienced students and programmers sometimes trust their skill and experience a little too easily. That first idea seems so good, and I'm a good programmer... Famous last words. Reality eventually catches up with us and helps us become more humble.

Some students are afraid: afraid they won't get done if they waste time considering alternatives, or afraid that they will choose wrong anyway. Such students need more confidence, the kind born out of small successes.

I think the most likely explanation for why beginners don't already seek alternatives is quite simple. They have not developed the design habit. Kent's advice can be a good start.

One pithy statement is often enough of a reminder for more experienced programmers. By itself, though, it probably isn't enough for beginners. But it can be an important first step for students -- and others -- who are in the habit of doing the first thing that pops into their heads.

Do note that this advice is consistent with XP's counsel to do the simplest thing that could possibly work. "Simplest" is a superlative. Grammatically, that suggests having at least three options from which to choose!


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development, Teaching and Learning

December 10, 2013 3:33 PM

Your Programming Language is Your Raw Material, Too

Recently someone I know retweeted this familiar sentiment:

If carpenters were hired like programmers:
"Must have at least 5 years experience with the Dewalt 18V 165mm Circular Saw"

This meme travels around the world in various forms all the time, and every so often it shows up in one of my inboxes. And every time I think, "There is more to the story."

In one sense, the meme reflects a real problem in the software world. Job ads often use lists of programming languages and technologies as requirements, when what the company presumably really wants is a competent developer. I may not know the particular technologies on your list, or be expert in them, but if I am an experienced developer I will be able to learn them and become an expert.

Understanding and skill run deeper than a surface list of tools.

But. A programming language is not just a tool. It is a building material, too.

Suppose that a carpenter uses a Dewalt 18V 165mm circular saw to add a room to your house. When he finishes the project and leaves your employ, you won't have any trace of the Dewalt in his work product. You will have a new room.

He might have used another brand of circular saw. He may not have used a power tool at all, preferring the fine craftsmanship of a handsaw. Maybe he used no saw of any kind. (What a magician!) You will still have the same new room regardless, and your life will proceed in the very same way.

Now suppose that a programmer uses the Java programming language to add a software module to your accounting system. When she finishes the project and leaves your employ, you will have the results of running her code, for sure. But you will have a trace of Java in her work product. You will have a new Java program.

If you intend to use the program again, to generate a new report from new input data, you will need an instance of the JVM to run it. If want to modify the program to work differently, then you will also need a Java compiler to create the byte codes that run in the JVM. If you want to extend the program to do more, then you again will need a Java compiler and interpreter.

Programs are themselves tools, and we use programming languages to build them. So, while the language itself is surely a tool at one level, at another level it is the raw material out of which we create other things.

To use a particular language is to introduce a slew of other dependencies to the overall process: compilers, interpreters, libraries, and sometimes even machine architectures. In the general case, to use a particular language is to commit at least some part of the company's future attention to both the language and its attendant tooling.

So, while I am sympathetic to sentiment behind our recurring meme, I think it's important to remember that a programming language is more than just a particular brand of power tool. It is the stuff programs are made of.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 08, 2013 11:48 AM

Change Happens When People Talk to People

I finally got around to reading Atul Gawande's Slow Ideas this morning. It's a New Yorker piece from last summer about how some good ideas seem to resist widespread adoption, despite ample evidence in their favor, and ways that one might help accelerate their spread.

As I read, I couldn't help but think of parallels to teaching students to write programs and helping professionals develop software more reliably. We know that development practices such as version control, short iterations, and pervasive testing lead to better software and more reliable process. Yet they are hard habits for many programmers to develop, especially when they have conflicting habits in place.

Other development practices seem counterintuitive. "Pair programming can't work, right?" In these cases, we have to help people overcome both habits of practice and habits of thought. That's a tall order.

Gawande's article is about medical practice, from surgeons to home practitioners, but his conclusions apply to software development as well. For instance: People have an easier time changing habits when the benefit is personal, immediate, and visceral. When the benefit is not so obvious, a whole new way of thinking is needed. That requires time and education.

The key message to teach surgeons, it turned out, was not how to stop germs but how to think like a laboratory scientist.

This is certainly true for software developers. (If you replace "germs" with "bugs", it's an even better fit!) Much of the time, developers have to think about evidence the ways scientists do.

This lesson is true not just for surgeons and software developers. It is true for most people, in most ways of life. Sometimes, we all have to be able to think and act like a scientist. I can think of no better argument for treating science as important for all students, just as we do reading and writing.

Other lessons from Gawande's article are more down-to-earth:

Many of the changes took practice for her, she said. She had to learn, for instance, how to have all the critical supplies -- blood-pressure cuff, thermometer, soap, clean gloves, baby respiratory mask, medications -- lined up and ready for when she needed them; how to fit the use of them into her routine; how to convince mothers and their relatives that the best thing for a child was to be bundled against the mother's skin. ...

So many good ideas in one paragraph! Many software development teams could improve by putting them in action:

  • Construct a work environment with essential tools ready at hand.
  • Adjust routine to include new tools.
  • Help collaborators see and understand the benefit of new habits.
  • Practice, practice, practice.

Finally, the human touch is essential. People who understand must help others to see and understand. But when we order, judge, or hector people, they tend to close down the paths of communication, precisely when we need them to be most open. Gawande's colleagues have been most successful when they built personal relationships:

"It wasn't like talking to someone who was trying to find mistakes," she said. "It was like talking to a friend."

Good teachers know this. Some have to learn it the hard way, in the trenches with their students. But then, that is how Gawande's colleagues learned it, too.

"Slow Hands" is good news for teachers all around. It teaches ways to do our job better. But also, in many ways, it tells us that teaching will continue to matter in an age dominated by technological success:

People talking to people is still how the world's standards change.


Posted by Eugene Wallingford | Permalink | Categories: Managing and Leading, Software Development, Teaching and Learning

December 04, 2013 3:14 PM

Agile Moments, "Why We Test" Edition

Case 1: Big Programs.

This blog entry tells the sad story of a computational biologist who had to retract six published articles. Why? Their conclusions depended on the output of a computer program, and that program contained a critical error. The writer of the entry, who is not the researcher in question, concludes:

What this should flag is the necessity to aggressively test all the software that you write.

Actually, you should have tests for any program you use to draw important conclusions, whether you wrote it or not. The same blog entry mentions that a grad student in the author's previous lab had found several bugs a molecular dynamics program used by many computational biologists. How many published results were affected before they were found?

Case 2: Small Scripts.

Titus Brown reports finding bugs every time he reused one of his Python scripts. Yet:

Did I start doing any kind of automated testing of my scripts? Hell no! Anyone who wants to write automated tests for all their little scriptlets is, frankly, insane. But this was one of the two catalysts that made me personally own up to the idea that most of my code was probably somewhat wrong.

Most of my code has bugs but, hey, why write tests?

Didn't a famous scientist define insanity as doing the same thing over and over but expecting different results?

I consider myself insane, too, but mostly because I don't write tests often enough for my small scripts. We say to ourselves that we'll never reuse them, so we don't need tests. But we don't throw them away, and then we do reuse them, perhaps with a tweak here or there.

We all face time constraints. When we run a script the first time, we may well pay enough attention to the output that we are confident it is correct. But perhaps we can all agree that the second time we use a script, we should write tests for it if we don't already have them.

There are only three numbers in computing, 0, 1, and many. The second time we use a program is a sign from the universe that we need the added confidence provided by tests.

To be fair, Brown goes on to offer some good advice, such as writing tests for code after you find a bug in it. His article is an interesting read, as is almost everything he writes about computation and science.

Case 3: The Disappointing Trade-Off.

Then there's this classic from Jamie Zawinski, as quoted in Coders at Work:

I hope I don't sound like I'm saying, "Testing is for chumps." It's not. It's a matter of priorities. Are you trying to write good software or are you trying to be done by next week? You can't do both.

Sigh. If you you don't have good software by next week, maybe you aren't done yet.

I understand that the real world imposes constraints on us, and that sometimes worse is better. Good enough is good enough, and we rarely need a perfect program. I also understand that Zawinski was trying to be fair to the idea of testing, and that he was surely producing good enough code before releasing.

Even still, the pervasive attitude that we can either write good programs or get done on time, but not both, makes me sad. I hope that we can do better.

And I'm betting that the computational biologist referred to in Case 1 wishes he had had some tests to catch the simple error that undermined five years worth of research.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 03, 2013 3:17 PM

The Workaday Byproducts of Striving for Higher Goals

Why set audacious goals? In his piece about the Snowfall experiment, David Sleight says yes, and not simply for the immediate end:

The benefits go beyond the plainly obvious. You need good R&D for the same reason you need a good space program. It doesn't just get you to the Moon. It gives you things like memory foam, scratch-resistant lenses, and Dustbusters. It gets you the workaday byproducts of striving for higher goals.

I showed that last sentence a little Twitter love, because it's something people often forget to consider, both when they are working in the trenches and when they are selecting projects to work on. An ambitious project may have a higher risk of failure than something more mundane, but it also has a higher chance of producing unexpected value in the form of new tools and improved process.

This is also something that university curricula don't do well. We tend to design learning experiences that fit neatly into a fifteen-week semester, with predictable gains for our students. That sort of progress is important, of course, but it misses out on opportunities for students to produce their own workaday byproducts. And that's an important experience for students to have.

It also gives a bad example of what learning should feel like, and what it should do for us. Students generally learn what we teach them, or what we make easiest for them to learn. If we always set before them tasks of known, easily-understood dimensions, then they will have to learn after leaving us that the world doesn't usually work like that.

This is one of the reasons I am such a fan of project-based computer science education, as in the traditional compiler course. A compiler is an audacious enough goal for most students that they get to discover their own personal memory foam.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development, Teaching and Learning

November 25, 2013 2:56 PM

The Moment When Design Happens

Even when we plan ahead a bit, the design of a program tends to evolve. Gary Bernhardt gives an example in his essay on abstraction:

If I feel the need to violate the abstraction, I need to reconsider how to modify the boundaries to match that need, rather than violating the boundaries by crossing them.

This is the moment when design happens...

This is a hard design lesson to give students, because it is likely to click with them only after living with the consequences of violating the abstraction. This requires working with the same large program over time, preferably one they are building along the way.

This is one of the reasons I so like our senior project courses. My students are building a compiler this term, which gives them a chance to experience a moment when design happens. Their abstract syntax trees and symbol tables are just the sort of abstractions that invite violation -- and reward a little re-design.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

November 21, 2013 3:06 PM

Agile Thoughts, Healthcare.gov Edition

Clay Shirky explains the cultural attitudes that underlie Healthcare.gov's problems in his recent essay on the gulf between planning and reality. The danger of this gulf exists in any organization, whether business or government, but especially in large organizations. As the number of levels grows between the most powerful decision makers and the workers in the trenches, there is an increasing risk of developing "a culture that prefers deluding the boss over delivering bad news".

But this is also a story of the danger inherent in so-called Big Design Up Front, especially for a new kind of product. Shirky oversimplifies this as the waterfall method, but the basic idea is the same:

By putting the most serious planning at the beginning, with subsequent work derived from the plan, the waterfall method amounts to a pledge by all parties not to learn anything while doing the actual work.

You may learn something, of course; you just aren't allowed to let it change what you build, or how.

Instead, waterfall insists that the participants will understand best how things should work before accumulating any real-world experience, and that planners will always know more than workers.

If the planners believe this, or they allow the workers to think they believe this, then workers will naturally avoid telling their managers what they have learned. In the best case, they don't want to waste anyone's time if sharing the information will have no effect. In the worst case, they might fear the results of sharing what they have learned. No one likes to admit that they can't get the assigned task done, however unrealistic it is.

As Shirky notes, many people believe that a difficult launch of Healthcare.gov was unavoidable, because political and practical factors prevented developers from testing parts of the project as they went along and adjusting their actions in response. Shirky hits this one out of the park:

That observation illustrates the gulf between planning and reality in political circles. It is hard for policy people to imagine that Healthcare.gov could have had a phased rollout, even while it is having one.

You can learn from feedback earlier, or you can learn from feedback later. Pretending that you can avoid problems you already know exist never works.

One of the things I like about agile approaches to software development is they encourage us not to delude ourselves, or our clients. Or our bosses.


Posted by Eugene Wallingford | Permalink | Categories: General, Managing and Leading, Software Development

November 19, 2013 4:49 PM

First Model, Then Improve

Not long ago, I read Unhappy Truckers and Other Algorithmic Problems, an article by Tom Vanderbilt that looks at efforts to optimize delivery schedules at UPS and similar companies. At the heart of the challenge lies the traveling salesman problem. However, in practice, the challenge brings companies face-to-face with a bevy of human issues, from personal to social, psychological to economic. As a result, solving this TSP is more complex than what we see in the algorithms courses we take in our CS programs.

Yet, in the face of challenges both computational and human, the human planners working at these companies do a pretty good job. How? Over the course of time, researchers figured out that finding optimal routes shouldn't be their main goal:

"Our objective wasn't to get the best solution," says Ted Gifford, a longtime operations research specialist at Schneider. "Our objective was to try to simulate what the real world planners were really doing."

This is a lesson I learned the hard way, too, back in graduate school, when my advisor's lab was trying to build knowledge-based systems for real clients, in chemical engineering, aeronautics, business, and other domains. We were working with real people who were solving hard problems under serious constraints.

At the beginning I was a typically naive programmer, armed with fancy AI techniques and unbounded enthusiasm. I soon learned that, if you walk into a workplace and propose to solve all the peoples' problems with a program, things don't go as smoothly as the programmer might hope.

First of all, this impolitic approach generally creates immediate pushback. These are people, with personal investment in the way things work now. They tend to bristle when a 20-something grad student walks in the door promoting the wonder drug for all their ills. Some might even fear that you are right, and success for your program will mean negative consequences for them personally. We see this dynamic in Vanderbilt's article.

There's a deeper reason that things don't go so smoothly, though, and it's the real lesson of Vanderbilt's piece. Until you implement the existing solution to the problem, you don't really understand the problem yet.

These problems are complex, often with many more constraints than typical theoretical solutions have dealt with. The humans solving the problem often have many years of experience contributing to their approach. They have deep knowledge of the domain, but also repeated exposure to the exceptions and edge cases that sometimes confound theoretical solutions. They use heuristics that are hard to tease apart or articulate.

I learned that it's easy to solve a problem if you are solving the wrong one.

A better way to approach these challenges is: First, model the existing system, including the extant solution. Then, look for ways to improve on the solution.

This approach often gives everyone involved greater confidence that the programmers understand -- and so are solving -- the right problem. It also enables the team to make small, incremental changes to the system, with a correspondingly higher probability of success. Together, these two outcomes greatly increase the chance of human buy-in from the current workers. This makes it easier for the whole team to recognize the need for larger-scale changes to the process, and to support and contribute to an improved solution.

Vanderbilt tells a similarly pragmatic story. He writes:

When I suggest to Gifford that he's trying to understand the real world, mathematically, he concurs, but adds: "The word 'understand' is too strong--we are happy to get positive outcomes."

Positive outcomes are what the company wants. Fortunately for the academics who work on such problems in industry, achieving good outcomes is often an effective way to test theories, encounter their shortcomings, and work on improvements. That, too, is something I learned in grad school. It was a valuable lesson.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 04, 2013 2:41 PM

Those Silly Tests

I love this passage by Mark Dominus in Overlapping Intervals:

This was yet another time when I felt slightly foolish as I wrote the automated tests, assuming that the time and effort I spent on testing this trivial function would be time and effort thrown away on nothing -- and then they detected a real fault. Someday perhaps I'll stop feeling foolish writing tests for functions like this one; until then, many cases just like this one will help me remember that I must write the tests even though I feel foolish doing it.

Even excellent programmers feel silly writing tests sometimes. But they also benefit from writing them. Dominus was saved here by his test-writing habit, or by his sense of right and wrong.

Helping students develop that habit or that moral sense is a challenge. Even so, I rarely come across a situation where my students or I write or run too many tests. I regular encounter cases where we write or run too few.

Dominus's blog entry also a great passage on a larger lesson from that coding experience. In the end, his clever solution to a tricky problem results not from "just thinking" but from deeper thought: from "applying carefully-learned and practiced technique". That's an important form of thinking, too.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

October 30, 2013 11:41 AM

Discipline Can Be Structural As Well As Personal

There is a great insight in an old post by Brian Marick, Discipline and Skill, which I re-read this week. The topic sentence asserts:

Discipline can be a personal virtue, but it must also be structural.

Extreme Programming illustrates this claim. It draws its greatest power from the structural discipline it creates for developers. Marick goes on:

For example, one of the reasons to program in pairs is that two people are less likely to skip a test than one is. Removing code ownership makes it more likely someone within glaring distance will see that you didn't leave code as clean as you should have. The business's absolute insistence on getting working -- really working -- software at frequent intervals makes the pain of sloppiness strike home next month instead of next year, stiffening the resolve to do the right thing today.

P consists of a lot of relatively simple actions, but simple actions can be hard to perform, especially consistently and especially in opposition to deeply ingrained habits. XP practices work together to create structural discipline that helps developers "do the right thing".

We see the use of social media playing a similar role these days. Consider diet. People who are trying to lose weight or exercise more have to do some pretty simple things. Unfortunately, those things are not easy to do consistently, and they are opposed by deep personal and cultural habits. In order to address this, digital tool providers like FitBit make it easy for users to sync their data to a social media account and share with others.

This is a form of social discipline, supported by tools and practices that give structure to the actions people want to take. Just like XP. Many behaviors in life work this way.

(Of course, I'm already on record as saying that XP is a self-help system. I have even fantasized about XP's relationship to self-help in the cinema.)


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

October 16, 2013 11:38 AM

Poetry as a Metaphor for Software

I was reading Roger Hui's Remembering Ken Iverson this morning on the elliptical, and it reminded me of this passage from A Conversation with Arthur Whitney. Whitney is a long-time APL guru and the creator of the A, K, and Q programming languages. The interviewer is Bryan Cantrill.

BC: Software has often been compared with civil engineering, but I'm really sick of people describing software as being like a bridge. What do you think the analog for software is?

AW: Poetry.

BC: Poetry captures the aesthetics, but not the precision.

AW: I don't know, maybe it does.

A poet's use of language is quite precise. It must balance forces in many dimensions, including sound, shape, denotation, and connotation. Whitney seems to understand this. Richard Gabriel must be proud.

Brevity is a value in the APL world. Whitney must have a similar preference for short language names. I don't know the source of his names A, K, and Q, but I like Hui's explanation of where J's name came from:

... on Sunday, August 27, 1989, at about four o'clock in the afternoon, [I] wrote the first line of code that became the implementation described in this document.

The name "J" was chosen a few minutes later, when it became necessary to save the interpreter source file for the first time.

Beautiful. No messing around with branding. Gotta save my file.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 10, 2013 5:17 AM

Software Design is a Bet on a Particular Future

This truth is expressed nicely by Reginald Braithwaite:

Software design is the act of making bets about the future. A well-designed program is a bet on what will change in the future, and what will not change. And a well-designed program communicates the nature of that bet by being relatively flexible about things that the designers think are most likely to change, and being relatively inflexible about the things the designers think are least likely to change.

That's what refactoring is all about, of course. Sometimes, a particular guess turns out to be wrong. We have the wrong factors, the wrong components, for adding a new feature. So we change the shape of the code -- we factor it into different components -- to reflect our new best understanding of the future. Then we move on.

Sometimes, though, there are forces that make more desirable a relatively monolithic piece of code (or, as Braithwaite points out, a system decomposed into relatively less flexible components). In these cases, we need to defactor, to use Braithwaite's term: we recombine some or all of the parts to create a new design.

Predicting the future is hard, even for experienced programmers. One of the goals of agile design is to not think too far ahead, because that means committing to a future too far removed from what we already know to be true about our program.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

October 07, 2013 12:07 PM

StrangeLoop: Exercises in Programming Style

[My notes on StrangeLoop 2013: Table of Contents]

Crista Lopes

I had been looking forward to Crista Lopes's StrangeLoop talk since May, so I made sure I was in the room well before the scheduled time. I even had a copy of the trigger book in my bag.

Crista opened with something that CS instructors have learned the hard way: Teaching programming style is difficult and takes a lot of time. As a result, it's often not done at all in our courses. But so many of our graduates go into software development for the careers, where they come into contact with many different styles. How can they understand them -- well, quickly, or at all?

To many people, style is merely the appearance of code on the screen or printed. But it's not. It's more, and something entirely different. Style is a constraint. Lopes used images of a few stylistic paintings to illustrate the idea. If an artist limits herself to pointillism or cubism, how can she express important ideas? How does the style limit the message, or enhance it?

But we know this is true of programming as well. The idea has been a theme in my teaching for many years. I occasionally write about the role of constraints in programming here, including Patterns as a Source of Freedom, a few programming challenges, and a polymorphism challenge that I've run as a workshop.

Lopes pointed to a more universal example, though: the canonical The Elements of Programming Style. Drawing on this book and other work in software, she said that programming style ...

  • is a way to express tasks
  • exists at all scales
  • recurs at multiple scales
  • is codified in programming language

For me, the last bullet ties back most directly to idea of style as constraint. A language makes some things easier to express than others. It can also make some things harder to express. There is a spectrum, of course. For example, some OO languages make it easy to create and use objects; others make it hard to do anything else! But the language is an enabler and enforcer of style. It is a proxy for style as a constraint on the programmer.

Back to the talk. Lopes asked, Why is it so important that we understand programming style? First, a style provides the reader with a frame of reference and a vocabulary. Knowing different styles makes us a more effective consumers of code. Second, one style can be more appropriate for a given problem or context than another style. So, knowing different styles makes us a more effective producers of code. (Lopes did not use the producer-consumer distinction in the talk, but it seems to me a nice way to crystallize her idea.)

the cover of Raymond Queneau's Exercises in Style

The, Lopes said, I came across Raymond Queneau's playful little book, "Exercises in Style". Queneau constrains himself in many interesting ways while telling essentially the same story. Hmm... We could apply the same idea to programming! Let's do it.

Lopes picked a well-known problem, the common word problem famously solved in a Programming Pearls column more than twenty-five years. This is a fitting choice, because Jon Bentley included in that column a critique of Knuth's program by Doug McIlroy, who considered both engineering concerns and program style in his critique.

The problem is straightforward: identify and print the k most common terms that occur in a given text document, in decreasing order. For the rest of the talk, Lopes presented several programs that solve the problem, each written in a different style, showing code and highlighting its shape and boundaries.

Python was her language of choice for the examples. She was looking for a language that many readers would be able to follow and understand, and Python has the feel of pseudo-code about it. (I tell my students that it is the Pascal of their time, though I may as well be speaking of hieroglyphics.) Of course, Python has strengths and weaknesses that affect its fit for some styles. This is an unavoidable complication for all communication...

Also, Lopes did not give formal names to the styles she demonstrated. Apparently, at previous versions of this talk, audience members had wanted to argue over the names more than the styles themselves! Vowing not to make that mistake again, she numbered her examples for this talk.

That's what programmers do when they don't have good names.

In lieu of names, she asked the crowd to live-tweet to her what they thought each style is or should be called. She eventually did give each style a fun, informal name. (CS textbooks might be more evocative if we used her names instead of the formal ones.)

I noted eight examples shown by Lopes in the talk, though there may have been more:

  • monolithic procedural code -- "brain dump"
  • a Unix-style pipeline -- "code golf"
  • procedural decomposition with a sequential main -- "cook book"
  • the same, only with functions and composition -- "Willy Wonka"
  • functional decomposition, with a continuation parameter -- "crochet"
  • modules containing multiple functions -- "the kingdom of nouns"
  • relational style -- (didn't catch this one)
  • functional with decomposition and reduction -- "multiplexer"

Lopes said that she hopes to produce solutions using a total of thirty or so styles. She asked the audience for help with one in particular: logic programming. She said that she is not a native speaker of that style, and Python does not come with a logic engine built-in to make writing a solution straightforward.

Someone from the audience suggested she consider yet another style: using a domain-specific language. That would be fun, though perhaps tough to roll from scratch in Python. By that time, my own brain was spinning away, thinking about writing a solution to the problem in Joy, using a concatenative style.

Sometimes, it's surprising just how many programming styles and meaningful variations people have created. The human mind is an amazing thing.

The talk was, I think, a fun one for the audience. Lopes is writing a book based on the idea. I had a chance to review an early draft, and now I'm looking forward to the finished product. I'm sure I'll learn something new from it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

October 04, 2013 3:12 PM

StrangeLoop: Rich Hickey on Channels and Program Design

[My notes on StrangeLoop 2013: Table of Contents]

Rich Hickey setting up for his talk

Rich Hickey spoke at one of the previous StrangeLoops I attended, but this was my first time to attend one of his talks in person. I took the shaky photo seen at the right as proof. I must say, he gives a good talk.

The title slide read "Clojure core.async Channels", but Hickey made a disclaimer upfront: this talk would be about what channels are and why Clojure has them, not the details of how they are implemented. Given that there were plenty of good compiler talks elsewhere at the conference, this was a welcome change of pace. It was also a valuable one, because many more people will benefit from what Hickey taught about program design than would have benefited from staring at screens full of Clojure macros. The issues here are important ones, and ones that few programmers understand very well.

The fundamental problem is this: Reactive programs need to be machines, but functions make bad machines. Even sequences of functions.

The typical solution to this problem these days is to decompose the system logic into a set of response handlers. Alas, this leads to callback hell, a modern form of spaghetti code. Why? Even though the logic has been decomposed into pieces, it is still "of a piece", essentially a single logical entity. When this whole is implemented across multiple handlers, we can't see it as a unit, or talk about it easily. We need to, though, because we need to design the state machine that it comprises.

Clojure's solution to the problem, in the form of core.async, is the channel. This is an implementation of Tony Hoare's communicating sequential process. One of the reasons that Hickey likes this approach is that it lets a program work equally well in fully threaded apps and in apps with macro-generated inversion of control.

Hickey then gave some examples of code using channels and talked a bit about the implications of the implementation for system design. For instance, the language provides handy put! and take! operators for integrating channels with code at the edge of non-core.async systems. I don't have much experience with Clojure, so I'll have to study a few examples in detail to really appreciate this.

For me, the most powerful part of the talk was an extended discussion of communication styles in program. Hickey focused on the trade-offs between direct communication via shared state and indirect communication via channels. He highlighted six or seven key distinctions between the two and how these affect the way a system works. I can't do this part of the talk justice, so I suggest you watch the video of the talk. I plan to watch it again myself.

I had always heard that Hickey was eminently quotable, and he did not disappoint. Here are three lines that made me smile:

  • "Friends don't let friends put logic in handlers."
  • "Promises and futures are the one-night stands" of asynchronous architecture.
  • "Unbounded buffers are a recipe for a bad program. 'I don't want to think about this bug yet, so I'll leave the buffer unbounded.'"

That last one captures the indefatigable optimism -- and self-delusion -- that characterizes so many programmers. We can fix that problem later. Or not.

In the end, this talk demonstrates how a good engineer approaches a problem. Clojure and its culture reside firmly in the functional programming camp. However, Hickey recognizes that, for the problem at hand, a sequence of functional calls is not the best solution. So he designs a solution that allows programmers to do FP where it fits best and to do something else where FP doesn't. That's a pragmatic way to approach problems.

Still, this solution is consistent with Clojure's overall design philosophy. The channel is a first-class object in the language. It converts a sequence of functional calls into data, whereas callbacks implement the sequence in code. As code, we see the sequence only at run-time. As data, we see it in our program and can use it in all the ways we can use any data. This consistent focus on making things into data is an attractive part of the Clojure language and the ecosystem that has been cultivated around it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 27, 2013 4:26 PM

StrangeLoop: Add All These Things

[My notes on StrangeLoop 2013: Table of Contents]

I took a refreshing walk in the rain over the lunch hour on Friday. I managed to return late and, as a result, missed the start of Avi Bryant's talk on algebra and analytics. Only a few minutes, though, which is good. I enjoyed this presentation.

Bryant didn't talk about the algebra we study in eighth or ninth grade, but the mathematical structure math students encounter in a course called "abstract" or "modern" algebra. A big chunk of the talk focused on an even narrower topic: why +, and operators like it, are cool.

One reason is that grouping doesn't matter. You can add 1 to 2, and then add 4 to the result, and have the same answer as if you added 4 to 1, and then added 2 to the result. This is, of course, the associative property.

Another is that order doesn't matter. 1 + 2 is the same as 2 + 1. That's the commutative property.

Yet another is that, if you have nothing to add, you can add nothing and have the same value you started with. 4 + 0 = 4. 0 is the identity element for addition.

Finally, when you add two numbers, you get a number back. This is not quite as true in computers as in math, because an operation can cause an overflow or underflow and create an error. But looked at through fuzzy lenses, this is true in our computers, too. This is the closure property for addition of integers and real numbers.

Addition isn't the only operation on numbers that has these properties. Finding the maximum value in a set of numbers, does, too. The maximum of two numbers is a number. max(x,y) = max(y,x), and if we have three or more numbers, it doesn't how matter how we group them; max will find the maximum among them. The identity value is tricky -- there is no smallest number... -- but in practice we can finesse this by using the smallest number of a given data type, or even allowing max to take "nothing" as a value and return its other argument.

When we see a pattern like this, Bryant said, we should generalize:

  • We have a function f that takes two values from a set and produces another member of the same set.
  • The order of f's arguments doesn't matter.
  • The grouping of f's arguments doesn't matter.
  • There is some identity value, a conceptual "zero", that doesn't matter, in the sense that f(i,zero) for any i is simply i.

There is a name for this pattern. When we have such as set and operation, we have a commutative monoid.

     S ⊕ S → S
     x ⊕ y = y ⊕ x
     x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z
     x ⊕ 0 = x

I learned about this and other such patterns in grad school when I took an abstract algebra course for kicks. No one told me at the time that I'd being seeing them again as soon as someone created the Internet and it unleashed a torrent of data on everyone.

Just why we are seeing the idea of a commutative monoid again was the heart of Bryant's talk. When we have data coming into our company from multiple network sources, at varying rates of usage and data flow, and we want to extract meaning from the data, it can be incredibly handy if the meaning we hope to extract -- the sum of all the values, or the largest -- can be computed using a commutative monoid. You can run multiple copies of your function at the entry point of each source, and combine the partial results later, in any order.

Bryant showed this much more attractively than that, using cute little pictures with boxes. But then, there should be an advantage to going to the actual talk... With pictures and fairly straightforward examples, he was able to demystify the abstract math and deliver on his talk's abstract:

A mathematician friend of mine tweeted that anyone who doesn't understand abelian groups shouldn't build analytics systems. I'd turn that around and say that anyone who builds analytics systems ends up understanding abelian groups, whether they know it or not.

That's an important point. Just because you haven't studied group theory or abstract algebra doesn't mean you shouldn't do analytics. You just need to be prepared to learn some new math when it's helpful. As programmers, we are all looking for opportunities to capitalize on patterns and to generalize code for use in a wider set of circumstances. When we do, we may re-invent the wheel a bit. That's okay. But also look for opportunities to capitalize on patterns recognized and codified by others already.

Unfortunately, not all data analysis is as simple as summing or maximizing. What if I need to find an average? The average operator doesn't form a commutative monoid with numbers. It falls short in almost every way. But, if you switch from the set of numbers to the set of pairs [n, c], where n is a number and c is a count of how many times you've seen n, then you are back in business. Counting is addition.

So, we save the average operation itself as a post-processing step on a set of number/count pairs. This turns out to be a useful lesson, as finding the average of a set is a lossy operation: it loses track of how many numbers you've seen. Lossy operations are often best saved for presenting data, rather than building them directly into the system's computation.

Likewise, finding the top k values in a set of numbers (a generalized form of maximum) can be handled just fine as long as we work on lists of numbers, rather than numbers themselves.

This is actually one of the Big Ideas of computer science. Sometimes, we can use a tool or technique to solve a problem if only we transform the problem into an equivalent one in a different space. CS theory courses hammer this home, with oodles of exercises in which students are asked to convert every problem under the sun into 3-SAT or the clique problem. I look for chances to introduce my students to this Big Idea when I teach AI or any programming course, but the lesson probably gets lost in the noise of regular classwork. Some students seem to figure it out by the time they graduate, though, and the ones who do are better at solving all kinds of problems (and not by converting them all 3-SAT!).

Sorry for the digression. Bryant didn't talk about 3-SAT, but he did demonstrate several useful problem transformations. His goal was more practical: how can we use this idea of a commutative monoid to extract as many interesting results from the stream of data as possible.

This isn't just an academic exercise, either. When we can frame several problems in this way, we are able to use a common body of code for the processing. He called this body of code an aggregator, comprising three steps:

  • prepare the data by transforming it into the space of a commutative monoid
  • reduce the data to a single value in that space, using the appropriate operator
  • present the result by transforming it back into its original space

In practice, transforming the problem into the space of a monoid presents challenges in the implementation. For example, it is straightforward to compute the number of unique values in a collection of streams by transforming each item into a set of size one and then using set union as the operator. But union requires unbounded space, and this can be inconvenient when dealing with very large data sets.

One approach is to compute an estimated number of uniques using a hash function and some fancy arithmetic. We can make the expected error in estimate smaller and smaller by using more and more hash functions. (I hope to write this up in simple code and blog about it soon.)

Bryant looked at one more problem, computing frequencies, and then closed with a few more terms from group theory: semigroup, group, and abelian group. Knowing these terms -- actually, simply knowing that they exist -- can be useful even for the most practical of practitioners. They let us know that there is more out there, should our problems become harder or our needs become larger.

That's a valuable lesson to learn, too. You can learn all about abelian groups in the trenches, but sometimes it's good to know that there may be some help out there in the form of theory. Reinventing wheels can be cool, but solving the problems you need solved is even cooler.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

September 22, 2013 3:51 PM

StrangeLoop: Jenny Finkel on Machine Learning at Prismatic

[My notes on StrangeLoop 2013: Table of Contents]

The conference opened with a talk by Jenny Finkel on the role machine learning play at Prismatic, the customized newsfeed service. It was a good way to start the conference, as it introduced a few themes that would recur throughout, had a little technical detail but not too much, and reported a few lessons from the trenches.

Prismatic is trying to solve the discovery problem: finding content that users would like to read but otherwise would not see. This is more than simply a customized newsfeed from a singular journalistic source, because it draws from many sources, including other reader's links, and because it tries to surprise readers with articles that may not be explicitly indicated by their profiles.

The scale of the problem is large, but different from the scale of the raw data facing Twitter, Facebook, and the like. Finkel said that Prismatic is processing only about one million timely docs at a time, with the set of articles turning over roughly weekly. The company currently uses 5,000 categories to classify the articles, though that number will soon go up to the order of 250,000.

The complexity here comes from the cross product of readers, articles, and categories, along with all of the features used to try to tease out why readers like the things they do and don't like the others. On top of this are machine learning algorithms that are themselves exponentially expensive to run. And with articles turning over roughly weekly, they have to be amassing data, learning from it, and moving on constantly.

The main problem at the heart of a service like this is: What is relevant? Everywhere one turns in AI, one sees this question, or its more general cousin, Is this similar? In many ways, this is the problem at the heart of all intelligence, natural and artificial.

Prismatic's approach is straight from AI, too. They construct a feature vector for each user/article pair and then try to learn weights that, when applied to the values in a given vector, will rank desired articles high and undesired articles low. One of the key challenges when doing this kind of working is to choose the right features to use in the vector. Finkel mentioned a few used by Prismatic, including "Does the user follow this topic?", "How many times has the reader read an article from this publisher?", and "Does the article include a picture?"

With a complex algorithm, lots of data, and a need to constantly re-learn, Prismatic has to make adjustments and take shortcuts wherever possible in order to speed up the process. This is a common theme at a conference where many speakers are from industry. First, learn your theory and foundations; learn the pragmatics and heuristics need to turn basic techniques into the backbone of practical applications.

Finkel shared one pragmatic idea of this sort that Prismatic uses. They look for opportunities to fold user-specific feature weights into user-neutral features. This enables their program to compute many user-specific dot products using a static vector.

She closed the talk with five challenges that Prismatic has faced that other teams might be on the look out for:

Bugs in the data. In one case, one program was updating a data set before another program could take a snapshot of the original. With the old data replaced by the new, they thought their ranker was doing better than it actually was. As Finkel said, this is pretty typical for an error in machine learning. The program doesn't crash; it just gives the wrong answer. Worse, you don't even have reason to suspect something is wrong in the offending code.

Presentation bias. Readers tend to look at more of the articles at the top of a list of suggestions, even if they would have enjoyed something further down the list. This is a feature of the human brain, not of computer programs. Any time we write programs that interact with people, we have to be aware of human psychology and its effects.

Non-representative subsets. When you are creating a program that ranks things, its whole purpose is to skew a set of user/article data points toward the subset of articles that the reader most wants to read. But this subset probably doesn't have the same distribution as the full set, which hampers your ability to use statistical analysis to draw valid conclusions.

Statistical bleeding. Sometimes, one algorithm looks better than it is because it benefits from the performance of the other. Consider two ranking algorithms, one an "explorer" that seeks out new content and one an "exploiter" that recommend articles that have already been found to be popular. If we in comparing their performances, the exploiter will tend to look better than it is because it benefits from the successes of the explorer without being penalized for its failures. It is crucial to recognize that one feature you measure is not dependent on another. (Thanks to Christian Murphy for the prompt!)

Simpson's Paradox. The iPhone and the web have different clickthrough rates. They once found them in a situation where one recommendation algorithm performed worse than another on both platforms, yet better overall. This can really disorient teams who follow up experiments by assessing the results. The issue here is usually a hidden variable that is confounding the results.

(I remember discussing this classic statistical illusion with a student in my early years of teaching, when we encountered a similar illusion in his grade. I am pretty sure that I enjoyed our discussion of the paradox more than he did...)

This part of a talk is of great value to me. Hearing about another team's difficulties rarely helps me avoid the same problems in my own projects, but it often does help me recognize those problems when they occur and begin thinking about ways to work around them. This was a good way for me to start the conference.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

September 10, 2013 3:40 PM

A Laugh at My Own Expense

This morning presented a short cautionary tale for me and my students, from a silly mistake I made in a procmail filter.

Back story: I found out recently that I am still subscribed to a Billy Joel fan discussion list from the 1990s. The list has been inactive for years, or I would have been filtering its messages to a separate mailbox. Someone has apparently hacked the list, as a few days ago it started spewing hundreds of spam messages a day.

I was on the road for a few days after the deluge began and was checking mail through a shell connection to the mail server. Because I was busy with my trip and checking mail infrequently, I just deleted the messages by hand. When I got back, Mail.app soon learned they were junk and filtered them away for me. But the spam was still hitting my inbox on the mail server, where I read my mail occasionally even on campus.

After a session on the server early this morning, I took a few minutes to procmail them away. Every message from the list has a common pattern in the Subject: line, so I copied it and pasted it into a new procmail recipe to send all list traffic to /dev/null :

    :0
    * ^Subject.*[billyjoel]
    /dev/null

Do you see the problem? Of course you do.

I didn't at the time. My blindness probably resulted from a combination of the early hour, a rush to get over to the gym, and the tunnel vision that comes from focusing on a single case. It all looked obvious.

This mistake offers programming lessons at several different levels.

The first is at the detailed level of the regular expression. Pay attention to the characters in your regex -- all of them. Those brackets really are in the Subject: line, but by themselves mean something else in the regex. I need to escape them:

    * ^Subject.*\[billyjoel\]

This relates to a more general piece of problem-solving advice. Step back from individual case you are solving and think about the code you are writing more generally. Focused on the annoying messages from the list, the brackets are just characters in a stream. Looked at from the perspective of the file of procmail recipes, they are control characters.

The second is at the level of programming practice. Don't /dev/null something until you know it's junk. Much better to send the offending messages to a junk mbox first:

    * ^Subject.*\[billyjoel\]
    in.tmp.junk

Once I see that all and only the messages from the list are being matched by the pattern, I can change that line send list traffic where it belongs. That's a specific example of the sort of defensive programming that we all should practice. Don't commit to solutions too soon.

This, too, relates to more general programming advice about software validation and verification. I should have exercised a few test cases to validate my recipe before turning it loose unsupervised on my live mail stream.

I teach my students this mindset and program that way myself, at least most of the time. Of course, the time you most need test cases will be the time you don't write them.

The day provided a bit of irony to make the story even better. The topic of today's session in my compilers course? Writing regular expressions to describe the tokens in a language. So, after my mail admin colleague and I had a good laugh at my expense, I got to tell the story to my students, and they did, too.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

September 05, 2013 3:13 PM

Code Should Speak for Itself

Matt Welsh recently posted a blog entry on experiences rewriting a large system in Go and some of his thoughts about the language afterwards. One of the few shortcomings in his mind had to do with how Go's type inference made it hard for him to know the type of a variable. Sure, an IDE or other tool could help, but Welsh says:

I staunchly refuse to edit code with any tool that requires using a mouse.

That's mostly how I feel, too, though I'm more an emacs man. I use IDEs and appreciate what they can give. I used Eclipse a fair amount back when it was young, but it's gotten so big and complex these days that I shudder at the thought of starting it up. RubyMine gave me many pleasant moments when I used it for a while a couple of years ago.

When I use IDEs, I prefer simpler IDEs, such as Dr. Racket or even Dr. Java, to complex ones anyway. They don't generally provide as much support, but they do help. When not helping, they mostly staying out of the way while I am writing code.

For me, the key word in Welsh's refusal is require'. If I *need* a mouse or a lot of IDE support just to use a language, that's a sign that the code either isn't telling me everything it should, or there's too much to think about.

Code should speak for itself, and it only say things that the programmer needs to know.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

August 31, 2013 11:32 AM

A Good Language Conserves Programmer Energy

Game programmer Jeff Wofford wrote a nice piece on some of the lessons he learned by programming a game in forty-eight hours. One of the recurring themes of his article is the value of a high-powered scripting language for moving fast. That's not too surprising, but I found his ruminations on this phenomenon to be interesting. In particular:

A programmer's chief resource is the energy of his or her mind. Everything that expends or depletes that energy makes him or her less effective, more tired, and less happy.

A powerful scripting language sitting atop the game engine is one of the best ways to conserve programmer energy. Sometimes, though, a game programmer must work hard to achieve the performance required by users. For this reason, Wofford goes out of his way not to diss C++, the tool of choice for many game programmers. But C++ is an energy drain on the programmer's mind, because the programmer has to be in a constant state of awareness of machine cycles and memory consumption. This is where the trade-off with a scripting language comes in:

When performance is of the essence, this state of alertness is an appropriate price to pay. But when you don't have to pay that price -- and in every game there are systems that have no serious likelihood of bottlenecking -- you will gain mental energy back by essentially ignoring performance. You cannot do this in C++: it requires an awareness of execution and memory costs at every step. This is another argument in favor of never building a game without a good scripting language for the highest-level code.

I think this is true of almost every large system. I sure wish that the massive database systems at the foundation of my university's operations had scripting languages sitting on top. I even want to script against the small databases that are the lingua franca of most businesses these days -- spreadsheets. The languages available inside the tools I use are too clunky or not powerful, so I turn to Ruby.

Unfortunately, most systems don't come with a good scripting language. Maybe the developers aren't allowed to provide one. Too many CS grads don't even think of "create a mini-language" as a possible solution to their own pain.

Fortunately for Wofford, he both has the skills and inclination. One of his to-dos after the forty-eight hour experience is all about language:

Building a SWF importer for my engine could work. Adding script support to my engine and greatly refining my tools would go some of the distance. Gotta do something.

Gotta do something.

I'm teaching our compiler course again this term. I hope that the dozen or so students in the course leave the university knowing that creating a language is often the right next action and having the skills to do it when they feel compelled to do something.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

August 22, 2013 2:45 PM

A Book of Margin Notes on a Classic Program?

I recently stumbled across an old How We Will Read interview with Clive Thompson and was intrigued by his idea for a new kind of annotated book:

I've had this idea to write a provocative piece, or hire someone to write it, and print it on-demand it with huge margins, and then send it around to four people with four different pens -- red, blue, green and black. It comes back with four sets of comments all on top of the text. Then I rip it all apart and make it into an e-book.

This is an interesting mash-up of ideas from different eras. People have been writing in the margins of books for hundreds of years. These days, we comment on blog entries and other on-line writing in plain view of everyone. We even comment on other people's comments. Sites such as Findings.com, home of the Thompson interview, aim to bring this cultural practice to everything digital.

Even so, it would be pretty cool to see the margin notes of three or four insightful, educated people, written independently of one another, overlaid in a single document. Presentation as an e-book offers another dimension of possibilities.

Ever the computer scientist, I immediately began to think of programs. A book such as Beautiful Code gives us essays from master programmers talking about their programs. Reading it, I came to appreciate design decisions that are usually hidden from readers of finished code. I also came to appreciate the code itself as a product of careful thought and many iterations.

My thought is: Why not bring Thompson's mash-up of ideas to code, too? Choose a cool program, perhaps one that changed how we work or think, or one that unified several ideas into a standard solution. Print it out with huge margins, and send it to three of four insightful, thoughtful programmers who read it, again or for the first time, and mark it up with their own thoughts and ideas. It comes back with four sets of comments all on top of the text. Rip it apart and create an e-book that overlays them all in a single document.

Maybe we can skip the paper step. Programming tools and Web 2.0 make it so easy to annotate documents, including code, in ways that replace handwritten comments. That's how most people operate these days. I'm probably showing my age in harboring a fondness for the written page.

In any case, the idea stands apart from the implementation. Wouldn't it be cool to read a book that interleaves and overlays the annotations made by programmers such as Ward Cunningham and Grady Booch as they read John McCarthy's first Lisp interpreter, the first Fortran compiler from John Backus's team, QuickDraw, or Qmail? I'd stand in line for a copy.

Writing this blog entry only makes the idea sound more worth doing. If you agree, I'd love to hear from you -- especially if you'd like to help. (And especially if you are Ward Cunningham and Grady Booch!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

August 13, 2013 3:11 PM

Refactoring is Underrated

I think revision is hugely underrated. It is very seldom recognized as a place where the higher creativity can live, or where it can manifest. I think it was Yeats who said that literary revision was the only place in life where a man could truly improve himself.

-- William Gibson, The Art of Fiction No. 211

I find it a lot easier to come up with clean, simple designs when I have code in my hands to work with, rather than requirements. Even detailed requirements are abstract with respect to our programs. Code is the raw material.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

July 25, 2013 10:01 AM

Software is Hard...

... but here I am, writing another program.

She could feel her own lungs suspended as she worked, and she forced herself to inhale, suddenly frustrated by the insurmountable inability to make the paint correspond exactly and precisely to what was in her head. It was always doomed from the outset, but here she was, making another goddamned painting.

(From The Great Man, by Kate Christensen.)


Posted by Eugene Wallingford | Permalink | Categories: Software Development

July 03, 2013 10:22 AM

Programming for Everyone, Venture Capital Edition

Christina Cacioppo left Union Square Ventures to learn how to program:

Why did I want to do something different? In part, because I wanted something that felt more tangible. But mostly because the story of the internet continues to be the story of our time. I'm pretty sure that if you truly want to follow -- or, better still, bend -- that story's arc, you should know how to write code.

So, rather than settle for her lot as a non-programmer, beyond the accepted school age for learning these things -- technology is a young person's game, you know -- Cacioppo decided to learn how to build web apps. And build one.

When did we decide our time's most important form of creation is off-limits? How many people haven't learned to write software because they didn't attend schools that offered those classes, or the classes were too intimidating, and then they were "too late"? How much better would the world be if those people had been able to build their ideas?

Yes, indeed.

These days, she is enjoying the experience of making stuff: trying ideas out in code, discarding the ones that don't work, and learning new things every day. Sounds like a programmer to me.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

June 26, 2013 2:30 PM

An Opportunity to Learn, Born of Deprivation

Earlier this summer, my daughter was talking about something one of her friends had done with Instagram. As a smug computer weenie, I casually mentioned that she could do that, too.

She replied, "Don't taunt me, Dad."

You see, no one in our family has a cell phone, smart or otherwise, so none of us use Instagram. That's not a big deal for dear old dad, even though (or perhaps because) he's a computer scientist. But she is a teenager growing up in an entirely different world, filled with technology and social interaction, and not having a smart phone must surely seem like a form of child abuse. Occasionally, she reminds us so.

This gave me a chance to explain that Instagram filters are, at their core, relatively simple little programs, and that she could learn to write them. And if she did, she could run them on almost any computer, and make them do things that even Instagram doesn't do.

I had her attention.

So, this summer I am going to help her learn a little Python, using some of the ideas from media computation. At the end of our first pass, I hope that she will be able to manipulate images in a few basic ways: changing colors, replacing colors, copying pixels, and so on. Along the way, we can convert color images to grayscale or sepia tones, posterize images, embed images, and make simple collages.

That will make her happy. Even if she never feels the urge to write code again, she will know that it's possible. And that can be empowering.

I have let my daughter know that we probably will not write code that does as good a job as what she can see in Instagram or Photoshop. Those programs are written by pros, and they have evolved over time. I hope, though, that she will appreciate how simple the core ideas are. As James Hague said in a recent post, then key idea in most apps require relatively few lines of code, with lots and lots of lines wrapped around them to handle edge cases and plumbing. We probably won't write much code for plumbing... unless she wants to.

Desire and boredom often lead to creation. They also lead to the best kind of learning.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

June 25, 2013 2:46 PM

Data Ingestion

This sentence in Reid Draper's Data Traceability made me laugh recently:

I previously worked in the data ingestion team at a music data company.

Nice turn of phrase. I suppose that another group digests the data, and yet another expels it.

Draper's sentence came to mind again yesterday while I was banging my head on a relatively simple problem, transforming a CSV file generated by my university's information system, replete with embedded quotes and commas, into something more manageable. As data ingestion goes, this isn't much of a problem at all. There are plenty of libraries that do the heavy lifting for you, in most any language you choose, Ruby included.

Of course, I was just writing a quick-and-dirty script, so I was rolling my own CSV-handling code. As usual, "quick and dirty" is often dirty, but rarely quick. I tweeted a bit of my frustration, in response to which @geoffwozniak wrote:

Welcome to the world of enterprise data ingress.

If I had to deal with these files everyday, I might head for the egress. ... or master a good library, so that I could bang my head on more challenging data ingestion problems.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

June 20, 2013 12:32 PM

Agile Moments: Stories, Tests, and Refactoring in Visual Design

In A Place for Sharing Ideas and Stories, designers Teehan+Lax tell the story of their role in creating Medium, "a better place to read and write things that matter". The section "Forging ahead", about features added to the platform after its launch, made me think of some of the ideas we use when designing code test-first, only at a much higher level.

We reset by breaking the team up into new feature teams. Each feature team would have at least one designer, one front-end developer and a back-end developer. Some teams would take on multiple features depending on their complexity. We used one page briefs that were easy to write, easy to understand and helped guide the teams when working through their feature(s).

They consisted of questions like:

  • Who is this page for?
  • What problem does this page solve for the user?
  • How do we know they need it?
  • What is the primary action we want users to take on this page?
  • What might prompt a user to take this action?
  • How will we know that this page is doing what we want it to do?

This bullet list embodies several elements of agile development. For each feature, the brief acts like a story card that boils the feature down to a clear need of the user, a clear action, and, most important in my mind, a test: How will we know that this page is doing what we want it to do? In a lot of my work, this is a crucial element. As Kent Beck says, "How will I know I'm done?"

The paragraph preceding the list highlights a couple of other attributes common to agile development. One, teams are working on stories in parallel on a common artifact. Two, the teams include a designer, a front-end developer, and a back-end developer. The team doesn't include a user, which can be a huge advantage for developers, but the author mentions elsewhere that nearly everyone on the team was a user:

As the internal product progressed and its features and capabilities became clearer, we would reduce the amount of ad hoc meetings and focus on getting stuff built into the product so we could actually use it. Talking about work is great at first, but usage is what breathes life into the product.

The author also stresses the value of physical co-location of the designers and developers over even well-supported electronic communication, which echoes for me the value of having a user in the room with the developers.

What happens as features were enhanced or added by separate teams? "... maintaining some sort of design integrity across the entire product."

This thing was about to get full of user choices (read: complexity) in a hurry -- The product was now at a critical point in its life.

Of course, complexity and incoherence can creep into products even when they are designed and built by one team, when it works on multiple features in rapid succession.

The solution for Medium sounded familiar:

We took a week off current fixes and features and focused on redesigning three pages from scratch: Home, Collection and Post. Some of it went live, some of it went away.

And:

We hated to see some of the stuff we'd designed (and even built) not go live, but it needed to die so the product could grow through simplification.

This reminded me a lot like refactoring, even with differences from the way we refactor at the code level. As teams added features without considering the effect of the changes on the global structure of the product, they accumulated something akin to "design debt". So after a while they dedicated time to paying off the debt and bringing back to the product the sense of wholeness that had been lost. I am curious to know whether the teams ever looked back at the page briefs to verify that the redesigned pages still did what they wanted them to do. That would be the equivalent of "running the tests".

We programmers really do have it nice. Lines and units of code are a bit more separable than the visual design elements of a product. This allows us to refactor all the time, if we are so inclined, not just in batch after a large set of changes. The presence of concrete tests, written as code, allows us to test the efficacy of our changes relatively easily before we move forward. While down in the trenches writing code, it's easy to forget just how liberating -- and empowering -- this combination of separability and testability are for design and redesign.

It's also easy to forget sometimes that similar challenges face designers and creators across domains and disciplines. Many of the same themes run through stories the things we create, whether software in the small, software in the large, or physical artifacts. Reading Teahan+Lax's story reminded me of that, too.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

May 28, 2013 2:57 PM

The Willingness to Delete Working Code

Don DeLillo

... and to do code katas, too:

I find I'm more ready to discard pages than I used to be. I used to look for things to keep. I used to find ways to save a paragraph or a sentence, maybe by relocating it. Now I look for ways to discard things. If I discard a sentence I like, it's almost as satisfying as keeping a sentence I like. I don't think I've become ruthless or perverse--just a bit more willing to believe that nature will restore itself. The instinct to discard is finally a kind of faith. It tells me there's a better way to do this page even though the evidence is not accessible at the present time.

Says Don DeLillo, in The Art of Fiction No. 135. Even in programming, the willingness to cut a chunk of working code, or to rm -f a file, generally follows from a deep-seated belief that nature will restore itself. We are often happy to find that nature does a better job the second time around.

~~~

PHOTO: Adapted from http://www.flickr.com/photos/thousandrobots/5371974016/, (CC BY-SA 2.0).


Posted by Eugene Wallingford | Permalink | Categories: Software Development

May 23, 2013 3:57 PM

Bad Examples Are Everywhere

... even in the Java class libraries.

Earlier today, @fogus joked:

The java.awt.Point class was only created because someone needed a 1st example to show how to make an object for a book they were writing.

My response was only half-joking:

And I use it as an example of how not to make a class.

If you have ever seen the Point class, you might understand why. Two public instance variables, seven methods for reading and writing the instance variables, and only one method (translate) that could conceivably be considered a behavior. But it's not; it's just a relative writer.

When this is the first class we show our students and ask them to use, we immediately handicap them with an image of objects as buckets of data and programs as manipulation of values. We may as well teach them C or Pascal.

This has long been a challenger for teaching OOP in CS1. If a class has simple enough syntax for the novice programmer to understand, it is generally a bad example of an object. If a class has interesting behavior, it is generally too complex for the novice programmer to understand.

This is one of the primary motivations for authors to create frameworks for their OOP/CS1 textbooks. One of the earliest such frameworks I remember was the Graphics Package (GP) library in Object-Oriented Programming in Pascal, by Connor, Niguidula, and van Dam. Similar approaches have been used in more recent books, but the common thread is an existing set of classes that allow users to use and create meaningful objects right away, even as they learn syntax.

a hadrosaurus, which once roamed the land with legacy CS profs

A lot of these frameworks have a point objects as egregious as Java's GP included. But with these frameworks, the misleading Point class need not be the first thing students see, and when seen they are used in a context that consist of rich objects interacting as objects should.

These frameworks create a new challenge for the legacy CS profs among us. We like to "begin with the fundamentals" and have students write programs "from scratch", so that they "understand the entire program" from the beginning. Because, you know, that's the way we learned to program.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 21, 2013 3:05 PM

Exercises in Exercises in Style

I just registered for Strange Loop 2013, which doesn't happen until this fall. This has become a popular conference, deservedly so, and it didn't seem like a good idea to wait to register and risk being shut out.

One of the talks I'm looking forward to is by Crista Lopes. I mentioned Crista in a blog entry from last year's Strange Loop, for a talk she gave at OOPSLA 2003 that made an analogy between programming language and natural language. This year, she will give a talk called Exercises in Style that draws inspiration from a literary exercise:

Back in the 1940s, a French writer called Raymond Queneau wrote an interesting book with the title Exercises in Style featuring 99 renditions of the exact same short story, each written in a different style. This talk will shamelessly do the same for a simple program. From monolithic to object-oriented to continuations to relational to publish/subscribe to monadic to aspect-oriented to map-reduce, and much more, you will get a tour through the richness of human computational thought by means of implementing one simple program in many different ways.

If you've been reading this blog for long, you can image how much I like this idea. I even checked Queneau's book out of the library and announced on Twitter my plan to read it before the conference. From the response I received, I gather a lot of conferences attendees plan to do the same. You gotta love the audience Strange Loop cultivates.

I actually have a little experience with this idea of writing the same program in multiple styles, only on a much smaller scale. For most of the last twenty years, our students have learned traditional procedural programming in their first-year sequence and object-oriented programming in the third course. I taught the third course twice a year for many years. One of things I often did early in the course was to look at the same program in two forms, one written in a procedural style and one written in OOP. I hoped that the contrast between the programs would help them see the contrast between how we think about programs in the two styles.

I've been teaching functional programming regularly for the last decade, after our students have seen procedural and OO styles in previous courses, but I've rarely done the "exercises in style" demo in this course. For one thing, it is a course on languages and interpreters, not a course on functional programming per se, so the focus is on getting to interpreters as soon as possible. We do talk about differences in the styles in terms of their concepts and the underlying differences (and similarities!) in their implementation. But I think about doing so every time I prep the next offering of the course.

Not doing "exercises in style" can be attractive, too. Small examples can mislead beginning students about what is important, or distract them with concepts they'd won't understand for a while. The wrong examples can damage their motivation to learn. In the procedural/object-oriented comparison, I have had reasonable success in our OOP course with a program for simple bank accounts and a small set of users. But I don't know how well this exercise would work for a larger and more diverse set of styles, at least not at a scale I could use in our courses.

I thought of this when @kaleidic tweeted, "I hope @cristalopes includes an array language among her variations." I do, too, but my next thought was, "Well, now Crista needs to use an example problem for which an array language is reasonably well-suited." If the problem is not well suited to array languages, the solution might look awkward, or verbose, or convoluted. A newcomer to array languages is left to wonder, "Is this a problem with array languages, or with the example?" Human nature as it is, too many of us are prone to protect our own knowledge and assume that something is wrong with the new style.

An alternative approach is to get learners to suspend their disbelief for a while, learn some nuts and bolts, and then help them to solve bigger problems using the new style. My students usually struggle with this at first, but many of them eventually reach a point where they "get" the style. Solving a larger problem gives them a chance to learn the advantages and disadvantages of their new style, and retroactively learn more about the advantages and disadvantages of the styles they already know well. These trade-offs are the foundation of a really solid understanding of style.

I'm really intrigued by Queneau's idea. It seems that he uses a small example not to teach about each style in depth but rather to give us a taste. What does each style feel like in isolation? It is up to the aspiring writer to use this taste as a starting point, to figure out where each style might take you when used for a story of the writer's choosing.

That's a promising approach for programming styles, too, which is one of the reasons I am so looking forward to Crista's talk. As a teacher, I am a shameless thief of good ideas, so I am looking forward to seeing the example she uses, the way she solves it in the different styles, and the way she presents them to the crowd.

Another reason I'm looking forward to the talk is that I love programs, and this should be just plain fun.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

April 30, 2013 4:53 PM

Exceleration

A student stopped in for a chat late last week to discuss the code he was writing for a Programming Languages assignment. This was the sort of visit a professor enjoys most. The student had clearly put in plenty of time on his interpreter and had studied the code we had built in class. His code already worked. He wanted to talk about ways to make his code better.

Some students never reach this point before graduation. In Coders at Work, Bernie Cosell tells a story about leading teams of new hires at BBN:

I would get people -- bright, really good people, right out of college, tops of their classes -- on one of my projects. And they would know all about programming and I would give them some piece of the project to work on. And we would start crossing swords at our project-review meetings. They would say, "Why are you complaining about the fact that I have my global variables here, that I'm not doing this, that you don't like the way the subroutines are laid out? The program works."

They'd be stunned when I tell them, "I don't care that the program works. The fact that you're working here at all means that I expect you to be able to write programs that work. Writing programs that work is a skilled craft and you're good at it. Now, you have to learn how to program.

I always feel that we have done our students well if we can get them to the point of caring about their craft before they leave us. Some students come to us already having this mindset, which makes for a very different undergraduate experience. Professors enjoy working these students, too.

But what stood out to me most from this particular conversation was something the student said, something to this effect:

When we built the lexical addresser in class a few weeks ago, I didn't understand the idea and I couldn't write it. So I studied it over and over until I could write it myself and understand exactly why it worked. We haven't looked at lexical addressing since then, but the work I did has paid off every time we've written code to process programs in our little languages, including this assignment. And I write code more quickly on the exams now, too.

When he finished speaking, I could hardly contain myself. I wish I could bottle this attitude and give to every student who ever thinks that easy material is an opportunity to take it easy in a course for a while. Or who thinks that the best response to difficult material is to wait for something easier to come along next chapter.

Both situations are opportunities to invest energy in the course. The returns on investment are deeper understanding of the material, sharper programming skills, and the ability to get stuff done.

This student is reaping now the benefits of an investment he made five weeks ago. It's a gift that will keep on giving long after this course is over.

I encourage students to approach their courses and jobs in this way, but the message doesn't always stick. As Clay Stone from City Slickers might say, I'm happy as a puppy with two peters whenever it does.

While walking this morning, I coined a word for this effect: exceleration. It's a portmanteau combining "excellence" and "acceleration", which fits this phenomenon well. As with compound interest and reinvested dividends, this sort of investment builds on its self over time. It accelerates learners on their path to mastering their craft.

Whatever you call it, that conversation made my week.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

April 25, 2013 4:03 PM

Toward a Course on Reading Code

Yesterday, I tweeted absent-mindedly:

It would be cool to teach a course called "Reading Code".

Reading code has been on mind for a few months now, as I've watched my students read relatively small pieces of code in my Programming Languages course and as I've read a couple of small libraries while reading the exercise bike. Then I ran across John Regehr's short brainstorm on the topic, and something clicked. So I tweeted.

Reading code, or learning to do it, must be on the minds of a lot people, because my tweet elicited quite a few questions and suggestions. It is an under-appreciated skill. Computer science programs rarely teach students how to do it, and then usually only implicitly, by hearing a prof or other students talk about code they've read.

Several readers wanted to know what the course outline would be. I don't know. That's one of the things about Twitter or even a blog: it is easy to think out loud absent-mindedly without having much content in mind yet. It's also easier to express an interest in teaching a course than to design a good one.

Right now, I have only a few ideas about how I'd start. Several readers suggested Code Reading by Spinellis, which is the only textbook I know on then topic. It may be getting a little old these days, but many of the core techniques are still sound.

I was especially pleased that someone recommended Richard Gabriel's idea for an MFA in Software, in which reading plays a big role. I've used some of Dick's ideas in my courses before. Ironically, the last time I mentioned the MFA in Software idea in my blog was in the context of a "writing code" course, at the beginning of a previous iteration of Programming Languages!

That's particularly funny to me because someone replied to my tweet about teaching a course called "Reading Code" with:

... followed by a course "Writing Readable Code".

Anyone who has tried to grade thirty evolving language interpreters each week appreciates this under-appreciated skill.

Chris Demwell responded to my initial tweet with direct encouragement: Write the course, or at least an outline, and post it. I begged indulgence for lack of time as the school year ends and said that maybe I can take a stab this summer. Chris's next tweet attempted to pull me into the 2010s:

1. Write an outline. 2. Post on github. 3. Accept pull requests. Congrats, you're an editor!

The world has indeed changed. This I will do. Watch for more soon. In the meantime, feel free to e-mail me your suggestions. (That's an Old School pull request.)


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

April 09, 2013 3:16 PM

Writing a Book Is Like Flying A Spaceship

I've always liked this quote from the preface of Pragmatic Ajax, by Gehtland, Galbraith, and Almaer:

Writing a book is a lot like (we imagine) flying a spaceship too close to a black hole. One second you're thinking "Hey, there's something interesting over there," and a picosecond later, everything you know and love has been sucked inside and crushed.

Programming can be like that, too, in a good way. Just be sure to exit the black hole on the other side.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

April 01, 2013 3:16 PM

Good Sentences, Programming State Edition

I've read a couple of interesting papers recently that included memorable sentences related to program state.

First, Stuart Sierra in On the Perils of Dynamic Scope:

Global state is the zombie in the closet of every Clojure program.

This essay explains the difference between scope and extent, a distinction that affects how easy it is to some of what happens in a program with closures and first-order functions with free variables. Sierra also shows the tension between variables of different kinds, using examples from Clojure. An informative read.

Next, Rob Pike in Go at Google: Language Design in the Service of Software Engineering, a write-up of his SPLASH 2012 keynote address:

The motto [of the Go language] is, "Don't communicate by sharing memory, share memory by communicating."

Imperative programmers who internalize this simple idea are on their way to understanding and using functional programming style effectively. The inversion of sharing and communication turns a lot of design and programming patterns inside out.

Pike's notes provide a comprehensive example of how a new language can grow out of the needs of a particular set of applications, rather than out of programming language theory. The result can look a little hodgepodge, but using such a language often feels just fine. (This reminds me of a different classification of languages with similar practical implications.)

~~~~

(These papers weren't published April Fool's Day, so I don't think I've been punked.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 27, 2013 12:46 PM

Programming Language as Operating System

We are deep in the semester now, using Racket in our programming languages course. I was thinking recently about how little of Racket's goodness we use in this course. We use it primarily as a souped-up R5RS Scheme and handy IDE. Tomorrow we'll see some of Racket's tools for creating new syntax, which will explore one of the rich niches of the system my students haven't seen yet.

I'm thinking about ways to introduce a deeper understanding of The Racket Way, in which domain concepts are programming language constructs and programming languages are extensible and composable. But it goes deeper. Racket isn't just a language, or a set of languages. It is an integrated family of tools to support language creation and use. To provide all these services, Racket acts like an operating system -- and gives you full programmatic access to the system.

(You can watch the video of Flatt's StrangeLoop talk "The Racket Way" at InfoQ -- and you should.)

The idea is bigger than Racket, of course. Dan Ingalls expressed this idea in his 1981 Byte article, Design Principles Behind Smalltalk:

Operating System: An operating system is a collection of things that don't fit into a language. There shouldn't be one.

Alan Kay talks often about this philosophy. The divide between programming language and operating system makes some things more difficult for programmers, and complicates the languages and tools we use. It also creates a divide in the minds of programmers and imposes unnecessary limitations on what programmers think is possible. One of things that appealed to me in Flatt's StrangeLoop talk is that presented a vision of programming without those limits.

There are implications of this philosophy, and costs. Smalltalk isn't just a language, with compilers and tools that you use at your Unix prompt. It's an image, and a virtual machine, and an environment. You don't use Smalltalk; you live inside it.

After you live in Smalltalk for a while, it feels strange to step outside and use other languages. More important, when you live outside Smalltalk and use traditional languages and tools, Smalltalk feels uncomfortable at best and foreboding at worst. You don't learn Smalltalk; you assimilate. -- At least that's what it feels like to many programmers.

But the upside of the "programming language as operating system" mindset you find in Smalltalk and Racket can be huge.

This philosophy generalizes beyond programming languages. emacs is a text editor that subsumes most everything else you do, if you let it. (Before I discovered Smalltalk in grad school, I lived inside emacs for a couple of years.)

You can even take this down to the level of the programs we write. In a blog entry on delimited continuations, Andy Wingo talks about the control this construct gives the programmer over how their programs work, saying:

It's as if you were implementing a shell in your program, as if your program were an operating system for other programs.

When I keep seeing the same idea pop up in different places, with a form that fits the niche, I'm inclined to think I am seeing one of the Big Ideas of computer science.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 22, 2013 9:17 AM

Honest Answers: Learning APL

After eighteen printed pages showing the wonders of APL in A Glimpse of Heaven, Bernard Legrand encourages programmers to give the language a serious look. But he cautions APL enthusiasts not to oversell the ease of learning the language:

Beyond knowledge of the basic elements, correct APL usage assumes knowledge of methods for organising data, and ways specific to APL, of solving problems. That cannot be learnt in a hurry, in APL or any other language.

Legrand is generous in saying that learning APL takes the same amount of time as learning any other language. In my experience, both as a learning of language and as a teacher of programmers, languages and programming styles that are quite different from one's experience take longer than more familiar topics. APL is one of those languages that requires us to develop entirely new ways of thinking about data and process, so it will take most people longer to learn than yet another C-style imperative language or OO knock-off.

But don't be impatient. Wanting to move too quickly is a barrier to learning and performing at all scales, and too often leads us to give up too soon. If you give up on APL too soon, or on functional programming, or OOP, you will never get to glimpse the heaven that experienced programmers see.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

March 11, 2013 4:25 PM

Does Readability Give a False Sense of Understandability?

In Good for Whom?, Daniel Lyons writes about the readability of code. He starts with Dan Ingall's classic Design Principles Behind Smalltalk, which places a high value on a system being comprehensible by a single person, and then riffs on readability in J and Smalltalk.

Early on, Lyons made me smile when he noted that, while J is object-oriented, it's not likely to be used that way by many people:

... [because] to use advanced features of J one must first use J, and there isn't a lot of that going on either.

As a former Smalltalker, I know how he feels.

Ultimately, Lyons is skeptical about claims that readability increases the chances that a language will attract a large audience. For one thing, there are too many counterexamples in both directions. Languages like C, which "combines the power of assembly language with the readability of assembly language" [ link ], are often widely used. Languages such as Smalltalk, Self, and Lisp, which put a premium on features such as purity and factorability, which in turn enhance readability, never seem to grow beyond a niche audience.

Lyons's insight is that readability can mislead. He uses as an example the source code of the J compiler, which is written in C but in a style mimicking J itself:

So looking at the J source code, it's easy for me to hold my nose and say, that's totally unreadable garbage; how can that be maintained? But at the same time, it's not my place to maintain it. Imagine if it were written in the most clean, beautiful C code possible. I might be able to dupe myself into thinking I could maintain it, but it would be a lie! Is it so bad that complex projects like J have complex code? If it were a complex Java program instead, I'd still need substantial time to learn it before I would stand a chance at modifying it. Making it J-like means I am required to understand J to change the source code. Wouldn't I have to understand J to change it anyway?

There is no point in misleading readers who have trouble understanding J-like code into thinking they understand the compiler, because they don't. A veneer of readability cannot change that.

I know how Lyons feels. I sometimes felt the same way as I learned Smalltalk by studying the Smalltalk system itself. I understood how things worked locally, within a method and then within a class, but I didn't the full network of classes that made up the system. And I had the scars -- and trashed images -- to prove it. Fortunately, Smalltalk was able to teach me many things, including object-oriented programming, along the way. Eventually I came to understand better, if not perfectly, how Smalltalk worked down its guts, but that took a lot of time and work. Smalltalk's readability made the code accessible to me early, but understanding still took time.

Lyons's article brought to mind another insight about code's understandability that I blogged about many years ago in an entry on comments in code. This insight came from Brian Marick, himself no stranger to Lisp or Smalltalk:

[C]ode can only ever be self-explanatory with respect to an expected reader.

Sometimes, perhaps it's just as well that a language or a program not pretend to be more understandable than it really is. Maybe a barrier to entry is good, by keeping readers out until they are ready to wield the power it affords.

If nothing else, Lyons's stance can be useful as a counterweight to an almost unthinking admiration of readable syntax and programming style.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

March 08, 2013 3:50 PM

Honest Answers: Debugging

We have all been there:

Somehow, at some point in every serious programming project, it always comes down to the last option: stare at the code until you figure it out. I wish I had a better answer, but I don't. Anyway, it builds character.

This is, of course, the last resort. We need to teach students better ways to debug before they have to fall back on what looks a lot like wishful thinking. Fortunately, John Regehr lists this approach as the last resort in his lecture on How to Debug. Before he tells students to fall back to the place we all have to fall back to occasionally, he outlines an explicit, evidence-driven process for finding errors in a program.

I like that Regehr includes this advice for what to do after you find a bug: step back and figure out what error in thinking led to the bug.

An important part of learning from a mistake is diagnosing why you made it, and then taking steps wherever possible to make it difficult or impossible to make the same mistake again. This may involve changing your development process, creating a new tool, modifying an existing tool, learning a new technique, or some other act. But it requires an act. Learning rarely just happens.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

March 07, 2013 3:31 PM

A Programming Koan

Student: "I didn't have time to write 150 lines of code for the homework."

Master: "That's fine. It requires only 50."

Student: "Which 50?"

I have lived this story several times recently, as the homework in my Programming Languages has become more challenging. A few students do not complete the assignment because they do not spend enough time on the course, either in practice or performance. But most students do spend enough time, both in practice and on the assignment. Indeed, they spend much more time on the assignment than I intend.

When I see their code, I know why. They have written long solutions: code with unnecessary cases, unnecessary special cases, and unnecessary helper functions. And duplication -- lots and lots of duplication. They run out of time to write the ten lines they need to solve the last problem on the set because they spent all their time writing thirty lines on each of the preceding problems, where ten would have done quite nicely.

Don't let anyone fool you. Students are creative. The trick is o help them harness their creativity for good. The opposite of good here is not evil, but bad code -- and too much code.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

February 27, 2013 11:52 AM

Programming, Writing, and Clear Thinking

This Fortune Management article describes a technique Jeff Bezos uses in meetings of his executive team: everyone begins by "quietly absorbing ... six-page printed memos in total silence for as long as 30 minutes".

There is a good reason, Bezos knows, for an emphasis on reading and the written word:

There is no way to write a six-page, narratively structured memo and not have clear thinking.

This is certainly true for programming, that unique form of writing that drives the digital world. To write a well-structured, six-page computer program to perform a task, you have to be able to think clearly about your topic.

Alas, the converse is not true, at least not without learning some specific skills and practicing a lot. But then again, that makes it just like writing narratives.

My Programming Languages students this semester are learning that, for functional programming in Scheme, the size limit is somewhat south of six pages. More along the lines of six lines.

That's a good thing if your goal is clear thinking. Work hard, clarify your thoughts, and produce a small function that moves you closer to your goal. It's a bad thing if your goal is to get done quickly.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

February 25, 2013 4:07 PM

Thelonious Monk Teaches Software Design

Thelonious Monk at the piano

Thelonious Monk was a cool cat on the piano, but I think he could feel at home as a programmer. For example:

The _inside_ of the tune is the part that makes the _outside_ sound good.

Monk would understand that the design of your code matters as much as the design of your program's user interface. That is certainly true for developers who will have to maintain and modify the code over time. But it is also true for your program's users. It's hard for a program to be well designed on the outside, however pretty, when it is poorly designed on the inside.

Don't play _everything_ (or every time); let some things go by. Some music is just _imagined_. What you _don't_ play can be more important than what you _do_ play.

Some of the most effective software design happens in the negative space around software components. Alan Kay's original notions for designing objects stressed the messages that pass between objects more than the objects themselves. When we unfocus our eyes a bit and look at our system as a whole, the parts you don't design can come into focus.

a freehand rendition of the Japanese character 'ma'

And like Monk's missing notes, the code you don't write can be as important as the code you do, or more. The You Aren't Gonna Need It mindset tells us not to solve problems that don't exist yet. Live in the current spec. The result will be a minimal system, in terms of code size, with maximal effect.

You've got to dig it to _dig_ it, you dig?

A lot of people don't dig XP. But that's because they don't _dig_ it, you dig? Sometimes it takes surrendering old habits and thought processes all the way, pulling on a whole new way of approaching music or software, and letting it seep into your being for a while before you can really dig it. Some people begin skeptical but come to dig it after immersion.

This is true for a lot of practices that seem unusual or awkward, not just XP. As Alan Kay is also fond of saying, "Don't dip your toe in the water. Get wet."

~~~~

The quotes above are from a document archived by Steve Lacy, by way of Lists of Note.

PHOTO. Thelonious Monk, circa 1947 by William P. Gottlieb. Source: Wikipedia.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

February 19, 2013 3:36 PM

The Lumbering Lethargy of Software Engineering

Graham Lee makes an ironic observation in Does the history of making software exist?:

"[S]oftware engineering" ... was introduced to suggest a professionalism beyond the craft discipline that went before it, only to become a symbol of lumbering lethargy among adherents of the craft discipline that came after it.

It's funny how terms evolve and communities develop sometimes.

There are a lot of valuable lessons to be learned from the discipline of software engineering. As a mindset, it can shape how we build systems with good results. Taken too far, it can be a mindset can stifles and overloads the process of making software.

As a university professor, I have to walk a fine line, exposing students to the valuable lessons without turning the creation of software into a lethargic, lumbering process. My courses tend to look different from similar courses taught by software engineering profs. I presume that they feel different to students.

As a programmer, I walk a fine line, too, trying to learn valuable lessons from wherever I can. Often that's from the software engineering community. But I don't want to fall into a mindset where the process becomes more important than the result.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

February 18, 2013 12:59 PM

Code Duplication as a Hint to Think Differently

Last week, one of my Programming Languages students sent me a note saying that his homework solution worked correctly but that he was bothered by some duplicated code.

I was so happy.

Any student who has me for class for very long hears a lot about the dangers of duplication for maintaining code, and also that duplication is often a sign of poor design. Whenever I teach OOP or functional programming, we learn ways to design code that satisfy the DRY principle and ways to eliminate it via refactoring when it does sneak in.

I sent the student an answer, along with hearty congratulations for recognizing the duplication and wanting to eliminate it. My advice

When I sat down to blog the solution, I had a sense of deja vu... Hadn't I written this up before? Indeed I had, a couple of years ago: Increasing Duplication to Eliminate Duplication. Even in the small world of my own teaching, it seems there is nothing new under the sun.

Still, there was a slightly different feel to the way I talked about this in class later that day. The question had come earlier in the semester this time, so the code involved was even simpler. Instead of processing a vector or a nested list of symbols, we were processing with a flat list of symbols. And, instead of applying an arbitrary test to the list items, we were simply counting occurrences of a particular symbol, s.

The duplication occurred in the recursive case, where the procedure handles a pair:

    (if (eq? s (car los))
        (+ 1 (count s (cdr los)))      ; <---
        (count s (cdr los)))           ; <---

Then we make the two sub-cases more parallel:

    (if (eq? s (car los))
        (+ 1 (count s (cdr los)))      ; <---
        (+ 0 (count s (cdr los))))     ; <---

And then use distributivity to push the choice down a level:

    (+ (if (eq? s (car los)) 1 0)
       (count s (cdr los)))            ; <--- just once!

This time, I made a point of showing the students that not only does this solution eliminate the duplication, it more closely follows the command to follow the shape of the data:

When defining a program to process an inductively-defined data type, the structure of the program should follow the structure of the data.

This guideline helps many programmers begin to write recursive programs in a functional style, rather than an imperative style.

Note that in the first code snippet above, the if expression is choosing among two different solutions, depending on whether we see the symbol s in the first part of the pair or not. That's imperative thinking.

But look at the list-of-symbols data type:

    <list-of-symbols> ::= ()
                        | (<symbol> . <list-of-symbols>)

How many occurrences of s are in a pair? Obviously, the number of s's found in the car of the list plus the number of s's found in the cdr of the list. If we design our solution to match the code to the data type, then the addition operation should be at the top to begin:

    (+ ; number of s's found in the car
       ; number of s's found in the cdr )

If we define the answer for the problem in terms of the data type, we never create the duplication-by-if in the first place. We think about solving the subproblems for the car and the cdr, fill in the blanks, and arrive immediately at the refactored code snippet above.

I have been trying to help my students begin to "think functionally" sooner this semester. There is a lot or room for improvement yet in my approach. I'm glad this student asked his question so early in the semester, as it gave me another chance to model "follow the data" thinking. In any case, his thinking was on the right track.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

December 27, 2012 12:08 PM

Agile Moments: Predicting What Users Want

the icon for VoodooPad

In this MacStories interview, Gus Mueller talks about the response to his latest release of VoodooPad.

Interest in VoodooPad 5 far surpassed my expectations for it. I know that VoodooPad has a lot of fans out there, but I guess I just hadn't heard from them in a while.

People really seem to like the Markdown syntax support and the new JavaScript events system I've built in for customization. I also added ePub export to VP5 which I expected more interest in -- but that hasn't seem to materialized. I'm never very good at predicting which features people will like the most.

Gus is a Mac developer with a solid following and a couple of very popular titles, the wiki editor VoodooPad and the image editor Acorn. I am a long-time user of VoodooPad and an occasional user of Acorn's progenitor, FlySketch, and so have experienced firsthand Gus's open relationship with the users of his products.

If even he can't predict which features his users will like most and least, there isn't a lot of hope for the rest of us. Our best strategy is to follow the agile advice: release often, get feedback soon and frequently, and learn from what our the users of our software tell us.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

December 14, 2012 3:50 PM

A Short Introduction to the Law of Demeter

Preface

My students spent the last three weeks of the semester implementing some of the infrastructure for a Twitter-like messaging app. They grew the app in three iterations, with new and changed features at each step. In the readme file for one of the versions, a student commented:

I wound up having a lot of code of the sort
  thisCollection.thisInsideCollection.thisAttribute.thisMethod()

I smiled and remembered again that even students who would never write such code in a smaller, more focused setting can be lulled into writing it in the process of growing a big, complicated program. I made a mental note to pay a little extra attention the next time I teach the course to the Law of Demeter.

I actually don't talk much about the Law of Demeter in this sophomore-level course, because it's a name we don't need. But occasionally I'd like to point a student to a discussion of it, and there don't seem to be a lot of resources at the right level for these students. So I decided to draft the beginnings of a simple reference. I welcome your suggestions on how to make it better.

You might also check out The Paperboy, The Wallet, and The Law Of Demeter, a nice tutorial I recently came across.

What is the Law of Demeter?

The Law of Demeter isn't a law so much as a general principle for software design. It is often referred to in the context of object-oriented programming, so you may see it phrased in terms of objects:

Objects should have short reach.

Or:

An object should not try to know too much, or need to.

The law's Wikipedia entry has a nice object-free formulation:

Only talk to your immediate friends.

If those are too squishy for you, the Wikipedia entry also as a more formal summary:

The fundamental notion is that a given object should assume as little as possible about the structure or properties of anything else (including its subcomponents).

Framed this way, the Law of Demeter is just a restatement of OOP 101. Objects are independent. They encapsulate their state and behavior. Instance variables are private, and we should be suspicious of getter methods.

As a matter of programming style, I often introduce this principle in a pragmatic way:

An operation should live with the data it uses.

Those are all general statements of the Law of Demeter. You will sometimes see a much more specific, formal statement of this sort:

A method m of an object obj may send messages only to these objects:
  • obj itself
  • obj's instance variables
  • m's formal parameters
  • any objects created within m

This version of the Law is actually an enumeration of specific ways that we can obey the more general principle. In the case of existing code, this version can help us recognize that the general principle has been violated.

Why is the Law of Demeter important?

This principle is often pitched as being about loose coupling: we should minimize the amount of knowledge that any component has about the implementation of any other component.

Another way to think about this principle from the perspective of the receiver. Some object wants access to its parts in order to do its job. From this angle, the Law of Demeter is fundamentally about encapsulation. The receiver's implementation should be private, and chained messages tend to leak implementation detail. When we hide those details from the object's collaborators, we shield other parts of the system from changes to the implementation.

How can we follow the Law of Demeter?

Consider my student's example from above:

  thisCollection.thisInsideCollection.thisAttribute.thisMethod()

The simplest way to eliminate the sender's need to know about thisInsideCollection and thisAttribute is to

  • add a method to thisCollection's class that accomplishes thisInsideCollection.thisAttribute.thisMethod(), and
  • send the new message to thisCollection:
      thisCollection.doSomething()
    

Coming up with a good name for the new message doSomething forces us to think about what this behavior means in our program. Many times, finding a good name helps us to clarify how we think about the objects that make up our program.

Notice that the new method in thisCollection might still violate our design principle, because thisCollection needs to know that thisInsideCollection is implemented in terms of thisAttribute and that thisAttribute responds to thisMethod(). That's okay. You can apply the same process again, looking for a way to push the behavior that operates on thisAttribute into the object that knows about it.

More generally, when you notice yourself violating the Law of Demeter, think of the violation as an opportunity to rethink the objects in your program and how they relate to one another. Why is this object performing this task? Why doesn't its collaborator provide that service? Perhaps we can give this responsibility to another object. Who knows how to help this object with this task?

Sometimes, we even find that we can move thisMethod() into thisCollection and take our object out the loop entirely, letting the object that sent us the message communicate directly with thisCollection. That is a great way to make our code less tightly coupled.

A Potential Wrinkle

There is a potential problem when we program in a language like Java, though. What if thisCollection is a primitive Java class, say, a Vector or a HashMap? If so, you cannot add a new method to the class. This is sign of another problem looking in your program. This object depends on your choice of data structure for thisCollection. What role does thisCollection play in your domain? Maybe it's a catalog of users, or the set of users that follow another user.

Make a new class that represents this domain object, and make your data structure an instance variable in that class. Now you can write your program in terms of the domain object, not the Java primitive. This makes solves several problems for you:

  • Your code reflects the problem domain more faithfully.
  • Your code is easier to modify. You can now change the data structure used to implement the domain object without affecting the rest of the program.
  • Now you can solve the Law of Demeter violation by adding a method to the new class.

This new wrinkle is sometimes called Primitive Obsession. OO masters know to beware its temptations. I sometimes like to play a little game in which every base type and every primitive class has been pushed down to the bottom layer of my program and wrapped in a domain object. This is often overkill -- taking a good thing too far -- but such an exercise can help you see just how often it really is a good thing to hide implementation detail and program in terms of the objects in your domain.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

December 10, 2012 2:54 PM

Brief Flashes of Understanding, Fully Awake

The programmers and teachers among you surely know this feeling well:

As I drift back to sleep, I can't help thinking that it's a wonderful thing to be right about the world. To weigh the evidence, always incomplete, and correctly intuit the whole, to see the world in a grain of sand, to recognize its beauty, its simplicity, its truth. It's as close as we get to God in this life, and we reside in the glow of such brief flashes of understanding, fully awake, sometimes, for two or three seconds, at peace with our existence. And then we go back to sleep.

Or tackle the next requirement.

(The passage is from Richard Russo's Straight Man, an enjoyable send-up of modern man in an academic life.)


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

December 07, 2012 11:17 AM

Agglutination and Crystallization

Alan Kay talks about programming languages quite a bit in this wide-ranging interview. (Aren't all interviews with Kay wide-ranging?) I liked this fuzzy bifurcation of the language world:

... a lot of them are either the agglutination of features or ... a crystallization of style.

My initial reaction was that I'm a crystallization-of-style guy. I have always had a deep fondness for style languages, with Smalltalk at the head of the list and Joy and Scheme not far behind.

But I'm not a purist when it comes to neat and scruffy. As an undergrad, I really liked programming in PL/I. Java never bothered me as much as it bothered some of my purist friends, and I admit unashamedly that I enjoy programming in it.

These days, I like Ruby as much as I like any language. It is a language that lies in the fuzz between Kay's categories. It has an "everything is an object" ethos but, man alive, is it an amalgamation of syntactic and semantic desiderata.

I attribute linguistic split personality to this: I prefer languages with a "real center", but I don't mind imposing a stylistic filter on an agglutinated language. PL/I always felt comfortable because I programmed with a pure structured programming vibe. When I program in Java or Ruby now, somewhere in the center of my mind is a Smalltalk programmer seeing the language through a Smalltalk lens. I have to make a few pragmatic concessions to the realities of my tool, and everything seems to work out fine.

This semester, I have been teaching with Java. Next semester, I will be teaching with Scheme. I guess I can turn off the filter.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 26, 2012 3:24 PM

Quotes of the Day: Constraints on Programming

Obligation as constraint

Edward Yang has discovered the Three Bears pattern. He calls it "extremist programming".

When learning a new principle, try to apply it everywhere. That way, you'll learn more quickly where it does and doesn't work well, even if your initial intuitions about it are wrong.

Actually, you don't learn in spite of your initial intuitions being wrong. You learn because your initial intuitions were wrong. That's when learning happens best.

(I mention Three Bears every so often, such as Bright Lines in Learning and Doing, and whenever I discuss limiting usage of language features or primitive data values.)

Blindness as constraint

In an interview I linked to in my previous entry, Brian Eno and Ha-Joon Chang talk about the illusion of freedom. Whenever you talk about freedom, as in a "free market" or "free jazz",

... what you really mean is "constrained by rules that we've stopped thinking about".

Free jazz isn't entirely free, because you are constrained by what your muscles can do. Free markets aren't entirely free, because there are limits we simply choose not to talk about. Perhaps we once did talk about them and have chosen not to any more. Perhaps we never talked about them and don't even recognize that they are present.

I can't help but think of computer science faculty who claim we shouldn't be teaching OO programming in the first course, or any other "paradigm"; we should just teach basic programming first. They may be right about not teaching OOP first, but not because their approach is paradigm-free. It isn't.

(I mention constraints as a source of freedom every so often, including the ways in which patterns free students to create and the way creativity needs to be developed.)


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

November 18, 2012 9:13 AM

Programming Languages Quote of the Day

... comes from Gilad Bracha:

I firmly believe that a time traveling debugger is worth more than a boatload of language features[.]

This passage comes as part of a discussion of what it would take to make Bret Victor's vision of programming a reality. Victor demonstrates powerful ideas using "hand crafted illustrations of how such a tool might behave". Bracha, whose work on Smalltalk and Newspeak have long inspired me -- reflects on what it would take to offer Victor's powerful ideas in a general purpose programming environment.

Smalltalk as a language and environment works at a level where we conceive of providing the support Victor and Bracha envision, but most of the language tools people use today are too far removed from the dynamic behavior of the programs being written. The debugger is the most notable example.

Bracha suggests that we free the debugger from the constraints of time and make it a tool for guiding the evolution of the program. He acknowledges that he is not the first person to propose such an idea, pointing specifically to Bill Lewis's proposal for an omniscient debugger. What remains is the hard work needed to take the idea farther and provide programmers more transparent support for understanding dynamic behavior while still writing the code.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 17, 2012 12:23 PM

Why a CS Major Might Minor in Anthropology

Paul Klipp wrote a nice piece recently on emic and etic approaches to explaining team behavior. He explains what emic and etic approaches are and then shows how they apply to the consultant's job. For example:

Let's look at an example closer to home for us software folks. You're an agile coach, arriving in a new environment with a mission from management to "make this team more agile". If you, like so many consultants in most every field, favor an etic approach, you will begin by doing a gap analysis between the behaviors and artifacts that you see and those with which you are most familiar. That's useful, and practically inevitable. The next natural step, however, may be less helpful. That is to judge the gaps between what this team is doing and what you consider to be normal as wrong.... By deciding, as a consultant or coach, to now attempt to prepare an emic description of the team's behaviors, you force yourself to set aside your preconceptions and engage in meaningful conversations with the team in order to understand how they see themselves. Now you have two tools in your kit, where you might before have had one, and more tools prepares you for more situations.

When I speak to HS students and their parents, and when I advise freshmen, I suggest that the consider picking up a minor or a second major. I tell them that it almost doesn't matter which other discipline they choose. College is a good time to broaden oneself, to enjoy learning for its own sake. Some minors and second majors may seem more directly relevant to a CS grad's career interests, but you never know what domain or company you will end up working in. You never know when having studied a seemingly unrelated discipline will turn out to be useful.

Many students are surprised when I recommend social sciences such as psychology, sociology, and anthropology as great partners for CS. Their parents are, too. Understanding people, both individually and in groups, is important in any profession, but it is perhaps more important for CS grads than many. We build software -- for people. We teach new languages and techniques -- to people. We contract out our services to organizations -- of people. We introduce new practices and methodologies to organizations -- of people. Ethnography may be a more important to a software consultant's success than any set of business classes.

I had my first experience with this when I was a graduate student working in the area of knowledge-based systems. We built systems that aimed to capture the knowledge of human experts, often teams of experts. We found that they relied a lot on tacit knowledge, both in their individual expertise and in the fabric of their teams. It wasn't until I read some papers from John McDermott's research group at Carnegie Mellon that I realized we were all engaged in ethnographic studies. It would have been so useful to have had some background in anthropology!


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

November 16, 2012 2:50 PM

Preferring Objects over Class Methods in OOP

Bryan Helmkamp recently posted Why Ruby Class Methods Resist Refactoring on the Code Climate blog. It explains nicely example how using class methods makes it harder to refactor in a way that makes you feel like you have actually improved the code.

This is a hard lesson for beginners to learn, but even experienced Ruby and Java programmers sometimes succumb to the urge for simplicity that "just" writing a method offers. It doesn't take long for a simple class method to grow into something more complex, which gets in the way of growing the program.

As Helmkamp points out, class methods have the insidious feature of coupling your code to a particular class. Even when we program with instances rather than classes, we like to avoid that sort of coupling. Thus was born the factory method.

Deep in the process of teaching OO to undergraduates, I was drawn to an important truth found deep in the post, just after mention of the class-name coupling:

You can't easily swap in new a class, but you can easily swap in a new instance.

One of the great advantages of OOP is being able to plug a different object into a program and thus change or extend the program's behavior without modifying its code. Dynamic polymorphism via substitutable objects is in many ways the first principle of object-oriented programming.

That's why the first refactoring I usually apply whenever I encounter a class method is Introduce Object.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

November 06, 2012 3:34 PM

A Good Name Is About An Idea, Not An Implementation

In The Poetry of Function Naming, Stephen Wolfram captures something that all programmers eventually learn:

[Naming functions is] an unforgiving and humbling activity. And the issue is almost always the same. The reason you can't find a good name is because you don't really understand with complete and ultimate clarity what the function does.

Sometimes we can't come up with the perfect name for a function or a variable until after we have written code that uses it. The act of writing the program helps us to learn about the program's content.

Later in the same blog entry, Wolfram says something that made me think of my previous blog, on how some questions presuppose how they are to be answered:

But in a computer language, each function name ultimately refers to a particular piece of functionality that is defined in an absolute way, and can be implemented by a specific precise program.

When we write OO programs, a name doesn't always refer to a specific function. With polymorphic variables, we don't usually know which method will be executed when we use a name. Any object that provides the protocol required by the variable's type, or implements the interface so named, may be stored in the variable. It may even be an instance of a class I know nothing about.

For this reason, when I teach OO programming, I am careful to talk about sending a message rather than "invoking a method" or "calling a function". The receiver of the message interprets the message, not the sender.

This doesn't invalidate what Wolfram says, though it does point to a way in which we might rephrase it more generally. The name of a good method isn't about specific functionality so much as about expectation. It's about the core idea associated with an interaction, not any particular implementation of that idea.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

November 03, 2012 11:17 AM

When "What" Questions Presuppose "How"

John Cook wrote about times in mathematics when maybe you don't need to do what you were asked to do. As one example, he used remainder from division. In many cases, you don't need to do division, because you can find the answer using a different, often simpler, method.

We see a variation of John's theme in programming, too. Sometimes, a client will ask for a result in a way that presupposes the method that will be used to produce it. For example, "Use a stack to evaluate these nested expressions." We professors do this to students a lot, because they want the students to learn the particular technique specified. But you see subtle versions of this kind of request more often than you might expect outside the classroom.

An important part of learning to design software is learning to tease apart the subtle conflation of interface and implementation in the code we write. Students who learn OO programming after a traditional data structures course usually "get" the idea of data abstraction, yet still approach large problems in ways that let implementations leak out of their abstractions in the form of method names and return values. Kent Beck talked about how this problem afflicts even experienced programmers in his blog entry Naming From the Outside In.

Primitive Obsession is another symptom of conflating what we need with how we produce it. For beginners, it's natural to use base types to implement almost any behavior. Hey, the extreme programming principle You Ain't Gonna Need It encourages even us more experienced developers not to create abstractions too soon, until we know we need them and in what form. The convenience offered by hashes, featured so prominently in the scripting languages that many of us use these days, makes it easy to program for a long time without having to code a collection of any sort.

But learning to model domain objects as objects -- interfaces that do not presuppose implementation -- is one of the powerful stepping stones on the way to writing supple code, extendible and adaptable in the face of reasonable changes in the spec.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 30, 2012 4:22 PM

Mathematical Formulas, The Great Gatsby, and Small Programs

... or: Why Less Code is Better

In Don't Kill Math, Evan Miller defends analytic methods in the sciences against Bret Victor's "visions for computer-assisted creativity and scientific understanding". (You can see some of my reactions to Victor's vision in a piece I wrote about his StrangeLoop talk.)

Miller writes:

For the practicing scientist, the chief virtue of analytic methods can be summed up in a single word: clarity. An equation describing a quantity of interest conveys what is important in determining that quantity and what is not at all important.

He goes on to look at examples such as the universal law of gravitation and shows that a single formula gives even a person with "minimal education in physics" an economical distillation of what matters. The clarity provided by a good analytic solution affords the reader two more crucial benefits: confident understanding and memorable insights.

Poet Peter Turchi describes a related phenomenon in fiction writing, in his essay You and I Know, Order is Everything. A story can pull us forward by omitting details and thus creating in the reader a desire to learn more. Referring to a particularly strategic paragraph, he writes:

That first sentence created a desire to know certain information: What is this significant event? ... We still don't have an answer, but the context for the question is becoming increasingly clear -- so while we're eager to have those initial questions answered, we're content to wait a little longer, because we're getting what seems to be important information. By the third paragraph, ... we think we've got a clue; but by that time the focus of the narrative is no longer the simple fact of what's going on, but the [larger setting of the story]. The story has shifted our attention from a minor mystery to a more significant one. On some level or another nearly every successful story works this way, leading us from one mystery to another, like stepping stones across a river.

In a good story, eventually...

... we recognize that the narrator was telling us more than we could understand, withholding information but also drawing our attention to the very thing we should be looking at.

In two very different contexts, we see the same forces at play. The quality of a description follows from the balance it strikes between what is in the description and what is left out.

To me, this is another example of how a liberal education can benefit students majoring in both the sciences and the humanities [ 1 | 2 ]. We can learn about many common themes and patterns of life from both traditions. Neither is primary. A student can encounter the idea first in the sciences, or first in the humanities, whichever interests the student more. But apprehending a beautiful pattern in multiple domains of discourse can reinforce the idea and make it more salient in the preferred domain. This also broadens our imaginations, allowing us to see more patterns and more subtlety in the patterns we already know.

So: a good description, a good story, depends in some part on the clarity attendant in how it conveys what is important and what is not important. What are the implications of this pattern for programming? A computer program is, after all a description: an executable description of a domain or phenomenon.

I think this pattern gives us insight into why less code is usually better than more code. Given two functionally equivalent programs of different lengths, we generally prefer the shorter program because it contains only what matters. The excess code found in the longer program obscures from the reader what is essential. Furthermore, as with Miller's concise formulas, a short program offers its reader the gift of more confident understanding and the opportunity for memorable insights.

What is not in a program can tell us a lot, too. One of the hallmarks of expert programmers is their ability to see the negative space in a design and program and learn from it. My students, who are generally novice programmers, struggle with this. They are still learning what matters and how to write descriptions at all, let alone concise one. They are still learning how to focus their programming tools, and on what.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

October 16, 2012 4:45 PM

The Parable of the OO Programming Student

As The Master was setting out on a journey, a young man ran up, knelt down before him, and asked him, "Good teacher, what must I do to inherit the eternal bliss of OO?"

The Master answered him, "Why do you call me good? No OO programmer is good but The Creator alone.

"You know the commandments: "'An object should have only a single responsibility.'

"'Software entities should be open for extension, but closed for modification.'

"'Objects should be replaceable with instances of their subtypes without altering the correctness of that program.'

"'Tell, don't ask.'

"'You shall not indulge in primitive obsession.'

"'All state is private.'"

The young man replied and said to Him, "Teacher, all of these I have observed from my youth when first I learned to program."

The Master, looking at him, loved him and said to him, "You are lacking in one thing. Go, surrender all primitive types, and renounce all control structures. Write all code as messages passed between encapsulated objects, with extreme late-binding of all things. Then will you have treasure in Heaven; then come, follow me."

At that statement the young man's face fell, and he went away sad, for he possessed many data structures and algorithms.

The Master looked around and said to his disciples, "How hard it is for those who have a wealth of procedural programming experience to enter the kingdom of OO."

... with apologies to The Gospel of Mark.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

October 02, 2012 4:14 PM

The Pareto Principle and Programming Purity

After talking about the advantages of making the changeable aspects of the system as declarative as possible, William Payne writes:

Having done a bit of Prolog programming in the dim and distant past, my intuition is that trying to make everything declarative is a mistake; one ends up tying oneself into knots. The mental gymnastics simply are not worth it. However, splitting the program into declarative-and-non-declarative parts seems reasonable.

This is an application of the Pareto Principle to programming, in the form of the purity of style. The Pareto Principle says that, "for many events, roughly 80% of the effects come from 20% of the causes".

When I was first learning functional programming in Lisp as an AI researcher, a more experienced researcher told me that about 90% of a big system could be purely functional. The remaining 10% should include all side-effecting operations, cordoned off from the rest of the app into its own abstraction layer. Since that time, I've heard 85-15 used as a reasonable split for big Scheme programs.

The lesson is: don't kill yourself trying to be 100%. As Payne says, the mental gymnastics simply are not worth it. You'll end up with code that easier to understand, maintain, and modify if you allow yourself a little impurity, in small, controlled doses.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

September 30, 2012 12:45 PM

StrangeLoop 8: Reactions to Brett Victor's Visible Programming

The last talk I attended at StrangeLoop 2012 was Bret Victor's Visible Programming. He has since posted an extended version of his presentation, as a multimedia essay titled Learnable Programming. You really should read his essay and play the video in which he demonstrates the implementation of his ideas. It is quite impressive, and worthy of the discussion his ideas have engendered over the last few months.

In this entry, I give only a high-level summary of the idea, react to only one of his claims, and discuss only one of his design principles in ay detail. This entry grew much longer than I originally intended. If you would like to skip most of my reaction, jump to the mini-essay that is the heart of this entry, Programing By Reacting, in the REPL.

~~~~

Programmers often discuss their productivity as at least a partial result of the programming environments they use. Victor thinks this is dangerously wrong. It implies, he says, that the difficulty with programming is that we aren't doing it fast enough.

But speed is not the problem. The problem is that our programming environments don't help us to think. We do all of our programming in our minds, then we dump our ideas into code via the editor.

Our environments should do more. They should be our external imagination. They should help us see how our programs work as we are writing them.

This is an attractive guiding principle for designing tools to help programmers. Victor elaborates this principle into a set of five design principles for an environment:

  • read the vocabulary -- what do these words mean?
  • follow the flow -- what happens when?
  • see the state -- what is the computer thinking?
  • create by reacting -- start somewhere, then sculpt
  • create by abstracting -- start concrete, then generalize

Victor's talk then discussed each design principle in detail and showed how one might implement the idea using JavaScript and Processing.js in a web browser. The demo was cool enough that the StrangeLoop crowd broke into applause at leas twice during the talk. Read the essay.

~~~~

As I watched the talk, I found myself reacting in a way I had not expected. So many people have spoken so highly of this work. The crowd was applauding! Why was I not as enamored? I was impressed, for sure, and I was thinking about ways to use these ideas to improve my teaching. But I wasn't falling head over heels in love.

A Strong Claim

First, I was taken aback by a particular claim that Victor made at the beginning of his talk as one of the justifications for this work:

If a programmer cannot see what a program is doing, she can't understand it.

Unless he means this metaphorically, seeing "in the mind's eye", then it is simply wrong. We do understand things we don't see in physical form. We learn many things without seeing them in physical form. During my doctoral study, I took several courses in philosophy, and only rarely did we have recourse to images of the ideas we were studying. We held ideas in our head, expressed in words, and manipulated them there.

We did externalize ideas, both as a way to learn them and think about them. But we tended to use stories, not pictures. By speaking an idea, or writing it down, and sharing it with others, we could work with them.

So, my discomfort with one of Victor's axioms accounted for some of my unexpected reaction. Professional programmers can and do manipulate ideas abstractly. Visualization can help, but when is it necessary, or even most helpful?

Learning, Versus Doing

This leads to a second element of my concern. I think I had a misconception about Victor's work. His talk and its title, "Visible Programming", led me to think his ideas are aimed primarily at working programmers, that we need to make programs visible for all programmers.

The title of his essay, "Learnable Programming", puts his claims into a different context. We need to make programs visible for people who are learning to program. This seems a much more reasonable position on its face. It also lets me see the axiom that bothered me so much in a more sympathetic light: If a novice programmer cannot see what a program is doing, then she may not be able to understand it.

Seeing how a program works is a big part of learning to program. A few years ago, I wrote about "biction" and the power of drawing a picture of what code does. I often find that if I require a student to draw a picture of what his code is doing before he can ask me for debugging help, he will answer his own question before getting to me.

The first time a student experiences this can be a powerful experience. Many students begin to think of programming in a different way when they realize the power of thinking about their programs using tools other than code. Visible programming environments can play a role in helping students think about their programs, outside their code and outside their heads.

I am left puzzling over two thoughts:

  • How much of the value my students see in pictures comes from not from seeing the program work but from drawing the picture themselves -- the act of reflecting about the program? If our tools visualizes the code for them, will we see the same learning effect that we see in drawing their own pictures?

  • Certainly Victor's visible programming tools can help learners. How much will they help programmers once they become experts? Ben Shneiderman's Designing the User Interface taught me that novices and experts have different needs, and that it's often difficult to know what works well for experts until we run experiments.

Mark Guzdial has written a more detailed analysis of Victor's essay from the perspective of a computer science educator. As always, Mark's ideas are worth reading.

Programming By Reacting, in the REPL

My favorite parts of this talk were the sections on creating by reacting and abstracting. Programmers, Victor says, don't work like other creators. Painters don't stare at a blank canvas, think hard, create a painting in their minds, and then start painting the picture they know they want to create. Sculptors don't stare at a block of stone, envision in their mind's eye the statue they intend to make, and then reproduce that vision in stone. They start creating, and react, both to the work of art they are creating and to the materials they are using.

Programmers, Victor says, should be able to do the same thing -- if only our programming environments helped us.

As a teacher, I think this is an area ripe for improvement in how we help students learn to program. Students open up their text editor or IDE, stare at that blank screen, and are terrified. What do I do now? A lot of my work over the last fifteen to twenty years has been in trying to find ways to help students get started, to help them to overcome the fear of the blank screen.

My approaches haven't been through visualization, but through other ways to think about programs and how we grow them. Elementary patterns can give students tools for thinking about problems and growing their code at a scale larger than characters or language keywords. An agile approach can help them start small, add one feature at a time, proceed in confidence with working tests, and refactor to make their code better as they go along. Adding Victor-style environment support for the code students write in CS1 and CS2 would surely help as well.

However, as I listened to Victor describe support for creating by reacting, and then abstracting variables and functions out of concrete examples, I realized something. Programmers don't typically write code in an environment with data visualizations of the sort Victor proposes, but we do program in the style that such visualizations enable.

We do it in the REPL!

A simple, interactive computer programming environment enables programmers to create by reacting.

  • They write short snippets of code that describe how a new feature will work.
  • They test the code immediately, seeing concrete results from concrete examples.
  • They react to the results, shaping their code in response to what the code and its output tell them.
  • They then abstract working behaviors into functions that can be used to implement another level of functionality.

Programmers from the Lisp and Smalltalk communities, and from the rest of the dynamic programming world, will recognize this style of programming. It's what we do, a form of creating by reacting, from concrete examples in the interaction pane to code in the definitions pane.

In the agile software development world, test-first development encourages a similar style of programming, from concrete examples in the test case to minimal code in the application class. Test-driven design stimulates an even more consciously reactive style of programming, in which the programmer reacts both to the evolving program and to the programmer's evolving understanding of it.

The result is something similar to Victor's goal for programmers as they create abstractions:

The learner always gets the experience of interactively controlling the lower-level details, understanding them, developing trust in them, before handing off that control to an abstraction and moving to a higher level of control.

It seems that Victor would like to perform even more support for novices than these tools can provide, down to visualizing what the program does as they type each line of code. IDEs with autocomplete is perhaps the closest analog in our current arsenal. Perhaps we can do more, not only for novices but also professionals.

~~~~

I love the idea that our environments could do more for us, to be our external imaginations.

Like many programmers, though, as I watched this talk, I occasionally wondered, "Sure, this works great if you creating art in Processing. What about when I'm writing a compiler? What should my editor do then?"

Victor anticipated this question and pre-emptively answered it. Rather than asking, How does this scale to what I do?, we should turn the question inside out and ask, These are the design requirements for a good environment. How do we change programming to fit?

I doubt such a dogmatic turn will convince skeptics with serious doubts about this approach.

I do think, though, that we can reformulate the original question in a way that focuses on helping "real" programmers. What does a non-graphical programmer need in an external imagination? What kind of feedback -- frequent, even in-the-moment -- would be most helpful to, say, a compiler writer? How could our REPLs provide even more support for creating, reacting, and abstracting?

These questions are worth asking, whatever one thinks of Victor's particular proposal. Programmers should be grateful for his causing us to ask them.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

September 29, 2012 4:04 PM

StrangeLoop 7: The Racket Way

I have been using Racket since before it was Racket, back when it was "just another implementation of Scheme". Even then, though, it wasn't just another implementation of Scheme, because it had such great libraries, a devoted educational community around it, and an increasingly powerful facility for creating and packaging languages. I've never been a deep user of Racket, though, so I was eager to see this talk by one of its creators and learn from him.

Depending on your perspective, Racket is either a programming language (that looks a lot like Scheme), a language plus a set of libraries, or a platform for creating programs. This talk set out to show us that Racket is more.

Flatt opened with a cute animated fairy tale, about three princesses who come upon a wishing well. The first asks for stuff. The second asks for more wishes. The third asks for a kingdom full of wishing wells. Smart girl, that third one. Why settle for stuff when you can have the source of all stuff?

This is, Flatt said, something like computer science. There is a similar progression of power from:

  • a document, to
  • a language for documents, to
  • a language for languages.

Computer scientists wish for a way to write programs that do... whatever.

This is the Racket way:

  1. Everything is a program.
  2. Concepts are programming language constructs.
  3. Programming languages are extensible and composable.

The rest of the talk was a series of impressive mini-demos that illustrated each part of the Racket way.

To show what it means to say that everything is a program, Flatt demoed Scribble, a language for producing documents -- even the one he was using to give his talk. Scribble allows writers to abstract over every action.

To show what it means to say that concepts are programming language constructs, Flatt talked about the implementation of Dr. Racket, the flexible IDE that comes with the system. Dr. Racket needs to be able to create, control, and terminate processes. Relying on the OS to do this for it means deferring to what that OS offers. In the end, that means no control.

Dr. Racket needs to control everything, so the language provides constructs for these concepts. Flatt showed as examples threads and custodians. He then showed this idea at work in an incisive way: he wrote a mini-Dr. Racket, called Racket, Esq. -- live using Racket. To illustrate its completeness, he then ran his talk inside racket-esq. Talk about a strange loop. Very nice.

To show what it means to say that programming languages are extensible and composable, Flatt showed a graph of the full panoply of Racket's built-in languages and demoed several languages. He then used some of the basic language-building tools in Racket -- #lang, require, define-syntax, syntax-rules, and define-syntax-rule -- to build the old text-based game Adventure, which needs a natural language-like scripting language for defining worlds. Again, very nice -- so much power in so many tools.

This kind of power comes from taking seriously a particular way of thinking about the world. It starts with "Everything is a program." That is the Racket way.

Flatt is a relaxed and confident presenter. As a result, this was a deceptively impressive talk. It reinforced its own message by the medium in which it was delivered: using documents -- programs -- written and processed in Racket. I am not sure how anyone could see a slideshow with "hot" code, a console for output, and a REPL within reach, all written in the environment being demoed, and not be moved to rethink how they write programs. And everything else they create.

As Flatt intimated briefly early on, The Racket Way of thinking is not -- or should not be -- limited to Racket. It is, at its core, the essence of of computer science. The duality of code and data makes what we do so much more powerful than most people realize, and makes what we can do so much more powerful than most us actually do with the tools we accept. I hope that Flatt's talk inspires a few more of us not to settle for less than we have to.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 29, 2012 3:40 PM

StrangeLoop 6: Y Y

I don't know if it was coincidence or by design of the conference organizers, but Wednesday morning was a topical repeat of Tuesday morning for me: two highly engaging talks on functional programming. I had originally intended to write them up in a single entry, but that write-up grew so long that I decided to give them their own entries.

Y Not?

Watching talks and reading papers about the Y combinator are something of a spectator code kata for me. I love to see new treatments, and enjoy seeing even standard treatments every now and then. Jim Weirich presented it at StrangeLoop with a twist I hadn't seen before.

Weirich opened, as speakers often do, with him. This is a motivational talk, so it should be...

  • non-technical. But it's not. It is highly technical.
  • relevant. But it's not. It is extremely pointless.
  • good code. But it's not. It shows the worst Clojure code ever.

But it will be, he promises, fun!

Before diving in, he had one more joke, or at least the first half of one. He asked for audience participation, then asked his volunteer to calculate cos(n) for some value of n I missed. Then he asked the person to keep hitting the cosine button repeatedly until he told him to stop.

At the dawn of computing, to different approaches were taken in an effort to answer the question, What is effectively computable?

Alan Turing devised what we now call a universal Turing machine to embody the idea. Weirich showed a video demonstration of a physical Turing machine to give his audience a sense of what a TM is like.

(If you'd like to read more about Turing and the implication of his universal machine, check out this reflection I wrote earlier this year after a visit by Doug Hofstadter to my campus. Let's just say that the universal TM means more than just an answer to what functions are effectively computable.)

A bit ahead of Turing, Alonzo Church devised an answer to the same question in the form of the lambda calculus, a formal logical system. As with the universal TM, the lambda calculus can be used to compute everything, for a particular value of eveything. These days, nearly every programming language has lambdas of some form

... now came the second half of the joke running in the background. Weirich asked his audience collaborator what was in his calculator's display. The assistant called out some number, 0.7... Then Weirich showed his next slide -- the same number, taken out many more digits. How was he able to do this? There is a number n such that cos(n) = n. By repeatedly pressing his cosine button, Weirich's assistant eventually reached it. That number n is called the fixed point of the cosine function. Other functions have fixed points to, and they can be a source of great fun.

Then Weirich opened up his letter and wrote some code from the ground up to teach some important concepts of functional programming, using the innocuous function 3(n+1). With this short demo, Weirich demonstrated the idea of a higher-order function, including function factories, a set of useful functional refactorings that included

  • Introduce Binding
    -- where the new binding is unused in the body
  • Inline Definition
    -- where a call to a function is replaced by the function body, suitably parameterized
  • Wrap Function
    -- where an expression is replaced by a function call that computes the expression
  • Tennent Correspondence Principle
    -- where an expression is turned into a think

At the end of his exercise, Weirich had created a big function call that contained no named function definitions yet computed the same answer.

He asks the crowd for applause, then demurs. This is 80-year-old technology. Now you know, he says, what a "chief scientist" at New Context does. (Looks a lot like what an academic might do...)

Weirich began a second coding exercise, the point behind all his exposition to this point: He wrote the factorial function, and began to factor and refactor it just as he had the simpler 3(n+1). But now inlining the function breaks the code! There is a recursive call, and the name is now out of scope. What to do?

He refactors, and refactors some more, until the body of factorial is an argument to a big melange of lambdas and applications of lambdas. The result is a function that computes the fixed point of any function passed it.

That is Y. The Y combinator.

Weirich talked a bit about Y and related ideas, and why it matters. He closed with a quote from Wittgenstein, from Philosophical Investigations:

The aspects of things that are most important for us are hidden because of their simplicity and familiarity. (One is unable to notice something -- because it is always before one's eyes.) The real foundations of his enquiry do not strike a man at all. Unless that fact has at some time struck him. -- And this means: we fail to be struck by what, once seen, is most striking and most powerful.

The thing that sets Weirich's presentation of Y apart from the many others I've seen is its explicit use of refactoring to derive Y. He created Y from a sequence of working pieces of code, each the result of a refactoring we can all understand. I love to do this sort of thing when teaching programming ideas, and I was pleased to see it used to such good effect on such a challenging idea.

The title of this talk -- Y Not? -- plays on Y's interrogative homonym. Another classic in this genre echos the homonym in its title, then goes on to explain Y in four pages of English and Scheme. I suggest that you study @rpg's essay while waiting for Weirich's talk to hit InfoQ. Then watch Weirich's talk. You'll like it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

September 28, 2012 3:59 PM

StrangeLoop 5: Miscellany -- At All Levels

Most of the Tuesday afternoon talks engaged me less deeply than the ones that came before. Part of that was the content, part was the style of delivery, and part was surely that my brain was swimming in so many percolating ideas that there wasn't room for much more.

Lazy Guesses

Oleg Kiselyov, a co-author of the work behind yesterday's talk on miniKanren, gave a talk on how to implement guessing in computer code. That may sound silly, for a couple of reasons. But it's not.

First, why would we want to guess at all? Don't we want to follow principles that guarantee we find the right answer? Certainly, but those principles aren't always available, and even when they are the algorithms that implement them may be computationally intractable. So we choose to implement solutions that restrict the search space, for which we pay a price along some other dimension, often expressiveness.

Kiselyov mentioned scheduling tasks early in his talk, and any student of AI can list many other problems for which "generate and test" is a surprisingly viable strategy. Later in the talk, he mentioned parsing, which is also a useful example. Most interesting grammars have nondeterministic choices in them. Rather than allow our parsers to make choices and fail, we usually adopt rules that make the process predictable. The result is an efficient parser, but a loss in what we can reasonably say in the language.

So, perhaps the ability to make good guesses is valuable. What is so hard about implementing them? The real problem is that there are so many bad guesses. We'd like to use knowledge to guide the process of guessing again, to favor some guesses over others.

The abstract for the talk promises a general principle on which to build guessing systems. I must admit that I did not see it. Kiselyov moved fast at times through his code, and I lost sight of the big picture. I did see discussions of forking a process at the OS level, a fair amount of OCaml code, parser combinators, and lazy evaluation. Perhaps my attention drifted elsewhere at a key moment.

The speaker closed his talk by showing a dense slide and saying, "Here is a list of buzzwords, some of which I said in my talk and some of which I didn't say in my talk." That made me laugh: a summary of a talk he may or may not have given. That seemed like a great way to end a talk about guessing.

Akka

I don't know much about the details of Akka. Many of my Scala-hacking former students talk about it every so often, so I figured I'd listen to this quick tour and pick up a little more. The underlying idea, of course, is Hewitt's Actor model. This is something I'm familiar with from my days in AI and my interest in Smalltalk.

The presenter, Akka creator Jonas Boner, reminded the audience that Actors were a strong influence on the original Smalltalk. In many ways, it is truer to Kay's vision of OOP than the languages we use today.

This talk was a decent introduction to Hewitt's idea and its implementation in Akka. My two favorite things from the talk weren't technical details, but word play:

  • The name "Akka" has many inspirations, including a mountain in northern Sweden, a goddess of the indigenous people of northern Scandinavia, and a palindrome of Actor Kernel / Kernel Actor.

  • Out of context, this quote made the talk for me:
    We have made some optimizations to random.
    Ah, aren't we all looking for those?

Expressiveness and Abstraction

This talk by Ola Bini was a personal meditation on the expressiveness of language. Bini, whose first slide listed him as a "computational metalinguist", started from the idea that, informally, the expressiveness of a language is inversely proportional to the distance between our thoughts and the code we have to write in that language.

In the middle part of the talk, he considered a number of aspects of expressiveness and abstraction. In the latter part, he listed ideas from natural language and wondered aloud what their equivalents would be in programming languages, among them similes, metaphors, repetition, elaboration, and multiple equivalent expressions with different connotations.

During this part of the talk, my mind wandered, too, to a blog entry I wrote about parts of speech in programming languages back in 2003, and a talk by Crista Lopes at OOPSLA that year. Nouns, verbs, pronouns, adjectives, and adverbs -- these are terms I use metaphorically when teaching students about new languages. Then I thought about different kinds of sentence -- declarative, interrogative, imperative, and exclamatory -- and began to think about their metaphorical appearances in our programming languages.

Another fitting way for a talk to end: my mind wondering at the end of a wondering talk.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 26, 2012 8:12 PM

StrangeLoop 3: Functional Programming 1 -- Monads and Patterns

The StrangeLoop program had a fair amount of functional programming talks, and I availed myself of two to complete the first morning of the conference.

Monad Examples for Normal People

The web is full of tutorials claiming to explain monads in a way that anyone can understand. If any of them had succeeded, we wouldn't need another. How could I not attend a talk claiming to slay this dragon?

Getz started out with a traditional starting point: a sequence of operations that can be viewed as composition of functions. That works great for standard business logic. But consider a common exceptional case: given a name, looking up an account number fails. This requires us to break the symmetry of the code with guards. These break composition, because now the return type of the function doesn't fit.

The Maybe monad factors these guards out of the business logic. If we further need to record and capture error coded, we can use the Error monad, which factors the same sort of plumbing out of the business logic and also serves as a facade for a tuple of value and error code.

After these simple examples, the speaker dove into the sort of exercise that tends to lose the interest of programmers in the trenches building apps: building a Lisp interpreter in Python, using monads to compose the elements of the interpreter. The environment consists of a combination of the reader monad and the writer monad; the interpreter consists of a combination of the environment and the error monad. Several other monads play a role in representing values, built-in procedures, state, and non-standard control operators. An interpreter is, he said, "monad heaven".

The best part of this talks message was in viewing a monad as a design pattern that abstracts repetitive plumbing out of applications in such a way that preserves function composition.

After the talk, someone asked a question to the effect, "I get by fine with macros and higher-order functions. When do I need monads?" Getz answered from his personal experience: monads enable him to write more elegant code, by factoring repetition that other tools could not reach as nicely.

This wasn't the monad explanation to end the need for new monad explanations, but it was a decent effort. With the Getz's focus on factoring code and the question mentioning macros, I could not help but think of this essay that presents monads in the light of code transformation, and Brian Marick's approach treating a monad as a macro. Perhaps we are getting near a metaphor for monads that will help "normal people" grok them without resorting to abstract abstract math.

Functional Design Patterns

From the moment I saw the StrangeLoop line-up, I was excited to see Stuart Sierra speak on functional design patterns. Sierra is one of the few prominent people in the Lisp and Clojure worlds to acknowledge the value of design patterns in functional style -- heck, even to acknowledge they exist.

He opened his talk in a way that confronted the view commonly held among Lispers, He conceded that, for many, "design pattern" is a loaded term, bringing to mind an OO cult and the ominous voice of the Gang of Four. The thing is, Sierra said, Design Patterns is a pretty good book, given the time it was written and the programming language community to which it. speaks. However, in the functional language community, the design patterns in that book are best known for being debunked by Peter Norvig in a 1998 tutorial.

Sierra reminded this audience that patterns can occur at all levels of a program. He pointed to a lower-profile patterns book of the mid-1990s, Pattern-Oriented Software Architecture (now a five-volume series), which organized patterns at multiple levels:

  • architectural -- across components
  • design -- within components
  • idiom -- within a particular language

Sierra then went on to list, and describe briefly, several common patterns he has noticed in functional programs and used himself in Clojure. Like POSA, he organized them into categories. Before proceeding, he admitted to any Haskell programmers in the room that, yes, many of these patterns are monadic in nature.

I'd very much like to write about some of Sierra's patterns in greater detail than a single entry permits, including providing links to blog entries he and others have written about them. For now, let me list the ones I jotted down, in Sierra's categories:

  • state patterns
    • State/Event, aka Event Sourcing
    • Consequences

  • data-building patterns
    • Accumulator
    • Reduce/Combine
    • Recursive Expansion

  • control flow patterns
    • Pipeline
    • Wrapper
    • Token
    • Observer
    • Strategy

Before describing Reduce/Combine, Sierra took a short digression to talk about MapReduce, a pattern accepted by many in the world of big data. He reminded us that this pattern is predicated on the spinning disk becoming the bottleneck of our system. In the future, this pattern will become less useful as other forces come to dominate our system architecture.

Two of the control flow patterns, Observer and Strategy, are held in common with the GoF catalog, though in the context of functional programming a few new variations become more obvious. It also seemed to me that Sierra's Wrapper is a lot like the GoF's Decorator, though he did not make an explicit connection.

As I wrote a couple of years ago, the time is right for functional design patterns. I'm so glad that Sierra has been documented patterns of this sort and articulating the value of thinking in terms of patterns. The key is not to look to OO programs for patterns of value to functional programmers, but to look at functional programs for recurring choices among design alternatives. (It's not too surprising that many OO design patterns don't mean much to functional programmers, just as it's not surprising that FP patterns dealing with, say, state are afterthoughts in the OO world.)

I look forward to reviewing Sierra's slides, available at StrangeLoop 2012 GitHub repo, and coming back to the topic of functional design patterns soon.

The first day of StrangeLoop was off to a great start.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

September 25, 2012 8:35 PM

StrangeLoop 1: A Miscellany of Ideas

For my lunch break, I walked a bit outside, to see the sun and bend my knee a bit. I came back for a set of talks without an obvious common thread. After seeing the talks, I saw a theme: ideas for writing programs more conveniently or more concisely.

ClojureScript

David Nolen talked about ClojureScript, a Clojure-like language that compiles to Javascript. As he noted, there is a lot of work in this space, both older and newer. The goal of all that work is to write Javascript more conveniently, or generate it from something else. The goal of ClojureScript is to bring the expressibility and flexible programming style of the Lisp world to JS world. Nolen's talk gave us some insights into the work being done to make the compiler produce efficient Javascript, as well as into why you might use ClojureScript in the first place.

Data Structures and Hidden Code

The message of this talk by Scott Vokes is that your choice in data structures plays a big role in determining how much code you have to write. You can make a lot of code disappear by using more powerful data structures. We can, of course, generalize this claim from data structures to data. This is the theme of functional and object-oriented programming, too. This talk highlights how often we forget the lowly data structure when we think of writing less code.

As Vokes said, your choice in data structures sets the "path of least resistance" for what your program will do and also for the code you will write. When you start writing code, you often don't know what the best data structure for your application is. As long as you don't paint yourself into a corner, you should be able to swap a new structure in for the old. The key to this is something novice programmers learn early, writing code not in terms of a data structure but in terms of higher-level behaviors. Primitive obsession can become implementation obsession if you aren't careful.

The meat of this talk was a quick review of four data structures that most programmers don't learn in school: skip lists, difference list, rolling hashes, and jumpropes, a structure Vokes claims to invented.

This talk was a source of several good quote, including

  • "A data structure is just a stupid programming language." -- Bill Gosper
  • "A data structure is just a virtual machine." -- Vokes himself, responding to Gosper
  • "The cheapest, fastest, and most reliable components are those that aren't there." -- Gordon Bell

The first two quotes there would make nice mottos for a debate between functional and OO programming. They also are two sides of the same coin, which destroys the premise of the debate.

miniKanren

As a Scheme programmer and a teacher of programming languages, I have developed great respect and fondness for the work of Dan Friedman over the last fifteen years. As a computer programmer who began his studies deeply interested in AI, I have long had a fondness for Prolog. How could I not go to the talk on miniKanren? This is a small implementation (~600 lines written in a subset of Scheme) of Kanren, a declarative logic programming system described in The Reasoned Schemer.

This talk was like a tag-team vaudeville act featuring Friedman and co-author William Byrd. I can't so this talk justice in a blog entry. Friedman and Byrd interleaved code demo with exposition as they

  • showed miniKanren at its simplest, built from three operators (fresh, conde, and run)
  • extended the language with a few convenient operators for specifying constraints, types, and exclusions, and
  • illustrated how to program in miniKanren by building a language interpreter, EOPL style.

The cool endpoint of using logic programming to build the interpreter is that, by using variables in a specification, the interpreter produces legal programs that meet a given specification. It generates code via constraint resolution.

If that weren't enough, they also demo'ed how their system can, given a language grammar, produce quines -- programs p such that

    (equal p (eval p))
-- and twines, pairs of programs p and q such that
    (and (equal p (eval q))
         (equal q (eval p)))

Then they live-coded an implementation of typed lambda calculus.

Yes, all in fifty minutes. Like I said, you really need to watch the talk at InfoQ as soon as it's posted.

In the course of giving the talk, Friedman stated a rule that my students can use:

Will's law: If your function has a recursion, do the recursion last.

Will followed up with cautionary advice:

Will's second law: If your function has two recursions, call Will.

We'll see how serious he was when I put a link to his e-mail address in my Programming Languages class notes next spring.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 25, 2012 7:31 PM

Blogging from StrangeLoop

StrangeLoop logo

This week I have the pleasure of spending a couple of days expanding my mind at StrangeLoop 2012. I like StrangeLoop because it's a conference for programmers. The program is filled with hot programming topics and languages, plus a few keynotes to punctuate our mental equilibria. The 2010 conference gave me plenty to think about, but I had to skip 2011 while teaching and recovering. This year was a must-see.

I'll be posting the following entries from the conference as time permits me to write them.

You can find links to other write-ups of the conference, as well as slides from some talks and other material, at the StrangeLoop 2012 github site.

Now that the conference has ended, I can say with confidence that StrangeLoop 2012 was even better than StrangeLoop 2010.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 20, 2012 8:09 PM

Computer Science is a Liberal Art

Over the summer, I gave a talk as part of a one-day conference on the STEM disciplines for area K-12, community college, and university advisors. They were interested in, among other things, the kind of classes that CS students take at the university and the kind of jobs they get when they graduate.

In the course of talking about how some of the courses our students take (say, algorithms and the theory of computing) seem rather disconnected from many of the jobs they get (say, web programmer and business analyst), I claimed that the more abstract courses prepare students to understand the parts of the computing world that never change, and the ones that do. The specific programming languages or development stack they use after they graduate to build financial reporting software may change occasionally, but the foundation they get as a CS major prepares them to understand what comes next and to adapt quickly.

In this respect, I said, a university CS education is not job training. Computer Science is a liberal art.

This is certainly true when you compare university CS education with what students get at a community college. Students who come out of a community college networking program often possess specific marketable skills at a level we are hard-pressed to meet in a university program. We bank our program's value on how well it prepares students for a career, in which networking infrastructure changes multiple times and our grads are asked to work at the intersection of networks and other areas of computing, some of which may not exist yet.

It is also true relative to the industries they enter after graduation. A CS education provides a set of basic skills and, more important, several ways to think about problems and formulate solutions. Again, students who come out of a targeted industry or 2-year college training program in, say, web dev, often have "shovel ready" skills that are valuable in industry and thus highly marketable. We bank our program's value on how well it prepares students for a career in which ASP turns to JSP turns PHP turns to JavaScript. Our students should be prepared to ramp up quickly and have a shovel in the hands producing value soon.

And, yes, students in a CS program must learn to write code. That's a basic skill. I often hear people comment that computer science programs do not prepare students well for careers in software development. I'm not sure that's true, at least at schools like mine. We can't get away with teaching all theory and abstraction; our students have to get jobs. We don't try to teach them everything they need to know to be good software developers, or even many particular somethings. That should and will come on the job. I want my students to be prepared for whatever they encounter. If their company decides to go deep with Scala, I'd like my former students to be ready to go with them.

In a comment on John Cook's timely blog entry How long will there be computer science departments?, Daniel Lemire suggests that we emulate the model of medical education, in which doctors serve several years in residency, working closely with experienced doctors and learning the profession deeply. I agree. Remember, though, that aspiring doctors go to school for many years before they start residency. In school, they study biology, chemistry, anatomy, and physiology -- the basic science at the foundation of their profession. That study prepares them to understand medicine at a much deeper level than they otherwise might. That's the role CS should play for software developers.

(Lemire also smartly points out that programmers have the ability to do residency almost any time they like, by joining an open source project. I love to read about how Dave Humphrey and people like him bring open-source apprenticeship directly into the undergrad CS experience and wonder how we might do something similar here.)

So, my claim that Computer Science is a liberal arts program for software developers may be crazy, but it's not entirely crazy. I am willing to go even further. I think it's reasonable to consider Computer Science as part of the liberal arts for everyone.

I'm certainly not the first person to say this. In 2010, Doug Baldwin and Alyce Brady wrote a guest editors' introduction to a special issue of the ACM Transactions on Computing Education called Computer Science in the Liberal Arts. In it, they say:

In late Roman and early medieval times, seven fields of study, rooted in classical Greek learning, became canonized as the "artes liberales" [Wagner 1983], a phrase denoting the knowledge and intellectual skills appropriate for citizens free from the need to labor at the behest of others. Such citizens had ample leisure time in which to pursue their own interests, but were also (ideally) civic, economic, or moral leaders of society.

...

[Today] people ... are increasingly thinking in terms of the processes by which things happen and the information that describes those processes and their results -- as a computer scientist would put it, in terms of algorithms and data. This transformation is evident in the explosion of activity in computational branches of the natural and social sciences, in recent attention to "business processes," in emerging interest in "digital humanities," etc. As the transformation proceeds, an adequate education for any aspect of life demands some acquaintance with such fundamental computer science concepts as algorithms, information, and the capabilities and limitations of both.

The real value in a traditional Liberal Arts education is in helping us find better ways to live, to expose us to the best thoughts of men and women in hopes that we choose a way to live, rather than have history or accident choose a way to live for us. Computer science, like mathematics, can play a valuable role in helping students connect with their best aspirations. In this sense, I am comfortable at least entertaining the idea that CS is one of the modern liberal arts.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

September 19, 2012 4:57 PM

Don't Stop The Car

I'm not a Pomodoro guy, but this advice from The Timer Knows Best applies more generally:

Last month I was teaching my wife to drive [a manual transmission car], and it's amazing how easy stick shifting is if the car is already moving.... However, when the car is stopped and you need to get into 1st gear, it's extremely difficult. [So many things can go wrong:] too little gas, too much clutch, etc. ...

The same is true with the work day. Once you get going, you want to avoid coming to a standstill and having to get yourself moving again.

As I make the move from runner to cyclist, I have learned how much easier to keep moving on a bike than it is to start moving on a bike.

This is true of programming, too. Test-driven development helps us get started by encouraging us to focus on one new piece of functionality to implement. Keep it small, make it work, and move on to another small step. Pretty soon you are moving, and you are on your way.

Another technique many programs use to get started is to code a failing test before you stop the day before. This failing test focuses you even more quickly and recruits your own memory for help in recreating the feeling of motion more quickly. It's like a way to leave the car running in second gear.

I'm trying to help my students, who are mostly still learning how to write code, learn how to get started when they program. Many of them seem repeatedly to find themselves sitting still, grinding their gears and trying to figure out how to write the next bit of code and get it running. Ultimately, the answer may come down to the same thing we learn when we learn to drive a stick: practice, practice, practice, and eventually you get the feel of how the gearshift works.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

September 05, 2012 5:24 PM

Living with the Masters

I sometimes feel guilty that most of what I write here describes connections between teaching or software development and what I see in other parts of the world. These connections are valuable to me, though, and writing them down is valuable in another way.

I'm certainly not alone. In Why Read, Mark Edmondson argues for the value of reading great literature and trying on the authors' view of the world. Doing so enables us to better understand our own view of the world, It also gives us the raw material out of which to change our worldview, or build a new one, when we encounter better ideas. In the chapter "Discipline", Edmondson writes:

The kind of reading that I have been describing here -- the individual quest for what truth a work reveals -- is fit for virtually all significant forms of creation. We can seek vital options in any number of places. They may be found for this or that individual in painting, in music, in sculpture, in the arts of furniture making or gardening. Thoreau felt he could derive a substantial wisdom by tending his bean field. He aspired to "know beans". He hoed for sustenance, as he tells us, but he also hoed in search of tropes, comparisons between what happened in the garden and what happened elsewhere in the world. In his bean field, Thoreau sought ways to turn language -- and life -- away from old stabilities.

I hope that some of my tropes are valuable to you.

The way Edmondson writes of literature and the liberal arts applies to the world of software in a much more direct ways too. First, there is the research literature of computing and software development. One can seek truth in the work of Alan Kay, David Ungar, Ward Cunningham, or Kent Beck. One can find vital options in the life's work of Robert Floyd, Peter Landin, or Alan Turing; Herbert Simon, Marvin Minsky, or John McCarthy. I spent much of my time in grad school immersed in the writings and work of B. Chandrasekaran, which affected my view of intelligence in both humans and machines.

Each of these people offers a particular view into a particular part of the computing world. Trying out their worldviews can help us articulate our own worldviews better, and in the process of living their truths we sometimes find important new truths for ourselves.

We in computing need not limit ourselves to the study of research papers and books. As Edmondson says the individual quest for the truth revealed in a work "is fit for virtually all significant forms of creation". Software is a significant form of creation, one not available to our ancestors even sixty years ago. Live inside any non-trivial piece of software for a while, especially one that has withstood the buffets of human desire over a period of time, and you will encounter truth -- truths you find there, and truths you create for yourself. A few months trying on Smalltalk and its peculiar view of the world taught me OOP and a whole lot more.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development, Teaching and Learning

September 01, 2012 10:18 AM

Making Assumptions

a dartboard

Patrick Honner has been writing a series of blog posts reviewing problems from the June 2012 New York State Math Regents exams. A recent entry considered a problem in which students were asked to compute the probability that a dart hits the bull's eye on a dartboard. This question requires the student to make a specific assumption: "that every point on the target is equally likely to be hit". Honner writes:

... It's not necessarily bad that we make such assumptions: refining and simplifying problems so they can be more easily analyzed is a crucial part of mathematical modeling and problem solving.

What's unfortunate is that, in practice, students are kept outside this decision-making process: how and why we make such assumptions isn't emphasized, which is a shame, because exploring such assumptions is a fundamental mathematical process.

The same kinds of assumptions are built into even the most realistic problems that we set before our students. But discussing assumptions is an essential part of doing math. Which assumptions are reasonable? Which are necessary? What is the effect of a particular assumption on the meaning of the problem, on the value of the answer we will obtain? This kind of reasoning is, in many ways, the real math in a problem. Once we have a formula or two, we are down to crunching numbers. That's arithmetic.

Computer science teachers face the risks when we pose problems to our students, including programming problems. Discovering the boundaries of a problem and dealing with the messy details that live on the fringe are an essential part of making software. When we create assignments that can be neatly solved in a week or two, we hide "a fundamental computing process" from our students. We also rob them of a lot of fun.

As Honner says, though, making assumptions is not necessarily bad. In the context of teaching a course, they are necessary. Sometimes, we need to focus our students' attention on a specific new skill to be learned or honed. Tidying up the boundaries of a problem bring that skill into greater relief and eliminate what are at the moment unnecessary distractions.

It is important, though, for a computing curriculum to offer students increasing opportunities to confront the assumptions we make and begin to make assumptions for themselves. That level of modeling is also a specific skill to be learned and honed. It also can make class more fun for the professor, if a lot messier when it comes time to evaluating student work and assigning grades.

Even when we have to make assumptions prior to assigning a problem, discussing them explicitly with students can open their eyes to the rest of the complexity in making software. Besides, some students already sense or know that we are hiding details from them, and having the discussion is a way to honor their knowledge -- and earn their respect.

So, the next time you assign a problem, ask yourself: What assumptions have I made in simplifying this problem? Are they necessary? If not, can I loosen them? If yes, can my students benefit from discussing them?

And be prepared... If you leave a few messy assumptions lying around a problem for your students to confront and make on their own, some students will be unhappy with you. As Honner says, we teachers spend a lot of time training students to make implicit assumptions unthinkingly. In some ways, we are too successful for our own good.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

August 31, 2012 3:22 PM

Two Weeks Along the Road to OOP

The month has flown by, preparing for and now teaching our "intermediate computing" course. Add to that a strange and unusual set of administrative issues, and I've found no time to blog. I did, however manage to post what has become my most-retweeted tweet ever:

I wish I had enough money to run Oracle instead of Postgres. I'd still run Postgres, but I'd have a lot of cash.

That's an adaptation of tweet originated by @petdance and retweeted my way by @logosity. I polished it up, sent it off, and -- it took off for the sky. It's been fun watching its ebb and flow, as it reaches new sub-networks of people. From this experience I must learn at least one lesson: a lot of people are tired of sending money to Oracle.

The first two weeks of my course have led the students a few small steps toward object-oriented programming. I am letting the course evolve, with a few guiding ideas but no hard-and-fast plan. I'll write about the course's structure after I have a better view of it. For now, I can summarize the first four class sessions:

  1. Run a simple "memo pad" app, trying to identify behavior (functions) and state (persistent data). Discuss how different groupings of the functions and data might help us to localize change.
  2. Look at the code for the app. Discuss the organization of the functions and data. See a couple of basic design patterns, in particular the separation of model and view.
  3. Study the code in greater detail, with a focus on the high-level structure of an OO program in Java.
  4. Study the code in greater detail, with a focus on the lower-level structure of classes and methods in Java.

The reason we can spend so much time talking about a simple program is that students come to the course without (necessarily) knowing any Java. Most come with knowledge of Python or Ada, and their experiences with such different languages creates an interesting space in which to encounter Java. Our goal this semester is for students to learn their second language as much as possible, rather than having me "teach" it to them. I'm trying to expose them to a little more of the language each day, as we learn about design in parallel. This approach works reasonably well with Scheme and functional programming in a programming languages course. I'll have to see how well it works for Java and OOP, and adjust accordingly.

Next week we will begin to create things: classes, then small systems of classes. Homework 1 has them implementing a simple array-based class to an interface. It will be our first experience with polymorphic objects, though I plan to save that jargon for later in the course.

Finally, this is the new world of education: my students are sending me links to on-line sites and videos that have helped them learn programming. They want me to check them and and share with the other students. Today I received a link to The New Boston, which has among its 2500+ videos eighty-seven beginning Java and fifty-nine intermediate Java titles. Perhaps we'll come to a time when I can out-source all instruction on specific languages and focus class time on higher-level issues of design and programming...


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development, Teaching and Learning

August 13, 2012 3:56 PM

Lessons from Unix for OO Design

Pike's and Kernighan's Program Design in the UNIX Environment includes several ideas I would like for my students to learn in my Intermediate Computing course this fall. Among them:

... providing the function in a separate program makes convenient options ... easier to invent, because it isolates the problem as well as the solution.

In OO, objects are the packages that create possibilities for us. The beauty of this lesson is the justification: because a class isolates the problem as well as the solution.

This solution affects no other programs, but can be used with all of them.

This is one of the great advantages of polymorphic objects.

The key to problem-solving on the UNIX system is to identify the right primitive operations and to put them at the right place.

Methods should live in the objects whose data they manipulate. One of the hard lessons for novice OO programmers coming from a procedural background is putting methods with the thing, not a faux actor.

UNIX programs tend to solve general problems rather than special cases.

Objects that are too specific should be rare, at least for beginning programmers. Specificity in interface often indicates that implementation detail is leaking out.

Merely adding features does not make it easier for users to do things -- it just makes the manual thicker.

Keep objects small and focused. A big interface is often evidence of an object waiting to be born.

~~~~

In many ways, The Unix Way is contrary to object-oriented programming. Or so many of Linux friends tell me. But I'm quite comfortable with the parallels found in these quotes, because they are more about good design in general than about Unix or OOP themselves.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

July 23, 2012 3:14 PM

Letting Go of Old Strengths

Ward Cunningham commented on what it's like to be "an old guy who's still a programmer" in his recent Dr. Dobb's interview:

A lot of people think that you can't be old and be good, and that's not true. You just have to be willing to let go of the strengths that you had a year ago and get some new strengths this year. Because it does change fast, and if you're not willing to do that, then you're not really able to be a programmer.""

That made me think of the last comment I made in my posts on JRubyConf:

There is a lot of stuff I don't know. I won't run out of things to read and learn and do for a long, long, time.

This is an ongoing theme in the life of a programmer, in the life of a teacher, and the life of an academic: the choice we make each day between keeping up and settling down. Keeping up is a lot more fun, but it's work. If you aren't comfortable giving up what you were awesome at yesterday, it's even more painful. I've been lucky mostly to enjoy learning new stuff more than I've enjoyed knowing the old stuff. May you be so lucky.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development, Teaching and Learning

July 16, 2012 3:02 PM

Refactoring Everywhere: In Code and In Text

Charlie Stross is a sci-fi writer. Some of my friends have recommended his fiction, but I've not read any. In Writing a novel in Scrivener: lessons learned, he, well, describes what he has learned writing novels using Scrivener, an app for writers well known in the Mac OS X world.

I've used it before on several novels, notably ones where the plot got so gnarly and tangled up that I badly needed a tool for refactoring plot strands, but the novel I've finished, "Neptune's Brood", is the first one that was written from start to finish in Scrivener...

... It doesn't completely replace the word processor in my workflow, but it relegates it to a markup and proofing tool rather than being a central element of the process of creating a book. And that's about as major a change as the author's job has undergone since WYSIWYG word processing came along in the late 80s....

My suspicion is that if this sort of tool spreads, the long-term result may be better structured novels with fewer dangling plot threads and internal inconsistencies. But time will tell.

Stross's lessons don't all revolve around refactoring, but being able to manage and manipulate the structure of the evolving novel seems central to his satisfaction.

I've read a lot of novels that seemed like they could have used a little refactoring. I always figured it was just me.

The experience of writing anything in long form can probably be improved by a good refactoring tool. I know I find myself doing some pretty large refactorings when I'm working on the set of lecture notes for a course.

Programmers and computer scientists have the advantage of being more comfortable writing text in code, using tools such as LaTex and Scribble, or homegrown systems. My sense, though, is that fewer programmers use tools like this, at least at full power, than might benefit from doing so.

Like Stross, I have a predisposition against using tools with proprietary data formats. I've never lost data stored in plaintext to version creep or application obsolescence. I do use apps such as VoodooPad for specific tasks, though I am keenly aware of the exit strategy (export to text or RTFD ) and the pain trade-off at exit (the more VoodooPad docs I create, the more docs I have to remember to export before losing access to the app). One of the things I like most about MacJournal is that it's nothing but a veneer over a set of Unix directories and RTF documents. The flip side is that it can't do for me nearly what Scrivener can do.

Thinking about a prose writing tool that supports refactoring raises an obvious question: what sort of refactoring operations might it provide automatically? Some of the standard code refactorings might have natural analogues in writing, such as Extract Chapter or Inline Digression.

Thinking about automated support for refactoring raises another obvious question, the importance of which is surely as clear to novelists as to software developers: Where are the unit tests? How will we know we haven't broken the story?

I'm not being facetious. The biggest fear I have when I refactor a module of a course I teach is that I will break something somewhere down the line in the course. Your advice is welcome!


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development, Teaching and Learning

July 11, 2012 2:45 PM

A Few Comments on the Alan Kay Interview, and Especially Patterns

Alan Kay

Many of my friends and colleagues on Twitter today are discussing the Interview with Alan Kay posted by Dr. Dobb's yesterday. I read the piece this morning while riding the exercise bike and could not contain my desire to underline passages, star paragraphs, and mark it up with my own comments. That's hard to do while riding hard, hurting a little, and perspiring a lot. My desire propelled me forward in the face of all these obstacles.

Kay is always provocative, and in this interview he leaves no oxen ungored. Like most people do when whenever they read outrageous and provocative claims, I cheered when Kay said something I agreed with and hissed -- or blushed -- when he said something that gored me or one of my pet oxen. Twitter is a natural place to share one's cheers and boos for an artyicle with or by Alan Kay, given the amazing density of soundbites one finds in his comments about the world of computing.

(One might say the same thing about Brian Foote, the source of both soundbites in that paragraph.)

I won't air all my cheers and hisses here. Read the article, if you haven't already, and enjoy your own. I will comment on one paragraph that didn't quite make me blush:

The most disastrous thing about programming -- to pick one of the 10 most disastrous things about programming -- there's a very popular movement based on pattern languages. When Christopher Alexander first did that in architecture, he was looking at 2,000 years of ways that humans have made themselves comfortable. So there was actually something to it, because he was dealing with a genome that hasn't changed that much. I think he got a few hundred valuable patterns out of it. But the bug in trying to do that in computing is the assumption that we know anything at all about programming. So extracting patterns from today's programming practices ennobles them in a way they don't deserve. It actually gives them more cachet.

Long-time Knowing and Doing readers know that patterns are one of my pet oxen, so it would have been natural for me to react somewhat as Keith Ray did and chide Kay for what appears to be a typical "Hey, kids, get off my lawn!" attitude. But that's not my style, and I'm such a big fan of Kay's larger vision for computing that my first reaction was to feel a little sheepish. Have I been wasting my time on a bad idea, distracting myself from something more important? I puzzled over this all morning, and especially as I read other people's reactions to the interview.

Ultimately, I think that Kay is too pessimistic when he says we hardly know anything at all about programming. We may well be closer to the level of the Egyptians who built the pyramids than we are to the engineers who built the Empire State Building. But I simply don't believe that people such as Ward Cunningham, Ralph Johnson, and Martin Fowler don't have a lot to teach most of us about how to make better software.

Wherever we are, I think it's useful to identify, describe, and catalog the patterns we see in our software. Doing so enables us to talk about our code at a level higher than parentheses and semicolons. It helps us bring other programmers up to speed more quickly, so that we don't all have to struggle through all the same detours and tar pits our forebears struggled through. It also makes it possible for us to talk about the strengths and weaknesses of our current patterns and to seek out better ideas and to adopt -- or design -- more powerful languages. These are themes Kay himself expresses in this very same interview: the importance of knowing our history, of making more powerful languages, and of education.

Kay says something about education in this interview that is relevant to the conversation on patterns:

Education is a double-edged sword. You have to start where people are, but if you stay there, you're not educating.

The real bug in what he says about patterns lies at one edge of the sword. We may not know very much about how to make software yet, but if we want to remedy that, we need to start where people are. Most software patterns are an effort to reach programmers who work in the trenches, to teach them a little of what we do know about how to make software. I can yammer on all I want about functional programming. If a Java practitioner doesn't appreciate the idea of a Value Object yet, then my words are likely wasted.

Ward Cunningham

Ironically, many argue that the biggest disappointment of the software patterns effort lies at the other edge of education's sword: an inability to move the programming world quickly enough from where it was in the mid-1990s to a better place. In his own Dr. Dobb's interview, Ward Cunningham observed with a hint of sadness that an unexpected effect of the Gang of Four Design Patterns book was to extend the life of C++ by a decade, rather than reinvigorating Smalltalk (or turning people on to Lisp). Changing the mindset of a large community takes time. Many in the software patterns community tried to move people past a static view of OO design embodied in the GoF book, but the vocabulary calcified more quickly than they could respond.

Perhaps that is all Kay meant by his criticism that patterns "ennoble" practices in a way they don't deserve. But if so, it hardly qualifies in my mind as "one of the 10 most disastrous things about programming". I can think of a lot worse.

Kurt Vonnegut

To all this, I can only echo the Bokononists in Kurt Vonnegut's novel Cat's Cradle: "Busy, busy, busy." The machinery of life is usually more complicated and unpredictable than we expect or prefer. As a result, reasonable efforts don't always turn out as we intend them to. So it goes. I don't think that means we should stop trying.

Don't let my hissing about one paragraph in the interview dissuade you from reading the Dr. Dobb's interview. As usual, Kay stimulates our minds and encourages us to do better.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

July 06, 2012 4:19 PM

Assume a Good Compiler and Write Readable Code

Greg Robbins and Ron Avitzur, the authors of MacOS's original Graphing Calculator, offer nine tips for Designing Applications for the Power Macintosh. All of them are useful whatever your target machine. One of my favorites is:

5. Avoid programming cleverness. Instead, assume a good compiler and write readable code.

This is good programming advice in nearly every situation, for all the software engineering reasons we know. Perhaps surprisingly, it is good advice even when you are writing code that has to be fast and small, as Robbins and Avitzur were:

Cycle-counting and compiler-specific optimizations are favorite pastimes of hackers, and sometimes they're important. But we could never have completed the Graphing Calculator in under six months had we worried about optimizing each routine. Rather, we dealt with speed problems only when they were perceptible to users.

We made no attempt to look at performance bottlenecks or at the compiled code of the Calculator until after running execution profiles. We were surprised where the time was being spent. Most of the time that the Calculator is compute-bound it's either in the math libraries or in QuickDraw. So little time is spent in our code that even compiling it unoptimized didn't slow it down perceptibly. Improving our code's performance meant calling the math libraries less often.

This has been my experience with every large program or set of programs I've written, too. I know where the code is spending its time. Then I run the profiler, and it shows me I'm wrong. Donald Knuth famously warned us against small efficiencies and premature optimization.

Robbins and Avitzur's advice also has a user-centered dimension.

Programmers are often tempted to spend time saving a few bytes or cycles or to fine-tune an algorithm. If the change isn't visible to users, however, the benefits may not extend beyond the programmer's satisfaction. When most of the code in an application is straightforward and readable, maintenance and improvements are easier to make. Those are changes that users will notice.

We write code for our users. Programmer satisfaction comes second. This passage reminds me of a lesson I internalized from the early days extreme programming: At the end of the day, if you haven't added value for your customer, you haven't earned your day's pay.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

July 03, 2012 3:28 PM

A Little Zen, A Little Course Prep

I listened to about 3/4 of Zen and the Art of Motorcycle Maintenance on a long-ish drive recently. It's been a while since I've read the whole book, but I listen to it on tape once a year or so. It always gets my mind in the mood to think about learning to read, write, and debug programs.

This fall, I will be teaching our third course for first time since became head seven years ago. In that time, we changed the name of the course from "Object-Oriented Programming" to "Intermediate Computing". In many ways, the new name is an improvement. We want students in this course to learn a number of skills and tools in the service of writing larger programs. At a fundamental level, though OOP remains the centerpiece of everything we do in the course.

As I listened to Pirsig make his way across the Great Plains, a few ideas stood out as prepare to teach one of my favorite courses:

The importance of making your own thing, not just imitating others. This is always a challenge in programming courses, but for most people it is essential if we hope for students to maximize their learning. It underlies several other parts of Pirsig's zen and art, such as caring about our artifacts, and the desire to go beyond what something is to what it means.

The value of reading code, both good and bad. Even after only one year of programming, most students have begun to develop a nose for which is which, and nearly all have enough experience that they can figure out the difference with minimal interference from the instructor. If we can get them thinking about what features of a program make it good or bad, we can move on to the more important question: How can we write good programs? If we can get students to think about this, then they can see the "rules" we teach them for what they really are: guidelines, heuristics that point us in the direction of good code. They can learn the rules with an end in mind, and not as an end in themselves.

The value of grounding abstractions in everyday life. When we can ground our classwork in their own experiences, they are better prepared to learn from it. Note that this may well involve undermining their naive ideas about how something works, or turning a conventional wisdom from their first year on its head. The key is to make what they see and do matter to them.

One idea remains fuzzy in my head but won't let me go. While defining the analytic method, Pirsig talks briefly about the difference between analysis by composition and analysis by function. Given that this course is teaches object-oriented programming in Java, there are so many ways in which this distinction could matter: composition and inheritance, instance variables and methods, state and behavior. I'm not sure whether there is anything particular useful in Pirsig's philosophical discussion of this, so I'll think some more about it.

I'm also thinking a bit about a non-Zen idea for the course: Mark Guzdial's method of worked examples and self-explanation. My courses usually include a few worked examples, but Mark has taken the idea to another level. More important, he pairs it with an explicit step in which students explain examples to themselves and others. This draws on results from research in CS education showing that learning and retention are improved when students explain something in their own words. I think this could be especially valuable in a course that asks students to learn a new style of writing code.

One final problem is on my mind right now, a more practical matter: a textbook for the course. When I last taught this course, I used Tim Budd's Understanding Object-Oriented Programming with Java. I have written in the past that I don't like textbooks much, but I always liked this book. I liked the previous multi-language incarnation of the book even more. Unfortunately, one of the purposes of this course is to have students learn Java reasonably well.

Also unfortunate is that Budd's OOP/Java book is now twelve years old. A lot has happened in the Java world in the meantime. Besides, as I found while looking for a compiler textbook last fall, the current asking price of over $120 seems steep -- especially for a CS textbook published in 2000!

So I persist in my quest. I'd love to find something that looks like it is from this century, perhaps even reflecting the impending evolution of the textbook we've all been anticipating. Short of that, I'm looking for a modern treatment of both OO principles and Java.

Of course, I'm a guy who still listens to books on tape, so take my sense of what's modern with a grain of salt.

As always, any pointers and suggestions are appreciated.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

June 28, 2012 4:13 PM

"Doing research is therefore writing software."

The lede from RA Manual: Notes on Writing Code, by Gentzkow and Shapiro:

Every step of every research project we do is written in code, from raw data to final paper. Doing research is therefore writing software.

The authors are economists at the University of Chicago. I have only skimmed the beginning of the paper, but I like what little I've seen. They take seriously the writing of computer programs.

  • "This document lays out some broad principles we should all follow."
  • "We encourage you to invest in reading more broadly about software craftsmanship, looking critically at your own code and that of your colleagues, and suggesting improvements or additions to the principles below."
  • "Apply these principles to every piece of code you check in without exception."
  • "You should also take the time to improve code you are modifying or extending even if you did not write the code yourself."

...every piece of code you check in... Source code management and version control? They are a couple of steps up on many CS professors and students.

Thanks to Tyler Cowen for the link.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

June 28, 2012 12:37 PM

What Big Software Needs

Unix guru Rob Pike, on "programming in the large":

There's this idea about "programming in the large" and somehow C++ and Java own that domain. I believe that's just a historical accident, or perhaps an industrial accident. But the widely held belief is that it has something to do with object-oriented design.

Big software needs methodology to be sure, but not nearly as much as it needs strong dependency management and clean interface abstraction and superb documentation tools, none of which is served well by C++ (although Java does noticeably better).

That is as succinct a summary as I've seen of what people need from a language in order to write and maintain large programs: strong dependency management, clean interface abstraction, and superb documentation tools. I think that individuals and small teams need them as much as large teams, but that you experience the pain of not having them much sooner when you work on larger teams.

the logo of the Go programming language

The quoted passage is from Less is exponentially more, the text of a talk he gave this month about the biggest surprise he experienced from the rolling out of Go, the programming language he and several colleagues created at Google. He had expected Go to attract C and C++ programmers, because Go was designed to do the things that C++ is used for. Instead, it attracts programmers from Python and Ruby. I'm tempted to quote Pike's conclusion, because it's so succinct, but instead I'll let you read his blog post yourself.

It was interesting to read this paper the day after seeing Leo Meyerovich's blog post on the sociology of programming languages. After reading Pike's thoughts on the spread of Go, I'm more motivated to read the paper Meyerovich introduces, on the principles for programming language adoption.

Irrespective of the adoption question: Pike's talk has no code in it, yet it conveys the spirit of Go better than anything I had read before.

~~~~

Go logo comes courtesy of the project's open-source repository at Google Code.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

June 25, 2012 12:11 PM

Test-First Development and Implementation of Processing.js

While describing the lessons learned by the team that wrote Processing.js, Mike Kamermans talks about one of the benefits of writing tests before writing code:

The usual process, in which code is written and then test cases are written for that code, actually creates biased tests. Rather than testing whether or not your code does what it should do, according to the specification, you are only testing whether your code is bug-free. In Processing.js, we instead start by creating test cases based on what the functional requirements for some function or set of functions is, based on the documentation for it. With these unbiased tests, we can then write code that is functionally complete, rather than simply bug-free but possibly deficient.

When you implement a language processor, you can use the language specification as a primary guide. Testing your code efficiently, though, means translating the features of the spec into test cases -- code. When you port a language from one platform to another, you can usually use the documentation of the original implementation as a guide, too. The Processing.js team had the benefit that the original implementation also came with a large set of test cases. This allowed them to write code against tests from the beginning, and then write their own tests before writing code that went beyond the scope of the original.

The next time I teach our compiler course, I hope to do a better job getting students to write tests sooner, if not first, as a matter of habit. Perhaps I will seed the teams with a few tests to help them get started.

~~~~

The passage above comes from the chapter on Processing.js in Volume 2 of The Architecture of Open Source Applications. This was a good read that taught me a bit about Javascript, HTML 5, and the web browser as a platform. But most of all it explained the thought process that went into porting a powerful Java package to an architecture with considerably different strengths and weakness. Language theory is all fine and good, but language implementors have to make pragmatic choices in the service of users. It's essential to remember that, in the end, what matters is that the compiler or interpreter produce correct results -- and not, in the case of a port, that the resulting code resemble the original (another lesson the Processing.js team learned).

Another lesson this chapter teaches is to acknowledge when when program doesn't meet everyone's expectations. This was a serious challenge for Processing.js, because Java makes possible behaviors that a web browser does not. When you can't make something work the way people expect, tell them. Processing.js provides documentation for people who come to it with a Processing background, and documentation for people who come to it with a JavaScript background.

Next up on my reading list from AOSA: the design of LLVM.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

June 19, 2012 3:04 PM

Basic Arithmetic, APL-Style, and Confident Problem Solvers

After writing last week about a cool array manipulation idiom, motivated by APL, I ran across another reference to "APL style" computation yesterday while catching up with weekend traffic on the Fundamentals of New Computing mailing list. And it was cool, too.

Consider the sort of multi-digit addition problem that we all spend a lot of time practicing as children:

        365
     +  366
     ------

The technique requires converting two-digit sums, such as 6 + 5 = 11 in the rightmost column, into a units digit and carrying the tens digit into the next column to the left. The process is straightforward but creates problems for many students. That's not too surprising, because there is a lot going on in a small space.

David Leibs described a technique, which he says he learned from something Kenneth Iverson wrote, that approaches the task of carrying somewhat differently. It takes advantage of the fact that a multi-digit number is a vector of digits times another vector of powers.

First, we "spread the digits out" and add them, with no concern for overflow:

        3   6   5
     +  3   6   6
     ------------
        6  12  11

Then we normalize the result by shifting carries from right to left, "in fine APL style".

        6  12  11
        6  13   1
        7   3   1

According to Leibs, Iverson believed that this two-step approach was easier for people to get right. I don't know if he had any empirical evidence for the claim, but I can imagine why it might be true. The two-step approach separates into independent operations the tasks of addition and carrying, which are conflated in the conventional approach. Programmers call this separation of concerns, and it makes software easier to get right, too.

Multiplication can be handled in a conceptually similar way. First, we compute an outer product by building a digit-by-digit times table for the digits:

     +---+---------+
     |   |  3  6  6|
     +---+---------+
     | 3 |  9 18 18|
     | 6 | 18 36 36|
     | 5 | 15 30 30|
     +---+---------+

This is straightforward, simply an application of the basic facts that students memorize when they first learn multiplication.

Then we sum the diagonals running southwest to northeast, again with no concern for carrying:

     (9) (18+18) (18+36+15) (36+30) (30)
      9      36         69      66   30

In the traditional column-based approach, we do this implicitly when we add staggered columns of digits, only we have to worry about the carries at the same time -- and now the carry digit may be something other than one!

Finally, we normalize the resulting vector right to left, just as we did for addition:

         9  36  69  66  30
         9  36  69  69   0
         9  36  75   9   0
         9  43   5   9   0
        13   3   5   9   0
     1   3   3   5   9   0

Again, the three components of the solution are separated into independent tasks, enabling the student to focus on one task at a time, using for each a single, relatively straightforward operator.

(Does this approach remind some of you of Cannon's algorithm for matrix multiplication in a two-dimensional mesh architecture?)

Of course, Iverson's APL was designed around vector operations such as these, so it includes operators that make implementing such algorithms as straightforward as the calculate-by-hand technique. Three or four Greek symbols and, voilá, you have a working program. If you are Dave Ungar, you are well on your way to a compiler!

the cover of High-Speed Math Self-Taught, by Lester Meyers

I have a great fondness for alternative ways to do arithmetic. One of the favorite things I ever got from my dad was a worn copy of Lester Meyers's High-Speed Math Self-Taught. I don't know how many hours I spent studying that book, practicing its techniques, and developing my own shortcuts. Many of these techniques have the same feel as the vector-based approaches to addition and multiplication: they seem to involve more steps, but the steps are simpler and easier to get right.

A good example of this I remember learning from High-Speed Math Self-Taught is a shortcut for finding 12.5% of a number: first multiply by 100, then divide by 8. How can a multiplication and a division be faster than a single multiplication? Well, multiplying by 100 is trivial: just add two zeros to the number, or shift the decimal point two places to the right. The division that remains involves a single-digit divisor, which is much easier than multiplying by a three-digit number in the conventional way. The three-digit number even has its own decimal point, which complicates matters further!

To this day, I use shortcuts that Meyers taught me whenever I'm making updating the balance in my checkbook register, calculating a tip in a restaurant, or doing any arithmetic that comes my way. Many people avoid such problems, but I seek them out, because I have fun playing with the numbers.

I am able to have fun in part because I don't have to worry too much about getting a wrong answer. The alternative technique allows me to work not only faster but also more accurately. Being able to work quickly and accurately is a great source of confidence. That's one reason I like the idea of teaching students alternative techniques that separate concerns and thus offer hope for making fewer mistakes. Confident students tend to learn and work faster, and they tend to enjoy learning more than students who are handcuffed by fear.

I don't know if anyone was tried teaching Iverson's APL-style basic arithmetic to children to see if it helps them learn faster or solve problems more accurately. Even if not, it is both a great demonstration of separation of concerns and a solid example of how thinking about a problem differently opens the door to a new kind of solution. That's a useful thing for programmers to learn.

~~~~

Postscript. If anyone has a pointer to a paper or book in which Iverson talks about this approach to arithmetic, I would love to hear from you.

IMAGE: the cover of Meyers's High-Speed Math Self-Taught, 1960. Source: OpenLibrary.org.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

June 13, 2012 2:19 PM

The First Rule of Programming in Ruby

When I was thinking about implementing the cool programming idiom I blogged about yesterday, I forgot the first rule of programming in Ruby:

It's already in there.

It didn't take long after my post went live that readers began to point out that Ruby already offers the functionality of my sub and R's { operator, via Array#values_at:

     >> [10, 5, 9, 6, 20, 17, 1].values_at(6, 1, 3, 2, 0, 5, 4)
     =;> [1, 5, 6, 9, 10, 17, 20]

I'm not surprised. I should probably spend a few minutes every day browsing the documentation for a randomly-chosen Ruby class. There's so much to find! There's also too much to remember out of context, but I never know when I'll come across a need and have a vague recollection that Ruby already does what I need.

Reader Gary Wright pointed out that I can get closer to R's syntax by aliasing values_at with an unused array operator, say:

     class Array
       def %(*args)
         values_at(*args.first)
       end
     end

Now my use of % is as idiomatic in Ruby as { is in R:

     >> [10, 5, 9, 6, 20, 17, 1] % [6, 1, 3, 2, 0, 5, 4]
     =;> [1, 5, 6, 9, 10, 17, 20]

I am happy to learn that Ruby already has my method and am just as happy to have spent time thinking about and implementing it on my own. I like to play with the ideas as much as I like knowing the vast class library of a modern scripting language.

(If I were writing Ruby code for a living, my boss might not look upon my sense of exploration so favorably... There are advantages to being an academic!)


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

June 12, 2012 3:44 PM

Faking a Cool Programming Idiom in Ruby

Last week, James Hague blogged about a programming idiom you've never heard of: fetching multiple items from an array with a single operation.

Let's say the initial array is this:
     10 5 9 6 20 17 1

Fetching the values at indices 0, 1, 3, and 6, gives:

     10 5 6 1

You can do this directly in APL-style languages such as J and R. In J, for example, you use the { operator:

     0 1 3 6 { 10 5 9 6 20 17 1

Such an operator enables you to do some crazy things, like producing a sorted array by accessing it with a permuted set of indices. This:

     6 1 3 2 0 5 4 { 10 5 9 6 20 17 1

produces this:

     1 5 6 9 10 17 20

When I saw this, my first thought was, "Very cool!" It's been a long time since I programmed in APL, and if this is even possible in APL, I'd forgotten.

One of my next thoughts was, "I bet I can fake that in Ruby...".

I just need a way to pass multiple indices to the array, invoking a method that fetches one value at a time and returns them all. So I created an Array method named sub that takes the indices as an array.

     class Array
       def sub slots
         slots.map {|i| self[i] }
       end
     end

Now I can say:

     [10, 5, 9, 6, 20, 17, 1].sub([6, 1, 3, 2, 0, 5, 4])

and produce [1, 5, 6, 9, 10, 17, 20].

The J solution is a little cleaner, because my method requires extra syntax to create an array of indices. We can do better by using Ruby's splat operator, *. splat gathers up loose arguments into a single collection.

     class Array
       def sub(*slots)             # splat the parameter
         slots.map {|i| self[i] }
       end
     end

This definition allows us to send sub any number of integer arguments, all of which will be captured into the parameter slots.

Now I can produce the sorted array by saying:

     [10, 5, 9, 6, 20, 17, 1].sub(6, 1, 3, 2, 0, 5, 4)

Of course, Ruby allows us to omit the parentheses when we send a message as long as the result is unambiguous. So we can go one step further:

     [10, 5, 9, 6, 20, 17, 1].sub 6, 1, 3, 2, 0, 5, 4

Not bad. We are pretty close to the APL-style solution now. Instead of {, we have .sub. And Ruby requires comma-separated argument lists, so we have to use commas when we invoke the method. These are syntactic limitations placed on us by Ruby.

Still, with relative simple code we are able to fake Hague's idiom quite nicely. With a touch more complexity, we could write sub to allow either the unbundled indices or a single array containing all the indices. This would make the code fit nicely with other Ruby idioms that produce array values.

If you have stronger Ruby-fu than I and can suggest a more elegant implementation, please share. I'd love to learn something new!


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

June 05, 2012 3:33 PM

Writing and Rewriting

An interviewer once asked writer Stephen King about extensive rewrites, and King responded:

One of the ways the computer has changed the way I work is that I have a much greater tendency to edit "in the camera" -- to make changes on the screen. With 'Cell' that's what I did. I read it over, I had editorial corrections, I was able to make my own corrections, and to me that's like ice skating. It's an OK way to do the work, but it isn't optimal. With 'Lisey' I had the copy beside the computer and I created blank documents and retyped the whole thing. To me that's like swimming, and that's preferable. It's like you're writing the book over again. It is literally a rewriting.

The idea of typing an existing text made me think of Zed Shaw's approach to teaching programming, which has grown into Learn Code the Hard Way. You can learn a lot about a body of words or code by reading it just enough to type it, and letting your brain do the rest. I'm not sure how well this approach would work for a group of complete novices. I suspect that a few would like it and that many would not. I like having it around, though, because I like having as diverse a set of tools as possible for reaching students.

For someone who already knows how to write -- or, in King's case, who actually wrote the text he is retyping -- the act offers a different set of trade-offs than rewriting or refactoring in place. It also offers a very different experience from (re)writing from scratch, or deleting text so you won't be tempted to keep it.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

May 29, 2012 2:44 PM

Some Final Thoughts and Links from JRubyConf

You are probably tired of hearing me go on about JRubyConf, so I'll try to wrap up with one more post. After the first few talks, the main result of the rest of conference was to introduced me to several cool projects and a few interesting quotes.

Sarah Allen speaking on agile business development

Sarah Allen gave a talk on agile business development. Wow, she has been part of creating several influential pieces of software, including AfterEffects, ShockWave, and FlashPlayer. She talked a bit about her recent work to increase diversity among programmers and reminded us that diversity is about more than the categories we usually define:

I may be female and a minority here, but I'm way more like everybody in here than everybody out there.

Increasing diversity means making programming accessible to people who wouldn't otherwise program.

Regarding agile development, Sarah reminded us that agile's preference for working code over documentation is about more than just code:

Working code means not only "passes the tests" but also "works for the customer".

... which is more about being the software they need than simply getting right answers to some tests written in JUnit.

Nate Schutta opened day two with a talk on leading technical change. Like Venkat Subramaniam on day one, Schutta suggested that tech leaders consider management's point of view when trying to introduce new technology, in particular the risks that managers face. If you can tie new technology to the organization's strategic goals and plans, then managers can integrate it better into other actions. Schutta attributed his best line to David Hussman:

Change must happen with people, not to them.

The award for the conference's most entertaining session goes to Randall Thomas and Tammer Saleh for "RUBY Y U NO GFX?", their tag-team exegesis of the history of computer graphics and where Ruby fits into that picture today. They echoed several other speakers in saying that JRuby is the bridge to the rest of the programming world that Ruby programmers need, because the Java community offers so many tools. For example, it had never occurred to me to use JRuby to connect my Ruby code to Processing, the wonderful open-source language for programming images and animations. (I first mentioned Processing here over four years ago in its original Java form, and most recently was thinking of its JavaScript implementation.)

Finally, a few quickies:

  • Jim Remsik suggested Simon Sinek's TED talk, How great leaders inspire action, with the teaser line, It's not what you do; it's why you do it.

  • Yoko Harada introduced me to Nokogiri, a parser for HTML, XML, and the like.

  • Andreas Ronge gave a talk on graph databases as a kind of NoSQL database and specifically about Neo4j.rb, his Ruby wrapper on the Java library Neo4J.

  • I learned about Square, which operates in the #fintech space being explored by the Cedar Valley's own T8 Webware and by Iowa start-up darling Dwolla.

  • rapper Jay Z
    I mentioned David Wood in yesterday's entry. He also told a great story involving rapper Jay-Z, illegal music downloads, multi-million-listener audiences, Coca Cola, logos, and video releases that encapsulated in a nutshell the new media world in which we live. It also gives a very nice example of why Jay-Z will soon be a billionaire, if he isn't already. He gets it.

  • The last talk I attended before hitting the road was by Tony Arcieri, on concurrent programming in Ruby, and in particular his concurrency framework Celluloid. It is based on the Actor model of concurrency, much like Erlang and Scala's Akka framework. Regarding these two, Arcieri said that Celluloid stays truer the original model's roots than Akka by having objects at its core and that he currently views any differences in behavior between Celluloid and Erlang as bugs in Celluloid.

One overarching theme for me of my time at JRubyConf: There is a lot of stuff I don't know. I won't run out of things to read and learn and do for a long, long, time.

~~~~

IMAGE 1: my photo of Sarah Allen during her talk on agile business development. License: Creative Commons Attribution-ShareAlike 3.0 Unported.

IMAGE 2: Jay-Z, 2011. Source: Wikimedia Commons.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 28, 2012 10:58 AM

The Spirit of Ruby... and of JRuby

JRubyConf was my first Ruby-specific conference, and one of the things I most enjoyed was seeing how the spirit of the language permeates the projects created by its community of users. It's one thing to read books, papers, and blog posts. It's another to see the eyes and mannerisms of the people using the language to make things they care about. Being a variant, JRuby has its own spirit. Usually it is in sync with Ruby's, but occasionally it diverges.

the letter thnad, created

The first talk after lunch was by Ian Dees, talking about his toy programming language project Thnad. (He took the name from one of the new letters of the alphabet in Dr. Seuss's On Beyond Zebra.) Thnad looks a lot like Klein, a language I created for my compiler course a few years ago, a sort of functional integer assembly language.

The Thnad project is a great example of how easy it is to roll little DSLs using Ruby and other DSLs created in it. To implement Thnad, Dees uses Parslet, a small library for generating scanners and parsers PEG-style, and BiteScript, a Ruby DSL for generating Java bytecode and classes. This talk demonstrated the process of porting Thnad from JRuby to Rubinius, a Ruby implementation written in Ruby. (One of the cool things I learned about the Rubinius compiler is that it can produce s-expressions as output, using the switch -S.)

Two other talks exposed basic tenets of the Ruby philosophy and the ways in which implementations such as JRuby and Rubinius create new value in the ecosystem.

On Wednesday afternoon, David Wood described how his company, the Jun Group, used JRuby to achieve the level of performance its on-line application requires. He told some neat stories about the evolution of on-line media over the last 15-20 years and how our technical understanding for implementing such systems has evolved in tandem. Perhaps his most effective line was this lesson learned along the way, which recalled an idea from the keynote address the previous morning:

Languages don't scale. Architectures do. But language and platform affect architecture.

In particular, after years of chafing, he had finally reached peace with one of the overarching themes of Ruby: optimize for developer ease and enjoyment, rather than performance or scalability. This is true of the language and of most of the tools built around, such as Rails. As a result, Ruby makes it easy to write many apps quickly. Wood stopped fighting the lack of emphasis on performance and scalability when he realized that most apps don't succeed anyway. If one does, you have to rewrite it anyway, so suck it up and do it. You will have benefited from Ruby's speed of delivery.

This is the story Twitter, apparently, and what Wood's team did. They spent three person-months to port their app from MRI to JRuby, and are now quite happy.

Where does some of that performance bump come from? Concurrency. Joe Kutner gave a talk after Thnad on Tuesday afternoon about using JRuby to deploy efficient Ruby web apps on the JVM, in which he also exposed a strand of Ruby philosophy and place where JRuby diverges.

The canonical implementations of Ruby and Python use a Global Interpreter Lock to ensure that non-thread-safe code does not interfere with the code in other threads. In effect, the interpreter maps all threads onto a single thread in the kernel. This may seem like an unnecessary limitation, but it is consistent with Matz's philosophy for Ruby: Programming should be fun and easy. Concurrency is hard, so don't do allow it to interfere with the programmer's experience.

Again, this works just fine for many applications, so it's a reasonable default position for the language. But it does not work so well for web apps, which can't scale if they can't spawn new, independent threads. This is a place where JRuby offers a big win by running atop the JVM, with its support for multithreading. It's also a reason why the Kilim fibers GSoC project mentioned by Charles Nutter in the State of JRuby session is so valuable.

In this talk, I learned about three different approaches to delivering Ruby apps on the JVM:

  • Warbler, a light and simple tool for packaging .war files,
  • Trinidad, which is a JRuby wrapper for a Tomcat server, and
  • TorqueBox, an all-in-one app server that appears to be the hot new thing.

Links, links, and more links!

Talks such as these reminded me of the feeling of ease and power that Ruby gives developers, and the power that language implementors have to shape the landscape in which programmers work. They also gave me a much better appreciation for why projects like Rubinius and JRuby are essential to the Ruby world because -- not despite -- deviating from a core principle of the language.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 25, 2012 4:07 PM

JRubyConf, Day 1: The State of JRuby

Immediately after the keynote address, the conference really began for me. As a newcomer to JRuby, this was my first chance to hear lead developers Charles Nutter and Tom Enebo talk about the language and community. The program listed this session as "JRuby: Full of Surprises", and Nutter opened with a slide titled "Something About JRuby", but I just thought of the session as a "state of the language" report.

Nutter opened with some news. First, JRuby 1.7.0.preview1 is available. The most important part of this for me is that Ruby 1.9.3 is now the default language mode for the interpreter. I still run Ruby 1.8.7 on my Macs, because I have never really needed more and that kept my installations simple. It will be nice to have a 1.9.3 interpreter running, too, for cases where I want to try out some of the new goodness that 1.9 offers.

Second, JRuby has been awarded eight Google Summer of Code placements for 2012. This was noteworthy because there were no Ruby projects at all in 2010 or 2011, for different reasons. Several of the 2012 projects are worth paying attention to:

  • creating a code generator for Dalvik byte code, which will give native support for JRuby on Android
  • more work on Ruboto, the current way to run Ruby on Android, via Java
  • implementing JRuby fibers using Kilim fibers, for lighterweight and faster concurrency than Java threads can provide
  • work on krypt, "SSL done right" for Ruby, which will eliminate the existing dependence on OpenSSL
  • filling in some of the gaps in the graphics framework Shoes, both Swing and SWT versions

Charles Nutter discussing details of the JRuby compiler

Enebo then described several projects going on with JRuby. Some are smaller, including closing gaps in the API for embedding Ruby code in Java, and Noridoc, a tool for generating integrated Ruby documentation for Ruby and Java APIs that work together. Clever -- "No ri doc".

One project is major: work on the JRuby compiler itself. This includes improving to the intermediate representation used by JRuby, separating more cleanly the stages of the compiler, and writing a new, better run-time system. I didn't realize until this talk just how much overlap there is in the existing scanner, parser, analyzer, and code generator of JRuby. I plan to study the JRuby compiler in some detail this summer, so this talk whet my appetite. One of the big challenges facing the JRuby team is to identify execution hot spots that will allow the compiler to do a better job of caching, inlining, and optimizing byte codes.

This led naturally into Nutter's discussion of the other major project going on: JRuby support for the JVM's new invokedynamic instruction. He hailed invokedynamic as "the most important change to the JVM -- ever". Without it, JRuby's method invocation logic is opaque to the JVM optimizer, including caching and inlining. With it, the JRuby compiler will be able not only to generate optimizable function calls but also more efficient treatment of instance variables and constants.

Charles Nutter donning his new RedHat

Nutter showed some performance data comparing JRuby to MRI Ruby 1.9.3 on some standard benchmarks. Running on Java 6, JRuby is between 1.3 and 1.9 times faster than the C-based compiler on the benchmark suites. When they run it on Java 7, performance jumps to speed-ups of between 2.6 and 4.3. That kind of speed is enough to make JRuby attractive for many compute-intensive applications.

Just as Nutter opened with news, he closed with news. He and Enebo are moving to RedHat. They will work with various RedHat and JBoss teams, including TorqueBox, which I'll mention in an upcoming JRubyConf post. Nutter and Enebo have been at EngineYard for three years, following a three-year stint at Sun. It is good to know that, as the corporate climate around Java and Ruby evolves, there is usually a company willing to support open-source JRuby development.

~~~~

IMAGE 1: my photo of Charles Nutter talking about some details of the JRuby compiler. License: Creative Commons Attribution-ShareAlike 3.0 Unported.

IMAGE 2: my photo of Charles Nutter and Tom Enebo announcing their move to RedHat. License: Creative Commons Attribution-ShareAlike 3.0 Unported.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 24, 2012 3:05 PM

JRubyConf 2012: Keynote Address on Polyglot Programming

JRubyConf opened with a keynote address on polyglot programming by Venkat Subramaniam. JRuby is itself a polyglot platform, serving as a nexus between a highly dynamic scripting language and a popular enterprise language. Java is not simply a language but an ecosphere consisting of language, virtual machine, libraries, and tools. For many programmers, the Java language is the weakest link in its own ecosphere, which is one reason we see so many people porting their favorite languages run on the JVM, or creating new languages with the JVM as a native backend.

Subramaniam began his talk by extolling the overarching benefits of being able to program in many languages. Knowing multiple programming languages changes how we design software in any language. It changes how we think about solutions. Most important, it changes how we perceive the world. This is something that monolingual programmers often do not appreciate. When we know several languages well, we see problems -- and solutions -- differently.

Why learn a new language now, even if you don't need to? So that you can learn a new language more quickly later, when you do need to. Subramaniam claimed that the amount of time required to learn a new language is inversely proportional to the number of languages a person has learned in last ten years. I'm not sure whether there is any empirical evidence to support this claim, but I agree with the sentiment. I'd offer one small refinement: The greatest benefits come from learning different kinds of language. A new language that doesn't stretch your mind won't stretch your mind.

Not everything is heaven for the polyglot programmer. Subramaniam also offered some advice for dealing with the inevitable downsides. Most notable among these was the need to "contend with the enterprise".

Many companies like to standardize on a familiar and well-established technology stack. Introducing a new language into the mix raises questions and creates resistance. Subramaniam suggested that we back up one step before trying to convince our managers to support a polyglot environment and make sure that we have convinced ourselves. If you were really convinced of a language's value, you would find a way to do it. Then, when it comes time to convince your manager, be sure to think about the issue from her perspective. Make sure that your argument speaks to management's concerns. Identify the problem. Explain the proposed solution. Outline the costs of the solution. Outline its benefits. Show how the move can be a net win for the organization.

The nouveau JVM languages begin with a head start over other new technologies because of their interoperability with the rest of the Java ecosphere. They enable you to write programs in a new language or style without having to move anyone else in the organization. You can experiment with new technology while continuing to use the rest of the organization's toolset. If the experiments succeed, managers can have hard evidence about what works well and what doesn't before making larger changes to the development environment.

I can see why Subramaniam is a popular conference speaker. He uses fun language and coins fun terms. When talking about people who are skeptical of unit testing, he referred to some processes as Jesus-driven development. He admonished programmers who are trying to do concurrency in JRuby without knowing the Java memory model, because "If you don't understand the Java memory model, you can't get concurrency right." But he followed that immediately with, Of course, even if you do know the Java memory model, you can't get concurrency right. Finally, my favorite: At one point, he talked about how some Java developers are convinced that they can do anything they need to do in Java, with no other languages. He smiled and said, I admire Java programmers. They share an unrelenting hope.

There were times, though, when I found myself wanting to defend Java. That happens to me a lot when I hear talks like this one, because so many complaints about it are really about OOP practiced poorly; Java is almost an innocent bystander. For example, the speaker chided Java programmers for suffering from primitive obsession. This made me laugh, because most Ruby programmers seem to have an unhealthy predilection for strings, hashes, and integers.

In other cases, Subramaniam demonstrated the virtues of Ruby by showing a program that required a gem and then solved a thorny problem with three lines of code. Um, I could do that in Java, too, if I used the right library. And Ruby programmers probably don't want to draw to much attention to gems and the problems many Ruby devs have with dependency management.

But those are small issues. Over the next two days, I repeatedly heard echoes of Subramaniam's address in the conference's other talks. This is the sign of a successful keynote.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

May 11, 2012 2:31 PM

Get Busy; Time is Short

After an award-winning author had criticized popular literature, Stephen King responded with advice that is a useful reminder to us all:

Get busy. You have a short life span. You need to stop this crap about sitting there and talking about what we do, and actually do it. Because God gave you some talent, but he also gave you a certain number of years.

You don't have to be an award-winning author to waste precious time commenting on other people's work. Anyone with a web browser can fill his or her day talking about stuff, and not actually making stuff. For academics, it is a professional hazard. We need to balance the analytic and the creative. We learn by studying others' work and writing about it, but we also need to make time to make.

(The passage above comes from Stephen King, The Art of Fiction No. 189, in the wonderful on-line archive of interviews from the Paris Review.)


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

May 10, 2012 12:18 PM

Code Signatures in Lisp

Recently, @fogus tweeted:

I wonder if McCarthy had to deal with complaints of parentheses count in the earliest Lisps?

For me, this tweet immediately brought to mind Ward Cunningham's experiment with file signatures as an aid in browsing unfamiliar code, which he presented at a workshop on "software archeology" at OOPSLA 2001. In his experiment, Ward collapsed each file in the Java 1.3 source code distribution into a single line consisting of only braces, quotes, and semicolons. For example, the AWT class java.awt.peer.ComponentPeer looked like this:

    ;;;;;;;;{;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;} 

while java.awt.print.PageFormat looked like this:

    {;{;;}{;;};}{;;{;}{;};}{;;{;}{;};}{;{;;;;;;"";};}{;{;;;;;;"";};}{;{;}{;};}{;{;}{;};}{}{}{;}{;}{{;}{;}}{;}{;;;;;;}{}{;{;;;;;;;;;;;;;;;;;;;;;;};}}

As Ward said, it takes some time to get use to the "radical summarization" of files into such punctuation signatures. He was curious about how such a high-level view of a code base might help a programmer understand the regularities and irregularities in the code, via an interactive process of inspection and projection.

Perhaps this came to mind as a result of experiences I had when I was first learning to program in Scheme. Coming from other languages with more syntax, I developed a bad habit of writing code like this:

    (define factorial
      (lambda (n)
        (if (zero? n)
            1
            (* n (factorial (- n 1)))
        )))

When real Scheme and Lisp programmers saw my code, they suggested that I put those closing parens at the end of the multiplication line. They were even more insistent when I dropped closing parens onto separate lines in the middle of a larger piece of code, say, with a let expression of several complex values. I objected that the line breaks helped me to see the structure of my code better. They told me to trust them; after I had more experience, I wouldn't need the help, and my code would be cleaner and more idiomatic.

They were right. Eventually, I learned to read Scheme code more like real Schemers do. I found myself drawn to the densest parts of the code, in which those closing parens often played a role, and learned to see that that's where the action usually was.

I think it was the connection between counting parentheses and the structure of code that brought to mind Ward's work. And then I wondered what it would be like to take the signature of Lisp or Scheme code in terms of its maligned primary punctuation, the parentheses?

In a few spare minutes, I fiddled with the I idea. As an example, consider the following Lisp function, which is part of an implementation of CLOS written by Patrick Henry Winston and Berthold Horn to support their AI and Lisp textbooks:

    (defun call-next-method ()
      (if *around-methods*
          (apply (pop *around-methods*) *args*)
        (progn
          (do () ((endp *before-methods*))
            (apply (pop *before-methods*) *args*))
          (multiple-value-prog1
              (if *primary-methods*
	          (apply (pop *primary-methods*) *args*)
                (error "Oops, no applicable primary method!")) 
            (do () ((endp *after-methods*))
              (apply (pop *after-methods*) *args*))))))

Collapsing this function into a single line of parentheses results in:

    (()((())((()(())(()))(((())())(()(())(()))))))

The semicolons in Java code give the reader a sense of the code's length; collapsing Lisp in this way loses the line breaks. So I wrote another function to insert a | where the line breaks had been, which results in:

    (()|(|(())|(|(()(())|(()))|(|(|(())|())|(()(())|(()))))))

This gives a better idea of how the code plays out on the page, but it loses all sense of the code's structure, which is so important when reading Lisp. So I implemented a third signature, one that surrenders the benefit of a single line in exchange for a better sense of structure. This signature preserves leading white space and line breaks but otherwise gives us just the parentheses:

    (()
      (
          (())
        (
          (()(())
          (()))
          (
            (
           (())
               ())
        (()(())
          (()))))))

Interesting. It's almost art.

I think there is a lot of room left to explore here in terms of punctuation. To capture the nature of Scheme and Lisp programs, we would probably want to include other characters, such as the hash, the comma, quotes, and backquotes. These would expose macro-related expressions to the human reader. To expand the experiment to include Clojure, we would of course want to include [] and {} in the signatures.

I'm not an every-day Schemer, so I am not sure how much either the flat signatures or the structured signatures would help seasoned Lisp or Scheme programmers develop an intuitive sense of a function's size, complexity, and patterns. As Ward's experiment showed, the real value comes when signing entire files, and for that task flat signatures may have more appeal. It would be neat to apply this idea to a Lisp distribution of non-trivial size -- say, the full distribution of Racket or Clojure -- and see what might be learned.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

May 07, 2012 3:21 PM

The University as a Gym for the Mind

In recent years, it is becoming even more common for people to think of students as "paying customers" at the university. People inside of universities, especially the teachers, have long tried to discourage this way of thinking, but it is becoming much harder to make the case. Students and parents are being required to shoulder an ever larger share of the bill for higher education, and with that comes a sense of ownership. Still, educators can't help but worry. The customer isn't always right.

Rob Miles relates a story that might help us make the case:

the gym: a university for the body

You can join a gym to get fit, but just joining doesn't make you fit. It simply gives you access to machinery and expertise that you can use to get fit. If you fail to listen to the trainer or make use of the equipment then you don't get a better body, you just get poorer.

You can buy all the running shoes you like, but if you never lace them up and hit the road, you won't become a runner.

I like this analogy. It also puts into perspective a relatively recent phenomenon, the assertion that software developers may not need a university education. Think about such an assertion in the context of physical fitness:

A lot of people manage to get in shape physically without joining a gym. To do so, all you need is the gumption (1) to learn what they need to do and (2) to develop and stick to a plan. For example, there is a lot of community support among runners, who are willing to help beginners get started. As runners become part of the community, they find opportunities to train in groups, share experiences, and run races together. The result is an informal education as good as most people could get by paying a trainer at a gym.

The internet and the web have provided the technology to support the same sort of informal education in software development. Blogs, user groups, codeathons, and GitHub all offer the novice opportunities to get started, "train" in groups, share experiences, and work together. With some gumption and hard work, a person can become a pretty good developer on his or her own.

But it takes a lot of initiative. Not all people who want to get in shape are ready or able to take control of their own training. A gym serves the useful purpose of getting them started. But each person has to do his or her own hard work.

Likewise, not all learners are ready to manage their own educations and professional development -- especially at age 18, when they come out of a K-12 system that can't always prepare them to be completely independent learners. Like a gym, a university serves the useful purpose of helping such people get started. And just as important, as at the gym, students have to do their own hard work to learn, and to prepare to learn on their own for the rest of their careers.

Of course, other benefits may get lost when students bypass the university. I am still idealistic enough to think that a liberal education, even a liberal arts education, has great value for all people. [ 1 | 2 | 3 ]. We are more than workers in an economic engine. We are human beings with a purpose larger than our earning potentials.

But the economic realities of education these days and the concurrent unbundling of education made possible by technology mean that we will have to deal with issues such as these more and more in the coming years. In any case, perhaps a new analogy might help us help people outside the university understand better the kind of "customer" our students need to be.

(Thanks to Alfred Thompson for the link to Miles's post.)


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

April 30, 2012 4:00 PM

Processing Old Languages and Thinking of New

I'm beginning to look at the compilers my students produced this semester. I teach a relatively traditional compiler course; we want students to experience as many different important ideas as possible within the time constraints of a semester. As you might expect, my students' programs read in a text file and produce a text file. These files contain a high-level program and an assembly language program, respectively.

I love seeing all the buzz floating around non-textual languages and new kinds of programming environments such as Bret Victor's reactive documents and Light Table. Languages and environments like these make my traditional compiler course seem positively archaic. I still think the traditional course adds a lot of value to students' experience. Before you can think outside of the box, you have to start with a box.

These new programming ideas really are outside the confines of how we think about programs. Jonathan Edwards reminds us how tightly related languages and tools are:

As long as we are programming in descendants of assembly language, we will continue to program in descendants of text editors.

Edwards has been exploring this pool of ideas for a few years now. I first mentioned his work in this blog back in 2004. As he has learned, the challenge we face in trying to re-think how we program is complicated by the network of ideas in which we work. It isn't just syntax or language or IDE or support tools that we have to change. To change one in a fundamental way requires changing them all.

On top of that, once researchers create something new, we will have to find a way to migrate there. That involves education and lots of existing practitioners. Here's hoping that the small steps people are taking with Tangle and Light Table can help us bridge the gap.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

March 18, 2012 5:20 PM

Thinking Out Loud about the Compiler in a Pure OO World

John Cook pokes fun at OO practice in his blog post today. The "Obviously a vast improvement." comment deftly ignores the potential source of OOP's benefits, but then that's the key to the joke.

A commenter points to a blog entry by Smalltalk veteran Travis Griggs. I agree with Griggs's recommendation to avoid using verbs-turned-into-nouns as objects, especially lame placeholder words such as "manager" and "loader". As he says, they usually denote objects that fiddle with the private parts of other objects. Those other objects should be providing the services we need.

Griggs allows reasonable linguistic exceptions to the advice. But he also acknowledges the pull of pop culture which, given my teaching assignment this semester, jumped out at me:

There are many 'er' words that despite their focus on what they do, have become so commonplace, that we're best to just stick with them, at least in part. Parser. Compiler. Browser.

I've thought about this break in my own OO discipline before, and now I'm thinking about it again. What would it be like to write compiles without creating parsers and code generators -- and compilers themselves -- as objects?

We could ask a program to compile itself:

     program.compileTo( targetMachine )
But is the program a program, or does it start life as a text file? If the program starts as a text file, perhaps we say
  program.parse()
to create an abstract syntax tree, which we then could ask
  ast.compileTo( targetMachine )

(Instead of sending a parse() message, we might send an asAbstractSyntax() message. There may be no functional difference, but I think the two names indicate subtle differences in mindset.)

When my students write their compilers in Java or another OO language, we discuss in class whether abstract syntax trees ought to be able to generate target code for themselves. The danger lies in binding the AST class to the details of a specific target machine. We can separate the details of the target machine for the AST by passing an argument with the compileTo() message, but what?

Given all the other things we have to learn in the course, my students usually end up following Griggs's advice and doing the traditional thing: pass the AST as an argument to a CodeGenerator object. If we had more time, or a more intensive OO design course prior to the compilers course, we could look at techniques that enable a more OO approach without making things worse in the process.

Looking back farther to the parse behavior, would it ever make sense to send an argument with the parse() message? Perhaps a parse table for an LL(1) or LR(1) algorithm? Or the parsing algorithm itself, as a strategy object? We quickly run the risk of taking steps in the direction that Cook joshes about in his post.

Or perhaps parsing is a natural result of creating a Program object from a piece of text. In that approach, when we say

     Program source = new Program( textfile );
the internal state of source is an abstract syntax tree. This may sound strange at first, but a program isn't really a piece of text. It's just that we are conditioned to think that way by the languages and tools most of us learn first and use most heavily. Smalltalk taught us long ago that this viewpoint is not necessary. (Lisp, too, though in a different way.)

These are just some disconnected thoughts on a Sunday afternoon. There is plenty more to say, and plenty of prior art. I think I'll fire up a Squeak image sometime soon and spend some time reacquainting myself with its take on parsing and code generation, in particular the way it compiles its core out to C.

I like doing this kind of "what if?" thinking. It's fun to explore along the boundary between the land where OOP works naturally and the murky territory where it doesn't seem to fit so well. That's a great place to learn new stuff.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

March 13, 2012 8:00 PM

The Writer's Mindset for Programmers

Several people have pointed out that these tips on writing from John Steinbeck are useful for programmers; Chris Freeman even mapped them to writing code. I like to make such connections, most recently to the work of Roger Rosenblatt (in several entries, including The Summer Smalltalk Taught Me OOP) and John McPhee, a master of non-fiction (in an entry on writing, teaching, and programming). Lately I have been reading about and from David Foster Wallace, as I wrote a few weeks ago. Several quotes from interviews he gave in the 1990s and 2000s reminded me of programming, both doing it and learning to do it.

The first ties directly into the theme from the entry on my summer of Smalltalk. As Wallace became more adept at making the extensive cuts to his wide-ranging stories suggested by his editor, he adopted a familiar habit:

Eventually, he learned to erase passages that he liked from his hard drive, in order to keep himself from putting them back in.

It's one thing to kill your darlings. It's another altogether to keep them from sneaking back in. In writing as in programming, sometimes rm -r *.* is your friend.

A major theme in Wallace's work -- and life -- was the struggle not to fall into the comfortable patterns of thought engendered by the culture around us. The danger is, those comfortable ruts separate us from what is real:

Most 'familiarity' is mediated and delusive.

Programmers need to keep this in mind when they set out to learn a new programming language and or a new style of programming. We tend to prefer the familiar, whether it is syntax or programming model. Yet familiarity is conditioned by so many things, most prominently recent experience. It deludes us into thinking some things are easier or better than others, often for no other reason than the accident of history that brought us to a particular language or style first. When we look past the experience that gets in the way of seeing the new thing as it is, we enable ourselves to appreciate the new thing as it is, and not as the lens of our experience distorts it.

Of course, that's easier said than done. This struggle consumed Wallace the writer his entire life.

Even so, we don't want to make the mistake of floating along the surface of language and style. Sometimes, we think that makes us free to explore all ideas unencumbered by commitment to any particular syntax, programming model, or development methodology. But it is in focusing our energies and thinking to use specific tools, to write in a specific cadre of languages, and to use a particular styles that we enable ourselves to create, to do something useful:

If I wanted to matter -- even just to myself -- I would have to be less free, by deciding to choose in some kind of definite way.

This line is a climactic revelation of the protagonist in Wallace's posthumously published unfinished novel, The Pale King. It reminds us that freedom is not always so free.

It is much more important for a programmer to be a serial monogamist than a confirmed bachelor. Digging deep into language and style is what makes us stronger, whatever language or style we happen to work in at any point in time. Letting comfortable familiarity mediate our future experiences is simply a way of enslaving ourselves to the past.

In the end, reading Wallace's work and the interviews he gave shows us again that writers and programmers have a lot in common. Even after we throw away all the analogies between our practices, processes, and goals, we are left with an essential identity that we programmers share with our fellow writers:

Writing fiction takes me out of time. That's probably as close to immortal as we'll ever get.

Wallace said this in the first interview he gave after the publication of his first novel. It is a feeling I know well, and one I never want to live without.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

February 19, 2012 12:17 PM

The Polymorphism Challenge

Back at SIGCSE 2005, Joe Bergin and ran a workshop called The Polymorphism Challenge that I mentioned at the time but never elaborated on. It's been on my mind again for the last week. First I saw a link to an OOP challenge aimed at helping programmers move toward OO's ideal of small classes and short methods. Then Kent Beck tweeted about the Anti-IF Campaign, which, as its name suggests, wants to help programmers "avoid dangerous ifs and use objects to build a code that is flexible, changeable, and easily testable".

That was the goal of The Polymorphism Challenge. I decided it was time to write up the challenge and make our workshop materials available to everyone.

Background

Beginning in the mid-1990s, Joe and I have been part of a cabal of CS educators trying to teach object-oriented programming style better. We saw dynamic polymorphism as one of the key advantages to be found in OOP. Getting procedural programmers to embrace it, including many university instructors, was a big challenge.

At ChiliPLoP 2003, our group was riffing on the idea of extreme refactoring, during which Joe and I created a couple of contrived examples eliminating if statements from a specific part of Karel the Robot that seemed to require them.

This led Joe to propose a programming exercise he called an étude, similar to what these days are called katas, which I summarized in Practice for Practice's Sake:

Write a particular program with a budget of n if-statements or fewer, for some small value of n. Forcing oneself to not use an if statement wherever it feels comfortable forces the programmer to confront how choices can be made at run-time, and how polymorphism in the program can do the job. The goal isn't necessarily to create an application to keep and use. Indeed, if n is small enough and the task challenging enough, the resulting program may well be stilted beyond all maintainability. But in writing it the programmer may learn something about polymorphism and when it should be used.

Motivated by the Three Bears pattern, Joe and I went a step further. Perhaps the best way to know that you don't need if-statements everywhere is not to use them anywhere. Turn the dials to 11 and make 'em all go away! Thus was born the challenge, as a workshop for CS educators at SIGCSE 2005. We think it is useful for all programmers. Below are the materials we used to run the workshop, with only light editing.

Task

Working in pairs, you will write (or re-write) simple but complete programs that normally use if statements, to completely remove all selection structures in favor of polymorphism.

Objective

The purpose of this exercise is not to demonstrate that if statements are bad, but that they aren't necessary. Once you can program effectively this way, you have a better perspective from which to choose the right tool. It is directed at skilled procedural programmers, not at novices.

Rules

You should attempt to build the solutions to one of the challenge problems without using if statements or the equivalent.

You may use the libraries arbitrarily, even when you are pretty sure that they are implemented with if statements.

You may use exceptions only when really needed and not as a substitute for if statements. Similarly, while loops are not to be used to simulate if statements. Your problems should be solved with polymorphism: dynamic and parameterized.

Note that if you use (for example) a hash map and the program cannot control what is used as a key in a get (user input, for example). then it might return null. You are allowed to use an exception to handle this situation, or even an if. If you can't get all if statements out of your code, then see how few you really need to live with.

Challenges

Participants worked in pairs. They had a choice of programming scenarios, some of which were adapted from work by others:

This pdf file contains the full descriptions given to participants, including some we did not try with workshop participants. If you come up with a good scenario for this challenge, or variations on ours, please let me know.

Hints

When participants hit a block and asked for pointers, we offered hints of various kinds, such as:

•  When you have two behaviors, put them into different objects. The objects can be created from the same class or from related classes. If they are from the same class, you may want to use parameterized polymorphism. When the classes are different, you can use dynamic polymorphism. This is the easy step. Java interfaces are your friend here.

•  When you have the behaviors in different objects, find a way to bring the right object into play at the right time. That way, you don't need to use ad hoc methods to distinguish among them. This is the hard part. Sometimes you can develop a state change diagram that makes it easier. Then you can replace one object with another at the well-defined state change points.

•  Note that you can eliminate a lot of if statements by capturing early decisions -- perhaps made using if statements -- in different objects. These objects can then act as "flags with behavior" when they are passed around the program. The flag object then encapsulates the earlier decision. Now try to capture those decisions without if statements.

(Note that this technique alone is a big win in improving the maintainability of code. You replace many if statements spread throughout a program with one, giving you a single point of change in future.)

•  Delegation from one object to another is a real help in this exercise. This leads you to the Strategy design pattern. An object M can carry with it another, S, that encapsulates the strategy M has for solving a problem. To perform the associated behavior, M delegates to S. By changing the strategy object, you can change the behavior of the object that carries it. M seems to behave polymorphically, but it is really S that does the work.

•  You can modify or enhance strategies using the Decorator design pattern. A decorator D implements the same interface as the thing it decorates, M. When sent a message, the decorator performs some action and also sends the same message to the object it decorates. Thus the behavior of M is executed, but also more. Note that D can provide additional functionality both before and after sending the message to M. A functional method can return quite a different result when sent through a decorated object.

•  You can often choose strategies or other objects that encapsulate decisions using the Factory design pattern. A hash map can be a simple factory.

•  You can sometimes use max and min to map a range of values onto a smaller set and then use an index into a collection to choose an object. max and min are library functions so we don't care here how they might be implemented.

At the end of the workshop, we gave one piece of general advice: Doing this exercise once is not enough. Like an étude, it can be practiced often, in order to develop and internalize the necessary skills and thought patterns.

Conclusion

I'll not give any answers or examples here, so that readers can take the challenge themselves and try their hand at writing code. In future posts, I'll write up examples of the techniques described in the hints, and perhaps a few others.

Joe wrote up his étude in Variations on a Polymorphic Theme. In it, he gives some advice and a short example.

Serge Demeyer, Stéphane Ducasse, and Oscar Nierstrasz wrote a wonderful paper that places if statements in the larger context of an OO system, Transform Conditionals: a Reengineering Pattern Language.

If you like the Polymorphism Challenge, you might want to try some other challenges that ask you to do without features of your preferred language or programming style that you consider essential. Check out these Programming Challenges.

Remember, constraints help us grow. Constraints are where creative solutions happen.

I'll close with the same advice we gave at the end of the workshop: Doing this exercise once is not enough, especially for OO novices. Practice it often, like an étude or kata. Repetition can help you develop the thought patterns of an OO programmer, internalize them, and build the skills you need to write good OO programs.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

February 15, 2012 4:41 PM

Architecture Without Architects

David Byrne's essay on collective-creation introduced me to a dandy little picture book by Bernard Rudofsky called Architecture Without Architects. I love a slim book, and I love architecture, so I didn't need much of a push to pick it up at the library.

For an agile software developer, the book's title evokes something visceral. I think software architecture often happens best when it happens organically, emerging as the programmer grows the program piecemeal, feature by feature. This is a pragmatic view, as expressed succinctly by Brian Marick in his recent The Aim of Architecture:

Architecture isn't true, it's useful.

In software, as Marick says, knowing a program's architecture helps us to navigate our way around the program and add new code. An agile developer is willing to let architecture describe the existing program, rather than prescribe its shape. A descriptive, emergent architecture will be more helpful in what we need it for than a prescribed, often inaccurate architecture created ahead of time.

That's the mindset I brought to Architecture Without Architects. I found, though, that it is about more than piecemeal growth and emergence; it talks about buildings and spaces created by regular people. Some people call this "vernacular" architecture, but Rudofsky uses a number of terms aimed at elevating the idea beyond the vulgar, among them "indigenous", "spontaneous", and "non-pedigreed". I think Rudofsky likes "non-pedigreed" best because it most accurately expresses the distinction between the creations he studies and "real" architecture: the only difference is the credential held by the builder.

He lays responsibility for this harmful distinction at the feat of historians, who emphasize "the parts played by architects and their patrons" at the expense of "the communal enterprise" of the built environment. But all of us share in the blame:

Part of our trouble results from the tendency to ascribe to architects -- or, for that matter, to all specialists -- exceptional insights into problems of living when, in truth, most of them are concerned with problems of business and prestige.

One of the goals of this book is to encourage the study of non-pedigreed architecture, to describe a typology and to document important examples. "There is much to learn," says Rudofsky, "from architecture before it became an expert's art".

So, it turns out that Architecture Without Architects is not about the same sense of "without architect" that we in the software world usually mean. Agile developers are, for the most part, professionals, not hobbyists or regular Joes cobbling together programs on the side. Part of that is cultural. People who would never think of writing a program for themselves think nothing about diddling around their houses. Part of it is technological. It's pretty easy to go to the nearest home improvement center and buy modular components that non-professionals can use to change the shape and decoration of their houses. Programming, not so much.

There are, of course, a few hobbyists tinkering around with programs in their spare time. More important, there are plenty of people with few or no credentials in computing or software engineering making a living by writing programs. Sometimes, they have switched careers out of necessity or choice. Other times, they have slowly drifted into software development over the course of a career.

In yet other cases, they retain their professional identity in another discipline and write code to help them do their jobs. Greg Wilson's Software Carpentry project is aimed at one such group of people: professional scientists who find themselves writing and maintaining software as an essential part of doing their science. Rudofsky may be right when he chides us for attributing exceptional insight to professional architects, and if so we are certainly right not to attribute exceptional insight to pedigreed software developers. But Wilson is building a brand by reminding us that, while it may not take exceptional insight to write programs, doing it well does require some skill and knowledge.

I think that Rudofsky's interest in vernacular architecture has other parallels in the software world. For example, technologies such as SourceForge and now GitHub have enabled developers to showcase, celebrate, and benefit from the communal enterprise of writing programs. Programmers may be sitting home working on their own, but they aren't really alone. They are sharing what they write, sharing in what others write, and otherwise participating in vibrant communities of developers.

Then there is the idea of credentials. While many programmers do have degrees in computer science or engineering, most professionals don't have advanced academic degrees or an academic bent. They write code in a world shaped by forces beyond those usually talked about in algorithms and data structures textbooks. The software patterns movement that grew up in the 1990s aimed to document valuable lessons learned programming "in the wild". Like Rudofsky's typology of indigenous architecture, catalogs of design patterns collected vernacular wisdom. As Rudofsky said about the creations of the anonymous builders of most of the world we actually live in,

The beauty of this architecture has long been dismissed as accidental, but today we should be able to recognize it as the result of rare good sense in the handling of practical problems.

Say whatever else you want about the Gang of Four book, it captured a lot of OO wisdom learned in the trenches, often from working with unforgiving building materials like C++.

I enjoyed Architecture Without Architects greatly. After an eight-page preface in which Rudofsky lays down most of the ideas I've summarized here, the book consists of 150 or so pages of pictures accompanied by explanatory paragraphs. There was a lot of interesting detail and even a little wisdom in those paragraphs. If you like architecture, whether housing or programming, you might enjoy spending a couple of hours with this book.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

February 08, 2012 11:15 AM

Refactoring as Curve Fitting

Nathan Marz has written a nice article on what he calls suffering-oriented programming, a development style that places great value on "You Aren't Gonna Need It". It also has a guideline for when you should build something: you feel the pain of not having it. If it doesn't hurt yet, then you don't need it.

As you might imagine, Marz reports that the most important characteristic of this style is a relentless focus on refactoring. In refactoring, we often speak of code smells. Suffering-oriented programmers operate within a different metaphor. They are waiting for their code to make their lives difficult, such as when the queues and workers in Marz's stream processing system became unworkable at a larger scale. We need someone like Kent Beck to coin a catchy phrase for this -- code owies, perhaps.

I enjoyed the analogy Marz uses between refactoring and curve fitting:

"Making it beautiful" is where you use your design and abstraction skills to distill the problem space into simple abstractions that can be composed together. I view the development of beautiful abstractions as similar to statistical regression: you have a set of points on a graph (your use cases) and you're looking for the simplest curve that fits those points (a set of abstractions).

Do the simplest thing that would possibly work.

Marz presents a very practical instantiation of agile development without the hype that accompanies some usages of the term. It is worth a read.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

February 01, 2012 5:00 PM

"You Cannot Trust Your Creativity Yet"

You've got to learn your instrument.
Then, you practice, practice, practice.
And then, when you finally get up there on the bandstand,
forget all that and just wail. -- Charlie Parker

I signed up for an opportunity to read early releases of a book in progress, Bootstrapping Design. Chapter 4 contain a short passage that applies to beginning programmers, too:

Getting good at design means cultivating your taste. Right now, you don't have it. Eventually you will, but until then you cannot trust your creativity. Instead, focus on simplicity, clarity, and the cold, hard science of what works.

M.C. Escher, 'Hands'

This is hard advice for people to follow. We like to use our brains, to create, to try out what we know. I see this desire in many beginning programming students. The danger grows as our skills grow. One of my greatest frustrations comes in my Programming Languages course. Many students in the course are reluctant to use straightforward design patterns such as mutual recursion.

At one level, I understand their mindset. They have started to become accomplished programmers in other languages, and they want to think, design, and program for themselves. Oftentimes, their ill-formed approaches work okay in the small, even if the code makes the prof cringe. As our programs grow, though, the disorder snowballs. Pretty soon, the code is out of the student's control. The prof's, too.

A good example of this phenomenon, in both its positive and negative forms, happened toward the end of last semester's course. A series of homework assignments had the students growing an interpreter for a small, Scheme-like language. It eventually became the largest functional program they had ever written. In the end. there was a big difference between code written by students who relied on "the cold, hard science" we covered in class and code written by students who had wondered off into the wilderness of their own creativity. Filled with common patterns, the former was relatively easy to read, search, and grade. The latter... not so much. Even some very strong students began to struggle with their own code. They had relied too much on their own approaches for decomposing the problem and organizing their programs, but those ideas weren't scaling well.

I think what happens is that, over time, small errors, missteps, and complexities accumulate. It's almost like the effect of rounding error when working with floating point numbers. I vividly remember experiencing that in mu undergrad Numerical Analysis courses. Sadly, few of our CS students these days take Numerical Analysis, so their understanding of the danger is mostly theoretical.

Perhaps the most interesting embodiment of trusting one's own creativity too much occurred on the final assignment of the term. After several weeks and several assignments, we had a decent sized program. Before assigning the last set of requirements, I gave everyone in the class a working solution that I had written, for reference. One student was having so much trouble getting his own program to work correctly, even with reference to my code, that he decided to use my code as the basis for his assignment.

Imagine my surprise when I saw his submission. He used my code, but he did not follow the example. The code he added to handle the new requirements didn't look anything like mine, or like what we had studied in class. It repeated many of the choices that had gotten him into hot water over the course of the earlier assignments. I could not help but chuckle. At least he is persistent.

It can be hard to trust new ideas, especially when we don't understand them fully yet. I know that. I do the same thing sometimes. We feel constrained by someone else's programming patterns and want to find our own way. But those patterns aren't just constraints; they are also a source of freedom. I try to let my students grow in freedom as they progress through the curriculum, but sometimes we encounter something new like functional programming and have to step back into the role of uncultivated programmer and grow again.

There is great value in learning the rules first and letting our tastes and design skill evolve slowly. Seniors taking project courses are ready, so we turn them loose to apply their own taste and creativity on Big Problems. Freshmen usually are not yet able to trust their own creativity. They need to take it slow.

To "think outside the box", you you have to start with a box. That is true of taste and creativity as much as it is of knowledge and skill.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

January 25, 2012 3:45 PM

Pragmatism and the Scientific Spirit

the philosopher William James

Last week, I found myself reading The Most Entertaining Philosopher, about William James. It was good fun. I have always liked James. I liked the work of his colleagues in pragmatism, C.S. Peirce and John Dewey, too, but I always liked James more. For all the weaknesses of his formulation of pragmatism, he always seemed so much more human to me than Peirce, who did the heavy theoretical lifting to create pragmatism as a formal philosophy. And he always seemed a lot more fun than Dewey.

I wrote an entry a few years ago called The Academic Future of Agile Methods, which described the connection between pragmatism and my earlier AI, as well as agile software development. I still consider myself a pragmatist, though it's tough to explain just what that means. The pragmatic stance is too often confounded with a self-serving view of the world, a "whatever works is true" philosophy. Whatever works... for me. James's references to the "cash value" of truth didn't help. (James himself tried to undo the phrase's ill effects, but it has stuck. Even in the 1800s, it seems, a good sound bite was better than the truth.)

As John Banville, the author NY Times book review piece says, "It is far easier to act in the spirit of pragmatism than to describe what it is." He then gives "perhaps the most concise and elegant definition" of pragmatism, by philosopher C. I. Lewis. It is a definition that captures the spirit of pragmatism as well as any few lines can:

Pragmatism could be characterized as the doctrine that all problems are at bottom problems of conduct, that all judgments are, implicitly, judgments of value, and that, as there can be ultimately no valid distinction of theoretical and practical, so there can be no final separation of questions of truth of any kind from questions of the justifiable ends of action.

This is what drew me to pragmatism while doing work in knowledge-based systems, as a reaction to the prevailing view of logical AI that seemed based in idealist and realist epistemologies. It is also what seems to me to distinguish agile approaches to software development from the more common views of software engineering. I applaud people who are trying to create an overarching model for software development, a capital-t Theory, but I'm skeptical. The agile mindset is, or at least can be, pragmatic. I view software development in much the way James viewed consciousness: "not a thing or a place, but a process".

As I read again about James and his approach, I remember my first encounters with pragmatism and thinking: Pragmatism is science; other forms of epistemology are mathematics.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

January 19, 2012 4:23 PM

An Adventure in Knowing Too Much ... Or Thinking Too Little

To be honest, it is a little of both.

I gave the students in my compilers course a small homework assignment. It's a relatively simple translation problem whose primary goals are to help students refresh their programming skills in their language of choice and to think about the issues we will be studying in depth in coming weeks: scanning, parsing, validating, and generating.

I sat down the other day to write a solution ... and promptly made a mess.

In retrospect, my problem was that I was somewhere in between "do it 'simple'" and "do it 'right'". Unlike most of the students in the class, already know a lot about building compilers. I could use all that knowledge and build a multi-stage processor that converts a string in the source language (a simple template language) into a string in the target language (ASCII text). But writing a scanner, a parser, a static analyzer, and a text generator seems like overkill for such a simple problem. Besides, my students aren't likely to write such a solution, which would make my experience less valuable helping them to solve the problem, and my program less valuable as an example of a reasonable solution.

So I decided to keep things simple. Unfortunately, though, I didn't follow my own agile advice and do the simplest thing that could possibly work. As if with the full-compiler option, I don't really want the simplest program that could possibly work. This problem is simple enough to solve with a single pass algorithm, processing the input stream at the level of individual characters. That approach would work but would obscure the issues we are exploring in the course in a lot of low-level code managing states and conditions. Our goal for the assignment is understanding, not efficiency or language hackery.

I was frustrated with myself, so I walked away.

Later in the day, I was diddling around the house and occasionally mulling over my situation. Suddenly I saw a solution in mind. It embodied a simple understanding of my the problem, in the middle ground between too simple and too complex that was just right.

I had written my original code in a test-first way, but that didn't help me avoid my mess. I know that pair programming would have. My partner would surely have seen through the complexity I was spewing to the fact that I was off track and said, "Huh? Cut that out." Pair programming is an unsung hero in cases like this.

I wonder if this pitfall is a particular risk for CS academics. We teach courses that are full of details, with the goal of helping students understand the full depth of a domain. We often write quick and dirty code for our own purposes. These are at opposite ends of the software development spectrum. In the end, we have to help students learn to think somewhere in the middle. So, we try to show students well-designed solutions that are simple enough, but no simpler. That's a much more challenging task that writing a program at either extreme. Not being full-time developers, perhaps our instincts for finding the happy medium aren't as sharp as they might be.

As always, though, I had fun writing code.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

December 19, 2011 4:49 PM

"I Love The Stuff You Never See"

I occasionally read and hear people give advice about how to find a career, vocation, or avocation that someone will enjoy and succeed in. There is a lot of talk about passion, which is understandable. Surely, we will enjoy things we are passionate about, and perhaps then we want to put in the hours required to succeed. Still, "finding your passion" seems a little abstract, especially for someone who is struggling to find one.

This weekend, I read A Man, A Ball, A Hoop, A Bench (and an Alleged Thread)... Teller!. It's a story about the magician Teller, one half of the wonderful team Penn & Teller, and his years-long pursuit of a particular illusion. While discussing his work habits, Teller said something deceptively simple:

I love the stuff you never see.

I knew immediately just what he meant.

I can say this about teaching. I love the hours spent creating examples, writing sample code, improving it, writing and rewriting lecture notes, and creating and solving homework assignments. When a course doesn't go as I had planned, I like figuring out why and trying to fix it. Students see the finished product, not the hours spent creating it. I enjoy both.

I don't necessarily enjoy all of the behind-the-scenes work. I don't really enjoy grading. But my enjoyment of the preparation and my enjoyment of the class itself -- the teaching equivalent of "the performance" -- carries me through.

I can also say the same thing about programming. I love to fiddle with source code, organizing and rewriting it until it's all just so. I love to factor out repetition and discover abstractions. I enjoy tweaking interfaces, both the interfaces inside my code and the interfaces my code's users see. I love that sudden moment of pleasure when a program runs for the first time. Users see the finished product, not the hours spent creating it. I enjoy both.

Again, I don't necessarily enjoy everything that I have to do the behind the scenes. I don't enjoy twiddling with configuration files, especially at the interface to the OS. Unlike many of my friends, I don't always enjoy installing and uninstalling, all the libraries I need to make everything work in the current version of the OS and interpreter. But that time seems small compared the time I spend living inside the code, and that carries me through.

In many ways, I think that Teller's simple declaration is a much better predictor of what you will enjoy in a career or avocation than other, fancier advice you'll receive. If you love the stuff other folks never see, you are probably doing the right thing for you.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development, Teaching and Learning

December 15, 2011 4:08 PM

Learning More Than What Is Helpful Right Now

Stanley Fish wrote this week about the end of a course he taught this semester, on "law, liberalism and religion". In this course, his students a number of essays and articles outside the usual legal literature, including works by Locke, Rawls, Hobbes, Kant, and Rorty. Fish uses this essay to respond to recent criticisms that law schools teach too many courses like this, which are not helpful to most students, who will, by and large, graduate to practice the law.

Most anyone who teaches in a university hears criticisms of this sort now and then. When you teach computer science, you hear them frequently. Most of our students graduate and enter practice the software development. How useful are the theory of computation and the principles of programming languages? Teach 'em Java Enterprise Edition and Eclipse and XSLT and Rails.

My recent entry Impractical Programming, With Benefits starts from the same basic premise that Fish starts from: There is more to know about the tools and methodologies we use in practice than meets the eye. Understanding why something is as it is, and knowing that something could be better, are valuable parts of a professional's preparation for the world.

Fish talks about these values in terms of the "purposive" nature of the enterprise in which we practice. You want to be able to thing about the bigger picture, because that determines where you are going and why you are going there. I like his connection to Searle's speech acts and how they help us to see how the story we tell gives rise to the meaning of the details in the story. He uses football as his example, but he could have used computer science.

He sums up his argument in this way

That understanding is what law schools offer (among other things). Law schools ask and answer the question, "What's the game here?"; the ins and outs of the game you learn later, as in any profession. The complaint ... is that law firms must teach their new hires tricks of the trade they never learned in their contracts, torts and (God forbid) jurisprudence classes. But learning the tricks would not amount to much and might well be impossible for someone who did not know -- in a deep sense of know -- what the trade is and why it is important to practice it.

Such a deep understanding is even more important in a discipline like computing, because our practices evolve at a much faster rate than legal practices. Our tools change even more frequently. When we taught functional programming ten or fifteen years ago, many of our students simply humored me. This wasn't going to help them with Windows programming, but, hey, they'd learn it for our sake. Now they live in a world where Scala, Clojure, and F# are in the vanguard. I hope what they learned in our Programming Languages course has helped them cope with the change. Some of them are even leading the charge.

The practical test of whether my Programming Languages students learned anything useful this semester will come not next year, but ten or fifteen years down the road. And, as I said in the Impractical Programming piece, a little whimsy can be fun in its own right, easy while it stretches your brain.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

December 08, 2011 4:23 PM

Quick and Wrong and Fast and Slow

I've been reading a lot about Daniel Kahneman's new book, Thinking, Fast and Slow. One of the themes of the book is how our brains include two independent systems for organizing and accessing knowledge. System One is incredibly fast but occasionally (perhaps often) wrong. It likely developed early in our biological history and provided humans with an adaptive advantage in a dangerous world. System Two developed later, after humans had survived to create more protective surroundings. It is slow -- conscious, deliberative -- and more often right.

One reviewer summarized the adaptive value of System One in this way:

In the world of the jungle, it is safer to be wrong and quick than to be right and slow.

This phrase reminded me of an old post by Allan Kelly, on the topic of gathering requirements for software. The entry's title is also its punch line:

You are better off being generally right than precisely wrong.

These two quotes are quite different in important ways. Yet they are related in some interesting ways, too.

It is easier to be fast and generally right than to be fast and precisely right. The pattern-matching mechanism in our brains and the heuristics we use consciously are fast, but they are often imprecise. If generally right is good enough, then fast is possible.

Attempts to be slow and precisely right often end up being slow and precisely wrong. Sometimes, the world changes while we are thinking. Other times, we end up solving the wrong problem because we didn't understand our goals or the conditions of the world as well we thought we did at the outset.

Evolution has given us two mechanisms with radically different trade-offs and, it turns out, a biological bias toward quick and wrong.

When I talk with friends who dislike or don't understand agile approaches, I find that they often think that agile folks overemphasize the use of System One in software development. Why react, be wrong, and learn from the mistake, when we could just think ahead and do it right the first time?

In one way, they are right. Some proponents of agile approaches speak rather loosely about Big Design Up Front and about You Aren't Gonna Need It. They leave the impression that one can program without thinking, so long as one takes small enough steps and learns from feedback. They also leave the impression that everyone should work this way, in all contexts. Neither of these impressions is accurate.

I try to help my skeptical friends to understand how a "quick and (sometimes) wrong" mindset can be useful for me even in contexts where I could conceivably plan ahead well and farther. I try to help them understand that I really am thinking all the time I'm working, but that I treat any products of thought that are not yet in code as contingent, awaiting the support of evidence gained through running code.

And then I let them work in whatever way makes them successful and comfortable.

I think that being aware of the presence of Systems One and Two, and the fundamental trade-off between them, can help agile developers work better. Making conscious, well-founded decisions about how far to think ahead, about what and how much to test, and about when and how often to refactor are, in the end, choices about which part of our brain to use at any given moment. Context matters. Experience matters. Blindly working in a quick-and-generally-right way is no more productive approach for most of us than working in a slow-and-sometimes-precisely-wrong way.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

December 02, 2011 4:28 PM

Impractical Programming, with Benefits

I linked to Jacob Harris's recent In Praise of Impractical Programming in my previous entry, in the context of programming's integral role in the modern newsroom. But as the title of his article indicates, it's not really about the gritty details of publishing a modern web site. Harris rhapsodizes about wizards and the magic of programming, and about a language that is for many of my colleagues the poster child for impractical programming languages, Scheme.

If you clicked through to the article but stopped reading when you ran into MIT 6.001, you might want to go back and finish reading. It is a story of how one programmer looks back on college courses that seemed impractical at the time but that, in hindsight, made him a better programmer.

There is a tension in any undergraduate CS program between the skills and languages of today and big ideas that will last. Students naturally tend to prefer the former, as they are relevant now. Many professors -- though not all -- prefer academic concepts and historical languages. I encounter this tension every time I teach Programming Languages and think, should I continue to use Scheme as the course's primary language?

As recently as the 1990s, this question didn't amount to much. There weren't any functional programming languages at the forefront of industry, and languages such as C++, Java, and Ada didn't offer the course much.

But now there are Scala and Clojure and F#, all languages in play in industry, not too mention several "pretty good Lisps". Wouldn't my students benefit from the extensive libraries of these languages? Their web-readiness? The communities connecting the languages to Hadoop and databases and data analytics?

I seriously consider these questions each time I prep the course, but I keep returning to Scheme. Ironically, one reason is precisely that it doesn't have all those things. As Harris learned,

Because Scheme's core syntax is remarkably impoverished, the student is constantly pulling herself up by her bootstraps, building more advanced structures off simpler constructs.

In a course on the principles of programming languages, small is a virtue. We have to build most of what we want to talk about. And there is nothing quite so stark as looking at half a page of code and realizing, "OMG, that's what object-oriented programming is!", or "You mean that's all a variable reference is?" Strange as it may sound, the best way to learn deeply the big concepts of language may be to look at the smallest atoms you can find -- or build them yourself.

Harris "won't argue that "journalism schools should squander ... dearly-won computer-science credits on whimsical introductions to programming" such as this. I won't even argue that we in CS spend too many of our limited credits on whimsy. But we shouldn't renounce our magic altogether, either, for perfectly practical reasons of learning.

And let's not downplay too much the joy of whimsy itself. Students have their entire careers to muck around in a swamp of XML and REST and Linux device drivers, if that's what they desire. There's something pretty cool about watching Dr. Racket spew, in a matter of a second or two, twenty-five lines of digits as the value of a trivial computation.

As Harris says,

... if you want to advance as a programmer, you need to take some impractical detours.

He closes with a few suggestions, none of which lapse into the sort of navel-gazing and academic irrelevance that articles like this one sometimes do. They all come down to having fun and growing along the way. I second his advice.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

November 30, 2011 7:07 PM

A Definition of Design from Charles Eames

Paul Rand's logo for NeXT

Our council of department heads meets in the dean's conference room, in the same building that houses the Departments of Theater and Art, among others. Before this morning's meeting, I noticed an Edward Tufte poster on the wall and went out to take a look. It turns out that the graphic design students were exhibiting posters they had made in one of their classes, while studying accomplished designers such as Tufte and Paul Rand, the creator of the NeXT logo for Steve Jobs.

As I browsed the gallery, I came across a couple of posters on the work of Charles and Ray Eames. One of them prominently featured this quote from Charles:

Design is a plan for arranging elements in such a way as best to accomplish a particular purpose.

This definition works just as well for software design as it does for graphic design. It is good to be reminded occasionally how universal the idea of design is to the human condition.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 18, 2011 2:10 PM

Teachers Working Themselves Out Of a Job

Dave Rooney wrote a recent entry on what he does when coaching in the realm of agile software development. He summarizes his five main tasks as:

  • Listen
  • Ask boatloads of questions
  • Challenge assumptions
  • Teach/coach Agile practices
  • Work myself out of a job

All but the fourth is a part of my job every day. Listening, asking questions, and challenging assumptions are essential parts of helping people to learn, whatever one's discipline or level of instruction. As a CS prof, I teach a lot of courses that instruct or require programming, and I look for opportunities to inject pragmatic programming skills and agile development practices.

What of working myself out of a job? For consultants like Rooney, this is indeed the goal: help an organization get on a track where they don't need his advice, where they can provide coaching from inside, and where they become sufficient in their own practices.

In a literal sense, this is not part of my job. If I do my job well, I will remain employed by the same organization, or at least have that option available to me.

But in another sense, my goals with respect to working myself out of a job are the same as as a consultant's, only at the level of individual students. I want to help students reach a point where they can learn effectively on their own. As much as possible, I hope for them to become more self-sufficient, able to learn as an equal member of the larger community of programmers and computer scientists.

A teacher's goal is, in part, to prepare students to move on to a new kind of learning, where they don't need us to structure the learning environment or organize the stream of ideas and information they learn from. Many students come to us pretty well able to do this already; they need only to realize that they don't me!

With most universities structured more around courses than one-on-one tutorials, I don't get to see the process through with every student I teach. One of the great joys is to have the chance to work with the same student many times over the years, through multiple courses and one-on-one through projects and research.

In any case, I think it's healthy for teachers to approach their jobs from the perspective of working themselves out of a job. Not to worry; there is always another class of students coming along.

Of course, universities as we know them may be disappearing. But the teachers among us will always find people who want to learn.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

November 16, 2011 2:38 PM

Memories of a Programming Assignment

Last Friday afternoon, the CS faculty met with the department's advisory board. The most recent addition to the board is an alumnus who graduated a decade ago or so. At one point in the meeting, he said that a particular programming project from his undergraduate days stands out in his mind after all these years. It was a series of assignments from my Object-Oriented Programming course called Cat and Mouse.

I can't take credit for the basic assignment. I got the idea from Mike Clancy in the mid 1990s. (He later presented his version of the assignment on the original Nifty Assignments panel at SIGCSE 1999.) The problem includes so many cool features, from coordinate systems to stopping conditions. Whatever one's programming style or language, it is a fun challenge. When done OO in Java, with great support for graphics, it is even more fun.

But those properties aren't what he remembers best about the project. He recalls that the project took place over several weeks and that each week, I changed the requirements of assignment. Sometimes, I added a new feature. Other times, I generalized an existing feature.

What stands out in his mind after all these years is getting the new assignment each week, going home, reading it, and thinking,

@#$%#^. I have to rewrite my entire program.

You, see he had hard-coded assumptions throughout his code. Concerns were commingled, not separated. Objects were buried inside larger pieces of code. Model was tangled up with view.

So, he started from scratch. Over the course of several weeks, he built an object-oriented system. He came to understand dynamic polymorphism and patterns such as MVC and decorator, and found ways to use them effectively in his code.

He remembers the dread, but also that this experience helped him learn how to write software.

I never know exactly what part of what I'm doing in class will stick most with students. From semester to semester and student to student, it probably varies quite a bit. But the experience of growing a piece of software over time in the face of growing and changing requirements is almost always a powerful teacher.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

November 14, 2011 3:43 PM

Using Null Object to Get Around Truthiness

Avdi Grimm recently posted a short but thorough introduction to a common problem encountered by Ruby programmers: nullifying an action when the required object is not present. In a mixed-paradigm language, this becomes an issue when we use if-statements to guard the action. Ruby, like many languages, treats a distinguished value as false and all other values as true. Thus, unless an object is "falsey", it will be seen as true by the if-statement. This requires us to do extra tests or wrap our objects to ensure that tests pass and fail at the appropriate times. Grimm explains the problem and walks us through several approaches to solving this problem in Ruby. I recommend you read it.

Some of my functional programming friends teased me about this article, along the lines of a pithy tweet:

Rubyists have discovered Maybe/Option.

Maybe is a wrapper type in Haskell that allows programmers to indicate that an expected value may not be available; Option is the Scala analog. Wrapper types are the natural way to handle the problem of falsey-ness in functional programming, especially in statically-typed languages. (Clojure, a popular dynamically-typed functional language in the Lisp tradition, doesn't make use of this idea.)

When I write Haskell code, as rare as that is, I use Maybe as a natural part of typing my programs. As a functional programmer more generally, I see its great value in writing compact code. The alternative in Scheme is to us if expressions to switch on value, separating null (usually) base cases from inductive cases.

However, when I work as an OO programmer, I don't miss Maybe or even falsey objects more generally. Indeed, I think the best advice in Grimm's article comes in his conclusion:

If we're trying to coerce a homemade object into acting falsey, we may be chasing a vain ideal. With a little thought, it is almost always possible to transform code from typecasing conditionals to duck-typed polymorphic method calls. All we have to do is remember to represent the special cases as objects in their own right, whether that be a Null Object or something else.

My pithy tweet-like conclusion is this: If you are checking for truthiness, you're doing OO wrong.

To do it right, use the Null Object pattern.

Ruby is an OO language, but it gives us great freedom to write code in other styles as well. Many programmers in the Ruby community, mostly those without prior OO experience, miss out on the beauty and compactness one can achieve using standard OO patterns.

I see a lot of articles on the web about the Null Object pattern, but most of them involve extending the class NilClass or its lone instance, nil. That is fine if you are trying to add generic null object behavior to a system, often in service of truthiness and falsey-ness. A better approach in most contexts is to implement an object that behaves like a "nothing" in your application. If you are writing a program that consist of users, create an unassigned user. If you are creating a logging facility and need the ability for a service not to use a log, create a null log. If you are creating an MVC application and need a read-only controller, create a null controller.

In some applications, the idea of the null object disappears into another primitive object. When we implement a binary tree, we create an object that has references to two other tree objects. If all tree nodes behave similarly except that some do not have children, then we can create a NullTree object to serve as the values of the instance variables in the actual leaves. If leaf nodes behave differently than interior nodes, then we can create a Leaf object to serve as the values of the interior nodes' instance variables. Leaf subsumes any need for a null object.

One of the consequences of using the Null Object pattern is the elimination of if statements that switch on the type of object present. Such if statements are a form of ad hoc polymorphism. Programmers using if statements while trying to write OO code should not be surprised that their lives are more difficult than they need be. The problem isn't with OOP; it's with not taking advantage of dynamic polymorphism, one of the essential benefits of OOP.

If you would like to learn more about the Null Object pattern, I suggest you read its two canonical write-ups:

If you are struggling with making the jump to OOP, away from typecasing switch statements and explicit loops over data structures, take up the programming challenge writing increasingly larger programs with no if statements and no for statements. Sometimes, you have to go to extremes before you feel comfortable in the middle.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

November 12, 2011 10:40 AM

Tools, Software Development, and Teaching

Last week, Bret Victor published a provocative essay on the future of interaction design that reminds us we should be more ambitious in our vision of human-computer interaction. I think it also reminds us that we can and should be more ambitious in our vision of most of our pursuits.

I couldn't help but think of how Victor's particular argument applies to software development. First he defines "tool":

Before we think about how we should interact with our Tools Of The Future, let's consider what a tool is in the first place.

I like this definition: A tool addresses human needs by amplifying human capabilities.

a tool addresses human needs by amplifying human capabilities

That is, a tool converts what we can do into what we want to do. A great tool is designed to fit both sides.

The key point of the essay is that our hands have much more consequential capabilities than our current interfaces use. They feel. They participate with our brains in numerous tactile assessments of the objects we hold and manipulate: "texture, pliability, temperature; their distribution of weight; their edges, curves, and ridges; how they respond in your hand as you use them". Indeed, this tactile sense is more powerful than the touch-and-slide interfaces we have now and, in many ways, is more powerful than even sight. These tactile senses are real, not metaphorical.

As I read the essay, I thought of the software tools we use, from language to text editors to development processes. When I am working on a program, especially a big one, I feel much more than I see. At various times, I experience discomfort, dread, relief, and joy.

Some of my colleagues tell me that these "feelings" are metaphorical, but I don't think so. A big part of my affinity for so-called agile approaches is how these sensations come into play. When I am afraid to change the code, it often means that I need to write more or better unit tests. When I am reluctant to add a new feature, it often means that I need to refactor the code to be more hospitable. When I come across a "code smell", I need to clean up, even if I only have time for a small fix. YAGNI and doing the simplest thing that can possibly work are ways that I feel my way along the path to a more complete program, staying in tune with the code as I go. Pair programming is a social practice that engages more of my mind than programming alone.

Victor closes with some inspiration for inspiration:

In 1968 -- three years before the invention of the microprocessor -- Alan Kay stumbled across Don Bitzer's early flat-panel display. Its resolution was 16 pixels by 16 pixels -- an impressive improvement over their earlier 4 pixel by 4 pixel display.

Alan saw those 256 glowing orange squares, and he went home, and he picked up a pen, and he drew a picture of a goddamn iPad.

We can think bigger about so much of what we do. The challenge I take from Victor's essay is to think about the tools I to teach: what needs do they fulfill, and how well do they amplify my own capabilities? Just as important are the tools we give our students as they learn: what needs do they fulfill, and how well do they amplify our students' capabilities?


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 27, 2011 6:44 PM

A Perfect Place to Cultivate an Obsession

And I urge you to please notice when you are happy,
and exclaim or murmur or think at some point,
"If this isn't nice, I don't know what is."
-- Kurt Vonnegut, Jr.

I spent the entire day teaching and preparing to teach, including writing some very satisfying code. It was a way to spend a birthday.

With so much attention devoted to watching my students learn, I found myself thinking consciously about my teaching and also about some learning I have been doing lately, including remembering how to write idiomatic Ruby. Many of my students really want to be able to write solid, idiomatic Scheme programs to process little languages. I see them struggle with the gap between their desire and their ability. It brought to mind something poet Mary Jo Bang said in recent interview about her long effort to become a good writer:

For a long time the desire to write and knowing how to write well remained two separate things. I recognized good writing when I saw it but I didn't know how to create it.

I do all I can to give students examples of good programs from which they can learn, and also to help them with the process of good programming. In the end, the only way to close the gap is to write a lot of code. Writing deliberately and reflectively can shorten the path.

Bang sees the same in her students:

Industriousness can compensate for a lot. And industry plus imagination is a very promising combination.

Hard work is the one variable we all control while learning something new. Some of us are blessed with more natural capacity to imagine, but I think we can stretch our imaginations with practice. Some CS students think that they are learning to "engineer" software, a cold, calculating process. But imagination plays a huge role in understanding difficult problems, abstract problems.

Together, industry and time eventually close the gap between desire and ability:

And I saw how, if you steadily worked at something, what you don't know gradually erodes and what you do know slowly grows and at some point you've gained a degree of mastery. What you know becomes what you are. You know photography and you are a photographer. You know writing and you are a writer.

... You know programming, and you are a programmer.

Erosion and growth can be slow processes. As time passes, we sometimes find our learning accelerates, a sort of negative splits for mental exercise.

We work hardest when we are passionate about what we do. It's hard for homework assigned in school to arouse passion, but many of us professors do what we can. The best way to have passion is to pick the thing you want to do. Many of my best students have had a passion for something and then found ways to focus their energies on assigned work in the interest of learning the skills and gaining the experience they need to fulfill their passion.

One last passage from Bang captures perfectly for me what educators should strive to make "school":

It was the perfect place to cultivate an obsession that has lasted until the present.

As a teacher, I see a large gap between my desire to create the perfect place to cultivate an obsession and my ability to deliver. For, now the desire and the ability remain two separate things. I recognize good learning experiences when I see them, and occasionally I stumble into creating one, but I don't yet know how to create them reliably.

Hard work and imagination... I'll keep at it.

If this isn't nice, I don't know what is.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

October 24, 2011 7:38 PM

Simple/Complex Versus Easy/Hard

A few years ago, I heard a deacon give a rather compelling talk to a group of college students on campus. When confronted with a recommended way to live or act, students will often say that living or acting that way is hard. These same students are frustrated with the people who recommend that way of living or acting, because the recommenders -- often their parents or teachers -- act as if it is easy to live or act that way. The deacon told the students that their parents and teachers don't think it is easy, but they might well think it is simple.

How can this be? The students were confounding "simple" and "easy". A lot of times, life is simple, because we know what we should do. But that does not make life easy, because doing a simple thing may be quite difficult.

This made an impression on me, because I recognized that conflict in my own life. Often, I know just what to do. That part is simple. Yet I don't really want to do it. To do it requires sacrifice or pain, at least in the short term. To do it means not doing something else, and I am not ready or willing to forego that something. That part is difficult.

Switch the verb from "do" to "be", and the conflict becomes even harder to reconcile. I may know what I want to be. However, the gap between who I am and who I want to be may be quite large. Do I really want to do what it takes to get there? There may be a lot of steps to take which individually are difficult. The knowing is simple, but the doing is hard.

This gap surely faces college students, too, whether it means wanting to get better grades, wanting to live a healthier life, or wanting to reach a specific ambitious goal.

When I heard the deacon's story, I immediately thought of some of my friends, who like very much the idea of being a "writer" or a "programmer", but they don't really want to do the hard work that is writing or programming. Too much work, too much disappointment. I thought of myself, too. We all face this conflict in all aspects of life, not just as it relates to personal choices and values. I see it in my teaching and learning. I see it in building software.

I thought of this old story today when I watched Rich Hickey's talk from StrangeLoop 2011, Simple Made Easy. I had put off watching this for a few days, after tiring of a big fuss that blew up a few weeks ago over Hickey's purported views about agile software development techniques. I knew, though, that the dust-up was about more than just Hickey's talk, and several of my friends recommended it strongly. So today I watched. I'm glad I did; it is a good talk. I recommend it to you!

Based only on what I heard in this talk, I would guess that Hickey misunderstands the key ideas behind XP's practices of test-driven development and refactoring. But this could well be a product of how some agilistas talk about them. Proponents of agile and XP need to be careful not to imply that tests and refactoring make change or any other part of software development easy. They don't. The programmer still has to understand the domain and be able to think deeply about the code.

Fortunately, I don't base what I think about XP practices on what other people think, even if they are people I admire for other reasons. And if you can skip or ignore any references Hickey makes to "tests as guard rails" or to statements that imply refactoring is debugging, I think you will find this really is a very good talk.

Hickey's important point is that simple/complex and easy/hard are different dimensions. Simplicity should be our goal when writing code, not complexity. Doing something that is hard should be our goal when it makes us better, especially when it makes us better able to create simplicity.

Simplicity and complexity are about the interconnectedness of a system. In this dimension, we can imagine objective measures. Ease and difficulty are about what is most readily at hand, what is most familiar. Defined as they are in terms of a person's experience or environment, this dimension is almost entirely subjective.

And that is good because, as Hickey says a couple of times in the talk, "You can solve the familiarity problem for yourself." We are not limited to our previous experience or our current environment; we can take on a difficult challenge and grow.

a Marin mountain bike

Alan Kay often talks about how it is worth learning to play a musical instrument, even though playing is difficult, at least at the start. Without that skill, we are limited in our ability to "make music" to turning on the radio or firing up YouTube. With it, you are able make music. Likewise riding a bicycle versus walking, or learning to fly an airplane versus learning to drive a car. None of these skills is necessarily difficult once we learn them, and they enable new kinds of behaviors that can be simple or complex in their own right.

One of the things I try to help my students see is the value in learning a new, seemingly more difficult language: it empowers us to think new and different thoughts. Likewise making the move from imperative procedural style to OOP or to functional programming. Doing so stretches us. We think and program differently afterward. A bonus is that something that seemed difficult before is now less daunting. We are able to work more effectively in a bigger world.

In retrospect, what Hickey says about simplicity and complexity is actually quite compatible with the key principles of XP and other agile methods. Writing tests is a part of how we create systems that are as simple as we can in the local neighborhood of a new feature. Tests can also help us to recognize complexity as it seeps into our program, though they are not enough by themselves to help us see complexity. Refactoring is an essential part of how we eliminate complexity by improving design globally. Refactoring in the presence of unit tests does not make programming easy. It doesn't replace thinking about design; indeed, it is thinking about design. Unit tests and refactoring do help us to grapple with complexity in our code.

Also in retrospect, I gotta make sure I get down to St. Louis for StrangeLoop 2012. I missed the energy this year.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

October 06, 2011 3:21 PM

Programming != Teaching

A few weeks ago I wrote a few entries that made connections to Roger Rosenblatt's Unless It Moves the Human Heart: The Craft and Art of Writing. As I am prone to doing, I found a lot of connections between writing, as described by Rosenblatt, and programming. I also saw connections between teaching of writers and teaching of programmers. The most recent entry in that series highlighted how teachers want their students to learn how to think the same way, not how to write the same way.

Rosenblatt also occasionally explores similarities between writing and teaching. Toward the end of the book, he points out a very important difference between the two:

Wouldn't it be nice if you knew that your teaching had shape and unity, and that when a semester came to an end, you could see that every individual thing you said had coalesced into one overarching statement? But who knows? I liken teaching to writing, but the two enterprises diverge here, because any perception of a grand scheme depends on what the students pick up. You may intend a lovely consistency in what you're tossing them, but they still have to catch it. In fact, I do see unity to my teaching. What they see, I have no clue. It probably doesn't matter if they accept the parts without the whole. A few things are learned, and my wish for more may be plain vanity.

Novelists, poets, and essayists can achieve closure and create a particular whole. Their raw material are words and ideas, which the writer can make to dance. The writer can have an overarching statement in mind, and making it real is just a matter of hard work and time.

Programmers have that sort of control over their raw material, too. As a programmer, I relish taken on the challenge of a hard problem and creating a solution that meets the needs of a person. If I have a goal for a program, I can almost always make it happen. I like that.

Teachers may have a grand scheme in mind, too, but they have no reliable way of making sure that their scheme comes true. Their raw material consists not only of words and ideas. Indeed, their most important raw material, their most unpredictable raw material, are students. Try as they might, teachers don't control what students do, learn, or think.

I am acutely aware of this thought as we wrap up the first half of our programming languages course. I have introduced students to functional programming and recursive programming techniques. I have a pretty good idea what I hope they know and can do now, but that scheme remains in my head.

Rosenblatt is right. It is vanity for us teachers to expect students to learn exactly what we want for them. It's okay if they don't. Our job is to do what we can to help them grow. After that, we have to step aside and let them run.

Students will create their own wholes. They will assemble their wholes from the parts they catch from us, but also from parts they catch everywhere else. This is a good thing, because the world has a lot more to teach than I can teach them on my own. Recognizing this makes it a lot easier for me as a teacher to do the best I can to help them grow and then get out of their way.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

October 03, 2011 8:11 AM

What to Build and How to Build

Update: This entry originally appeared on September 29. I bungled my blog directory and lost two posts, and the simplest way to get the content back on-line is to repost.

I remember back in the late 1990s and early 2000s when patterns were still a hot topic in the software world, and many pattern writers trying to make the conceptual move to pattern languages. It was a fun time to talk about software design. At some point, there was a long and illuminating discussion on the patterns mailing list about whether patterns should describe what to build or how to build. Richard Gabriel and Ron Goldman -- creators of the marvelous essay-as-performance-art Mob Software -- patiently taught the community that the ultimate goal is what. Of course, if we move to a higher level of abstraction, a what-pattern becomes a how-pattern. But the most valuable pattern languages teach us what to build and when, with some freedom in the how.

This is the real challenge that novice programmers face, in courses like CS1 or in self-education: figuring out what to build. It is easy enough for many students to "get" the syntax of the programming language they are learning. Knowing when to use a loop, or a procedure, or a class -- that's the bigger challenge.

Our CS students are usually in the same situation even later in their studies. They are still learning what to build, even as we teach them new libraries, new languages, and new styles.

I see this a lot when students who are learning to program in a functional style. Mentally, many think they are focused on the how (e.g., How do I write this in Scheme?). But when we probe deeper, we usually find that they are really struggling with what to say. We spend some time talking about the problem, and they begin to see more clearly what they are trying to accomplish. Suddenly, writing the code becomes a lot easier, if not downright easy.

This is one of the things I really respect in the How to Design Programs curriculum. Its design recipes give beginning students a detailed, complete, repeatable process for thinking about problems and what they need to solve a new problem. Data, contracts, and examples are essential elements in understanding what to build. Template solutions help bridge the what and the how, but even they are, at the student's current level of abstraction, more about what than how.

The structural recursion patterns I use in my course are an attempt to help students think about what to build. The how usually follows directly from that. As students become fluent in their functional programming language, the how is almost incidental.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

October 03, 2011 7:20 AM

Softmax, Recursion, and Higher-Order Procedures

Update: This entry originally appeared on September 28. I bungled my blog directory and lost two posts, and the simplest way to get the content back on-line is to repost.

John Cook recently reported that he has bundled up some of his earlier writings about the soft maximum as a tech report. The soft maximum is "a smooth approximation to the maximum of two real variables":

    softmax(x, y) = log(exp(x) + exp(y))

When John posted his first blog entry about the softmax, I grabbed the idea and made it a homework problem for my students, who were writing their first Scheme procedures. I gave them a link to John's page, so they had access to this basic formula as well as a Python implementation of it. That was fine with me, because I was simply trying to help students become more comfortable using Scheme's unusual syntax:

    (define softmax
      (lambda (x y)
        (log (+ (exp x)
                (exp y)))))

On the next assignment, I asked students to generalize the definition of softmax to more than two variables. This gave them an opportunity to write a variable arity procedure in Scheme. At that point, they had seen only a couple simple examples of variable arity, such as this implementation of addition using a binary + operator:

    (define plus              ;; notice: no parentheses around
      (lambda args            ;; the args parameter in lambda
        (if (null? args)
            0
            (+ (car args) (apply plus (cdr args))) )))

Many students followed this pattern directly for softmax:

    (define softmax-var
      (lambda args
        (if (null? (cdr args))
            (car args)
            (softmax (car args)
                     (apply softmax-var (cdr args))))))

Some of their friends tried a different approach. They saw that they could use higher-order procedures to solve the problem -- without explicitly using recursion:

    (define softmax-var
      (lambda args
        (log (apply + (map exp args)))))

When students saw each other's solutions, they wondered -- as students often do -- which one is correct?

John's original blog post on the softmax tells us that the function generalizes as we might expect:

    softmax(x1, x2, ..., xn) = log(exp(x1) + exp(x2) + ... + exp(xn))

Not many students had looked back for that formula, I think, but we can see that it matches the higher-order softmax almost perfectly. (map exp args) constructs a list of the exp(xi) values. (apply + ...) adds them up. (log ...) produces the final answer.

What about the recursive solution? If we look at how its recursive calls unfold, we see that this procedure computes:

    softmax(x1, softmax(x2, ..., softmax(xn-1, xn)...))

This is an interesting take on the idea of a soft maximum, but it is not what John's generalized definition says, nor is it particularly faithful to the original 2-argument function.

How might we roll our own recursive solution that computes the generalized function faithfully? The key is to realize that the function needs to iterate not over the maximizing behavior but the summing behavior. So we might write:

    (define softmax-var
      (lambda args
        (log (accumulate-exps args))))

(define accumulate-exps (lambda (args) (if (null? args) 0 (+ (exp (car args)) (accumulate-exps (cdr args))))))

This solution turns softmax-var into interface procedure and then uses structural recursion over a flat list of arguments. One advantage of using an interface procedure is that the recursive procedure accumulate-exps no longer has to deal with variable arity, as it receives a list of arguments.

It was remarkable to me and some of my students just how close the answers produced by the two student implementations of softmax were, given how different the underlying behaviors are. Often, the answers were identical. When different, they differed only in the 12th or 15th decimal digit. As several blog readers pointed out, softmax is associative, so the two solutions are identical mathematically. The differences in the values of the functions result from the vagaries of floating-point precision.

The programmer in me left the exercise impressed by the smoothness of the soft maximum. The idea is resilient across multiple implementations, which makes it seem all the more useful to me.

More important, though, this programming exercise led to several interesting discussions with students about programming techniques, higher-order procedures, and the importance of implementing solutions that are faithful to the problem domain. The teacher in me left the exercise pleased.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

September 24, 2011 8:04 PM

Much Code To Study, Learning with Students

I mentioned last time that I've been spending some time with Norvig's work on Strachey's checkers program in CPL. This is fun stuff that can be used in my programming languages course. But it isn't the only new stuff I've been learning. When you work with students on research projects and independent studies, opportunities to learn await at every turn.

A grad student is taking the undergrad programming languages course and so has to do some extra projects to earn his grad credit. He is a lover of Ruby and has been looking at a couple of Scheme interpreters implemented in Ruby, Heist and Bus-Scheme. I'm not sure where this will lead yet, but that is part of the exercise. The undergrad who faced the "refactor or rewrite?" decision a few weeks ago teaches me something new every week, not only through his experiences writing a language processor but also about his program's source and target languages, Photoshop and HTML/CSS.

Another grad student is working on a web application and teaching me other things about Javascript. Now we are expanding into one tool I've long wanted to study in greater detail, Processing.js and perhaps into another I only just learned of from Dave Humphrey, a beautiful little data presentation library called D3.

And as if that weren't enough, someone tweets that Avdi Grimm is sharing is code and notes as he implements Smalltalk Best Practice Patterns in Ruby. Awesome. This Avdi guy is rapidly becoming one of my heroes.

All of these projects are good news. One of the great advantages of working at a university is working with students and learning along with him. Right now, I have a lot on my plate. It's daunting but fun.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

September 13, 2011 7:17 PM

Learning to Think the Same, Not Write the Same

We have begun to write code in my programming languages course. The last couple of sessions we have been writing higher-order procedures, and next time we begin a three-week unit learning to write recursive programs following the structural recursion patterns in my small pattern language Roundabout.

One of the challenges for students is to learn how to use the patterns without feeling a need to mimic my style perfectly. Some students, especially the better ones, will chafe if I try to make them write code exactly as I do. They are already good programmers in other styles, and they can become good functional programmers without aping me. Many will see the patterns as a restriction on how they think, though in fact the patterns are source of great freedom. They force you to write code in a particular way; they give you tools for thinking about problems as you program.

Again, there is something for us to learn from our writing brethren. Consider a writing course like the one Roger Rosenblatt describes in his book, Unless It Moves the Human Heart: The Craft and Art of Writing, which I have referred to several times, most recently in The Summer Smalltalk Taught Me OOP. No student in Rosenblatt's course wants him to expect them to leave the course writing just like he does. They are in the course to learn elements of craft, to share and critique work, and to get advice from someone with extensive experience as a writer. Rosenblatt is aware of this, too:

Wordsworth quoted Coleridge as saying that every poet must create the taste by which he is relished. The same is true of teachers. I really don't want my students to write as I do, but I want them to think about writing as I do. In them I am consciously creating a certain taste for what I believe constitutes skillful and effective writing.

The course is more about learning how to think about writing as much as it is about learning how to write itself. That's what a good pattern language can do for us: help us learn to think about a class of problems or a class of solutions.

I think this happens whether a teacher intends it consciously or not. Students learn how to think and do by observing their teachers thinking and doing. A programming course is usually better if the teacher designs the course deliberately, with careful consideration of the patterns to demonstrate and the order in which students experience them in problems and solutions.

In the end, I want my students to think about writing recursive programs as I do, because experience as both a programmer and as a teacher tells me that this way of thinking will help them become good functional programmers as soon as possible. But I do not want them to write exactly as I so; they need to find their own style, their own taste.

This is yet another example of the tremendous power teachers wield every time they step foot in a classroom. As a student, I was always most fond of the teachers wielded it carefully and deliberately. So many of them live on in me in how I think. As a teacher, I have come to respect this power in my own hands and try to wield it respectfully with my students.

P.S. For what it's worth, Coleridge is one of my favorite poets!


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

September 12, 2011 12:14 PM

You Keep Using That Word....

Inigo Montoya, from The Princess Bride

Elisabeth Hendrickson recently posted a good short piece about Testing is a Whole Team Activity. She gives examples of some of the comments she hears frequently which indicate a misunderstanding about the relationship between coding and testing. My favorite example was the first:

"We're implementing stories up to the last minute, so we can never finish testing within the sprint."

Maybe I like this one so much because I hear it from students so often, especially on code that they find challenging and are having a hard time finishing.

"If we took the time to test our code, we would not get done on time."

"What evidence do you have that your code works for the example cases?"

"Well, none, really, but..."

"Then how do you know you are done?"

"'done'?" You keep using that word. I do not think it means what you think it means.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

September 08, 2011 8:19 PM

The Summer Smalltalk Taught Me OOP

Or, Throw 'em Away Until You Get It Right

Just before classes started, one of my undergrad research students stopped by to talk about a choice he was weighing. He is writing a tool that takes as input a Photoshop document and produces as output a faithful, understandable HTML rendition of the design and content. He has written a lot of code in Python over the last few months, using an existing library of tools for working with Photoshop docs. Now that he understands the problem well and has figured out what a good solution looks like, he is dissatisfied with his existing code. He thinks he can create a better, simpler solution by writing his own Photoshop-processing tools.

The choice is: refactor the code incrementally until he has replaced all the library code, or start from scratch?

My student's dilemma took me back over twenty years, to a time when I faced the same choice, when I too was first learning my craft.

I was one of my doctoral advisor's first graduate students. In AI, this means that we had a lot of infrastructure to build in support of the research we all wanted to do. We worked in knowledge-based systems, and in addition to doing research in the lab we also wanted to deliver working systems to our collaborators on and off campus. Building tools for regular people meant having reasonable interfaces, preferably with GUI capabilities. This created an extra problem, because our collaborators used PCs, Macs, and Unix workstations. My advisor was a Mac guy. At the time, I was a Unix man in the process of coming to love Macs. In that era, there was no Java and precious few tools for writing cross-platform GUIs.

Our lab spent a couple of years chasing the ideal solution: a way to write a program and run it on any platform, giving users the same experience on all three. The lab had settled more or less on PCL, a portable Common Lisp. It wasn't ideal, but we grad students -- who were spending a lot of time implementing libraries and frameworks -- were ready to stop building infrastructure and start working full-time on our own code.

the Smalltalk balloon from Byte magazine

Then my advisor discovered Smalltalk.

The language included graphics classes in its base image, which offered the promise of write-once-run-anywhere apps for clients. And it was object-oriented, which matched the conceptual model of the software we wanted to build to a T. I had just spent several months trying to master CLOS, Common Lisp's powerful object system, but Smalltalk looked like just what we wanted. So we made the move -- and told advisor that this would be the last move. Smalltalk would be our home.

I learned the basics of the language by working through every tutorial I could my hands on, first Digitalk Smalltalk then ObjectWorks. They were pretty good. Then I wrote some toy programs of my own, to show I was ready to create on my own.

So I started writing my first Smalltalk system: a parser and interpreter for a domain-specific AI language with a built-in inference engine, a simple graphical and table-driven tool for writing programs in these languages, and graphical front end for running systems.

There was a lot going on there, at least for a green graduate student only recently called up to the big leagues. But I had done my homework. I was ready. I was sure of it. I was cocky.

I crashed.

It turn out that I didn't really understand the role that data should play in an OO program. My program soon became a tangle of data dependencies design before I understood my solution all that well, and the tangle made the code increasing turgid.

So I threw it away, rm -r'ed it into nothingness. I started from scratch, sure that Version 2 would be perfect.

I crashed again.

The program was better this time around, but it turns out that I didn't really understand how objects should interact in a large OO program. My program soon became a tangle of objects and wrappers and adapters, whose behavior I could not follow in even the simplest scenarios. The tangle made the code increasing murky.

So I threw it away -- rm -r again -- and started from scratch. Surely Version 3, based on several weeks of coding and learning, would be just what we wanted.

I crashed yet again. This time, the landing was more gentle, because I really was making progress. But as I coded my third system, I began to see ways to structure the program that would make the code easier to grow as I added features, and easier to change as I got better at design. I was just beginning to glimpse the Land of Design Patterns. But I always seemed to learn each lesson one day too late to use it.

My program was moving forward, creakily, but I just knew it could be better. I did not like the idea of maintaining this code for several years, as we modified apps fielded with our collaborators and as we used it as part of the foundation for the lab's vision.

So I threw it away and wrote my system from scratch one last time. The result was not a perfect program, but one that I could live with and be proud of. It only took me four major iterations and several months of programming.

Looking back, I faced the same decision my student faced recently with his system. Refactor or start over? He has the advantage of having written a better first program than I had, yet he made the sound decision to rewrite.

Sometimes, refactoring really is the better approach. You can keep system running while slowly corralling data dependencies, spurious object interactions, and suboptimal design. Had I been a more experienced programmer, I may well have chosen to refactor from Version 3 to Version 4 of my program. But I wasn't. Besides, I had neither a suite of unit tests nor access to automated refactoring tools. Refactoring without either of these makes the process scarier and more dangerous than it needs to be.

Maybe refactoring is the better approach most or all of the time. I've read all about how the Great Rewrite is one of those Things You Should Never Do.

Fred Brooks

But then, there is an axiom from Turing Award winner Fred Brooks that applies better to my circumstance of writing the initial implementation of a program: "... plan to throw one away; you will, anyhow". I find Brooks's advice most useful when I am learning a lot while I am programmer. For me, that is one context, at least, in which starting from scratch is a big win: when my understanding is changing rapidly, whether of domain, problem, or tools. In those cases, I am usually incapable of refactoring fast enough to keep up with my learning. Starting over offers a faster path to a better program than refactoring.

On that first big Smalltalk project of mine, I was learning so much, so fast. Smalltalk was teaching me object-oriented programming, through my trial and error and through my experience with the rest of the Smalltalk class hierarchy. I had never written a language interpreter or other system of such scale before, and I was learning lessons about modularity and language processing. I was eager to build a new and improved system as quickly as I could.

In such cases, there is nothing like the sound of OS X's shredder. Freedom. No artificial constraints from what has suddenly become legacy code. No limits from my past ignorance. A fresh start. New energy!

This is something we programmers can learn from the experience of other writers, if we are willing. In Unless It Moves the Human Heart: The Craft and Art of Writing, Roger Rosenblatt tells us that Edgar Doctorow ...

... had written 150 pages of The Book of Daniel before he'd realized he had chosen the wrong way to tell the story. ... So one morning Edgar tossed out the 150 pages and started all over.... I wanted the class to understand that Edgar was happy to start from scratch, because he had picked the wrong door the first time.

Sometimes, the best thing a programmer can admit is that he or she has picked the wrong door and walked down the wrong path.

But Brooks warns of a danger programmer's face on second efforts: the Second System Effect. As Scott Rosenberg writes in Code Reads #1: The Mythical Man-Month:

Brooks noted that an engineer or team will often make all the compromises necessary to ship their first product, then founder on the second. Throughout project number one, they stored up pet features and extras that they couldn't squeeze into the original product, and told themselves, "We'll do it right next time." This "second system" is "the most dangerous a man ever designs."

I never shipped the first version of my program, so perhaps I eluded this pitfall out of luck. Still, I was cocky when I wrote Version 1, and then I was cocky when I wrote Version 2. But both versions humbled me, humbled me hard. I was a good programmer, maybe, but I wasn't good enough. I had a lot to learn, and I wanted to learn it all.

So it was relatively easy to start over on #3 and #4. I was learning, and I had the luxury of time. Ah, to be a grad student again!

In the end, I wrote a program that I could release to users and colleagues with pride. Along the way, Smalltalk taught me a lot about OOP, and writing the program taught me a lot about expert system shells. It was time well spent.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

September 04, 2011 7:28 PM

In Praise of Usefulness

In two recent entries [ 1 | 2 ], I mentioned that I had been recently reading Roger Rosenblatt's Unless It Moves the Human Heart: The Craft and Art of Writing. Many of you indulge me my fascination with writers talking about writing, and I often see parallels between what writers of code and writers of prose and poetry do. That Rosenblatt also connects writing to teaching, another significant theme of my blog, only makes this book more stimulating to me.

"Unless it moves the human heart" is the sort of thing writers say about their calling, but not something many programmers say. (The book title quotes Aussie poet A. D. Hope.) It is clearly at the heart of Rosenblatt's views of writing and teaching. But in his closing chapter, Rosenblatt includes a letter written to his students as a postscript on his course that speaks to a desire most programmers have for the lives' work: usefulness. To be great, he says, your writing must be useful to the world. The fiction writer's sense of utility may differ from the programmer's, but at one level the two share an honorable motive.

This paragraph grabbed me as advice as important for us programmers as it is for creative programmers. (Which software people do you think of as you read it?)

How can you know what is useful to the world? The world will not tell you. The world will merely let you know what it wants, which changes from moment to moment, and is nearly always cockeyed. You cannot allow yourself to be directed by its tastes. When a writer wonders, "Will it sell?" he is lost, not because he is looking to make an extra buck or two, but rather because, by dint of asking the question in the first place, he has oriented himself to the expectations of others. The world is not a focus group. The world is an appetite waiting to be defined. The greatest love you can show it is to create what it needs, which means you must know that yourself.

What a brilliant sentence: The world is an appetite waiting to be defined. I don't think Ward Cunningham went around asking people if they needed wiki. He built it and gave it to them, and when they saw it, their appetite took form. It is indeed a great form of love to create what the world needs, whether the people know it yet or not.

(I imagine that at least a few of you were thinking of Steve Jobs and the vision that gave us the Mac, iTunes, and the iPad. I was too, though Ward has always been my hero when it comes to making useful things I had not anticipated.)

Rosenblatt tells his students that, to write great stories and poems and essays, they need to know the world well and deeply. This is also sound advice to programmers, especially those who want to start the next great company or revolutionize their current employers from the inside out. This is another good reason to read, study, and think broadly. To know the world outside of one's Ruby interpreter, outside the Javascript spec and the HTML 5.0, one must live in it and think about it.

It seems fitting on this Labor Day weekend for us to think about all the people who make the world we live in and keep it running. Increasingly, those people are using -- and writing -- software to give us useful things.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

August 27, 2011 9:36 AM

Extravagant Ideas

One of the students in my just-started Programming Languages course recently mentioned that he has started a company, Glass Cannon Games, to write games for the Box and Android platforms. He is working out of my university's student incubator.

Last summer, I wrote a bit about entrepreneurship and a recent student of mine, Nick Cash, who has started Book Hatchery to "help authors publish their works digitally".

Going a bit further back, I mentioned an alumnus, Wade Arnold, winning a statewide award for his company, T8 Webware. Readers of this blog most recently encountered Wade in my entry on the power of intersections.

Over the last decade, Wade has taken a few big ideas and worked hard to make them real. That's what Nick and, presumably, Ian are doing, too.

Most entrepreneurs start with big thoughts. I try to encourage students to think big thoughts, to consider an entrepreneurial career. The more ideas they have, the more options they have in careers and in life. Going to work for a big company is the right path for some, but some want more and can do their own thing -- if only they have the courage to start.

This is a more important idea than just for starting start-ups. We can "think big and write small" even for the more ordinary programs we write. Sometimes we need a big idea to get us started writing code. Sometimes, we even need hubris. Every problem a novice faces can appear bigger than it is. Students who are able to think big often have more confidence. That is the confidence they need to start, and to persevere.

It is fun as a teacher to be able to encourage students to think big. As writer Roger Rosenblatt says,

One of the pleasures of teaching writing courses is that you can encourage extravagant thoughts like this in your students. These are the thoughts that will be concealed in plain and modest sentences when they write. But before that artistic reduction occurs, you want your students to think big and write small.

Many students come into our programming courses unsure, even a little afraid. Helping them free themselves to have extravagant ideas is one of the best things a teacher can do for them. Then they will be motivated to do the work they need to master syntax and idioms, patterns and styles.

A select few of them will go a step further and believe something even more audacious, that

... there's no purpose to writing programs unless you believe in significant ideas.

Those will be the students who start the Glass Cannons, the Book Hatcheries, and the T8s. We are all better off when they do.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

August 26, 2011 2:29 PM

Thoughts at the Start of the Semester

We have survived Week 1. This semester, I again get to teach Programming Languages, a course I love and about which I blog for a while every eighteen months of so.

I had thought I might blog as I prepped for the course, but between my knee and department duties, time was tight. I've also been slow to settle on new ideas for the course. In my blog ideas folder, I found notes for an entry debriefing the last offering of the course, from Spring 2010, and thought that might crystallize some ideas for me. Alas, the notes held nothing useful. They were just a reminder to write, which went unheeded during a May term teaching agile software development.

Yesterday I started reading a new book -- not a book related to my course, but Roger Rosenblatt's Unless It Moves the Human Heart: The Craft and Art of Writing. I love to read writers talking about writing, and this book has an even better premise: it is a writer talking about writing as he teaches writing to novices! So there is plenty of inspiration in it for me, even though it contains not a single line of Scheme or Ruby.

Rosenblatt recounts teaching a course a course called "Writing Everything". Most the students in the course want to learn how to write fiction, especially short stories. Rosenblatt has them also read write poems, in which they can concentrate on language and sounds, and essays, in which they can learn to develop ideas.

This is not the sort of course you find in CS departments. The first analogy that came to mind was a course in which students wrote, say, a process scheduler for an OS, a CRUD database app for a business, and an AI program. The breadth and diversity of apps might get the students to think about commonalities and differences in their programming practice. But a more parallel course would ask students to write a few procedural programs, object-oriented programs, and functional programs. Each programming style would let the student focus on different programming concepts and distinct elements of their craft.

I'd have a great time teaching such a "writing" course. Writing programs is fun and hard to learn, and we don't have many opportunities in a CS program to talk about the process of writing and revising code. Software engineering courses have a lot of their own content, and even courses on software design and development often deal more with new content than with programming practice. In most people's minds, there is not room for a new course like this one in the curriculum. In CS programs, we have theory and applications courses to teach. In Software Engineering programs, they seem far too serious about imitating other engineering disciplines to have room for something this soft. If only more schools would implement Richard Gabriel's idea of an MFA in software...

Despite all these impediments, I think a course in which students simply practiced programming in the large(r) and focused on their craft could be of great value to most CS grads.

I will let Rosenblatt's book inspire me and leak into my Programming Languages course where helpful. But I will keep our focus on the content and skills that our curriculum specifies for the course. By learning the functional style of programming and a lot about how programming languages work, students will get a chance to develop a few practical skills, which we hope will pay off in helping them to be better programmers all around, whether in Java, Python, Ruby, Scala, Clojure, or Ada.

One meta-message I hope to communicate both explicitly and implicitly is that programmers never stop learning, including their professor. Rosenblatt has the same goal in his writing course:

I never fail to say "we" to my students, because I do not want them to get the idea that you ever learn how to write, no matter how long you've done it.

Beyond that, perhaps the best I can do is let my students that I am still mesmerized by the cool things we are learning. As Rosenblatt says,

Observing a teacher who is lost in the mystery of the material can be oddly seductive.

Once students are seduced, they will take care of their own motivation and their own learning. They won't be able to help themselves.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

August 21, 2011 9:32 AM

Overcoming a Disconnect Between Knowing and Doing

Before reading interviews with Hemingway and Jobs, I read a report of Ansel Adams's last interview. Adams was one of America's greatest photographers of the 20th century, of course, and several of his experiences seem to me to apply to important issues in software development.

It turns out that both photography and software development share a disconnect between teaching and doing:

One of the problems is the teaching of photography. In England, I was told, there's an institute in which nobody can teach photography until they've had five years' experience in the field, until they've had to make a go of it professionally.

Would you recommend that?

I think that teachers should certainly have far more experience than most of the ones I know of have had. I think very few of them have had practical experience in the world. Maybe it's an impossibility. But most of the teachers work pretty much the same way. The students differ more from each other than the teachers do.

Academics often teach without having experience making a living from the material they teach. In computer science, that may make sense for topics like discrete structures. There is a bigger burden in most of the topics we teach, which are done in industry and which evolve at a more rapid rate. New CS profs usually come out of grad school on the cutting edge of their specialties, though not necessarily on top of all the trends in industry. Those who take research-oriented positions stay on the cutting edge of their areas, but the academic pressure is often to become narrower in focus and thus farther from contemporary practice. Those who take positions at teaching schools have to work really hard to stay on top of changes out in the world. Teaching a broad variety of courses makes it difficult to stay on top of everything.

Adams's comment does not address the long-term issue, but it takes a position on the beginning of careers. If every new faculty member had five years or professional programming experience, I dare say most undergrad CS courses would be different. Some of the changes might be tied too closely to those experiences (someone who spent five years at Rockwell Collins writing SRSs and coding in Ada would learn different things from someone who spent five years writing e-commerce sites in Rails), but I think would usually be some common experiences that would improve their courses.

When I first read Adams's comment, I was thinking about how the practitioner would learn and hone elements of craft that the inexperienced teacher didn't know. But the most important thing that most practitioners would learn is humility. It's easy to lecture rhapsodically about some abstract approach to software development when you haven't felt the pain it causes, or faced the challenges left even when it succeeds. Humility can be a useful personal characteristic in a teacher. It helps us see the student's experience more accurately and to respond by changing how and what we teach.

Short of having five years of professional experience, teachers of programming and software development need to read and study all the time -- and not just theoretical tomes, but also the work of professional developers. Our industry is blessed with great books by accomplished developers and writes, such as Design Patterns and Refactoring. The web and practitioners' conferences such as StrangeLoop are an incredible resource, too. As Fogus tweeted recently, "We've reached an exciting time in our industry: colleges professors influenced by Steve Yegge are holding lectures."

Other passages in the Adams interview stood out to me. When he shared his intention to become a professional photographer, instead of a concert pianist:

Some friends said, "Oh, don't give up music. ... A camera cannot express the human soul." The only argument I had for that was that maybe the camera couldn't, but I might try through the camera.

What a wonderful response. Many programmers feel this way about their code. CS attracts a lot of music students, either during their undergrad studies or after they have spent a few years in the music world. I think this is one reason: they see another way to create beauty. Good news for them: their music experience often gives them an advantage over those who don't have it. Adams believed that studying music was valuable to him as a photographer:

How has music affected your life?

Well, in music you have this absolutely necessary discipline from the very beginning. And you are constructing various shapes and controlling values. Your notes have to be accurate or else there's no use playing. There's no casual approximation.

Discipline. Creation and control. Accuracy and precision. Being close isn't good enough. That sounds a lot like programming to me!


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

August 13, 2011 2:47 PM

Trust People, Not Technology

Another of the interviews I've read recently was The Rolling Stone's 1994 interview with Steve Jobs, when he was still at NeXT. This interview starts slowly but gets better as it goes on. The best parts are about people, not about technology. Consider this, on the source Jobs' optimism:

Do you still have as much faith in technology today as you did when you started out 20 years ago?

Oh, sure. It's not a faith in technology. It's faith in people.

Explain that.

Technology is nothing. What's important is that you have a faith in people, that they're basically good and smart, and if you give them tools, they'll do wonderful things with them. It's not the tools that you have faith in -- tools are just tools. They work, or they don't work. It's people you have faith in or not.

I think this is a basic attitude held by many CS teachers, about both technology and about the most important set of people we work with: students. Give them tools, and they will do wonderful things with them. Expose them to ideas -- intellectual tools -- and they will do wonderful things. This mentality drives me forward in much the same way as Jobs's optimism does about the people he wants to use Apple's tools.

I also think that this is an essential attitude when you work as part of a software development team. You can have all the cool build, test, development, and debugging tools money can buy, but in the end you are trusting people, not technology.

Then, on people from a different angle:

Are you uncomfortable with your status as a celebrity in Silicon Valley?

I think of it as my well-known twin brother. It's not me. Because otherwise, you go crazy. You read some negative article some idiot writes about you -- you just can't take it too personally. But then that teaches you not to take the really great ones too personally either. People like symbols, and they write about symbols.

I don't have to deal with celebrity status in Silicon Valley or anywhere else. I do get to read reviews of my work, though. Every three years, the faculty of my department evaluate my performance as part of the dean's review of my work and his decision to consider for me another term. I went through my second such review last winter. And, of course, frequent readers here have seen my comments on student assessments, which we do at the end of each semester. I wrote about assessments of my spring Intelligent Systems course back in May. Despite my twice annual therapy sessions in the form of blog entries, I have a pretty good handle on these reviews, both intellectually and emotionally. Yet there is something visceral about reading even one negative comment that never quite goes away. Guys like Jobs probably do there best not to read newspaper articles and unsolicited third-party evals.

I'll have to try the twin brother gambit next semester. My favorite lesson from Jobs's answer, though, is the second part: While you learn to steel yourself against bad reviews, you learn not to take the really great ones too personally, either. Outliers is outliers. As Kipling said, all people should count with you, but none too much. The key in these evaluations to gather information and use it to improve your performance. And that most always comes out of the middle of the curve. Treating raves and rants alike with equanimity keeps you humble and sane.

Ultimately, I think one's stance toward what others say comes back to the critical element in the first passage from Jobs: trust. If you trust people, then you can train yourself to accept reviews as a source of valuable information. If you don't, then the best you can do is ignore the feedback you receive; the worst is that you'll damage your psyche every time you read them. I'm fortunate to work in a department where I can trust. And, like Jobs, I have a surprising faith in my students' fairness and honesty. It took a few years to develop that trust and, once I did, teaching came to feel much safer.


Posted by Eugene Wallingford | Permalink | Categories: Managing and Leading, Personal, Software Development, Teaching and Learning

August 12, 2011 8:45 PM

"Always Stop When You Know What Is Going To Happen Next"

A few weeks ago, I ran across an article that quoted from published interviews with nine creative people. Over the last couple of days, I have been reading the original interviews that most interested me. Three in particular jogged my mind about creativity, art, and the making of things -- all of which are a part of how I view craft as a programmer and teacher.

Who would have thought that an interview with author Ernest Hemingway in The Paris Review's "The Art of Fiction No. 21" would make me think about computer programming and test-driven development? But it did. When asked about his writing schedule, Hemingway described a morning habit that I myself enjoy:

When I am working on a book or a story I write every morning as soon after first light as possible. There is no one to disturb you and it is cool or cold and you come to your work and warm as you write. You read what you have written and, as you always stop when you know what is going to happen next, you go on from there. You write until you come to a place where you still have your juice and know what will happen next and you stop and try to live through until the next day when you hit it again. You have started at six in the morning, say, and may go on until noon or be through before that. When you stop you are as empty, and at the same time never empty but filling, as when you have made love to someone you love. Nothing can hurt you, nothing can happen, nothing means anything until the next day when you do it again. It is the wait until the next day that is hard to get through.

I love this paragraph. While discussing mundane details of how he starts his writing day, Hemingway seamlessly shifts into a simile comparing writing -- and stopping -- to making love. Writers and other artists can say such things and simply be viewed for what they are: artists. I dare say many computer programmers feel exactly the same about writing programs. Many times I have experienced the strange coincident feelings of emptiness and fullness after a long day or night coding, and the longing to begin again tomorrow. Yet I knew, as Hemingway did, that the breaks were a necessary part of the discipline one needed to write well and consistently over the long haul. (Think sustainable pace, my friends.)

Yet, if one of us programmers were to say what Hemingway said above, to compare the feeling we have when we stop programming to the the feeling we have after making love to a person we love, most people would have to fight back a smirk and suppress an urge to joke about nerds never having sex and not being able to get girls (or guys). The impolite among them would say it out loud. If you are careful in choosing your friends and perhaps a bit lucky, you will surround yourself with friends who react to you saying this with a sympathetic nod, because they know that you, too, are a writer, and something of an artist.

On a more practical note, the writing habit Hemingway describes resembles a habit many of us programmers have. In the world of TDD, you will often hear people say, "Stop at the end of the day with a failing test." My friend, poet and programmer Richard Gabriel, has spoken of ending the day in the middle of a line of code. Both ideas echo Hemingway's advice, because they leave us in the same great position the next morning: ready to start the day by doing something obvious, something concrete.

But why is that so important?

But are there times when the inspiration isn't there at all?

Naturally. But if you stopped when you knew what would happen next, you can go on. As long as you can start, you are all right. The juice will come.

Writing is hard. Starting is hard. But if you are a writer, you must write, you must start. Likewise a programmer. As many people will tell you, inspiration is a fickle and overrated gift. Hemingway speaks elsewhere in the interview of days filled with inspiration, but they are rare. The writer writes regardless of inspiration. In writing, one often creates the very inspiration he seeks.

As long as you can start, you are all right.

Later in the piece, Hemingway has something to interesting to say about a different sort of starting: starting a career. When asked if financial security can be a detriment to good writing, he says:

If it came early enough and you loved life as much as you loved your work it would take much character to resist the temptations. Once writing has become your major vice and greatest pleasure only death can stop it. Financial security then is a great help as it keeps you from worrying. Worry destroys the ability to write. [Worry] attacks your subconscious and destroys your reserves.

This made me think of young CS grads in start-up companies, working to get by on a minimal budget while fulfilling a passion to make something. Being poor may not be as good for our souls as some would have us think, but it does inoculate us from temptations available to us only if we have resources. Once programming is your habit -- "your major vice and greatest pleasure" -- then you are on the path for a productive life as a programmer. If financial success comes too early, or if you are born with resources, you can still become a programmer, but you may have too battle the attraction of things that will get in the way of the work necessary to develop your craft.

This is one of the motives behind the grueling 6-year trial to which we subject new profs to in our universities: to instill habits of work and thought before they receive the temptation-heavy mantle of tenure. Unlike Hemingway's prescription, though, at most research schools the tenure-track phase usually includes an unhealthy dose of uncertainty and worry. But then again, maybe being a poor, struggling young writer or artist does, too.

That's one reason I like Hemingway's answer so much. He does not romanticize being poor. He acknowledges that, once one has the habit of writing, financial security can be a great benefit, because it relieves the writer of the stress that can kill her productivity.

Despite enjoying these insightful passages so much, I cannot say that this a great interview. Hemingway is too often unwilling to talk about elements of the the craft of writing, and he expends too many words telling interviewer George Plimpton -- an intelligent man and accomplished journalist himself -- that his questions are cliche, worn, or stupid. Still, I was driven to read through to the end, and I enjoyed it.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

August 11, 2011 7:59 PM

Methods, Names, and Assumptions in Adding New Code to a Program

Since the mid-1990s, there has been a healthy conversation around refactoring, the restructuring of code to improve its internal structure without changing its external behavior. Thanks to Martin Fowler, we have a catalog of techniques for refactoring that help us restructure code safely and reliably. It is a wonderful tool for learners and practitioners alike.

When it comes to writing new code, we are not so lucky. Most of us learn to program by learning to write new code, yet we rarely learn techniques for adding code to a program in a way that is as safe and reliable as effective as the refactorings we know and love.

You might think that adding code would be relatively simple, at least compared to restructuring a large, interconnected web of components. But how can we move with the same confidence when adding code as we do when we follow a meticulous refactoring recipe under the protection of good unit tests permits? Test-driven design is a help, but I have never felt like I had the same sort of support writing new code as when I refactor.

So I was quite happy a couple of months ago to run across J.B. Rainsberger's Adding Behavior with Confidence. Very, very nice! I only wish I had read it a couple of months ago when I first saw the link. Don't make the same mistake; read it now.

Rainsberger gives a four-step process that works well for him:

  1. Identify an assumption that the new behavior needs to break.
  2. Find the code that implements that assumption.
  3. Extract that code into a method whose name represents the generalisation you're about to make.
  4. Enhance the extracted method to include the generalisation.

I was first drawn to the idea that a key step in adding new behavior is to make a new method, procedure, or function. This is one of the basic skills of computer programming. It is one of the earliest topics covered in many CS1 courses, and it should be taught sooner in many others.

Even still, most beginners seem to fear creating new methods. Even more advanced students will regress a bit when learning a new language, especially one that works differently than the languages they know well. A function call introduces a new point of failure: parameter passing. When worried about succeeding, students generally try to minimize the number of potential points of failure.

Notice, though, that Rainsberger starts not with a brand new method, empty except for the new code to be written. This technique asks us first to factor out existing code into a new method. This breaks the job of writing the new code into two, smaller steps: First refactor, relying on a well-known technique and the existing tests to provide safety. Second, add the new code. (These are Steps 3 and 4 in Rainsberger's technique.)

That isn't what really grabbed my attention first, however. The real beauty for me is that extracting a method forces us to give it us a name. I think that naming gives us great power, and not just in programming. A lot of times, CS textbooks make a deal about procedures as a form of abstraction, and they are. But that often feels so abstract... For programmers, especially beginners, we might better focus on the fact that help us to name things in our programs. Names, we get.

By naming a procedure that contains a few lines of code, we get to say what the code does. Even the best factored code that uses good variable names tends to say how something is done, not what it is doing. Creating and calling a method separates the two: the client does what the method does, and the server implements how it is done. This separation gives us new power: to refactor the code in other ways, certainly. Rainsberger reminds us that it also gives us power to add code more reliably!

"How can I add code to a program? Write a new function." This is an unsurprising, unhelpful answer most of the time, especially for novices who just see this as begging the question. "Okay, but what do I do then?" Rainsberger makes it a helpful answer, if a bit surprising. But he also puts it in a context with more support, what to do before we start writing the new code.

Creating and naming procedures was the strongest take-home point for me when I first read this article. As the ideas steeped in my mind for a few days, I began to have a greater appreciation for Rainsberger's focus on assumptions. Novice thinkers have trouble with assumptions. This is true whether they are learning to program, learning to read and analyze literature, or learning to understand and argue public policy issues. They have a hard time seeing assumptions, both the ones they make and the ones made by other writers. When the assumptions are pointed out, they are often unsure what to do with them, and are tempted to skip right over them. Assumptions are easy to ignore sometimes, because they are implicit and thus easy to lose track of when deep in a argument.

Learning to understand and reason about assumptions is another important step on the way to mature thinking. In CS courses, we often introduce the idea of preconditions and postconditions in Data Structures. (Students also see them in a discrete structures course, but there they tend to be presented as mathematical tools. Many students dismiss their value out of hand). Writing pre- and postconditions for a method is a way to make assumptions in your program explicit. Unfortunately, most beginning don't yet see the value in writing them. They feel like an extra, unnecessary step in a process dominated by the uncertainty they feel about their code. Assuring them that these invariants help is usually like pushing a rock up a hill. Tomorrow, you get to do it again.

One thing I like about Rainsberger's article is that it puts assumptions into the context of a larger process aimed at helping us write code more safely. Mathematical reasoning about code does that, too, but again, students often see it as something apart from the task of programming. Rainsberger's approach is undeniably about code. This technique may encourage programmers to begin thinking about assumptions sooner, more often, and more seriously.

As I said, I haven't seen many articles or books talk about adding code to a program in quite this way. Back in January, "Uncle Bob" Martin wrote an article in the same spirit as this, called The Transformation Priority Premise. It offers a grander vision, a speculative framework for all additions to code. If you know Uncle Bob's teachings about TDD, this article will seem familiar; it fits quite nicely with the mentality he encourages when using tests to drive the growth of a program. While his article is more speculative, it seems worthy or more thought. It encourages the tiniest of steps as each new test provokes new code in our program. Unfortunately, it takes such small steps that I fear I'd have trouble getting my students, especially the novices, to give it a fair try. I have a hard enough time getting most students to grok the value of TDD, even my seniors!

I have similar concerns about Rainsberger's technique, but his pragmatism and unabashed focus on code gibes me hope that it may be useful teaching students how to add functionality to their programs.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

July 06, 2011 12:23 PM

Don't Forget Your 3000-LOC Check-Up!

Yesterday, Michael Feathers tweeted:

If a code base is more complicated than a car, shouldn't it have a maintenance plan too?

I asked him, "Refactoring every 3K miles?", and he joked back, "Well, maybe every 3K lines." I had thought about using LOC in my tweet decided to stick with the auto analogy. Whatever its weaknesses, LOC seems to be the first place programmers' minds go when we talk about the volume of code. (Though, as much traveling as Feathers and other big-time consultants and speakers do in a year, maybe 3000 miles is the right magnitude after all.)

Feathers makes a serious point, even if he didn't mean it too seriously. When we buy a car, we implicitly accept the notion of scheduled maintenance: change the oil every so often; have the engine tuned up every so often; replace the battery and rotate the tires every so often. We accept it because we know that it makes our car run better and last longer.

When we buy software, we want it to run forever, as is. Or the company who sells it to us wants us to run it as-is forever -- or buy a new version. Imagine having to buy a new car as soon as your current car started coughing, wheezing, or seizing up on dirty oil.

I mentioned refactoring in my joke because it is part of the maintenance plan built in to XP and used in so many agile approaches to software development. XP discourages long-form maintenance in the form of refactoring every few months or even weeks. Instead, it encourages a sort of continuous maintenance, in a tight test-code-refactor cycle. It's kind of like checking your car's oil, fluids, tires, etc., after every use.

When we do that to a car, it's usually because the car is in bad shape, breaking down as we try to extend its life. But continuous refactoring of a code base is usually a sign of robust health. It means that we know our code is in good shape and ready for use -- and extension. Teams that maintain their code on automobile-like time scales are usually sitting on a time bomb. Users may be able to use the code, but the programmers dare not touch its internals.

My other thought as I tweeted was about our inability, or perhaps unwillingness, to make this idea come alive for the students in our university CS programs.

It is an inability because it is so hard to create ways for students to live with any body of code longer than a semester or two, and time is a necessary ingredient facing the need for maintenance. Of all my project courses, compilers seems the most frequent teacher of this lesson. A semester may not be long, but a compiler is complex enough, and a non-trivial language spec hard enough to understand, to accelerate the sense of age and deterioration.

It is perhaps an unwillingness because most every CS faculty I know makes very little effort to change courses and degree programs to make this lesson approachable. The good news is that making changes to bring this idea within our students' learning horizons also brings a lot of other important software development lessons within their horizons.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

June 27, 2011 5:03 PM

"You can code. That is pretty damn cool."

I've been off-line a lot lately, doing physical therapy for my knee and traveling a bit. That means I have a lot of fun reading to catch up on! One page that made the rounds recently is Advice From An Old Programmer, from Zed Shaw's intro book, Learn Python The Hard Way. Shaw has always been a thoughtful developer and an entertaining writer with a unique take on programming. Now he has put his money where his mouth is with a book that aims to teach programming in a style he thinks most effective for learners.

I look forward to digging into the book soon, but for now his advice page has piqued a lot of interest. For example:

Programming as a profession is only moderately interesting. It can be a good job, but you could make about the same money and be happier running a fast food joint. You're much better off using code as your secret weapon in another profession.

As a matter of personal opinion, I disagree with the first sentence, and could never make the switch discussed in the second. But I do think that the idea of programming as a secret weapon in other professions has a lot to offer people who would never want to be computer scientists or full-time software developers. It's a powerful tool that frees you from wishing you have a programmer around. It changes how you can think about problems in your discipline and lets you ask new questions.

Finally, Shaw tells his readers not too worry when non-programmers treat them badly because they are now nerds who can program. He gives good reasons why you shouldn't care about such taunts, and then sums it up in a Zed Shaw-like killer closing line:

You can code. They cannot. That is pretty damn cool.

Amen.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

June 15, 2011 7:53 AM

Barbarians at the Agile Gate

For the last year or so, there seems to have been something of a backlash against agile software development. A lot of conference speakers and bloggers have been telling us about the failures of the move to agile approaches, both the misconceptions at its base and the misdirections in its evolution. I have been especially uncomfortable watching writers and consultants who have teaching the world the ways of agile development join the bandwagon.

Don't get me wrong. I am aware that agile development has its weaknesses; all things do. I'm also aware that the social and technical movement leading to its adoption across our industry has had its problems; all movements do. I do believe that we should understand the weaknesses in our practices and our methods for teaching the world about them, so that we can learn how to do those things better. Still, it's been a little disconcerting to watch the backlash proceed.

Then again, in my short career, I've seen something similar happen to design patterns, object-oriented programming, and structured programming. It seems the natural order of history in our business. I've long wondered why.

Then I came across The Return of the Barbarian, which tells the story of humans history as a cycle between barbarian culture and civilization. If we think of the software world in similar terms, our own history makes sense. Agile software development was once the barbarian. Now its the civilized culture, ripe to be conquered by a new, hungry barbarian.

How so? Think back to before the days of agile. The software world, broadly speaking, had an entrenched culture that we all recognize. Developers and clients were different breeds. Developers and users, too. We documented stable requirements, designed up front a software system to deliver them, and passed the design onto programmers, wrote code. When the process didn't go as well as planned, programmers worked longer and harder to meet their deadlines. This was an idealized world, to be sure, but we strove hard to meet the ideal. People who had made this style of development work -- smart people, energetic people -- codified their knowledge in textbooks and training courses, so that the rest of us could learn how to duplicate their successes. Anyone willing to study could learn the techniques and become part of civilized society.

Collectively, we were wise, but individually, we were weak, really, going through the motions.

Along came agile. It turns out that there were world-class developers -- smart people, energetic people -- who did things differently. They worked closely with their clients and users. They collaborated heavily among themselves. They took small steps, refactored their code, and grew their software, rather than erecting it. To many in the industry, this was a romantic way to live and work, and they wanted to join in the fun.

The dominant software culture didn't see it that way, though. Agile looked, well, barbaric. What happens when a cowboy programmer does something to break the system? How can we write programs if we don't know everything about what it should do? Where are the systematic controls on the process? Those entrenched in civilized software development culture saw agile methods as a step backward, to a less advanced time. But they were wrong, as we read in "The Return of the Barbarian":

The reason this seems like a strange phenomenon is that we confuse refinement with advancement. Finely-crafted jewelry is not more advanced than roughly-hewn jewelry. A Boeing 747 is about a million times more capable than the Wright Flyer I, but it does not contain a million times as much intelligence. It is merely more refined.... The difference between advancement and refinement is clearest in disruption. A beautifully-crafted sword is not more advanced than a crude gun. It is merely more refined.

The problem was, the software civilization built around structured programming was more than refined than the agile approaches, but not necessarily more advanced. The system itself was quite intelligent, with much wisdom and knowledge encoded in its practices, its textbooks, and its other literature. But individually, developers did not need to be as sharp, because the system guided them to success (as much as it could).

The above passage is soon followed by a stark distinction:

The intelligence manifest in an artifact is simply the amount of human thought that has been externalized into it. Refinement on the other hand, is a measure of the amount of work that has gone into it. In Hegelian terms, intelligence in design is fundamentally a predatory quality put in by barbarian-Masters. Refinement in design is a non-predatory quality put in by civilized-Slaves.

It was in this context that agile -- a new barbarian culture -- swooped in and made inroads. The existing culture derided it as a fad, but it was in fact a set of advanced values, principles, and practices, less polished than the refined extant culture but full of deep thought and human experience.

Over time, we saw the inevitable cultural evolution. In order to teach agile values, principles, and practices to a wider audience, barbarian-masters wrote down their wisdom in the form of books and conference talks and podcasts and index cards. In order to reach the managerial class of the corporations that build and buy software, first- and second- generation agilista created management seminars and certification programs and the trappings of institutional respectability. The practices were packaged, adjusted... refined. Soon everyone could "do" agile, or at least pretend to, by following the rules. Some folks got it, though not everyone, but most everyone tried to do it.

The agile barbarian has become the civilized Organization Man. Where is the romance in that?

Now new barbarians are at the gate, ready to shove the once-romantic revolutionaries of OO and patterns and agile development out into the streets. It is the natural order.

Here's the thing. Agile approaches still work as well as they did back when they were the knowledge in the minds of great programmers, who were unhindered by the rules that they follow instinctively and break when necessary in the pursuit of a great program. That knowledge was born out of experience and embodied wisdom gained from living in the civilization that came before.

If we want to become like the agile masters, we need to do more than imitate them. We need to live the values, principles, and practices -- to make them our own, not something written in a book or a blog.

Perhaps this is just a fanciful retelling of our history to explain away the agile movement's shortcomings and failures. But I think it carries a grain of truth. A new barbarian knocks down the door as soon as we become too civilized, when we convert our deep, compiled, pragmatic, contextual knowledge into handbooks and notecards and consultants' packaged talks and on-site training courses.

To the extent that my tale is true, I think it offers us a path forward: a reminder to embrace our barbarian past. Each of us can develop his or her own individual intelligence, rather than relying on the external trappings of a civilized movement. Don't let the next revolution cause you to abandon what works well. Look for useful knowledge and practice in the ways of the next wave of barbarians, and use it to get better.

Instead of arguing with others about test-first design, pair programming and refactoring, about collaboration with users, collective code ownership, and sustainable pace, we should simply do them, in the way that fits best our context. This leads by example. It may not scale as fast as the civilized approach, but as we have seen, that approach is fraught with its own dangers.

Besides, the goal isn't to scale or to create a movement. It's to write great programs.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

May 23, 2011 2:34 PM

Plan A Versus Plan B

Late last week, Michael Nielsen tweeted:

"The most successful people are those who are good at Plan B." -- James Yorke

This is one of my personal challenges. I am a pretty good Plan A person. Historically, though, I am a mediocre Plan B person. This is true of creating Plan B, but more importantly of recognizing and accepting the need for Plan B.

Great athletes are good at Plan B. My favorite Plan B from the sporting world was executed by Muhammad Ali in the Rumble in the Jungle, his heavyweight title fight against George Foreman in October 1974. Ali was regarded by most at that time as the best boxer in the world, but in Foreman he encountered a puncher of immense power. At the end of Round 1, Ali realized that his initial plan of attacking Foreman was unlikely to succeed, because Foreman was also a quick fighter who had begun to figure out Ali's moves. So Ali changed plans, taking on greater short-term risk by allowing Foreman to hit him as much as he wanted, so long as the blows were not the kind likely to end the fight immediately. Over the next few rounds, Foreman began to wear down, unaccustomed to throwing so many punches for so many rounds against an opponent who did not weaken. Eventually, Ali found his opening, attacked, and ended the fight in Round 8.

This fight is burned in my mind for the all-time great Plan B moment: Ali sitting on his stool between the first and second rounds, eyes as wide and white as platters. I do not ever recall seeing fear in Muhammad Ali's eyes at any other time in his career, before or after this fight. He believed that Foreman could knock him out. But rather than succumb to the fear, he gathered himself, recalculated, and fought a different fight. Plan B. The Greatest indeed.

Crazy software developer that I am, I see seeds of Plan B thinking in agile approaches. Keep Plan A simple, so that you don't overcommit. Accept Plan B as a matter of course, refactoring in each cycle to build what you learn from writing the code back into the program. React to your pair's ideas and to changes in the requirements with aplomb.

There is good news: We can learn how to be better at Plan B. It takes effort and discipline, just as changing any of our habits does. For me, it is worth the effort.

~~~~

If you would like to learn more about the Rumble in the Jungle, I strongly recommend the documentary film When We Were Kings, which tells the story of this fight and how it came to be. Excellent sport. excellent art, and you can see Ali's Plan B moment with your own eyes.


Posted by Eugene Wallingford | Permalink | Categories: General, Personal, Software Development

May 20, 2011 11:55 AM

Learning From Others

I've been reading through some of the back entries in Vivek Haldar's blog and came across the entry Coding Blind. Haldar notes that most professionals and craftsmen learn their trade at least in part by watching others work, but that's not how programmers learn. He says that if carpenters learned the way programmers do, they'd learn the theory of how to hammer nails in a classroom and then do it for the rest of their careers, with every other carpenter working in a different room.

Programmers these days have a web full of open-source code to study, but that's not the same. Reading a novel doesn't give you any feel at all for what writing a novel is like, and the same is true for programming. Most CS instructors realize this early in their careers: showing students a good program shows them what a finished program looks like, but it doesn't give them any feel at all for what writing a program is like. In particular, most students are not ready for the false starts and the rewriting that even simple problems will cause them.

Many programming instructors try to bridge this gap by writing code live in class, perhaps with student participation, so that students can experience some of the trials of programming in a less intimidating setting. This is, of course, not a perfect model; instructors tend not to make the same kind of errors as beginners, or as many, but it does have some value.

Haldar points out one way that other kinds of writers learn from their compatriots:

Great artists and writers often leave behind a large amount of work exhaust other than their finished masterpieces: notebooks, sketches, letters and journals. These auxiliary work products are as important as the finished item in understanding them and their work.

He then says, "But in programming, all that is shunned." This made me chuckle, because I recently wrote a bit about my experience having students maintain engineering notebooks for our Intelligent Systems course. I do this so that they have a record of their thoughts, a place to dump ideas and think out loud. It's an exercise in "writing to learn", but Haldar's essay makes me think of another potential use of the notebooks: for other students to read and learn from. Given how reluctant my students were to write at all, I suspect that they would be even more reluctant to share their imperfect thoughts with others in the course. Still, perhaps I can find a way to marry these ideas.

cover of rpg's Writers' Workshops

This makes me think of another way that writers learn from each other, writers' workshops. Code reviews are a standard practice in software, and PLoP, the Pattern Languages of Programs conference, has adapted the writers' workshop form for technical writers. One of the reasons I like to teach certain project courses in a studio format is that it gives all then teams an opportunity to see each other's work and to talk about design, coding, and anything else that challenges or excites them. Some semesters, it works better than others.

Of course, a software team itself has the ability to help its members learn from one another. One thing I noticed more this semester than in the past was students commenting that they had learned from their teammates by watching them work. Some of the students who said this viewed themselves as the weakest links on their teams and so saw this as a chance to approach their more accomplished teammates' level. Others thought of themselves as equals to their teammates yet still found themselves learning from how others tackled problems or approached learning a new API. This is a team project succeeding as we faculty hope it might.

Distilling experience with techniques in more than just a finished example or two is one of the motivations for the software patterns community. It's one of the reasons I felt so comfortable with both the literary form and the community: its investment in and commitment to learning from others' practice. That doesn't operate at quite the fundamental level of watching another carpenter drive a nail, but it does strike close to the heart of the matter.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

May 13, 2011 2:26 PM

Patterns for Naming Things

J.B. Rainsberger's short entry grabbed my attention immediately. I think that Rainsberger is talking about a pair of complementary patterns that all developers learn at some point or other as they write more and bigger programs. He elegantly captures the key ideas in only a few words.

These patterns balance common forces between giving things long names and giving things short names. A long name can convey more information, but a short name is easier to type, format, and read. A long name can be a code smell that indicates a missing abstraction, but a short name can be a code smell that indicates premature generalization, a strange kind of YAGNI violation.

The patterns differ in the contexts in which they appear successfully. Long names are most useful the first time or two you implement an idea. At that point, there are few or no other examples of the idea in our code, so there is not yet a need for an abstraction. A long name can convey valuable information about the idea. As an idea appears more often, two or more long names will begin to overlap, which is a form of duplication. We are now ready to factor out the abstraction common to them. Now the abstraction conveys some or all of the information and short names become more valuable.

I need to incorporate these into any elementary pattern language I document, as well as in the foundation patterns layer of any professional pattern language. One thing I would like to think more about is how these patterns relate to Kent Beck's patterns Intention-Revealing Name and Type-Revealing Name.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

May 12, 2011 10:18 AM

What Students Said

Curly says, 'one thing'

On the last day of my Intelligent Systems course, I asked my students three retrospective questions. Each question asked them to identify one thing...

  • one thing you learned about AI by doing this project
  • one thing you learned about writing software by doing this project
  • one thing that makes your program "intelligent"

Question 3 is a topic for another day, when I will talk a bit about AI. Today I am thinking more about what students learned about writing software. As one of our curriculum's designated "project courses", Intelligent Systems has the goal of giving students an experience building a significant piece of software, as part of a team. What do the students themselves think they learned?

A couple of answers to the first question were of more general software development interest:

I learned that some concepts are easy to understand conceptually but difficult to implement or actually use.

I learned to be open-minded about several approaches to solving a problem. ... be prepared to accept that an approach might take a lot of time to understand and end up being [unsuitable].

There is nothing like trying to solve a real problem to teach you how hard some solutions are to implement. Neural networks were the most frequently mentioned concept that is easy to understand but hard to make work in practice. Many students come out their AI course thinking neural nets are magic; it turns out magic can be hard to serve up. I suspect this is true of many algorithms and techniques students learn over the course of their studies.

I don't recall talking about agile software development much during this course, though no doubt it leaks out in how I typically talk about writing software. Still, I was surprised at the theme running through student responses to the second question.

For example:

Design takes time. Multiple iterations, revise and test.

A couple of teams discovered spike solutions, sorta:

You may write a lot of worthless or bad code to help with the final solution. We produced a lot of bad code that was never used in the end product, but it helped us get to that point.

These weren't true spikes, because the teams didn't set out with the intention of using the code to learn. But most didn't realize that they could or should do this. Now that they know, they might behave differently in the future. Most important, they learned that it's okay to "code to learn".

Many students came to appreciate collective code ownership and tools that support it:

When writing software in a group, it is important to make your code readable: descriptive [names] and comments that describe what is going on.

I learned how to maintain a project with a repository so that each team member can keep his copy up-to-date. ... I also learned how to use testing suites.

Tests also showed up in one of my favorite student comments, about refactoring:

I learned that when refactoring even small code you need unit tests to make sure you are doing things correctly. Brute forcing only gets you into trouble and hours of debugging bad code.

Large, semester-long projects usually given students their first opportunity to experience refactoring. Living inside a code base for a while teaches them a lot about what software development is really like, especially code they themselves have written. Many are willing to accept that living with someone else's code can be difficult but believe that their own code will be fine. Turns out it's not. Most students then come to appreciate the value of refactoring techniques I need to help them learn refactoring tools better.

Finally, this comment from the first student retrospective I read captures a theme I saw throughout:

It is best to start off simple and make something work, rather than trying to solve the entire problem at once and get lost in its complexity.

This is in many ways the heart of agile software development and the source for all the other practices we find so useful. Whatever practices my own students adopt in the coming years, I hope they are guided by this idea.

~~~~

Some of you will recognize the character in the image above as Curly, the philosopher-cowboy from City Slickers. One of the great passages of that 1991 film has Curly teaching protagonist Mitch about the secret of life, "One thing. Just one thing."

I am not the first software person to use Curly as inspiration. Check out, for example, Curly's Law: Do One Thing. Atwood shows how "do one thing" is central to "several core principles of modern software development.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 10, 2011 4:32 PM

Course Post-Mortem and Project Notebooks

I'm pretty much done with my grading for the semester. All that's left is freezing the grades and submitting them.

Intelligent Systems is a project course, and I have students evaluate their and their teammates' contributions to the project. One part of the evaluation is to allocate the points their team earns on the project to the team members according to the quality and quantity of their respective contributions. As I mentioned to the class earlier in the semester, point allocations from semester to semester tend to exhibit certain features. With few exceptions:

  • Students are remarkably generous to one another, as long as the teammate makes a reasonable effort under the circumstances.
  • If anything, students tend to undervalue their own contribution.
  • The allocations are remarkably consistent across teammates on the same team.
  • The allocations are remarkably consistent with what I would assign, based on my interactions with the team over the course of the project.

All that adds up to me being rather satisfied with the grades that fall out of the grinder at the end of the semester.

One thing that has not changed since I last taught this course ten years ago or so is that most students don't like the idea of an engineer's notebook. I ask each student to maintain a record their of their notes while working on the project along with a weekly log intended to be a periodic retrospective of their work and progress, their team's work and progress, and the problems they encounter and solve along the way. Students have never liked keeping notebooks. Writing doesn't seem to be a habit we develop in our majors, and by the time they reach their last ultimate or penultimate semester, the habit of not writing is deeply ingrained.

One thing that may have changed in the last decade: students seem more surly at being asked to keep a notebook. In the past, students either did write or didn't write. This year, for the most part, students either didn't write or didn't write much except to say how much they didn't like being asked to write. I have to admire their honesty at the risk of being graded more harshly for having spoken up. (Actually, I am proud they trust me enough to still grade them fairly!) I can't draw a sound conclusion from one semester's worth of data, but I will watch for a trend in future semesters.

One thing that did change this semester: I allowed students to blog instead of maintaining a paper notebook. I was surprised that only two students took me up on the offer. Both ended up with records well above the average for the class. One of the students treated his blog a bit more formally than I think of an engineer's notebook, but the other seemed to treat much as he would have a paper journal. This was a win, one I hope to replicate in the future.

The Greeks long ago recorded that old habits die hard, if at all. In the future, I will have to approach the notebook differently, including more and perhaps more persuasive arguments for it up front and more frequent evaluation and feedback during the term. I might even encourage or require students to blog. This is 2011, after all.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 05, 2011 3:35 PM

Reaching for Too Much, in Life and Software Development

I just read this passage from The Rhythm of Life, by Matthew Kelly:

You never can get enough of what you don't really need.

Fulfillment comes not from having more and more of everything forever into oblivion. Fulfillment comes from having what you need.

Kelly is talking about how we live our lives. However, I could not help but think of You Aren't Gonna Need It and agile software development.

From there, Kelly takes a moral turn, but even then I hear the agile voice within:

The whole world is chasing illegitimate wants with reckless abandon. We use all of our time, effort, and energy in the pursuit of our illegitimate wants, hypnotized by the lie that our illegitimate wants are the key to our happiness.

At the same time, the gentle voice within us is constantly calling out to us, trying to encourage us not to ignore the wisdom we already possess.

There is a lot to be said for learning to be content with implementing the features we are working on right now, not features we think are coming in the future. Perhaps if we can learn to be content in life we can also learn to be content in code.


Posted by Eugene Wallingford | Permalink | Categories: Personal, Software Development

May 03, 2011 4:26 PM

There Is No Normal

Part of what made diagnosing my knee injury challenging is that the injury has not presented normally. Normally, this condition follows an obvious trauma. I did not suffer one. Normally, the symptoms include occasional locking of the joint and occasionally feeling as if the joint is going to give out. I have not experienced either. Normally, there is more pain than I seem to be having.

The doctors were surprised by this unusual presentation, but it didn't worry them much. They are used to the fact that there is no normal.

The human body is a complex machine, and people subject their bodies to a complex set of stimuli and conditions. As a result, the body responds in an unbelievable number of ways. What we think of as the "normal" path of most diseases, injuries, and processes is a composite of many examples. Each symptom or observation has some likelihood of occurring, but it is common for a particular case to look quite unusual.

This is something we learn when we study statistical methods. It's possible that no number in a set is equal to the average of all the numbers in a set. It's possible that no member in a set is normal in the sense of sharing all the features that are common to most members.

A large software system is a complex machine, and people subject software to a complex set of stimuli and conditions. As a result, the software responds in a surprising number of ways. When we think of this from the perspective people as users, we realize just how important designing for usability, reliability, and robustness are.

Programmers are people who interact with software, too, and we subject our programs to a wide-ranging set of demands. When we think about "there is no normal" from this perspective, we better understand why it is so challenging to debug, extend, and maintain programs.

Our programs may not be as complex as the human body, and we try to design them rather than let them evolve unguided. But I think it's still useful to program with a mindset that there is no normal. That way, like my doctor, we can handle cases that seem unusual with aplomb.


Posted by Eugene Wallingford | Permalink | Categories: Personal, Software Development

May 02, 2011 3:52 PM

Thinking and Doing in the Digital Age

Last week, someone I follow tweeted this link in order to share this passage:

You will be newbie forever. Get good at the beginner mode, learning new programs, asking dumb questions, making stupid mistakes, soliciting help, and helping others with what you learn (the best way to learn yourself).

That blog entry is about inexorable change of technology in then modern world and how, if we want to succeed in this world, we need a mindset that accommodates change. We might even argue that we need a mindset that welcomes or seeks out change. To me, this is one of the more compelling reasons for us to broaden the common definition of the liberal arts to include computing and other digital forms of communication.

As much as I like the quoted passage, I liked a couple of others as much or more. Consider:

Understanding how a technology works is not necessary to use it well. We don't understand how biology works, but we still use wood well.

As we introduce computing and other digital media to more people, we need to balance teaching how to use new ideas and techniques and teaching underlying implementations. Some tools change how we work without us knowing how they work, or needing to know. It's easy for people like me to get so excited about, say, programming that we exaggerate its importance. Not everyone needs to program all the time.

Then again, consider this:

The proper response to a stupid technology is to make a better one yourself, just as the proper response to a stupid idea is not to outlaw it but to replace it with a better idea.

In the digital world as in the physical world, we are not limited by our tools. We can change how our tools work, through configuration files and scripts. We can make our own tools.

Finally, an aphorism that captures differences between how today's youth think about technology and how people my age often think (emphasis added):

Nobody has any idea of what a new invention will really be good for. To evaluate, don't think; try.

This has always been true of inventions. I doubt many people appreciated just how different the world would be after the creation of the automobile or the transistor. But with digital tools, the cost of trying things out has been driven so low, relative to the cost of trying things in the physical world, that the cost is effectively zero. In so many situations now, the net value of trying things exceeds the net value of thinking.

I know that sounds strange, and I certainly don't mean to say that we should all just stop thinking. That's the sort of misinterpretation too many people made of the tenets of extreme programming. But the simple fact is, thinking too much means waiting too long. While you are thinking -- waiting to start -- someone else is trying, learning faster, and doing things that matter.

I love this quote from Elisabeth Hendrickson, who reminded herself of the wisdom of "try; don't think" when creating her latest product:

... empirical evidence trumps speculation. Every. Single. Time.

The scientific method has been teaching us the value of empiricism over pure thought for a long time. In the digital world, the value is even more pronounced.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

April 21, 2011 8:10 PM

Agile Approaches and the Small

Seth Godin recently blogged on the economies of small:

I think we embraced scale as a goal when the economies of that scale were so obvious that we didn't even need to mention them. Now that it's so much easier to produce a product in the small and market a product in the small, and now that it's so beneficial to offer a service to just a few, with focus and attention, perhaps we need to rethink the very goal of scale.

Agile approaches to software development exploit the economies of small:

  • small promises
  • short cycles
  • small teams
  • small steps
  • small changes
  • continuous integration

Traditional software engineering, based as it is on an engineering metaphor, seems invested in traditional economies of scale. That was a natural result of the metaphor bust also the tools of the time. Back when hardware and software made it more difficult to write software rapidly over short iterations, scale in the traditional sense -- large -- helped to make processes and maybe even people more efficient.

Things have changed.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

April 19, 2011 6:04 PM

A New Blog on Patterns of Functional Programming

(... or, as my brother likes to say about re-runs, "Hey, it's new to me.")

I was excited this week to find, via my Twitter feed, a new blog on functional programming patterns by Jeremy Gibbons, especially an entry on recursion patterns. I've written about recursion patterns, too, though in a different context and for a different audience. Still, the two pieces are about a common phenomenon that occurs in functional programs.

I poked around the blog a bit and soon ran across articles such as Lenses are the Coalgebras for the Costate Comonad. I began to fear that the patterns on this blog would not be able to help the world come to functional programming in the way that the Gang of Four book helped the world come to object-oriented programming. As difficult as the GoF book was for every-day programmers to grok, it eventually taught them much about OO design and helped to make OO programming mainstream. Articles about coalgebras and the costate comonad are certainly of value, but I suspect they will be most valuable to an audience that is already savvy about functional programming. They aren't likely to reach every-day programmers in a deep way or help them learn The Functional Way.

But then I stumbled across an article that explains OO design patterns as higher-order datatype-generic programs. Gibbons didn't stop with the formalism. He writes:

Of course, I admit that "capturing the code parts of a pattern" is not the same as capturing the pattern itself. There is more to the pattern than just the code; the "prose, pictures, and prototypes" form an important part of the story, and are not captured in a HODGP representation of the pattern. So the HODGP isn't a replacement for the pattern.

This is one of the few times that I've seen an FP expert speak favorably about the idea that a design pattern is more than just the code that can be abstracted away via a macro or a type class. My hope rebounds!

There is work to be done in the space of design patterns of functional programming. I look forward to reading Gibbons's blog as he reports on his work in that space.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

April 14, 2011 10:20 PM

Al Aho, Teaching Compiler Construction, and Computational Thinking

Last year I blogged about Al Aho's talk at SIGCSE 2010. Today he gave a major annual address sponsored by the CS department at Iowa State University, one of our sister schools. When former student and current ISU lecturer Chris Johnson encouraged me to attend, I decided to drive over for the day to hear the lecture and to visit with Chris.

Aho delivered a lecture substantially the same as his SIGCSE talk. One major difference was that he repackaged it in the context of computational thinking. First, he defined computational thinking as the thought processes involved in formulating problems so that their solutions can be expressed as algorithms and computational steps. Then he suggested that designing and implementing a programming language is a good way to learn computational thinking.

With the talk so similar to the one I heard last year, I listened most closely for additions and changes. Here are some of the points that stood out for me this time around, including some repeated points:

  • One of the key elements for students when designing a domain-specific language is to exploit domain regularities in a way that delivers expressiveness and performance.
  • Aho estimates that humans today rely on somewhere between 0.5 and 1.0 trillion lines of software. If we assume that the total cost associated with producing each line is $100, then we are talking about a most serious investment. I'm not sure where he found the $100/LOC number, but...
  • Awk contains a fast, efficient regular expression matcher. He showed a figure from the widely read Regular Expression Matching Can Be Simple And Fast, with a curve showing Awk's performance -- quite close to Thompson NFA curve from the paper. Algorithms and theory do matter.
  • It is so easy to generate compiler front ends these days using good tools in nearly every implementation language. This frees up time in his course for language design and documentation. This is a choice I struggle with every time I teach compilers. Our students don't have as strong a theory background as Aho's do when they take the course, and I think they benefit from rolling their own lexers and parsers by hand. But I'm tempted by what we could with the extra time, including processing a more compelling source language and better coverage of optimization and code generation.
  • An automated build system and a complete regression test suite are essential tools for compiler teams. As Aho emphasized in both talks, building a compiler is a serious exercise in software engineering. I still think it's one of the best SE exercises that undergrads can do.
  • The language for quantum looks cool, but I still don't understand it.

After the talk, someone asked Aho why he thought functional programming languages were becoming so popular. Aho's answer revealed that he, like any other person, has biases that cloud his views. Rather than answering the question, he talked about why most people don't use functional languages. Some brains are wired to understand FP, but most of us are wired for, and so prefer, imperative languages. I got the impression that he isn't a fan of FP and that he's glad to see it lose out in the social darwinian competition among languages.

If you'd like to see an answer to the question that was asked, you might start with Guy Steel's StrangeLoop 2010 talk. Soon after that talk, I speculated that documenting functional design patterns would help ease FPs into the mainstream.

I'm glad I took most of my day for this visit. The ISU CS department and chair Dr. Carl Chang graciously invited me to attend a dinner this evening in honor of Dr. Aho and the department's external advisory board. This gave me a chance to meet many ISU CS profs and to talk shop with a different group of colleagues. A nice treat.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

April 12, 2011 7:55 PM

Commas, Refactoring, and Learning to Program

The most important topic in my Intelligent Systems class today was the comma. Over the last week or so, I had grading their essays on communicating the structure and intent of programs. I was not all that surprised to find that their thoughts on communicating the structure and intent of programs were not always reflected in their essays. Writing well takes practice, and these essays are for practice. But the thing that stood out most glaringly from most of the papers was the overuse, misuse, and occasional underuse of the comma. So after I gave a short lecture on case-based reasoning, we talked about commas. Fun was had by all, I think.

On a more general note, I closed our conversation with a suggestion that perhaps they could draw on lessons they learn writing, documenting, and explaining programs to help them write prose. Take small steps when writing new content, not worrying as much about form as about the idea. Then refactor: spend time reworking the prose, rewriting, condensing, and clarifying. In this phase, we can focus on how well our text communicates the ideas it contains. And, yes, good structure can help, whether at the level of sentences, paragraphs, or then whole essay.

I enjoyed the coincidence of later reading this passage in Roy Behrens's blog, The Poetry of Sight:

Fine advice from poet Richard Hugo in The Triggering Town: Lectures and Essays on Poetry and Writing (New York: W.W. Norton, 1979)--

Lucky accidents seldom happen to writers who don't work. You will find that you may rewrite and rewrite a poem and it never seems quite right. Then a much better poem may come rather fast and you wonder why you bothered with all that work on the earlier poem. Actually, the hard work you do on one poem is put in on all poems. The hard work on the first poem is responsible for the sudden ease of the second. If you just sit around waiting for the easy ones, nothing will come. Get to work.

This is an important lesson for programmers, especially relative beginners, to learn. The hard work you do on one program is put in on all programs. Get to work. Write code. Refactor. Writing teaches writing.

~~~~

Long-time readers of this blog may recall that I once recommended The Triggering Town in an entry called Reading to Write. It is still one of my favorites -- and due for another reading soon!


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

April 10, 2011 9:23 PM

John McPhee on Writing, Teaching, and Programming

John McPhee is one of my favorite non-fiction writers. He is a long-form journalist who combines equal measures of detailed fact gathering and a literary style that I enjoy as a reader and aspire to as a writer. For years, I have used selections from the The John McPhee Reader in advice to students on how to do gather requirements for software, including knowledge acquisition for AI systems.

This weekend I enjoyed Peter Hessler's interview of McPhee in The Paris Review, John McPhee, The Art of Nonfiction No. 3. I have been thinking about several bits of McPhee's wisdom in the context of both writing and programming, which is itself a form of writing. I also connected with a couple of his remarks about teaching young writers -- and programmers.

One theme that runs through the interview serves as a universal truth connecting writing and programming:

Writing teaches writing.

In order to write or to program, one must first learn the basics, low-level skills such as grammar, syntax, and vocabulary. Both writers and programmers typically go on to learn higher-level skills that deal with the structure of larger works and the patterns that help creators create and readers understand. In the programming world, we call these "design" skills, though I imagine that's too much an engineering term to appeal to writes.

Once you have these skills under your belt, there isn't much more to teach, but there is plenty to learn. We help newbies learn by sharing what we create, by reading and critiquing each others work, and by talking about our craft. But doing it -- writing, whether it's stories, non-fiction, or computer programs -- that's the thing.

McPhee learned this in many ways, not the least of which was one of the responses he received to his first novel, which he wrote in lieu of a dissertation (much to the consternation of many Princeton English professors!). McPhee said

It had a really good structure and was technically fine. But it had no life in it at all. One person wrote a note on it that said, You demonstrated you know how to saddle a horse. Now go find the horse.

He still had a lot to learn. This is a challenge for many young programmers whom I teach. As they learn the skills they need to become competent programmers, even excellent ones, they begin to realize they also need a purpose. At a miniconference on campus last week, a successful former student encouraged today's students to find and nurture their own passions. In those passions they will also find the energy and desire to write, write, write, which is the only he knew of to master the craft of programming.

Finding passion is hard, especially for students who come through an educational system that sometimes seems more focused on checking off boxes than on growing a person.

Luckily, though, finding problems to work on (or stories to write) can be much less difficult. It requires only that we are observant, that we open our eyes and pay attention. As McPhee says:

There are zillions of ideas out there--they stream by like neutrons.

For McPhee, most of the ideas he was willing to write about, spending as much as three years researching and writing, relate to things he did when I was a kid. That's not too far from the advice we give young software developers: write the programs you need or want to use. It's okay to start with what you like and know even if no one else wants those things. First of all, maybe they do. And second, even if they really don't, those are the problems on which you will be willing to work. Programming teaches programming.

Keep in mind: finding ideas isn't enough. You have to do the work. In the end, that is the measure of a writer as well as the measure of a programmer.

If you have already found your passion, then finding cool things to do gets even easier. Passion and obsession seem to heighten our senses, making it easier to spot potential new ideas and solution. I just saw a great example of this in the movie The Social Network, when an exhausted Mark Zuckerberg found the insight for adding Relationship Status to Facebook from a friend's plaintive request for help finding out whether a girl in his Art History class was available.

So, you have an idea. How long does it take to write?

... It takes as long as it takes. A great line, and it's so true of writing. It takes as long as it takes.

Despite what we learn in skill, this is true of most things. They take however long they take. This was a hard lesson for me to learn. I was a pretty good student in school, and I learned early on how to prosper in the rhythm of the quarter, semester, and school year. Doing research in grad school helped me to see that real problems are much messier, much less predictable than the previous sixteen years of school had led me to believe.

As a CS major, though, I began to learn this lesson in my last year as an undergrad, writing the program at the core of my senior project. It takes as long as it takes, whatever the university's semester calendar says. Get to work.

As a teacher, I found most touching an answer McPhee gave when asked why he still teaches writing courses at Princeton. He is well past the usual retirement age and might be expected to slow down, or at least spend all of his time on his own writing. Every teacher who reads the answer will feel its truth:

But above all, interacting with my students--it's a tonic thing. Now I'm in my seventies and these kids really keep me alive. To talk to a nineteen-year-old who's really a good writer, and he's sitting in here interested in talking to me about the subject--that's a marvelous thing, and that's why I don't want to stop.

As I read this, my mind began to recount so many students who have changed me as a programmer and teacher. The programmer, artist, and musician who wanted to write a program that could express artistic style, who is now a filmmaker inflamed with understanding man and his relationship to the world. The high school kid with big ideas about fonts, AI, and design whose undergrad research qualified for a national ACM competition and who is now a research scientist at Apple. The brash PR student who wanted to become a programmer and did, writing a computer science thesis and an even more important communications studies thesis, who is now set on changing how we study and understand human communication in the age of the web. The precocious CS student whose ideas were bigger than my courses before he set foot in my classroom, who worked hard learning things beyond what we were teaching and eventually doubling back to learn what he had missed, an entrepreneur with a successful tech start-up who is now helping a new generation of students learn and dream.

The list could go on. Teaching keeps us alive. Students learn, we hope, and so do we. They keep us in the present, where the excitement of new ideas is fresh. And, as McPhee admits with no sense of shame or embarrassment, it is flattering, too.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

March 29, 2011 8:13 PM

Global Variables Considered

Last week, a student stopped in to ask a question. He had written a program for one of his courses of which he was especially proud. It consisted in large part of two local functions, and used recursion in a way that created an elegant, clear solution.

Yet his professor dinged his grade severely. The student had used a global variable.

That's when the question for me arrived.

But why are we taught not to use global variables?

First, let me say that this a strong student. He is not the sort to beg for points, and he wasn't asking me this question as a snark or a complaint. He really wanted to know the answer.

My first response was cynical and at least partly tongue-in-cheek. We teach you not to use global variables because we were taught to use global variables.

My second response was to point out that "global" is a relative term, not an absolute one. In OO languages, we write classes that contain instance variables and methods that operate on them. The instance variables are global to class's methods and local to class's clients. The programming world seems to like such "globals" just fine.

That is the beginning of my trouble trying to create an argument that supports the way in which his program was graded. In his program, written in a traditional procedural language, the offending variable was local to one procedure but global to two nested procedures. That sounds awfully similar to an ordinary Java class's instance variables!

On the extreme end of the global/local continuum we have a language like Cobol. All data is declared at the top of a program in an elaborate Data Division, and the "paragraphs" of the Procedure Division refer back to it. Not many computer scientists spend much time defending Cobol, but its design and organization make perfectly good sense in context, and programmers are able to write understandable, large programs.

As the student and I talked, I explained two primary reasons for the historical bias against globals:

Readability. When a variable lives outside the code that manipulates it, there is a chance that it can become separated in space from that code. As a large program evolves over time, it seems that the chance the variable will become separated from the related code approaches 1. That makes the code hard to understand. When the reader encounters a variable, she may have a hard time knowing what it means without seeing the code that uses it. When she encounters a procedure with a reference to a faraway variable, she may have a hard time knowing what the code does without easy reference to the variable and any other code that uses it.

This force is counteracted effectively in some circumstances. In OOP, we try not to write classes that are too long, which means that the instance vars and the methods will be relatively close to one another in the file or on the printed page. Furthermore, there is a convention that the vars will be declared at the top or bottom of the class, so the reader can always find them easily enough. That's part of what makes Cobol's layout work: readers study the Data Division first and then read the Procedure Division with an eye to the top of the file.

My student's programming had a structure that mirrored a small class: a procedure with a variable and two local procedures of reasonable size. I can imagine endorsing the relatively global variable because it was part of a understandable, elegant program.

Not-Obvious Dependencies. When two or more procedures operate on the same variable that lives outside all of them, there is a risk of that lack of readability rises to something worse: an inability to divine how the program works. The two procedures exert influence over each other's behavior through the values stored in the shared variable. In OO programs, this interaction is an expected part of how objects behave, and we try to keep methods and the class as a whole small enough to counteract the problem.

In the general case, though, we can end up with variables and procedures scattered throughout a program and interacting in non-obvious ways. A change to one procedure might affect another. Adding another procedure that refers to or changes the variable complicates matters for all existing procedures in the co-dependent relationship. Hidden dependencies are the worst kind of all.

This is what really makes global variables bad for us. Unless we can counteract this force effectively, we really don't want to use them.

These are two simple technical reasons that programmers prefer not to use global variables. CS professors tend to simplify them into the dictum, "no global variables allowed", and make it a hard and fast rule for beginners. Unfortunately, sometimes we forget to take the blinders off after our students -- or we ourselves! -- become more accomplished programmers. The dictum becomes dogma, and a substitute for professional judgment.

I have what I regard as a healthy attitude about global variables. But I admitted to my student that I have my own weakness in the dictum-turned-dogma arena. When I teach OOP to beginners, we follow the rule All instance variables are private. I'm a reasonable guy and so am willing to talk to students who want to violate the rule, but it's pretty much non-negotiable. I've never had a first- or second- year OOP student convince me that one of his IVs should protected or -- heaven forbid! -- public. But even in my own code, after a couple of decades of doing OOP, I rarely violate this rule. I say "rarely" only in the interest of being conservative in my assessment. I can't remember the last time I wrote a class with a public instance variable.

Not all teachers are good at giving up their dogma. Some professors don't even realize that what they believe is best thought of as an oversimplification for the purposes of helping novices develop good habits of thought.

Ironically, last semester I ran across the paper Global Variable Considered Harmful, by Bill Wulf and Mary Shaw. (If you can't get through the ACM paywall, you can also find the paper here.) This is, I think, the first published attempt to explain why the global variable is a bad idea. Read it -- it's a nice treatment of the issues as they existed back in 1973. Forty years later, I am comfortable using variables that are relatively global to one or more procedures under controlled conditions. At this point in my programming career, I willing to use my professional judgment to help me make good programs, not just programs that follow the rules.

I shared the Wulf and Shaw paper with my student. I hope he got a kick out of it, and I hope he used it to inform his already reliable professional judgment. The paper might even launch him ahead of the profs who teach the prohibition on global variables as if it were revealed truth.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

March 25, 2011 4:40 PM

Another Conference Changes Its Name

Ralph Johnson reports that "the conference on aspect oriented software development is renaming itself to 'Modularity'", much as OOPSLA has become SPLASH.

For the last couple of decades, computer science research has been focusing on more and more specific domains. The area of artificial intelligence, for example, soon spawned journals and conferences devoted specifically to sub-areas such as machine learning, expert systems, computer vision, and many others. It's interesting for me to see conferences such as AOSD and OOPSLA going the other direction, moving from the technology that spawned the conference in the first place to the more general idea or goal that motivates its community.

Of course, these conferences aren't purely academic; they have always had a strong alliance between industry and academia. Perhaps that is one of the reasons they are willing to rebrand themselves. Certainly, the changing economic model that drives this sort of conference is playing a big role.

By the way, Ralph's entry isn't really about the changing conference name. He merely uses that as a launching point for something more interesting: a first cut at cataloging different types of modularity. That is the best reason to read it!


Posted by Eugene Wallingford | Permalink | Categories: Software Development

March 23, 2011 8:13 PM

SPLASH 2011 and the Educators' Symposium

I have been meaning to write about SPLASH 2011 and especially the Educators' Symposium for months, and now I find that Mark Guzdial has beaten me to the punch -- with my own words, no less! Thanks to Mark for spreading the news. Go ahead and read his post if you'd like to see the message I sent to the SIGCSE membership calling for their submissions. Or visit the call for participation straightaway and see what the program committee has in mind. Proposals are due on April 8, only a few weeks hence. Dream big -- we are.

For now, though, I will write the entry I've been intending all these months:

The Next Ten Years of Software Education

SPLASH 2011 in Portland, Oregon

By the early 2000s, I had become an annual attendee of OOPSLA and had served on a few Educators' Symposium program committees. Out of the blue, John Vlissides asked me to chair the 2004 symposium. I was honored and excited. I eventually got all crazy and cold called Alan Kay and asked him to deliver our keynote address. He inspired us with a vision and ambitious charge, which we haven't been able to live up to yet.

When I was asked to chair again in 2005, we asked Ward Cunningham to deliver our keynote address. He inspired us with his suggestions for nurturing simple ideas and practices. It was a very good talk. The symposium as whole, though, was less successful at shaking things than in 2004. That was likely my fault.

I have been less involved in the Educators' Symposium since 2006 or 2007, and even less involved in OOPSLA more broadly. Being department head keeps me busy. I have missed the conference.

Fast-forward to 2010. OOPSLA has become SPLASH, or perhaps more accurately been moved under the umbrella of SPLASH. This is something that we had talked about for years. 2011 conference chair Crista Lopes was looking for a Educators' Symposium chair and asked me for any names I might suggest. I admitted to her that I would love to get involved again, and she asked me to chair. I'm back!

OOPSLA was OO, or at least that what its name said. It had always been about more, but the name brand was of little value in a corporate world in which OOP is mainstream and perhaps even passe. Teaching OOP in the university and in industry has changed a lot over the last ten years, too. Some think it's a solved problem. I think that's not true at all, but certainly many people have stopped thinking very hard about it.

In any case, conference organizers have taken the plunge. SPLASH != OOPSLA and is now explicitly not just about OO. The new conference acknowledges itself to be about programming more generally. That makes the Educators' Symposium something new, too, something more general. This creates new opportunities for the program committee, and new challenges.

We have decided to build the symposium around a theme of "The Next Ten Years". What ideas, problems, and technologies should university educators and industry trainers be thinking about? The list of possibilities is long and daunting: big data, concurrency, functional programming, software at Internet scale... and even our original focus, object-oriented programming. Our goal for the end of the symposium is to be able to write a report outlining a vision for software development education over the next ten years. I don't expect that we will have many answers, if any, but I do expect that we can at least begin to ask the right questions.

And now here's your chance to help us chart a course into the future, whether you plan to submit a paper or proposal to the symposium:

Who would be a killer keynote speaker?

What person could inspire us with a vision for computer science and software, or could ask us the questions we need to be asking ourselves?

Finding the right keynote speaker is one of the big questions I'm thinking about these days. Do you have any ideas? Let me know.

(And yes, I realize that Alan Kay may well still be one of the right answers!)

In closing, let me say that whenever I say "we" above, I am not speaking royally. I mean the symposium committee that has graciously offered their time and energy to designing implementing this challenge: Curt Clifton, Danny Dig, Joe Bergin, Owen Astrachan, and Rick Mercer. There are also a handful of people who have been helping informally. I welcome you to join us.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

March 04, 2011 5:25 PM

The Growing Buzz around Empirical Analysis of Repositories

This has turned into a recurring theme, due to a hopeful trend out in industry.

Last semester, I wrote a bit about studying program repositories as a way to understand how programmers work. Then last month, I wrote about simple empirical analysis of code, referring to Michael Feathers's article on how we can learn a lot about our program's design by looking at our commit log. Feathers went on to write a short note about getting empirical about refactoring, in which he expanded on the idea of looking at our code to understand its design better.

Now we have Turbulence, a package for pulling useful metrics about our code out of a git repository. The package began its life when Feathers and Corey Haines wrote a script to plot code churn versus its complexity. Haines has written a bit about the Turbulence project.

It doesn't end there. Developers are using Turbulence and adding to its code base. Feathers's has called for a renewed focus on design in the wild using the data we have at our fingertips. The physicians have begun to heal themselves, and they are leading the way for the rest of us.

One nice side effect of this trend is making available to a wider audience some of the academic research that has been done in this vein, such as Nagappan and Ball's paper on code churn and defect density. (I had the pleasure of meeting Ball when we served on a panel at OOPSLA several years ago.)

As many people are saying, we swim in data. We just have to find ways to use it well. I remain ever amazed at what our tools enable us to do.

All this talk about git has me resolved to go all the way and make a full switch to it. I've dabbled with git a bit and consumed a lot of software off GitHub, but now it's time to do all my development in it. Fortunately, there are a few excellent resources to help me, including the often-lauded Git Immersion guided tour by Jim Weirich and crew. and Scott Chacon's visually engaging Getting Git slidedeck. My trip to to SIGCSE and the spring break that follows immediately after can't come to soon!


Posted by Eugene Wallingford | Permalink | Categories: Software Development

February 22, 2011 4:06 PM

Nothing From Scratch

Over the weekend, I enjoyed re-reading Brian Foote's The Craftsmen vs. the Scavengers, which is subtitled "The Ruminations of a Foot Soldier on the Reuse Revolution". Foote expressed an idea that has been visited and re-visited every so often in recent years:

In a world dominated by code reuse, all programming, will, in one sense, be maintenance programming.

Foote was a student of Ralph Johnson, who has written and spoken occasionally about the idea that software development is program transformation. I blogged about Ralph's idea and what it meant for me nearly five years ago, just before teaching an intro CS course that emphasized the modification and extension of existing code.

Some people worry that if we don't start students off with writing their own code from scratch they won't really learn to program. Most of the students in that CS1 course have turned out to be pretty good programmers; that's just anecdote, of course, and not evidence that the approach is write or wrong. But at least I don't seem to have done them irreparable harm.

This idea is comfortable to me as an old Smalltalk programmer. As Foote elaborates, the Smalltalk toolset supports this style of programming and, more importantly, the Smalltalk culture encouraged code reuse, sharing, and a sense of collective code ownership. We all felt we were in the same boat -- the same image -- together.

The commingling of Foote's assertion and my recollection of that CS1 course caused my mind to wander down another path. What about those times when we do start with a blank slate on a new project? If we approach the task as programming from scratch, we might well design a complete solution and try to implement it as a single piece. When I do maintenance programming, I don't usually think that way, even when I have a major change to make to the program. I'm usually too scared to change too many things at once! Instead, I make a series of small changes to the code, coaxing and pruning the system toward the goal state. This is, I think, just what XP and TDD encourage us to do even when we code on a blank slate. It's an effective way to think about writing new programs.

The very next section of Foote's paper also caused echoes in my mind. It suggest that, in a culture of reuse, bad design might drive out good design simply by being there first and attracting an audience. Good design takes a while to evolve, and by the time it matures a mediocre library or framework might already control the niche. This may have happened in some parts of the Smalltalk world, but I was lucky not to encounter it very often. Foote's idea comes across as another form Gresham's Law for software design, so long as we are willing to mangle the original sense of that term. The effect is similar: the world ends up trafficking in a substandard currency.

It sobers me to note that Foote wrote this paper in the summer of 1989. These ideas aren't new and have been on the software world's mind since at least the shift to OOP began twenty-five years ago or so. There truly is nothing new for me to say. As a community, though, we revisit these ideas from different vantage points as we evolve. Perhaps we can come away with new understanding as a result.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

February 10, 2011 4:04 PM

This and That: Problems, Data, and Programs

Several articles caught my eye this week which are worth commenting on, but at this point none has triggered a full entry of its own. Some of my favorite bloggers do what they call "tab sweeps", but I don't store cool articles in browser tabs. I cache URLs and short notes to myself. So I'll sweep up three of my notes as a single entry, related to programming.

Programmer as Craftsman

Seth Godin writes about:

... the craftsperson, someone who takes real care and produces work for the ages. Everyone else might be a hack, or a factory guy or a suit or a drone, but a craftsperson was someone we could respect.

There's a lot of talk in the software development world these days about craftsmanship. All the conversation and all the hand-waving boil down to this. A craftsman is the programmer we all respect and the programmer we all want to be.

Real Problems...

Dan Meyer is an erstwhile K-12 math teacher who rails against the phony problems we give kids when we ask them to learn math. Textbooks do so in the name of "context". Meyer calls it "pseudocontext". He gives an example in his entry Connect These Two Dots, and then explains concisely what is wrong with pseudocontext:

Pseudocontext sends two signals to our students, both false:
  • Math is only interesting in its applications to the world, and
  • By the way, we don't have any of those.

Are we really surprised that students aren't motivated to practice and develop their craft on such nonsense? Then we do the same things to CS students in our programming courses...

... Are Everywhere These Days

Finally, Greg Wilson summarizes what he thinks "computational science" means in one of his Software Carpentry lessons. It mostly comes down to data and how we understand it:

It's all just data.

Data doesn't mean anything on its own -- it has to be interpreted.

Programming is about creating and composing abstractions.

...

The tool shapes the hand.

We drown in data now. We collect faster than we can understand it. There is room for more programmers, better programmers, across the disciplines and in CS.

We certainly shouldn't be making our students write Fahrenheit-to-Celsius converters or processing phony data files.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 07, 2011 9:03 PM

Teaching and Learning in a Code Base

In a pair of tweets today, Brian Marick offered an interesting idea for designing instruction for programmers:

A useful educational service: examine a person's codebase. Devise a new feature request that would be hard, given existing code and skill...

... Keep repeating as the codebase and skill improve. Would accelerate a programmer's skill at dealing with normal unexpected change.

This could also be a great way to help each programmer develop competencies that are missing from his or her skill set. I like how this technique would create an individualized learning for each student. The cost, of course, is in the work needed by the instructor to study the codebases and devise the feature requests. With a common set of problems to work on, over time an instructor might be able to develop a checklist of (codebase characteristic, feature request) pairs that covered a lot of the instructional space. This idea definitely deserves some more thought!

Of course, we can sometimes analyze valuable features of a codebase with relatively simple programs. Last month, Michael Feathers blogged about measuring the closure of code, in which he showed how we can examine the Open/Closed Principle in a codebase by extracting and plotting the per-file commit frequencies of source files in a project's version control repository. Feathers discussed how developers could use this information intentionally to improve the quality of their code. I think this sort of analysis could be used to great effect in the classroom. Students could see the OCP graphically for a number of projects and, combined with their programming knowledge of the projects, begin to appreciate what the OCP means to a programmer.

A serendipitous side effect would be for students to experience CS as an empirical discipline. This would help us prepare developers more readily in sync with Feathers's use of analytical data in their practice and CS grads who understand the ways in which CS can and should be an empirical endeavor.

I actually blogged a bit about studying program repositories last semester, for the purpose of understanding how to design better programming languages. That work used program repositories for research purposes. What I like about Marick's and Feathers's recent ideas is that they bring to mind how studying a program repository can aid instruction, too. This didn't occur to me so much back when one of my grad students studied relationships among open-source software packages with automated analysis of a large codebase. I'm glad to have received a push in that direction now.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

February 03, 2011 3:30 PM

Science and Engineering in CS

A long discussion on the SIGCSE members listserv about math requirements for CS degrees has drifted, as most curricular discussions seem to do, to "What is computer science?" Somewhere along the way, someone said, "Computer Science *is* a science, by name, and should therefore be one by definition". Brian Harvey responded:

The first thing I tell my intro CS students is "Computer Science isn't a science, and it isn't about computers." (It should be called "information engineering.")

I think that this assertion is wrong, at least without a couple of "only"s thrown in, but it is a great way to start a conversation with students.

I've been seeing the dichotomy between CS as science and CS as system-building again this semester in my Intelligent course. The textbook my students used in their AI course last semester is, like nearly every undergrad AI text, primarily an introduction to the science of AI: a taxonomy of concepts, results of research that help to define and delimit the important ideas. It contains essentially no pragmatic results for building intelligent systems. Sure, students learn about state-space search, logic as a knowledge representation, planning, and learning, along with algorithms for the basic methods of the field. But they are not prepared for the fact that, when they try to implement search or logical inference for a given problem, they still have a huge amount of work to do, with little guidance from the text.

In class today, we discussed this gap in two contexts: the gap one sees between low-level programming and high-level programming languages, and the difference between general-purpose languages and domain-specific languages.

My students seemed to understand my point of view, but I am not sure they really grok it. That happens best after they gain experience writing code and feel the gap while making real systems run. This is one of the reasons I'm such a believer in projects, real problems, and writing code. We don't always understand ideas until we see them in a concrete.

I don't imagine that intro CS students have any of the experience they need to understand the subtleties academics debate about what computer science is or what computer scientists. We are almost surely better off asking them to do something that matters them, whether a small problem or a larger project. In these problems and projects, students can learn from us and from their own work what CS is and how computer scientists think.

Eventually, I hope that the students writing large-ish AI programs in my course this semester learn just how much more there is to writing an intelligent system than just implementing a general-purpose algorithm from their text. The teams that are using pre-existing packages as part of they system might even learn that integrating software systems is "more like performing a heart transplant than snapping together LEGO blocks". (Thanks to John Cook for that analogy.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

January 27, 2011 4:48 PM

Turning Up the Knob on Functional OOP and Imperative OOP

One of my students has been learning Haskell as a prelude to exploring purely functional data structures. Recently, he wrote a short blog entry describing some of the ideas he has found most exciting. It ended with a couple of code snippets showing the elegance and brevity of list comprehensions compared to what he was used to in imperative languages. The student apologized for his imperative example, because he wrote it in Java using objects. In his mind, that made it object-oriented, not imperative.

This is a common misconception. Most OOP is imperative. Objects have state that changes.

Of course, one can write object-oriented code in a functional style by emphasizing return values and by creating new objects instead of changing existing objects. Certain kinds of objects, such as money, should probably be implemented as value objects without modifiable state. But most OO practice and intent is stateful, hence imperative.

I've read many a blog entry over the last few years in which OO gurus extol functional OO as a way to write better code. I think we can overdo it, though. Whenever I take this idea too far, though, I soon find myself contorting the code in a way that seems to serve the idea but not the program I am writing. Still, sometimes it can be fun to turn the knob on the functional OO dial up to 10 and try to write purely functional OO code, with no side effects of any sort. This kind of programming challenge has always appealed to me. It can teach you a lot about the strengths and limits of the tools you use.

It occurs to me that one way to enforce the rules of the functional OO challenge would be to turn off the imperative features in my language. That can be tough to do in a language with libraries full of stateful objects. But simply turning off the assignment operator in a language such as Java would make many of us struggle to write even simple programs.

Actually, I had the idea of turning off assignment statements late in a long conversation I had with myself while thinking about my student's comment and my response to him. If most OO is imperative, I wonder what it would be like to write "purely imperative" OO code. This would mean creating objects that never returned a value in response to a message. In a sense, these objects would be pure state and action, at least from the perspective of other objects in the system.

At first, this idea seemed absurd. What value could come from it?

This stylistic challenge is quite easy to enforce, either in practice or in tools: simply require all methods to be void. Voilé! No return statements are allowed. No values can be passed from one object to another in response to a message. An object would affect the state of the program either by modifying its own state or by sending a state-changing message to another object, perhaps an argument that it received along with a message.

Talk about Tell, Don't Ask! In this style of programming, I can only tell objects to do things. I can't ask for any data in return.

So, perhaps some value could come from this little challenge after all. I would have to take Tell, Don't Ask -- and encapsulation -- seriously. Programming in this way can help us see just how much we can accomplish with truly independent objects, providers of services who encapsulate their state and take full responsibility for its management. I think that, in many respects, this idea is faithful to the original idea of objects and OOP -- perhaps more faithful than our current incarnation of them in languages with functions.

I think that this could also help us in another way. Functional programming offers us one path to increased parallelism by eliminating state changes and thus making each computation independent of global context. Purely imperative programming offers another path, one that fits the early OO vision of encapsulated agents interacting via message passing. This is similar to the actor model that we see these days in languages such as Scala and Erlang. Of course, this model goes back to the work of Carl Hewitt, which inspired the evolution of both Scheme and Smalltalk!

I have not thought through all the implications of my thought experiment yet. Maybe it's nonsense; maybe it's a solved problem. Still, I think it might be fun to turn the dial up to 10 on stateful programming and try to implement a non-trivial program with no return statements. How far could I go before things got uncomfortable? How far could I go before I found myself contorting the code in a way that serves the idea more than the program I was writing?

Sometimes I am surprised just how many interesting thoughts can fall out of the simplest conversations. Too bad time to play with them doesn't fall out, too.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

December 20, 2010 3:32 PM

From Occasionally Great to Consistently Good

Steve Martin's memoir, "Born Standing Up", tells the story of how Martin's career as a stand-up comedian, from working shops at Disneyland to being the biggest-selling concert comic ever at his peak. I like hearing people who have achieved some level of success talk about the process.

This was my favorite passage in the book:

The consistent work enhanced my act. I Learned a lesson: It was easy to be great. Every entertainer has a night when everything is clicking. These nights are accidental and statistical: Like the lucky cards in poker, you can count on them occurring over time. What was hard was to be good, consistently good, night after night, no matter what the abominable circumstances.

"Accidental greatness" -- I love that phrase. We all like to talk about excellence and greatness, but Martin found that occasional greatness was inevitable -- a statistical certainty, even. If you play long enough, you are bound to win every now and then. Those wines are not achievement of performance so much as achievements of being there. It's like players and coaches in athletics who break records for the most X in their sport. "That just means I've been around a long time," they say.

The way to stick around a long time, as Martin was able to do, is to be consistently good. That's how Martin was able to be present when lightning struck and he became the hottest comic in the world for a few years. It's how guys like Don Sutton won 300+ games in the major leagues: by being good enough for a long time.

Notice the key ingredients that Martin discovered to becoming consistently good: consistent work; practice, practice, practice, and more practice; continuous feedback from audiences into his material and his act.

We can't control the lightning strikes of unexpected, extended celebrity or even those nights when everything clicks and we achieve a fleeting moment of greatness. As good as those feel, they won't sustain us. Consistent work, reflective practice, and small, continuous improvements are things we can control. They are all things that any of us can do, whether we are comics, programmers, runners, or teachers.


Posted by Eugene Wallingford | Permalink | Categories: General, Running, Software Development, Teaching and Learning

December 11, 2010 11:55 AM

Don't Forget to Solve Your Problem a Second Time

The Toyota Production System encourages workers to solve every problem two ways. First, you fix the issue at hand. Second, you work back through the system to determine the systemic reason that the issue arose. In this way, you eliminate this as a problem in the future, or at least make it less likely.

As I wrote recently, I don't think of writing software as a production process. But I do think that software developers can benefit from the "solve it twice" mentality. When we encounter a bug in our program or a design problem in our system or a delivery problem on a project, we should address the specific issue at hand. Then we should consider how we might prevent this sort of problem from recurring. There are several ways that we might improve:

  • We may need better or different tools.
  • We may be able to streamline or augment our process.
  • We may need to think about different things while working.
  • We may need to know something more deeply, or something new.

This approach would benefit us as university students and university professors, too. If students and professors thought more often in terms of continuous improvement and committed to fixing problems the second time, too, we might all have lower mean times to competence.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

December 03, 2010 3:56 PM

A Commonplace for Novelists and Programmers

To be a moral human being is to pay, be obliged to pay, certain kinds of attention.

-- Susan Sontag, "The Novelist and Moral Reasoning"

For some reason, this struck me yesterday as important, not just as a human being, but also as a programmer. I am reminded that many of my friends and colleagues in the software world speak passionately of our moral obligations when writing code. The software patterns community, especially, harkens to Christopher Alexander's call regarding the moral responsibility of creators.

(If you want to read more from Sontag without tracking down her essay, check out this transcript excerpted from a speech she gave in 2004.)


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

December 01, 2010 3:45 PM

"I Just Need a Programmer"

As head of the Department of Computer Science at my university, I often receive e-mail and phone calls from people with The Next Great Idea. The phone calls can be quite entertaining! The caller is an eager entrepreneur, drunk on their idea to revolutionize the web, to replace Google, to top Facebook, or to change the face of business as we know it. Sometimes the caller is a person out in the community; other times the caller is a university student in our entrepreneurship program, often a business major. The young callers project an enthusiasm that is almost infectious. They want to change the world, and they want me to help them!

They just need a programmer.

Someone has to take their idea and turn it into PHP, SQL, HTML, CSS, Java, and Javascript. The entrepreneur knows just what he or she needs. Would I please find a CS major or two to join the project and do that?

Most of these projects never find CS students to work on them. There are lots of reasons. Students are busy with classes and life. Most CS students have jobs they like. Those jobs pay hard cash, if not a lot of it, which is more attractive to most students than the promise of uncertain wealth in the future. The idea does not excite other people as much as the entrepreneur, who created the idea and is on fire with its possibilities.

A few of the idea people who don't make connections with a CS student or other programmer contact me a second and third time, hoping to hear good news. The younger entrepreneurs can become disheartened. They seem to expect everyone to be as excited by their ideas as they are. (The optimism of youth!) I always hope they find someone to help them turn their ideas into reality. Doing that is exciting. It also can teach them a lot.

Of course, it never occurs to them that they themselves could learn how to program.

A while back, I tweeted something about receiving these calls. Andrei Savu responded with a pithy summary of the phenomenon I was seeing:

@wallingf it's sad that they see software developers as commodities. product = execution != original idea

As I wrote about at greater length in a recent entry, the value of a product comes from the combination of having an idea and executing the idea. Doing the former or having the ability to do the latter aren't worth much by themselves. You have to put the two together.

Many "idea people" tend to think most or all of the value inheres to having the idea. Programmers are a commodity, pulled off the shelf to clean up the details. It's just a small matter of programming, right?

On the other side, some programmers tend to think that most or all of the value inheres to executing the idea. But you can't execute what you don't have. That's what makes it possible for me and my buddy to sit around over General Tsao's chicken and commiserate about lost wealth. It's not really lost; we were never in its neighborhood. We were missing a vital ingredient. And there is no time machine or other mechanism for turning back the clock.

I still wish that some of the idea people had learned how to program, or were willing to learn, so that they could implement their ideas. Then they, too, could know the superhuman strength of watching ideas become tangible. Learning to program used to be an inevitable consequence of using computers. Sadly, that's no longer true. The inevitable consequence of using computers these days seems to be interacting with people we may or may not know well and watching videos.

Oh, and imagining that you have discovered The Next Great Thing, which will topple Google or Facebook. Occasionally, I have an urge to tell the entrepreneurs who call me that their ideas almost certainly won't change the world. But I don't, for at least two reasons. First, they didn't call to ask my opinion. Second, every once in a while a Microsoft or Google or Facebook comes along and does change the world. How am I to know which idea is that one in a gazillion that will? If my buddy and I could go back to 2000 and tell our younger and better-looking selves about Facebook, would those guys be foresightful enough to sit down and write it? I suspect not.

How can we know which idea is that one that will change the world? Write the program, work hard to turn it into what people need and want, and cross our fingers. Writing the program is the ingredient the idea people are missing. They are doing the right thing to seek it out. I wonder what it would be like if more people could implement their own ideas.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

November 29, 2010 2:34 PM

Cleaning Up Versus Not Getting Dirty

In his 2008 letter to Amazon shareholders, Jeff Bezos wrote this oft-quoted footnote:

At a fulfillment center recently, one of our Kaizen experts asked me, "I'm in favor of a clean fulfillment center, but why are you cleaning? Why don't you eliminate the source of dirt?" I felt like the Karate Kid.

When I first read this in a blog entry, I immediately thought of refactoring. I favor a style of programming in which "cleaning up" is a fundamental step: pick a small bit of new functionality, do the simplest thing I can to make it work, and then clean up the program's design and implementation. Should I instead eliminate the source of dirt, and think far enough ahead that the program is always clean?

It didn't take me long to realize that I'm neither smart enough nor well enough informed about most problems to do that. I will have to clean up every so often, no matter how far I think ahead. Besides, I find so much value in taking small steps and doing simple things that I am willing to clean up.

Why is that? Why am I willing to clean up, rather than keep things clean from the start? Why does refactoring work for software developers?

If things are too clean, you probably are not creating new things.

Kaizen notions are attractive to many in the "lean" world of software development, and it is important -- in context. Production and creation are different kinds of task. Keeping things clean and efficient has great value in production environments, including factories and perhaps in certain kinds of software development. But when you are making new things, there is great value in exploration, and exploration is messy.

Bezos wrote that footnote on this passage:

Everywhere we look (and we all look), we find what experienced Japanese manufacturers would call "muda" or waste. I find this incredibly energizing. I see it as potential -- years and years of variable and fixed productivity gains and more efficient, higher velocity, more flexible capital expenditures.

Amazon is a company that makes most of its profit by delivering product to customers more efficiently and less expensively than its competitors. If it can eliminate a source of muda, it becomes a better company. That's why the Kaizen expert's advice gave Bezos a Karate Kid moment.

For me, the Karate Kid moment was just the opposite: when I learned that programmers had vocabulary for talking about refactoring and that some experts had made it a deliberate part of their development process. Wax on, wax off.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

November 19, 2010 4:45 PM

Debugging the Law

Debugging the Law

Recently, Kevin Carey's Decoding the Value of Computer Science got a lot of play among CS faculty I know. Carey talks about how taking a couple of computer programming courses way back at the beginning of his academic career has served him well all these years, though he ended up majoring in the humanities and working in the public affairs sector. Some of my colleagues suggested that this article gives great testimony about the value of computational thinking. But note that Carey didn't study abstractions about computation or theory or design or abstraction. He studied BASIC and Pascal. He learned computer programming.

Indeed, programming plays a central role in the key story within the story. In his first job out of grad school, Carey encountered a convoluted school financing law in my home state of Indiana. He wrote code to simulate the law in SAS and, between improving his program and studying the law, he came to understand the convolution so well that he felt confident writing a simpler formula "from first principles". His formula became the basis of an improved state law.

That's right. His code was so complicated and hard to maintain, he through the system away and wrote a new one. Every programmer has lived this experience with computer code. Carey tried to debug a legal code and found its architecture to be so bad that he was better off creating a new one.

CS professors should use this story every time they try to sell the idea of universal computer programming experience to the rest of the university!

The idea of refactoring legal code via a program that implements it is not really new. When I studied logic programming and Prolog in grad school, I read about the idea of expressing law as a Prolog programming and using the program to explore its implications. Later, I read examples where Prolog was used to do just that. The AI and law community still works on problems of this sort. I should dig into some of the recent work to see what progress, if any, has been made since I moved away from that kind of work.

My doctoral work involved modeling and reasoning about legal arguments, which are very much like computer programs. I managed to think in terms of argumentation patterns, based on the work of Stephen Toulmin (whose work I have mentioned here before). I wish I had been smart or insightful enough to make the deeper connection from argumentation to software development ideas such as architecture and refactoring. It seems like there is room for some interesting cross-fertilization.

(As always, if you know about work in this domain, please let me know!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

November 16, 2010 3:59 PM

Agile Moments: The Value of Smaller

Yesterday I read two articles that highlight, in different ways, the value of smaller things.

Smaller Teams

Why Google Can't Build Instagram relates a story about how Larry Ellison coaxed efficiencies from teams:

If a team wasn't productive, he'd come every couple of weeks and say "let me help you out." What did he do? He took away another person until the team started shipping and stopped having unproductive meetings.

This turns mythical man-month on its head. I wonder if I should try this in my project courses?

Smaller Scope

In the same blog entry, Scoble says:

[Instagram] actually started out as a service that did a lot more than just photographs. But, they learned they couldn't complete such a grand vision and do it well. So they kept throwing out features.

If you can't do all that you dreamed of doing, do less -- and do it well. Articles like this one imply that, as organizations get larger and more visible, they lose the ability to reduce scope and focus quality on a smaller project. I'm not sure they lose their ability so much as their will.

Smaller Promises

Paul Dyson tells a familiar story about the conflict between what a name comes to connote and the actions that are what it should denote:

I recently sat in front of a customer's project manager -- a very smart and reasonable person -- and accidentally used the A-word ["agile"] when describing how we were going to deliver our product and required customisations to them, and they sneered.

They actually snorted in disgust.

When I then explained we would get them live and using the base product quickly, followed by weekly incremental improvements with regular reviews and plenty of opportunity for rework they were very happy.

But they didn't see any connection between the two things.

The hype that seems inevitably to smother so many great ideas in the software world has, for many parts of our world, made "agile" meaningless at best and risible at worst. That's too bad, because when we ruin good words we lose a useful avenue for communication.

Later in the same piece, Dyson offers his solution:

... it's not about being agile/Agile or achieving agility, or being lean/Lean and efficient. It's about delivering software. And I figure the best way to champion that is actually just to get better at doing it.

I love those last two sentences. The best way to show people the value of patterns or TDD or refactoring or almost any practice is to do it. It's about delivering software.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

November 11, 2010 8:00 AM

A Gresham's Law for Software

Recently, Kent Beck tweeted:

cleaning up junit in preparation for 4.9 release with @dsaff. why do bad design decisions spread faster than good ones?

I immediately thought of Gresham's Law from economics, which is commonly expressed as "bad money drives good money out of circulation". That sounds like what Kent is saying: bad design decisions drive good ones out of our code.

In reality, the two ideas are not alike. Gresham's Law refers to a human behavior we can all understand. Suppose we have two coins denominated as a worth one dollar. The first coin is made of gold, a rare metal of enduring social and economic value. The second is made of nickel, a common metal not valued by the people using the coins. Under these conditions, people will tend to hoard the gold coins and use the nickel coins in trade. The result is that, eventually, there will be few or no gold coins in circulation. Hence the aphorism: "bad money drives good money out of circulation".

(Sir Thomas Gresham, a financial agent at the time of Queen Elizabeth I, was not the first person to note this behavior. According to Wikipedia, Aristophanes remarked on the phenomenon in his play "The Frogs", at the end of the 5th century BC.)

That's not what is happening in JUnit or other software systems. Kent and his partners aren't hoarding good design decisions and using bad ones in their place, in order to benefit later from having the good decisions at hand. Good design decisions have value only when they are deployed. They are good only in a context where they balance forces in a pleasing or supportive way. "Spending" bad design decisions in code doesn't get rid of them; it requires that we live with them every time we touch the code!

So, equating the phenomenon that Kent described with Gresham's Law would be to misuse the law. If I did so, I wouldn't be alone. Robert Mundell discusses faulty renderings of the principle in an academic paper. It wouldn't even be the first time I made the mistake. I remember vividly the night I took a midterm exam in my undergrad macroeconomics class. I misstated Gresham's Law in one of my responses. My instructor was wondering around the room. He saw my answer, leaned over, and whispered into my ear that I should think about that question again. I did and was able to correct my answer. (That guy was a way cool teacher, and not just for helping me out on the exam.)

When I saw Kent's tweet, I did not make that mistake again. I answered his question with "a perverse malformation of Gresham's Law?" John Mitchell offered his own colorful phrase: viral toxicity. That is almost certainly a better reflection of the phenomenon than mine, yet the ring of "the bad drives out the good" still appeals to me.

I think Kent's question is worth thinking about some more. Bad design does seem to infect other parts of the system and spread, by requiring us to deform other code to make it work well with the bad code, to fit with the bad structures we've already created. Perhaps Viral Toxicity is a sort of Gresham's Law for software, a phenomenon that developers need to be aware of and guard against. Maybe if we talk about the viral toxicity of bad design, other people will understand better the value and even necessity of regular refactoring!


Posted by Eugene Wallingford | Permalink | Categories: Software Development

November 03, 2010 2:12 PM

Ideas from Readers on Recent Posts

A few recent entries have given rise to interesting responses from readers. Here are two.

Fat Arrows

Relationships, Not Characters talked about how the most important part of design often lies in the space between the modules we create, whether objects or functions, not the modules themselves. After reading this, John Cook reminded me about an article by Thomas Guest, Distorted Software. Near the end of that piece, which talks about design diagrams, Guest suggests that the arrows in application diagrams should be larger, so that they would be proportional to the time their components take to develop. Cook says:

We typically draw big boxes and little arrows in software diagrams. But most of the work is in the arrows! We should draw fat arrows and little boxes.

I'm not sure that would make our OO class diagrams better, but it might help us to think more accurately!

My Kid Could Do That

Ideas, Execution, and Technical Achievement wistfully admitted that knowing how to build Facebook or Twitter isn't enough to become a billionaire. You have to think to do it. David Schmüdde mentioned this entry in his recent My Kid Could Do That, which starts:

One of my favorite artists is Mark Rothko. Many reject his work thinking that they're missing some genius, or offended that others see something in his work that they don't. I don't look for genius because genuine genius is a rare commodity that is only understood in hindsight and reflection. The beauty of Rothko's work is, of course, its simplicity.

That paragraph connects with one of the key points of my entry: Genius is rare, and in most ways irrelevant to what really matters. Many people have ideas; many people have skills. Great things happen when someone brings these ingredients together and does something.

Later, he writes:

The real story with Rothko is not the painting. It's what happens with the painting when it is placed in a museum, in front of people at a specific place in the world, at a specific time.

In a comment on this post, I thanked Dave, and not just because he discusses my personal reminiscence. I love art but am a novice when it comes to understanding much of it. My family and I saw an elaborate Rothko exhibit at the Smithsonian this summer. It was my first trip to the Smithsonian complex -- a wonderful two days -- and my first extended exposure to Rothko's work. I didn't reject his art, but I did leave the exhibit puzzled. What's the big deal?, I wondered. Now I have a new context in which to think about that question and Rothko's art. I didn't expect the new context to come from a connection a reader made to my post on tech start-up ideas that change the world!

I am glad to know that thinkers like Schmüdde are able to make connections like these. I should note that he is a professional artist (both visual and aural), a teacher, and a recovering computer scientist -- and a former student of mine. Opportunities to make connections arise when worlds collide.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Personal, Software Development

November 02, 2010 11:00 AM

Relationships, Not Characters

Early in his book "Impro", Keith Johnstone quotes playwright Edward Bond to the effect:

Drama is is about relationships, not about characters.

This immediately brought to mind Alan Kay's insistence that object-oriented programmers too often focus so much on the objects in their programmers that they lose sight of something more important: the space between the objects. A few years ago, I wrote about this idea in an entry called Software in Negative Space. It remains one of my most-read articles.

The secret to good OO design is in the ma, the web of relationships that make up a complex system, not in the objects themselves.

I think this is probably true of good design in any style, because it is really about how complex systems can manage and use encapsulation. The very best OO design patterns show us how multiple objects interact via message passing to resolve a thorny set of forces. The individual objects don't solve the problem; the solution lies in their interfaces and delegation of responsibility. We need to think about our designs at that level, too.

Now that functional programming is receiving so much mainstream attention, this is a good time to think about when and how good functional design is about relationships, not (just) functions. A combinator is an example: the particular functions are not as important as they way they hook together to solve a bigger problem. Designing functions in this style requires thinking more about the interfaces that expose abstractions to the world, and how other modules use them as service, and less about their implementation. Relationships.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

October 30, 2010 1:01 PM

The Time is Right for Functional Design Patterns

Back in 1998, I documented some of the ideas that I used to teach functional programming in Scheme. The result was the beginnings of a small pattern language I called Roundabout. When I workshopped this paper at PLoP, I had a lot of fun, but it was a challenge. My workshop consisted of professional developers, most working in Java, with little or no experience in Scheme. Worse, many had been exposed to Lisp as undergraduates and had had negative experiences. Even though they all seemed open to my paper, subconsciously their eyes and ears were closed.

We gathered over lunch so that I could teach a quick primer on how to read Scheme. The workshop went well, and I received much useful feedback.

Still, that wasn't the audience for Roundabout. They were are OO programmers. To the extent they were looking for patterns to use, they were looking for GoF-style OO patterns, C++, Java, and enterprise patterns. I had a hard time finding an audience for Roundabout. Most folks in the OO world weren't ready yet; they were still trying to learn how to do OOD and OOP really well. I gave a short talk on how I use Roundabout in class at an ICFP workshop, but the folks there already knew these patterns well, and most were beyond the novice level at which they live. Besides, the functional programming world wasn't all that keen on the idea of patterns at all, not patterns in the style of Christopher Alexander.

Fast forward to 2010. We now have Scala and Clojure on the JVM. A local developer I know is working hard to wrap his head around FP. Last week, he sent me a link to an InfoQ talk by Aino Corry on Functional Design Patterns. The talk is about patterns more generally, what they are and how GoF patterns fit in the functional programming world. At about the 19:00 mark, she mentions... Roundabout! My colleague is surprised to hear my name and tweets his excitement.

My work on functional design patterns is resurfacing. Why? The world is changing. With Scala and Clojure poised to make noise in the Java enterprise world, functional programming is here. People are talking about Scheme and especially Haskell again. Functional programming is trendy now, with famous OO consultants talking it up and making the rounds at conferences and workshops giving talks about how important it is. (Some folks have been saying that for a while...)

The software patterns "movement" grew out of a need felt by many programmers around the world to learn how to do OO design and programming. Most had been weaned on traditional procedural programming and built up years of experience programming in that style, only to find that their experience didn't transfer smoothly into the OO world. Patterns were an ideal vehicle for documenting OO expertise and communicating it to programmers as they learned the new style.

We now face a similar transition in industry and even academia, as languages like Scala and Clojure begin to change how professionals build their systems. They are again finding that their experience -- now with OO -- does not transfer into the functional world without a few hitches. What we need now are papers that document functional design and programming patterns, both at the most basic level (like Roundabout) and at a higher level (like GoF). We have some starting points from which to begin the journey. There has been some good work done on refactoring functional-style programs, and refactoring is a complement to patterns.

This is a great opportunity for experienced functional programmers to teach their style to a large new audience that now has incentive to learn it. This is also a great opportunity to demonstrate again the value of the software pattern as a form for documenting and teaching how to build things. The question is, what to do next.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

October 26, 2010 4:58 PM

Mindset, Faith, and Code Retreats

If I were more honest with myself, I would probably have to say something like this more often:

He also gave me the white book, XP Explained, which I dismissed outright as rubbish. The ideas did not fit my mental model and therefore they were crap.

Like many people, I am too prone to impose my way of thinking and working on everything. Learning requires changing how I think and do, and that can only happen when I don't dismiss new ideas as wrong.

I found that passage in Corey's Code Retreat, a review of a code retreat conducted by Corey Haines. The article closes with the author's assessment of what he had learned over the course of the day, including this gem:

... faith is a necessary component of design ...

This is one of the hardest things for beginning programmers to understand, and that gets in the way of their learning. Without much experience writing code, they often are overwhelmed by the uncertainty that comes with making anything that is just beyond their experience. And that is where the most interesting work lies: just beyond our experience.

Runners training for their first marathon often feel the same way. But experience is no antidote for this affliction. Despair and anger are common emotions, and they sometimes strike us hardest when we know how to solve problems in one way and are asked to learn a new way to think and do.

Some people are naturally optimistic and open to learning. Others have cultivated an attitude of openness. Either way, a person is better prepared to have faith that they will eventually get it. Once we have experience, our faith is buttressed by our knowledge that we probably will reach a good design -- and that, if we don't, we will know how to respond.

This article testifies to the power of a reflective code retreat led by a master. After reading it, I want to attend one! I think this would be a great thing for our local software community to try. For example, a code retreat could help professional programmers grok functional programming better than just reading books about FP or about the latest programming language.

~~~~

The article also opens with a definition of engineering I had not seen before:

... the strategy for causing the best change in a poorly understood situation within the available resources ...

I will think about it some more, but on first and second readings, I like this.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

October 21, 2010 8:50 AM

Strange Loop Redux

StrangeLoop 2010 logo

I am back home from St. Louis and Des Moines, up to my next in regular life. I recorded some of my thoughts and experiences from Strange Loop in a set of entries here:

Unlike most of the academic conferences I attend, Strange Loop was not held in a convention center or in a massive conference hotel. The primary venue for the conference was the Pageant Theater, a concert nightclub in the Delmar Loop:

The Pageant Theater

This setting gave the conference's keynotes something of an edgy feel. The main conference lodging was the boutique Moonrise Hotel a couple of doors down:

The Pageant Theater

Conference session were also held in the Moonrise and in the Regional Arts Commission building across the street. The meeting rooms in the Moonrise and the RAC were ordinary, but I liked being in human-scale buildings that had some life to them. It was a refreshing change from my usual conference venues.

It's hard to summarize the conference in only a few words, other than perhaps to say, "Two thumbs up!" I do think, though, that one of the subliminal messages in Guy Steele's keynote is also a subliminal message of the conference. Steele talked for half an hour about a couple of his old programs and all of his machinations twenty-five or forty years to make them run in the limited computing environments of those days. As he went to all the effort to reconstruct the laborious effort that went into those programs in the first place, the viewer can't help but feel that the joke's on him. He was programming in the Stone Age!

But then he gets to the meat of his talk and shows us that how we program now is the relic of a passing age. For all the advances we have made, we still write code that transitions from state to state to state, one command at a time, just like our cave-dwelling ancestors in the 1950s.

It turns out that the joke is on us.

The talks and conversations at Strange Loop were evidence that one relatively small group of programmers in the midwestern US are ready to move into the future.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Software Development

October 16, 2010 9:09 AM

Strange Loop, Day 2 Afternoon

After eating some dandy deep-dish BBQ chicken at Pi Pizzeria with the guys from T8 Webware (thanks, Wade!), I returned for a last big run of sessions. I'll save the first session for last, because my report of it is the longest.

Android Squared

I went to this session because so many of my students want me to get down in the trenches with phone programming. I saw a few cool tools, especially RetroFit, a new open-source framework for Android. There are not enough hours in a day for me to explore every tool out there. Maybe I can have a student do an Android project.

Java Puzzlers

And I went to this session because I am weak. I am a sucker for silly programming puzzles, especially ones that take advantage of the dark corners of our programming languages. This session did not disappoint in this regard. Oh, the tortured code they showed us! I draw from this experience a two-part moral:

  1. Bad programmers can write really bad code, especially in a complex language.
  2. A language that is too complex makes bad programmers of us all.

Brian Marick on Outside-In TDD

Marick demoed a top-down | outside-in style of TDD in Clojure using Midje, his homebrew test package. This package and style make heavy use of mock objects. Though I've dinked around a bit in Scala, I've done almost nothing in Clojure, so I'll have to try this out. The best quote of the session echoed a truth about all programming: You should have no words in your test that are not specifically about the test.

Douglas Crockford, Open-Source Heretic

Maybe my brain was fried by the end of the two days, or perhaps I'm simply not clever enough. While I able to chuckle several times through this closing keynote, I never saw the big picture or the point of the talk. There were plenty of interesting, if disconnected, stories and anecdotes. I enjoyed Crockford's coverage of several historical mark-up languages, including Runoff and Scribe. (Runoff was the forebear of troff, a Unix utility I used throughout grad school -- I even produced my wedding invitations using it! Fans of Scribe should take a look at Scribble, a mark-up tool built-on top of Racket.) He also told an absolutely wonderful story about Grace Murray Hopper's A-0, the first compiler-like tool and likely the first open-source software project.

Panel on the Future of Languages

Panels like this often don't have a coherent summary. About all I can do is provide a couple of one-liners and short answers to a couple of particularly salient questions.

Joshua Bloch: Today's patterns are tomorrow's language features. Today's bugs are tomorrow's type system features.

Douglas Crockford: Javascript has become what Java was meant to be, the language that runs everywhere, the assembly language of the web.

Someone in the audience asked, "Are changes in programming languages driven by academic discovery or by practitioner pain?" Guy Steele gave the best answer: The evolution of ideas is driven by academics. Uptake is driven by practitioner needs.

So, what is the next big thing in programming languages? Some panelists gave answers grounded in today's problems: concurrency, a language that could provide and end-to-end solution for the web, and security. One panelists offered laziness. I think that laziness will change how many programmers think -- but only after functional programming has blanketed the mainstream. Collectively, several panelists offered variations of sloppy programming, citing as early examples Erlang's approach to error recovery, NoSQL's not-quite-data consistency, and Martin Rinard's work on acceptability-oriented computing.

The last question from the audience elicited some suggestions you might be able to use. What language, obscure or otherwise, should people learn in order to learn the language you really want them to learn? For this one, I'll give you a complete list of the answers:

  • Io. (Bruce Tate)
  • Rebol. (Douglas Crockford)
  • Forth. Factor. (Alex Payne)
  • Scheme. Assembly. (Josh Bloch)
  • Clojure. Haskell. (Guy Steele)

I second all of these suggestions. I also second Steele's more complete answer: Learn any three languages you do not know. The comparisons and contrasts among them will teach you more than any one language can.

Panel moderator Ted Neward closed the session with a follow-up question: "But what should the Perl guys learn while they are waiting for Perl 6?" We are still waiting for the answer.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 15, 2010 11:46 PM

Strange Loop, Day 2 Morning

Perhaps my brain is becoming overloaded, or I have been less disciplined in picking talks to attend, or the slate of sessions for Day 2 is less coherent than Day 1's. But today has felt scattered, and so far less satisfying than yesterday. Still, I have had some interesting thoughts.

Billy Newport on Enterprise NoSQL

This was yet another NoSQL talk, but not, because it was different than the preceding ones at the conference. This talk was not about any particular technologies. It was about mindsets.

Newport explained that NoSQL means not only SQL. These two general approaches to data storage offer complementary strengths and weaknesses. This means that they are best used in different contexts.

I don't do enough programming for big data apps to appreciate all the details of this talk. Actually, I understood most of the basic concepts, but they soon starting blurring in my mind, because I don't have personal experience on which to hang them. A few critical points stood out:

  • In the SQL world, the database is the "system of record" for all data, so consistency is a given. In the NoSQL world, having multiple systems of record is normal. In order to ensure consistency, the application uses business rules to bring data back into sync. This requires a big mind shift for SQL guys.

  • In the SQL world, the row is a bottleneck. In the NoSQL world, any node can handle the request. So there is not a bottleneck, which means the NoSQL approach scales transparently. J But see the first bullet.

These two issues are enough to see one of Newport's key points. The differences between the two worlds is not only technical but also cultural. SQL and NoSQL programmers use different vocabulary and have different goals. Consider that "in NoSQL, 'query' is a dirty word". NoSQL programmers do everything they can to turn queries into look-ups. For the SQL programmer, the query is a fundamental concept.

The really big idea I took away from this talk is that SQL and NoSQL solve different problems. The latter optimizes for one dominant question, while the former seeks to support an infinite number of questions. Most of the database challenges facing NoSQL shops boil down to this: "What happens if you ask a different question?"

Dean Wampler on Scala

The slot in which this tutorial ran was longer than the other sessions at the conference. This allowed Wampler to cover a lot of details about Scala. I didn't realize how much of an "all but the kitchen sink" language Scala is. It seems to include just about every programming language feature I know about, drawn from just about every programming language I know about.

I left the talk a bit sad. Scala contains so much. It performs so much syntactic magic, with so many implicit conversions and so many shortcuts. On the one hand, I fear that large Scala programs will overload programmers' minds the way C++ does. On the other, I worry that its emphasis on functional style will overload programmers' minds the way Haskell does.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 15, 2010 12:34 PM

Guy Steele's Own Strange Loop

The last talk of the afternoon of Day 1 was a keynote by Guy Steele. My notes for his talk are not all that long, or at least weren't when I started writing. However, as I expected, Steele presented a powerful talk, and I want to be able to link directly to it later.

Steele opened with a story about a program he wrote forty years ago, which he called the ugliest program he ever wrote. It fit on a single punch card. To help us understand this program, he described in some detail the IBM hardware on which it ran. One problem he faced as a programmer is that the dumps were undifferentiated streams of bytes. Steele wanted line breaks, so he wrote an assembly language program to do that -- his ugliest program.

Forty years later, all he has is the punch card -- no source. Steele's story then turned into CSI: Mainframe. He painstakingly reverse-engineered his code from punches on the card. We learned about instruction format, data words, register codes... everything we needed to know how this program managed to dump memory with newlines and fit on a single card. The number of hacks he used, playing on puns between op codes and data and addresses, was stunning. That he could resurrect these memories forty years later was just as impressive.

I am just old enough to have programmed assembly for a couple of terms on punch cards. This talk brought back memories, even how you can recognize data tables on a card by the unused rows where there are no op codes. What a wonderful forensics story.

The young guys in the room liked the story, but I think some were ready for the meet of the talk. But Steele told another, about a program for computing sin 3x on a PDP-11. To write this program, Steele took advantage of changes in the assembly languages between the IBM mainframe and the PDP-11 to create more readable code. Still, he had to use several idioms to make it run quickly in the space available.

These stories are all about automating resource management, from octal code to assemblers on up to virtual memory and garbage collection. These techniques let the programmer release concerns about managing memory resources to tools. Steele's two stories demonstrate the kind of thinking that programmers had to do back in the days when managing memory was the programmer's job. It turns out that the best way to think about memory management is not to think about it at all.

At this point, Steele closed his own strange loop back to the title of his talk. His thesis is this: the best way to think about parallel programming is not to have to.

If we program using a new set of idioms, then parallelism can be automated in our tools. The idioms aren't about parallelism; they are more like functional programming patterns that commit the program less to underlying implementation.

There are several implications of Steele's thesis. Here are two:

  • Accumulators are bad. Divide and conquer is good.

  • Certain algebraic properties of our code are important. Programmers need to know and preserve them in then code they write.

Steele illustrated both of these implications by solving an example problem that would fit nicely in a CS1 course: finding all the words in a string. With such a simple problem, everyone in the room has an immediate intuition about how to solve it. And nearly everyone's intuition produces a program using accumulators that violates several important algebraic properties that our code might have.

One thing I love about Steele's talks: he grounds ideas in real code. He developed a complete solution to the problem in Fortress, the language Steele and his team have been creating at Sun/Oracle for the last few years. I won't try to reproduce the program or the process. I will say this much. One, the process demonstrated a wonderful interplay between functions and objects. Two, in the end, I felt like we had just used a process very similar to the one I use when teaching students to create this functional merge sort function:

    (define mergesort
       (lambda (lst)
          (merge-all (map list lst))))

Steele closed his talk with the big ideas that his programs and stories embody. Among the important algebraic properties that programs should have whenever possible are ones we all learned in grade school, explicitly or implicitly. Though they may still sound scary, they all have intuitive common meanings:

  • associative -- grouping don't matter
  • commutative -- order doesn't matter
  • idempotent -- duplicates don't matter
  • identity -- this value doesn't matter
  • zero -- other values don't matter

Steele said that "wiggle room" was the key buzzword to take away from his talk. Preserving invariants of these algebraic properties give the compiler wiggle room to choose among alternative ways to implement the solution. In particulars, associativity and commutativity give the compiler wiggle room to parallelize the implementation.

(Note that the merge-all operation in my mergesort program satisfies all five properties.)

One way to convert an imperative loop to a parallel solution is to think in terms of grouping and functions:

  1. Bunch mutable state together as a state "value".
  2. Look at the loop as an application of one or more state transformation functions.
  3. Look for an efficient way to compose these transformation functions into a single function.

The first two steps are relatively straightforward. The third step is the part that requires ingenuity!

In this style of programming, associative combining operators are a big deal. Creating new, more diverse associative combining operators is the future of programming. Creating new idioms -- the patterns of programs written in this style -- is one of our challenges. Good programming languages of the future will provide, encourage, and enforce invariants that give compilers wiggle room.

In closing, Steele summarized our task as this: We need to do for processor allocation what garbage collection did for memory allocation. This is essential in a world in which we have parallel computers of wildly different sizes. (Multicore processors, anyone?)

I told some of the guys at the conference that I go to hear Guy Steele irrespective of his topic. I've been fortunate enough to be in a small OOPSLA workshop on creativity with Steele, gang-writing poetry and Sudoku generators, and I have seen him speak a few times over the years. Like his past talks, this talk makes me think differently about programs. It also crystallizes several existing ideas in a way that clarifies important questions.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

October 08, 2010 3:53 PM

Strange Loops

StrangeLoop 2010 logo

As busy as things are here with class and department duties, I am excited to be heading to StrangeLoop 2010 next week. The conference description sounds like it is right up my alley:

Strange Loop is a developer-run software conference. Innovation, creativity, and the future happen in the magical nexus "between" established areas. Strange Loop eagerly promotes a mix of languages and technologies in this nexus, bringing together the worlds of bleeding edge technology, enterprise systems, and academic research.

Of particular interest are new directions in data storage, alternative languages, concurrent and distributed systems, front-end web, semantic web, and mobile apps.

One of the reasons I've always liked OOPSLA is that it is about programming. It also mixes hard-core developers with academics, tools with theory. Unfortunately, I'll be missing OOPSLA (now called SPLASH) again this year. I hope that StrangeLoop will inspire me in a similar way. The range of technologies and speakers on the program tells me it probably will.

2010 Des Moines Marathon logo

The day after I return from St. Louis, I hit the road again for the Des Moines Marathon, where I'll run strange loops of a different sort. My training has gone pretty well since the end of July, with mileage and speed work hitting targets I set back in June. Before my taper, I managed three weeks of 50 miles or more, including three 20+ mile long runs. Will that translate into a good race day? We never know. But I'm looking forward to the race, as well as the Saturday expo and dinner with a good buddy the night before.

If nothing else, the marathon will give me a few hours to reflect on what I learn at StrangeLoop and to think about what I will do with it!


Posted by Eugene Wallingford | Permalink | Categories: Running, Software Development

September 22, 2010 4:38 PM

What Agile Isn't

Traffic on the XP mailing list has been heavy for the last few weeks, with a lot of "meta" discussion spawned by messages from questioners seeking a firmer understanding of this thing we call "agile software development". I haven't had time to read even a small fraction of the messages, but I do check in occasionally. Often, I'll target in on comments from specific writers whose expertise and experience I value. Other times, I'll follow a sub-sub-plot to see a broader spectrum of ideas.

Two recent messages really stood out to me as important signposts in the long-term conversation about agile software development. First, Charlie Poole reminded everyone that Agile Isn't a "Thing".

The ongoing thread about whether is always/sometimes/not always/never/ whatever "right" for a given environment seems to me to be missing something rather important. It seems to be based on the assumption that "agile" is some particular thing that we can talk about unambiguously.

It isn't.

If you come to the "agile" community looking for one answer to any question, or agreement on specific practices, or a dictum that developers or managers can use to change minds, you'll be disappointed. It's much more nebulous than that:

Agile is a set of values. They fit anywhere that those values are respected, including places where folks are trying to move the company culture away from antithetical values and towards those of agile.

If you are working with a group of people who share these values, or who are open to them, then you can "do agile" by looking for ways to bring your group's practices more in alignment with your values. You can accomplish this in almost any environment. But to get specific about agile, Charlie reminds us, you probably have to shift the conversation to specific approaches to agile development, and even specific practices.

When I use the term "agile", I try not to use it solo. I like to say "agile software development" or simply "agile development". Software configuration management guru Brad Appleton wrote the second post to catch my eye and goes a step beyond "Agile Isn't a 'Thing'" to the root of the issue: "agile" is an adjective!

"Agile" is something that you are (or are not), not something that you "do".

So simple. Thanks, Brad.

I can talk about "agility" as a noun, where it is the quality attained by "being agile". I can talk about "agile" a modifier of a noun/thing, even if the "thing" it is modifying is a set of values, principles, beliefs, behaviors, etc.

He doesn't stop there, though, for which I'm glad. You can try to develop software in an agile way -- with an openness to change, typically using short iterations and continuous feedback -- and thus try to be more agile. You can adopt a set of values, but if you don't change what you do then you probably won't be any more agile.

I also liked that Brad points out it's not reasonable to expect to realize the promise developing software in an agile way if one ignores the premise of the agile approaches. For example, executing a certain method or set of practices won't enable you to respond to change with facility unless you also take actions that keep the cost of change low and pay attention to whether or not these actions are succeeding. Most importantly, "agile" is not a synonym for happiness or success. "Being agile" may be a way to be happier as a developer or person, but we should not confuse the goal of "being agile" with the goal of being happy or successful or happy.

The title of this entry, "What Agile Isn't", ought to sound funny, because it isn't entirely grammatically correct. If we quote the "Agile", at least then we could be honest in indicating that we are talking about a word.

Don't worry too much about whether you fit someone else's narrow definition of "agile". Just keep trying to get better by choosing -- deliberately and with care -- actions and practices that will move you in the direction of your goal. The rest will take care of itself.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

September 20, 2010 4:28 PM

Alan Kay on "Real" Object-Oriented Programming

In Alan Kay's latest comment-turned-blog entry, he offers thoughts in response to Moti Ben-Ari's CACM piece about object-oriented programming. Throughout, he distinguishes "real oop" from what passes under the term these days. His original idea of object-oriented software is really quite simple to express: encapsulated modules all the way down, with pure messaging. This statement boils down to its essence many of the things he has written and talked about over the years, such as in his OOPSLA 2004 talks.

This is a way to think about and design systems, not a set of language features grafted onto existing languages or onto existing ways of programming. He laments what he calls "simulated data structure programming", which is, sadly, the dominant style one sees in OOP books these days. I see this style in nearly every OOP textbook -- especially those aimed at beginners, because those books generally start with "the fundamentals" of the old style. I see it in courses at my university, and even dribbling into my own courses.

One of the best examples of an object-oriented system is one most people don't think of as a system at all: the Internet:

It has billions of completely encapsulated objects (the computers themselves) and uses a pure messaging system of "requests, not commands", etc.

Distributed client-server apps make good examples and problems in courses on OO design precisely because they separate control of the client and the server. When we write OO software in which we control both sides of the message, it's often too tempting to take advantage of how we control both objects and to violate encapsulation. These violations can be quite subtle, even when we take into account idioms such as access methods and the Law of Demeter. To what extent does one component depend on how another component does its job? The larger the answer, the more coupled the components.

Encapsulation isn't an end unto itself, of course. Nor are other features of our implementation:

The key to safety lies in the encapsulation. The key to scalability lies in how messaging is actually done (e.g., maybe it is better to only receive messages via "postings of needs"). The key to abstraction and compactness lies in a felicitous combination of design and mathematics.

I'd love to hear Kay elaborate on this "felicitous combination of design and mathematics"... I'm not sure just what he means!

As an old AI guy, I am happy to see Kay make reference to the Actor model proposed by Carl Hewitt back in the 1970s. Hewitt's ideas drew some of their motivation from the earliest Smalltalk and gave rise not only to Hewitt's later work on concurrent programming but also Scheme. Kay even says that many of Hewitt's ideas "were more in the spirit of OOP than the subsequent Smalltalks."

Another old AI idea that came to my mind as I read the article was blackboard architecture. Kay doesn't mention blackboards explicitly but does talk about how messaging might better be if instead of an object sent messages to specific targets they might "post their needs". In a blackboard system, objects capable of satisfying needs monitor the blackboard and offer to respond to a request as they are able. The blackboard metaphor maintains some currency in the software world, especially in the distributed computing world; it even shows up as an architectural pattern in Pattern-Oriented Software Architecture. This is a rich metaphor with much room for exploration as a mechanism for OOP.

Finally, as a CS educator, I could help but notice Kay repeating a common theme of his from the last decade, if not longer:

The key to resolving many of these issues lies in carrying out education in computing in a vastly different way than is done today.

That is a tall order all its own, much harder in some ways than carrying out software development in a vastly different way than is done today.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 14, 2010 9:58 PM

Thinking About Things Your Users Don't Know

Recently, one of the software developers I follow on Twitter posted a link to 10 Things Non-Technical Users Don't Understand About Your Software. It documents the gap the author has noticed between himself as a software developer and the people who use the software he creates. A couple, such as copy and paste and data storage, are so basic that they might surprise new developers. Others, such concurrency and the technical jargon of software, aren't all that surprising, but developers need to keep them in mind when building and documenting their systems. One, the need for back-ups, eludes even for technical users. Unfortunately,

You can mention the need for back-ups in your documentation and/or in the software, but it is unlikely to make much difference. History shows that this is a lesson most people have to learn the hard way (techies included).

Um, yeah.

a polar leaps the chasm between ice floes

As I read this article, I began to think that it would be fun and enlightening to write a series of blog entries on the things that CS majors don't understand about our courses. I could start with ten as a target length, but I'm pretty sure that I can identify even more. As the author of the non-technical users paper points out, the value in such a list is most definitely not to demean the students or users. Rather, it is exceedingly useful for professors to remember that their students are not like them and to keep these differences in mind as they design their courses, create assignments, and talk with the students. Like almost everyone who interacts with people, we can do a better job if we understand our audience!

So, I'll be on the look-out for topics specific to CS students. If you have any suggestions, you know how to reach me.

After I finished reading the article, I looked back at the list and realized that many of these things are themselves things that CS majors don't understand about their courses. Consider especially these:

the jargon you use

It took me several years to understand just how often the jargon I used in class sounded like white noise to my students. I'm under no illusion that I now speak in the clearest vocabulary and that all my students understand what I'm saying as I say it. But I think about this often as I prepare and deliver my lectures, and I think I'm better than I used to be.

they should read the documentation

I'm used to be surprised when, on our student assessments, a student responds to the question "What could I have done better to improve my learning in this course?" with "Read the book". (Even worse, some students say "Buy the book"!) Now, I'm just saddened. I can say only so much in class. Our work in class can only introduce students to the concepts we are learning, not cover them in their entirety. Students simply must read the textbook. In upper-division courses, they may well need to read secondary sources and software documentation, too. But they don't always know that, and we need to help them know it as soon as possible.

Finally, my favorite:

the problem exists between keyboard and chair

Let me sample from the article and substitute students for users:

Unskilled students often don't realize how unskilled they are. Consequently they may blame your course (and lectures and projects and tests) for problems that are of their own making.

For many students, it's just a matter of learning that they need to take responsibility for their own learning. Our K-12 schools often do not prepare them very well for this part of the college experience. Sometimes, professors have to be sensitive in raising this topic with students who don't seem to be getting it on their own. A soft touch can do wonders with some students; with others, polite but direct statements are essential.

The author of this article closes his discussion of this topic with advice that applies quite well in the academic setting:

However, if several people have the same problem then you need to change your product to be a better fit for your users (changing your users to be a better fit to your software unfortunately not being an option for most of us).

You see, sometimes the professor's problem exists between his keyboard and his chair, too!


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

September 10, 2010 2:22 PM

Recursion, Trampolines, and Software Development Process

Yesterday I read an old article on tail calls and trampolining in Scala by Rich Dougherty, which summarizes nicely the problem of recursive programming on the JVM. Scala is a functional language, which lends itself to recursive and mutually recursive programs. Compiling those programs to run on the JVM presents problems because (1) the JVM's control stack is shallow and (2) the JVM doesn't support tail-call optimization. Fortunately, Scala supports first-class functions, which enables the programmer to implement a "trampoline" that avoids the growing the stack. The resulting code is harder to understand and so to maintain, but it runs without growing the control stack. This is a nice little essay.

Dougherty's conclusion about trampoline code being harder to understand reminded me of a response by reader Edward Coffin to my July entry on CS faculty sending bad signals about recursion. He agreed that recursion usually is not a problem from a technical standpoint but pointed out a social problem (paraphrased):

I have one comment about the use of recursion in safety-critical code, though: it is potentially brittle with respect to changes made by someone not familiar with that piece of code, and brittle in a way that makes breaking the code difficult to detect. I'm thinking of two cases here: (1) a maintainer unwittingly modifies the code in a way that prevents the compiler from making the formerly possible tail-call optimization and (2) the organization moves to a compiler that doesn't support tail-call optimization from one that did.

Edward then explained how hard it is to warn the programmers that they have just made changes to the code that invalidate essential preconditions. This seems like a good place to comment the code, but we can't rely on programmers paying attention to such comments, even that the comments will accompany the code forever. The compiler may not warn us, and it may be hard to write test cases that reliably fail when the optimization is missed. Scala's @tailrec annotation is a great tool to have in this situation!

"Ideally," he writes, "these problems would be things a good organization could deal with." Unfortunately, I'm guessing that most enterprise computing shops are probably not well equipped to handle them gracefully, either by personnel or process. Coffin closes with a pragmatic insight (again, paraphrased):

... it is quite possible that [such organizations] are making the right judgement by forbidding it, considering their own skill levels. However, they may be using the wrong rationale -- "We won't do this because it is generally a bad idea." -- instead of the more accurate "We won't do this because we aren't capable of doing it properly."

Good point. I don't suppose it's reasonable for me or anyone to expect people in software shops to say that. Maybe the rise of languages such and Scala and Clojure will help both industry and academia improve the level of developers' ability to work with functional programming issues. That might allow more organizations to use a good technical solution when it is suitable.

That's one of the reasons I still believe that CS educators should take care to give students a balanced view of recursive programming. Industry is beginning to demand it. Besides, you never know when a young person will get excited about a problem whose solution feels so right as a recursion and set off to write a program to grow his excitement. We also want our graduates to be able to create solutions to hard problems that leverage the power of recursion. We need for students to grok the Big Idea of recursion as a means for decomposing problems and composing systems. The founding of Google offers an instructive example of using inductive definition recursion, as discussed in this Scientific American article on web science:

[Page and Brin's] big insight was that the importance of a page -- how relevant it is -- was best understood in terms of the number and importance of the pages linking to it. The difficulty was that part of this definition is recursive: the importance of a page is determined by the importance of the pages linking to it, whose importance is determined by the importance of the pages linking to them. [They] figured out an elegant mathematical way to represent that property and developed an algorithm they called PageRank to exploit the recursiveness, thus returning pages ranked from most relevant to least.

Much like my Elo ratings program that used successive approximation, PageRank may be implemented in some other way, but it began as a recursive idea. Students aren't likely to have big recursive ideas if we spend years giving them the impression it is an esoteric technique best reserved for their theory courses.

So, yea! for Scala, Clojure, and all the other languages that are making recursion respectable in practice.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

September 06, 2010 12:45 PM

Empiricism, Bias, and Confidence

This morning, Mike Feathers tweeted a link to an old article by Donald Norman, Simplicity Is Highly Overrated and mentioned that he disagrees with Norman. Many software folks disagreed with Norman when he first wrote the piece, too. We in software, often being both designers and users, have learned to appreciate simplicity, both functionally and aesthetically. And, as Kent Beck suggested, products such as the iPod are evidence contrary to the claim that people prefer the appearance of complexity. Norman offered examples in support of his position, too, of course, and claimed that he has observed them over many years and in many cultures.

This seems like a really interesting area for study. Do people really prefer the appearance of complexity as a proxy for functionality? Is the iPod an exception, and if so why? Are software developers different from the more general population when it comes to matters of function, simplicity, and use?

When answering these questions, I am leery of relying on self-inspection and anecdote. Norman said it nicely in the addendum to his article:

Logic and reason, I have to keep explaining, are wonderful virtues, but they are irrelevant in describing human behavior.

He calls this the Engineer's Fallacy. I'm glad Norman also mentions economists, because much of the economic theory that drives our world was creating from deep analytic thought, often well-intentioned but usually without much evidence to support it, if any at all. Many economists themselves recognize this problem, as in this familiar quote:

If economists wished to study the horse, they wouldn't go and look at horses. They'd sit in their studies and say to themselves, "What would I do if I were a horse?"

This is a human affliction, not just a weakness of engineers and economists. Many academics accepted the Sapir-Whorf Hypothesis, which conjectures that our language restricts how we think, despite little empirical support for a claim so strong. The hypothesis affected in disciplines such as psychology, anthropology, and education, as well as linguistics itself. Fortunately, others subjected the hypothesis to study and found it lacking.

For a while, it was fashionable to dismiss Sapir-Whorf. Now, as a recent New York Times article reports, researchers have begun to demonstrate subtler and more interesting ways in which the language we speaks shapes how we think. The new theories follow from empirical data. I feel a lot more confident in believing the new theories, because we have derived them from more reliable data than we ever had for the older, stronger claim.

(If you read the Times article, you will see that Whorf was an engineer, so maybe the tendency to develop theories from logical analysis and sparse data really is more prominent in those of us trained in the creation of artifacts to solve problems...)

We see the same tendencies in software design. One of the things I find attractive about the agile world is its predisposition toward empiricism. Just yesterday Jason Gorman posted a great example, Reused Abstractions Principle. For me, software abstractions that we discover empirically have a head-start toward confident believability on the ones we design aforethought. We have seen them instantiated in actual code. Even more, we have seen them twice, so they have already been reused -- in advance of creating the abstraction.

Given how frequently even domain experts are wrong in their forecasts of the future and their theorizing about the world, how frequently we are all betrayed by our biases and other subconscious tendencies, I prefer when we have reliable data to support claims about human preferences and human behavior. A flip but too often true way to say "design aforethought" is "make up".


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

August 31, 2010 7:18 PM

Notes from a Master Designer

Someone tweeted recently about a recent interview with Fred Brooks in Wired magazine. Brooks is one of the giants of our field, so I went straight to the page. I knew that I wanted to write something about the interview as soon as I saw this exchange, which followed up questions about how a 1940s North Carolina schoolboy ended up working with computers:

Wired: When you finally got your hands on a computer in the 1950s, what did you do with it?

Brooks: In our first year of graduate school, a friend and I wrote a program to compose tunes. The only large sample of tunes we had access to was hymns, so we started generating common-meter hymns. They were good enough that we could have palmed them off to any choir.

It never surprises me when I learn that programmers and computer scientists are first drawn to software by a desire to implement creative and intelligent tasks. Brooks was first drawn to computers by a desire to automatic data retrieval, which at the time must have seemed almost as fantastic as composing music. In an Communications of the ACM interview printed sometime last year, Ed Feigenbaum called AI the "manifest destiny" of computer science. I often think he is right. (I hope to write about that interview soon, too.)

But that's not the only great passage in Brooks's short interview. Consider:

Wired: You say that the Job Control Language you developed for the IBM 360 OS was "the worst computer programming language ever devised by anybody, anywhere." Have you always been so frank with yourself?

Brooks: You can learn more from failure than success. In failure you're forced to find out what part did not work. But in success you can believe everything you did was great, when in fact some parts may not have worked at all. Failure forces you to face reality.

As an undergrad, I took a two-course sequence in assembly language programming and JCL on an old IBM 370 system. I don't know how much the JCL on that machine had advanced beyond Brooks's worst computer programming language ever devised, if it had at all. But I do know that the JCL course gave me a negative-split learning experience unlike any I had ever had before or have had since. As difficult as that was, I will be forever grateful for Dr. William Brown, a veteran of the IBM 360/370 world, and what he taught me that year.

There are at least two more quotables from Brooks that are worth hanging on my door some day:

Great design does not come from great processes; it comes from great designers.

Hey to Steve Jobs.

The insight most likely to improve my own work came next:

The critical thing about the design process is to identify your scarcest resource.

This one line will keep me on my toes for many projects to come.

If great design comes from great designers, then how can the rest of us work toward the goal of becoming a great designer, or at least a better one?

Design, design, and design; and seek knowledgeable criticism.

Practice, practice, practice. But that probably won't be enough. Seek out criticism from thoughtful programmers, designers, and users. Listen to what they have to say, and use it to improve your practice.

A good start might be to read this interview and Brooks's books.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

August 26, 2010 9:39 PM

Thinking about Software Development and My Compilers Course

Our semester is underway. I've had the pleasure of meeting my compilers course twice and am looking forward to diving into some code next week. As I read these days, I am keenly watching for things I can bring into our project, both the content of defining and interpreting language and the process of student teams writing a compiler. Of course, I end up imposing this viewpoint on whatever I read! Lately, I've been seeing a lot that makes me think about the development process for the semester.

Greg Wilson recently posted three rules for for supervising student programming projects. I think these rules are just as useful for the students as they work on their projects. In a big project course, students need to think about time, technology, and team interaction realistically in a way they. I especially like the rule, "Steady beats smart every time". It gives students hope when things get tough, even if they are smart. More importantly, it encourages them to start and to keep moving. That's the best way to make progress, no matter smart you are. (I gave similar advice during my last offering of compilers.) My most successful project teams in both the compilers course and in our Intelligent Systems course were the once who humbly kept working, one shovel of dirt at a time.

I'd love to help my compiler students develop in an agile way, to the extent they are comfortable. Of course, we don't have time for a full agile development course while learning the intricacies of language translation. In most of our project courses, we teach some project management along side the course content. This means devoting a relatively small amount of time to team and management functions. So I will have to stick to the essential core of agile: short iterations plus continuous feedback. As Hugh Beyer writes:

Everything else is there to make that core work better, faster, or in a more organized way. Throw away everything else if you must but don't trade off this core.

For the last couple of weeks, I have been thinking about ways to decompose the traditional stages of the compiler project (scanning, parsing, semantic analysis, and code generation) into shorter iterations. We can certainly implement the parser in two steps, first writing code to recognize legal programs and then adding code to produce abstract syntax. The students in my most recent offering of the compilers course also suggested splitting the code generation phase of the project into two parts, one for implementing the run-time system and one for producing target code. I like this idea, but we will have to come up with ways to test the separate pieces and get feedback from the earlier piece of our code.

Another way we can increase feedback is to do more in-class code reviews of the students' compilers as they write them. A student from the same previous course offering wrote to me only yesterday, in response to my article on learning from projects in industry, suggesting that reviews of student code would have enhanced his project courses. Too often professors show students only their own code, which has been designed and implemented to be clean and easy to understand. A lot of the most important learning happens in working at the rough edges, encountering problems that make things messy and solving them. Other students' code has to confront and solve the same problems, and reading that code and sharing experiences is a great way to learn.

I'm a big fan of this idea, of course, and have taught several of my courses using a studio style in the past. Now I just need to find a way to bring more of that style into my compilers course.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

August 18, 2010 5:04 PM

You May Be in the Software Business

In the category programming for all, Paul Graham's latest essay explains his ideas about What Happened to Yahoo. (Like the classic Marvin Gaye album and song, there is no question mark.) Most people may not care about programming, but they ought to care about programs. More and more, the success of an organization depends on software.

Which companies are "in the software business" in this respect? ... The answer is: any company that needs to have good software.

If this was such a blind spot for an Internet juggernaut like Yahoo, imagine how big a surprise it must be for everyone else.

If you employ programmers, you may be tempted to stay within your comfort zone and treat your tech group just like the rest of the organization. That may not work very well. Programmers are a different breed, especially great programmers. And if you are in the software business, you want good programmers.

Hacker culture often seems kind of irresponsible. ... But there are worse things than seeming irresponsible. Losing, for example.

Again: If this was such a blind spot for an Internet juggernaut like Yahoo, imagine how big an adjustment it would be for everyone else.

I'm in a day-long retreat with my fellow department heads in the arts and sciences, and it's surprising how often software has come up in our discussions. This is especially true in recruitment and external engagement, where consistent communication is so important. It turns out the university is in the software business. Unfortunately, the university doesn't quite get that.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

August 14, 2010 11:30 AM

The Tests Are For Me

Yesterday, one of my former students tweeted,

"Write a test first." But unit tests are low priority. The business wants high priority items first.

When I asked, "Is correct code a high priority?", he rephrased my question as, "Is proof of correct code a high priority?"

I knew what he meant and so dropped the exchange, sad that companies still settle for code without tests in the name of expediency. I try not to be dogmatic about writing tests with code, let alone driving the design with tests, and know there are times when not writing tests, at least now, feels okay.

But then it occurred to me. When I write tests, the tests aren't really for "them", whoever "them" might be: my bosses, my colleagues, or my users. My best test suites are the ones I write for myself. They are a part of how I write code.

When I'm not doing TDD and not writings tests in parallel with my code -- when not writing tests feels okay -- I am almost always not writing interesting code. Perhaps I know the domain so well that I can write the code with my eyes closed. Perhaps the domain does not engage me enough that I care to get into the deep flow that comes with testing, growing, and refactoring my program. If the task is dull or routine, then tests seem unnecessary, a drag.

(Perhaps, though, I especially need to write tests in these situations, to guard against producing bad code as a result of complacency and boredom!)

When I am writing code I enjoy, the tests are for me. Saying to me, "Don't take time to write tests" is like telling me not to use version control. It's like saying to me, "Don't take time to format your code properly" or "Don't bother naming your variables properly". Not writing tests seems foreign. Not writing tests is an impediment to writing code at all.

It's really not a matter of "taking time to write a test first". I write tests because that's how I write code.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

August 04, 2010 2:58 PM

Agile Moments: Metaphor as Practice, Metaphors for Practices

In the last few weeks, I've seen a few interesting metaphors related to agile development. Surprisingly, one of them was actually a metaphor, XP-style.

The Mute Button

Like many newcomers to XP, my students tend not to get the reason that "metaphor" was one of the original XP practices. I try to give examples that I've seen, but my set of examples is too small. That's one reason I was excited when some agile practitioners in industry created a new discussion list on metaphor in software. Joshua Kerievsky posted a link to a blog entry he had written about two metaphors that have helped him recently.

In one case, his company has started using the idea of a playlist for the products it sells, instead a book. In the other, which is the main theme of his entry, he extrapolates from the presence of a "mute" feature in the Twitter client Twittelator to a riff on thinking about Twitter as if it were television. There are some interesting similarities, as well as interesting differences. But it's a good example of how a salient metaphor can be a source of common experience for guiding the design of a software system.

Refactoring "Technical Debt"

A few months back, I wrote an entry on technical debt, suggesting that some technical debt might be okay to carry, so long as we incur it for investment, not consumption. Not everyone believes this, of course. I've been heartened by Kent Beck's writing over the last couple of years about his own questioning of when we might break our own rules, at least the rules as they have been calcified over time or hardened against the challenges of skeptical audiences.

Last month, Steve Freeman proposed a new picture of bad code: it isn't technical debt; it's an unhedged call option. This metaphor highlights the risk we run when we allow bad code to remain in the build. My metaphor's willingness to carry debt for investment implies a risk, too, because some investments fail deliver as hoped. Freeman's metaphor raises this risk to a more dangerous level. I like his story and think it applies quite nicely in many contexts.

Still, I'm willing to accept lower quality code in trade for something of greater value now -- as long as I keep my eye on the balance sheet and remain vigilant about the debt load I carry. If the load begins to creep higher, or if I begin to spend too many resources servicing the debt, then I want to clean the code up right now. The cost of the debt has risen above the value of the investment it bought me. One of the nice things about software is that we can make changes to improve its structure, if we are willing to spend the time doing so.

What is Refactoring?

Finally, here is a metaphor you can use to explain refactoring to people who don't get it yet: refactoring is a time machine. I'll be smarter tomorrow, more experienced, better informed about what the system I am building today should look like. That is when I should hop in the time machine and make my code look the way it ought, given what I know then. Boy, that take's a lot of pressure off trying to fake the perfect design today, when I don't know yet what it is.

(If I could travel in a blogging time machine, I might go back to last Friday, unmention the Kent Beck failing today, and find a way to use for the first time here!)


Posted by Eugene Wallingford | Permalink | Categories: Software Development

July 23, 2010 2:29 PM

Agile Moments: Values and Integrity Come Before Practices

As important as the technical practices of the agile software developer are, it is good to keep in mind that they are a means to an end. Jeff Langr and Tim Ottinger do a great job of summarizing the characteristics of agile development teams and reminding us that they are not specific practices. Ottinger boiled it down to a simple phrase in a recent discussion of this piece on the XP mailing list: "... start with the values."

Later in the same thread, Langr commented on what made teaching XP practices so frustrating:

... I am very happy now to be developing in a team doing TDD, and to not be debating daily with apathetic and/or duplicitous people who want to make excuses. You're right, you can reach the lazier folks, given enough time.... It's the sly ones who spent more time crafting excuses, instead of earnestly trying to learn, who drove me nuts.

Few people can learn something entirely new to them without making a good-faith effort. When someone pretends to make a good-faith effort but then schemes not to learn, everyone's time is wasted -- and a lot of the coach's or teacher's limited energy, too. I have been fortunate in my years teaching CS to encounter very few students who behave this way. Most do make an effort to learn, and their struggles are signs that I need to try something different. And every once in a while I run into into is honest and says outright, "I'm not going to do that." This lets us both move straight on to more productive uses of our time.

On the same theme of values and integrity, I love this line from John Cook

If the smart thing to do doesn't scale, maybe we shouldn't scale.

His entry is in part about values. When someone says that we can't treat people well "because that doesn't scale", they are doing what many managerial strategies and software development processes do to people. It also reminds me of why many in the agile development community favor small teams. Not everything that we value scales, and we value those things enough that we are willing to look for ways to work "small".


Posted by Eugene Wallingford | Permalink | Categories: Software Development

July 21, 2010 4:17 PM

Two Classic Programs Available for Study

I just learned that the Computer History Museum has worked with Apple Computer to make source code for MacPaint and QuickDraw available to the public. Both were written by Bill Atkinson for the original Mac, drawing on his earlier work for the Lisa. MacPaint was the iconic drawing program of the 1980s. The utility and quality of this one program played a big role in the success of the Mac. Andy Hertzfeld, another Apple pioneer, credited QuickDraw for the success of the Mac for its speed at producing the novel interface that defined the machine to the public. These programs were engineering accomplishments of a different time:

MacPaint was finished in October 1983. It coexisted in only 128K of memory with QuickDraw and portions of the operating system, and ran on an 8 Mhz processor that didn't have floating-point operations. Even with those meager resources, MacPaint provided a level of performance and function that established a new standard for personal computers.

Though I came to Macs in 1988 or so, I was never much of a MacPaint user, but I was aware of the program through friends who showed me works they created using it. Now we can look under the hood to see how the program did what it did. Atkinson implemented MacPaint in one 5,822-line Pascal program and four assembly language files for the Motorola 6800 totaling 3,583 lines. QuickDraw consists of 17,101 lines of Motorola 6800 assembly in thirty-seven modules.

I speak Pascal fluently and am eager to dig into the main MacPaint program. What patterns will I recognize? What design features will surprise me, and teach me something new? Atkinson is a master programmer, and I'm sure to learn plenty from him. He was working in an environment that so constrained his code's size that he had to do things differently than I ever think about programming.

This passage from the Computer History Museum piece shares a humorous story that highlights how Atkinson spent much of his time tightening up his code:

When the Lisa team was pushing to finalize their software in 1982, project managers started requiring programmers to submit weekly forms reporting on the number of lines of code they had written. Bill Atkinson thought that was silly. For the week in which he had rewritten QuickDraw's region calculation routines to be six times faster and 2000 lines shorter, he put "-2000" on the form. After a few more weeks the managers stopped asking him to fill out the form, and he gladly complied.

This reminded me of one of my early blog entries about refactoring. Code removed is code earned!

I don't know assembly language nearly as well as I know Pascal, let alone Motorola 6800 assembly, but I am intrigued by the idea of being able to study more than 20,000 lines of assembly language that work together on a common task and which also exposes a library API for other graphics programs. Sounds like great material for a student research project, or five...

I am a programmer, and I love to study code. Some people ask why anyone would want to read listings of any program, let alone a dated graphics program from more than twenty-five years ago. If you use software but don't write it, then you probably have no reason to look under this hood. But keep in mind that I study how computation works and how it solves problems in a given context, especially when it has limited access to time, space, or both.

But... People write programs. Don't we already know how they work? Isn't that what we teach CS students, at least ones in practical undergrad departments? Well, yes and no. Scientists from other disciplines often ask this question, not as a question but as an implication that CS is not science. I have written on this topic before, including this entry about computation in nature. But studying even human-made computation is a valuable activity. Building large systems and building tightly resource-constrained programs are still black arts.

Many programmers could write a program with the functionality of MacPaint these days, but only a few could write a program that offers such functionality under similar resource limitations. That's true even today, more than two decades after Atkinson and others of his era wrote programs like this one. Knowledge and expertise matter, and most of it is hidden away in code that most of us never get to see. Many of the techniques used by masters are documented either not well or not at all. One of the goals of the software patterns community is to document techniques and the design knowledge needed to use them effectively. And one of the great services of the free and open-source software communities is to make programs and their source code accessible to everyone, so that great ideas are available to anyone willing to work to find them -- by reading code.

Historically, engineering has almost always run ahead of science. Software scientists study source code in order to understand how and why a program works, in a qualitatively different way than is possible by studying a program from the outside. By doing so, we learn about both engineering (how to make software) and science (the abstractions that explain how software works). Whether CS is a "natural" science or not, it is science, and source code embodies what it studies.

For me, encountering the release of source code for programs such as MacPaint feels something like a biologist discovering a new species. It is exciting, and an invitation to do new work.

Update: This is worth an update: a portrait of Bill Atkinson created in MacPaint. Well done.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Software Development

July 11, 2010 11:59 AM

Form Matters

While reading the July/August 2010 issue of Running Times, I ran across an article called "Why Form Matters" that struck me as just as useful for programmers as runners. Unfortunately, the new issue has not been posted on-line yet, so I can't link to the article. Perhaps I can make some of the connections for you.

For runners, form is the way our body works when we run: the way we hold our heads and arms; the way our feet strike the ground; the length and timing of our strides. For programmers, I am thinking of what we often call 'process', but also the smaller habitual practices we follow when we code, from how we engage a new feature to how and when we test our code, to how we manage our code repository. Like running, the act of programming is full of little features that just happen when we work. That is form.

The article opened with a story about a coach trying to fix Bill Rodgers' running form at a time when he was the best marathoner in the world. The result was surprising: textbook form, but lower efficiency. Rodgers changed his form to something better and became a worse runner.

Some runners take this to mean, "Don't fix what works. My form works for me, however bad it is." I always chuckle when I hear this and think, "When you are the best marathoner in the world, let's talk. Until then, you might want to consider ways that you can get better." And you can be sure that Bill Rodgers was always looking for ways that you can get better.

There are a lot of programmers who resist changing style or form because, hey, what I do works for me. But just as all top running coaches ask their pupils -- even the best runners -- work on their form, all programmers should work on their form, the practices they use in the moment-to-moment activity of writing code. Running form is sub-conscious, but so is the part of our programming practice that has the biggest effect on our productivity. These are the habits and the default answers that pop into our head as we work.

If you buy this connection between running form and programming practice, then there is a lot for programmers to learn from this article. First, what of that experiment with Bill Rodgers?

No reputable source claims that, at any one instant, significantly altering your form from what your body is used to will make you faster.

If you decide to try out a new set of practices, say, to go agile and practice XP, you probably won't be faster at the end of the day. New habits take time. The body and mind require practice and acclimation. When we work in teams to build software, we have to go through a process of acculturation. Time.

But that doesn't mean ... that the form your body naturally gravitates toward is what will make you fastest.

There are many reasons that you may have fallen into the practices you use now. The courses and instructors you had in school, the language(s) you learned first, and the programming culture cut your professional teeth in all lead you in a particular direction. You will naturally try to get better within the context of these influences.

Even when you have been working to get better, you may (in AI terms) reach a local max biased by the initial conditions on the search. So:

"... there is a difference between doing something reasonably well and maximizing performance."

Sometimes, we need a change in kind rather than yet another change in degree.

Nor does it mean that your "natural" form is in your best long-term interest.

Initial conditions really do have a huge effect on how we develop as runners. When we start running, our muscles are weak and we have little stamina. This affects our initial running form, which we then rehearse slowly over many months as we become better runners. The result is often that we now have stronger muscles, more stamina, and bad form!

The same is true for programmers, both solo and in teams. If we are bad at testing and refactoring when we start, we develop our programming skills and get better while not testing and refactoring. What we practice is what we become.

Now, consider this cruel irony faced by runners:

"This belief system that just doing it over and over is somehow going to make us better is really crazy. Longtime runners actually suffer from the body's ability to become efficient. You become so efficient that you start recruiting fewer muscle fibers to do the same exercise, and as you begin using [fewer] muscle fibers you start to get a little bit weaker. Over time, that can become significant. Once you've stopped recruiting as many fibers you start exerting too much pressure on the fibers you are recruiting to perform the same action. And then you start getting muscle imbalance injuries...."

We programmers may not have to worry about muscle imbalance injuries, but we can find ourselves putting all of our emphasis on our mastery of a small set of coding skills, which then become responsible for all facets of quality in the software we produce. There may be no checks and balances, no practices that help reinforce the quality we are trying to wring out of our coding skills.

How do runners break out of this rut, which is the result of locally maximizing performance? They do something wildly different. Elites might start racing at a different distance or even move to the mountains, where they can run on hills and at altitude. We duffers can also try a race at a new distances, which will encourage us to train differently. Or we might simply change our training regimen: add a track workout once a week, or join a running group that will challenge us in new ways.

Sometimes we just need a change, something new that will jolt us out our equilibrium and stress our system in new way. Programmers can do this, too, whether it's by learning a new language every year or by giving a whole new style a try.

"Running is the one sport where people think, 'I don't have to worry about my technique. ...' We also have a sport where people don't listen to what the top people are doing. ..."

... I can't think of one top runner in the last two decades who hasn't worked on form, either directly through technique drills, indirectly through strength work or simply by being mindful of it while running.

The best runners work on their form. So do the best programmers. You and I should, too. Of course,

It's important when discussing running form to remember that there's no "perfect" form that we should all aspire to.

Even though I'm a big fan of XP and other approaches, I know that there are almost as many reliable ways to deliver great software as there are programmers. The key for all of us is to keep getting better -- not just strengthening our strengths, which can lead to the irony of overtraining, but also finding our weaknesses and building up those muscles. If you tend toward domains and practices where up-front plans work best for you, great. Just don't forget to work on practices that can make you better. And, every once in a while, try something crazy new. You never know where that might lead you.

"... if I went out and said we're going to do functional testing on a set of people, you're going to find weaknesses in every single one of them. The body has adapted to who you are, but has the body adapted to the best possible thing you can offer it? No."

Runners owe it to their bodies to try to offer them the best form possible. Programmers owe it to themselves, their employers, and their customers to try to find the best techniques and process for writing code. Sometimes, that requires a little hill climbing in the search, jumping off into some foreign territory and seeing how much better we can get form there. For runners, this may literally be hill climbing!

After the opening of the Running Times article, it turned to discussion of problems and techniques very specific to running. Even I didn't want to overburden my analogy by trying to connect those passages to software development. But then the article ended with a last bit of motivation for skeptical runners, and I think it's perfect for skeptical programmers, too:

If you're thinking, "That's all well and good for college runners and pros who have all day for their running, but I have only an hour a day total for my running, so I'm better off spending that time just getting in the miles," [Pete] Magill has an answer for you.

"... if you have only an hour a day to devote to your running, the first thing you've got to do is learn to run. If you bring bad form into your running, all you're going to be doing for that hour a day is reinforcing bad form. ..."

"A lot of people waste far more time being injured from running with muscle imbalances and poorly developed form than they do spending time doing drills or exercises or short hills or setting aside a short period each week to work on form itself."

Sure, practicing and working to get better is hard and takes time. But what is the alternative? Think about all, the years, days, and minutes you spend making software. If you do it poorly -- or even well, but less efficiently than you might -- how much time are you wasting? Practice is an investment, not a consumable.

We programmers are not limited to improving our form by practicing off-line. We can also change what we do on-line: we can write a test, take a short step, and refactor. We can speed up the cycle between requirement and running code, learn from the feedback we get -- and get better at the same time.

The next time you are writing code, think about your form. Surprise yourself.


Posted by Eugene Wallingford | Permalink | Categories: Running, Software Development

June 30, 2010 5:17 PM

Changing Default Actions

Learning to do test-driven design requires a big shift in mindset for many developers. I was impressed with how well the students in my recent agile development course took to the idea of writing tests first. Even the most skeptical students seemed willing to go along with the group in using tests to specify the code they needed to write. Other agile practices, such as pair programming and communal development, helped to support all of the students, willing or skeptical, to move in the right direction.

My friend Steve Berczuk suggests another way to support the change in habit:

Rather than frame the testing challenge with the default being the old way of not testing:

Write a test when it makes sense.

Change your perspective to the default being to test:

Write a test unless you can explain why you did not.

I like how Steve shifts the focus onto default actions. The actions we take by default arise when our mental habits come into contact with the world. Some of my students prefer to talk about their "instincts", but the principle is the same: When things get hard -- or easy -- what will you do?

We can change our habits. We can develop our instincts. Yes, it is hard to do. However we make the change, we have to change the individual actions we take at each moment of choice.

The way to turn running into a habit is to run. When I have a run planned for a morning but wake up feeling rotten, my default has to be to run. I need to have a really good reason not to run, a reason I am willing to tell my family and running friends without shame. This is another example of using positive peer pressure to help myself act in a desired way.

There are good reasons not to run some days. However, when I am creating a new habit, I have to place the burden of proof on the old habit: Why not?

When I follow this discipline, there is a risk of overusing the technique I am learning. If my default answer is to just keep running, I will run on some mornings when I really should take a break. I may find out during the run, in which case I need to listen to my body immediately and adapt. Or I may find out later, when I see that my times from the workout were substandard or when I am sore or fatigued beyond reason later in the day. Whenever I recognize the problem, I can examine the outcome and try to learn the reason why I should not have run. This will allow me to make a sound exception to my default in the future.

The same risk comes when we try this technique while learning test-driven design or any new programming practice. I may write a test I don't have to write. I may write code that is too simple. I may need it after all. This risk is an integral part of learning. I must learn when not to do something just as much as I need to learn when to do it. The risk of running when I ought not run carries a greater potential cost than writing a test when I need not write, because physical injury may result. The only real cost of writing an unnecessary test or taking too small a step forward in my code is the time lost.

As a runner, the way I minimize the risk of injury or other significant cost I have to listen to my body. As a programmer, I still also have to listen to my code, and keep it clean through refactoring. Done steadily and faithfully, the side effect is a new habit, better instincts.

The key to Steve's suggestion is that changing practice isn't just about habit and instinct, as important as they are. It's also about attitude. There are times when my surface attitude is compliant with a picking up a new practice, but my ingrained attitude gets in the way. My conscious mind may say, "I want to learn how to do TDD", while subconsciously I react as if "I don't need to write a test here". Taking the initiative to change my default action consciously helps me to bridge the gap. I think that's why I find Steve's idea so helpful.


Posted by Eugene Wallingford | Permalink | Categories: Running, Software Development, Teaching and Learning

June 28, 2010 4:04 PM

Agile Moments: Incremental Design

I love the opening of Research Through Development of Installed Tools, a short section in the conclusion of Richard Stallman's 1979 memo, EMACS: The Extensible, Customizable Display Editor:

The conventional wisdom has it that when a program intended for multiple users is to be written, specifications should be designed in advance. It this is not done, the result will be inferior. The place to try anything new is in a research project which users will not see.

Some people know better than this, but they have been silenced.

If only it were so. The section explains why incremental design was essential to the creation of Emacs:

EMACS could not have been reached by a process of careful design, because such processes arrive only at goals which are visible at the outset, and whose desirability is established on the bottom line at the outset. Neither I nor anyone else visualized an extensible editor until I had made one, nor appreciated its value until he had experienced it. EMACS exists because I felt free to make individually useful small improvements on a path whose end was not in sight.

Agile development teams also like to learn from the act of creating software, allowing goals for the software to emerge as the software grows and allowing the value of features to be assessed in the context of the overall system.

Of course, design still mattered to Stallman and the other developers of Emacs:

While there was no overall goal, each small change had a specific purpose in terms of improving the text editor in general use, and each step had to be individually well designed and reliable.

The resulting design was, according to Hal Ableson (quoted here), good enough to support a new kind of software development community: "Its structure was robust enough that you'd have people all over the world who were loosely collaborating [and] contributing to it. I don't know if that had been done before."

Agile teams use test-driven design, refactoring, and metaphor to keep the quality of their designs on track. Like the Emacs project, agile projects take advantage of real users to keep the usability of their systems on track.

Raymond talks about the value of experimentation in writing new software, relegating upfront design to implementing new versions of existing features. These new implementations can take "advantage of hindsight". I'm often pleasantly reminded just how often an experimental mindset can benefit me as a software developer, even when implementing systems in domains where I have some experience.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

June 25, 2010 12:19 PM

YAGNI: It's Not Just for Agile Programmers Any More

This is the quote of the day from my reading, drawn from a relatively old blog by Giles Bowkett:

YAGNI and "scratch your own itch" don't just keep code clean, elegant, and succinct, they also keep it honest. The worst code you will ever encounter in your career will contain program logic which does something completely different than it claims to, either in its comments or its method, variable, and object names. Programmers spend more time talking about good and evil than priests or preachers do. The reason is simple: bad code is nothing but lies.

I think beginning programmers don't often realize how many different ways that code can lie to us. Moreover, their lack of experience building large programs and living with programs over time usually means that they have no clue at all why this matters, or how important it is.

I am also surprised that so many experienced programmers seem not to grok this yet, especially the role played by doing the simplest thing you can to implement a feature. YAGNI helps us to write more honest code because it helps us to be more honest to ourselves.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

June 12, 2010 11:09 AM

Readings from the Agile Development Course

Last time I mentioned that students in my agile software development course found several of the reading assignments to be valuable. For what it's worth, here is at least a relatively complete list of the readings I assigned in the course of four weeks, in no particular order.

When I still thought we might use a distributed version control system, I asked students to read Hg Init and then a few items on git, including Git for the Lazy, the official git tutorial man page, and Everyday Git. Then, when I decided to show discretion in at least one part of the project and use centralized version control, I asked the class to read several items on Subversion

I'm under no illusion that every student read every page of every reading. This is an ambitious list for agile beginners to tackle in four weeks while also working on a big project. Still, it's clear from discussion that many students read a lot of this material, had questions, and even had comments for me.

As always, I'd love to hear from you any comments you have about this list, or any additions you can suggest for future offerings.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

June 11, 2010 7:31 PM

Some Feedback on Agile Development Course

Today I finished reviewing materials from my agile software development course so that I could assign final grades. The last thing that students wrote for me was a short evaluation of the course and some of the things we did. Their comments are definitely valuable to me as I think about offering the course again.

What worked best this semester? I was surprised that the class was nearly unanimous in saying pair programming. Their reasons were varied but consistent. By pairing, they faced less down time because one of the two developers usually had an idea for how to proceed. Talking about the roadblock helped them get around it. Several students commented that programming was more fun when working with someone. One student said that he "learned how to explain things better". I imagine that part of this was that, as the team came to build up trust and respect for each other, he wanted to be more patient and helpful when responding to questions. Another part was probably simply practice; pair programming increases communication bandwidth.

The students who did not answer "pair programming" said "learning Ruby". I was hopeful that students would enjoy Ruby and pick it up quickly. But my hope was not grounded in much empirical evidence, so I was a little worried. The language was no worse than a break even proposition for some students, and it was a big win for most. And as a result I had more fun!

I asked students, Who were the best partners you worked with? The best part of these answers is that they were not dominated by a few names, as I thought they might be. There was a pretty good spread of names listed, with only a couple of students mentioned repeatedly. I take this to mean that many students contributed relatively well to the experience of their teammates, which is the sort of horizontal distribution of knowledge that is desired for XP teams. I did note an interesting distinction made by one student between the partner I learned the most from and the partner with whom I felt the most productive.

Which of the assigned readings was most valuable? I include a list of most of the readings I assigned over the course of the four weeks. Of those, two were identified most frequently by students as being valuable: Bill Wake's The Test-First Stoplight, which was assigned early in the semester to give students a sense of the rhythm of test-driven design before diving in as a team, and What is Software Design? by Jack Reeves, which was assigned late in the semester after the team had worked collaboratively for a couple of weeks on a system that had no upfront design and very little documentation. In class discussion, a couple of students disagreed with a few of Reeves's points, but even in those cases the paper engaged and challenged the reader. That's about all I can ask from a paper.

When asked, What did you learn best in the courses?, the answers fell into roughly two groups: TDD and the value of tests and Ruby and OOP. The flip side is that many students also said, "I wish I could have learned more!" Again, that's about all I can ask from any course, that it leave students both happy to have learned something and eager to learn more.

I don't mind that Ruby and object-oriented programming were the prized learning outcomes of a course on agile software development. A couple of times during the course I noted that a good project course is by its nature about everything. We can design courses in neat little bundles, but the work of making anything -- and perhaps especially software -- is a tangled weave of knowledge and discipline and habit that comes together in the act of creating something real. If Ruby and OOP were what some students most needed to learn in May, I'm glad that's what they learned. I will trust that the agile ideas will take root when they are most Needed.

How could I improve the course? This is always a tough question to ask students in an named setting, because no matter how much they might trust me I know that some will be reluctant to say what they really think. Still, I received some answers that will help me design the next iteration of this course. Several times expressed a desire for more time -- to write tests, to pair program, to work on the system outside of class, to practice refactoring, .... It seems that desire for more time is a constant in human experience. (At least they weren't so tired of the course that they all said, "Good riddance!"!) Clearly, there is a trade-off between a four-week semester focused on one course and a fifteen-week semester given over to four or five courses. The loss of absolute number of hours available is a cost of the former. I'll have to think about whether that cost is outweighed by its benefits.

One of the more mature and experienced team members offered an interesting comment: Having an instructor who is a programmer act as client and explain the project requirements to the developers affected a lot of things. He wondered what it would be like to have a non-technical client with the CS instructor acting solely as agile coach. An insightful observation and question!

All in all, this was valuable feedback. The students came through again, as they did throughout the course.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

June 04, 2010 4:38 PM

The End Comes Quickly

Teaching a 3-credit semester course in one month feels like running on a treadmill that speeds up every morning but never stops. Then the course is over, almost without warning. It reminds me a little bit of how the Christmas season felt to me when I was a child. (I should be honest with myself. That's still how the Christmas season feels to me, because I am still a kid.)

I've blogged about the course only twice since we dove head-long into software development, on TDD and incremental design and on the rare pleasure of pair programming with students. Today, we wrapped up the second of two week-long iterations. Here are the numbers, after we originally estimated 60 units of work for the ideal week:

  • Iteration 1
    • Budgeted work for week: 30 units.
    • Actual work for week: 29.5 units.
    • Story points delivered: 24 units.

  • Iteration 2 (holiday-shortened)
    • Budgeted work: 16 units.
    • Actual work: 18 units.
    • Story points delivered: 11 units.

I was reasonably happy with the amount of software the team was able to deliver, given all the factors at play, among them never having done XP beyond single practices in small exercises, never having worked together before, learning a new programming language, work schedules that hampered the developers' ability to pair program outside our scheduled class time, and working a domain far beyond most of their experiences.

And that doesn't even mention what was perhaps the team's biggest obstacle: me. I had never coached an undergrad team in such an intense, focused setting before. In three weeks, I learned a lot about how to write better stories for a student team and how to coach students better when they ran into problems in the trenches. I hope that I would do a much better job as coach if we were to start working on a second release on Monday. As my good friend Joe Bergin told me in e-mail today, "Just being agile."

We did short retrospectives at the end of both iterations, with the second melting into a retrospective on the project as a whole. In general, the students seemed satisfied with the progress they made in each iteration, even when they still felt uncomfortable with some of the P practices (or remained downright skeptical). Most thought that the foundation practices -- story selection, pair programming, test-first programming, and continuous integration -- worked well in both iterations.

When asked, "What could we improve?", many students gave the same answers, because they recognized we were all still learning. After the first iteration, several team members were still uncomfortable with no Big Design Up Front (BDUF), and a couple thought that we might have avoided needing to devote a day to refactoring if only we had done more design at the beginning. I was skeptical, though, and said so. If we had tried to design the system at the beginning of the project, knowing what we knew then, would we have had as good and as complete a system as we had at the end of the iteration? No way. We learned a lot building our first week's system, and it prepared us for the design we did while refactoring. I could be wrong, but I don't think so.

Most of the developers agreed that the team could be more productive if they were not required to do all of their programming in pairs. With a little guidance from me as the coach, the team decided to loosen the restriction on pairing as follows:

  • If a pair completes a story together, one member of the pair was permitted to work alone to refactor the code they worked on. Honor code: the solo programmer would not create new code, only refactor, and the solo programmer would not make changes to the code that stayed very far from what the pair understood while working together.

  • If a pair is nearly finished with a story, one member of the pair was permitted to work alone to quickly wrap up the story. Honor system: the programmer would work for only 15-30 minutes alone; if it became clear that the work remaining was more involved than a quick wrap-up, the solo programmer stop immediately and resume working with a partner at the next opportunity.

  • A team member may experiment alone, doing a quick spike in an effort to understand some part of the system. Honor system: the solo programmer would commit none of the spike's code; upon returning to the studio, the developer would collaborate on an equal basis within the pair, sharing what was learned via the spike but not ramming it through to the mainline system without agreement and understanding from the partner.

  • At the next daily stand-up, any developer who had worked solo since the previous class session would explain all work done solo to the entire team.

After the second iteration, the team was happy with this adaptation. Only three of the ten developers had worked alone during the iteration, all doing work well within the letter of the new rules and, even more important, well within the spirit of the new rules, too. The rest of the team was happy with the story wrap-up, the refactoring, and the experimentation that had been done. This seemed like a nice win for the group, as it was the one chance to adapt the XP practices to its own needs, and the modification worked well for them. Just being agile!

As part of the course retrospective, I asked the students whether they would have preferred working in a domain they understood better, perhaps allowing them to focus better on the new practices and new programming language. Here are the notes I had made for myself before class, to be shared after they had a chance to answer:

My thoughts on the domain:

I think it was essential that we work in a domain that pushed you out of your comfort zone.

  • It is hard enough to break habits at all, let alone working on problem you already understand well -- or think you do.
  • The benefits of agile approaches come in helping the team to learn and to incorporate that learning into the system.
  • Not knowing the domain forced you to ask lots of questions. That's how real projects work. That's also the best way to work on any system, even ones you think you already understand.
  • There is a realness to reality. Choices matter. When the user is a real person and is passionate about the product, choices matter.

I was so impressed with the answers the students gave. They covered nearly all of my points, sometimes better than I did. One student identified the trade-off between working in familiar and unfamiliar domains. Another student pointed out that not knowing the domain made the team slow down and think, which helped them design better tests and code. Yet another remarked that there probably was no domain that they all knew equally well anyway. The comment that struck me as most insightful was, roughly, "If we worked in a domain we all understand, then we would probably all understand it differently." That captures the problem of requirements analysis as well as anything I've ever read in a software engineering textbook.

It occurred to me while writing this that I should share the list of readings I asked students to study. It's nothing special, papers most people know about about, but it might be a subset of all possible readings that others might find useful. Sharing the list will also make it possible for you to help me make the it better for the next time I offer the course! I'll gather up all of my links and post the list soon.

A few days ago, alumnus Wade Arnold tweeted:

Sad to see May graduates in Computer Science applying to do website design and updates. Seriously where did the art of programming go?

The small and largely disconnected programming problems that we assign students in most courses may engage some students in the joy of programming, but I suspect that these problems do not engage enough students deeply enough. I remain convinced that courses like this one, with a real problem explored more deeply and more broadly, with the student developers more in control of what they do and how, is one of the few things we can do in a traditional university setting to help students grok what software development is all about -- and why it can satisfy in ways that other activities often cannot.

Next up on the teaching front: the compilers course. But first, I turn my thoughts to sustainable pace and look forward to breathing free for a few days.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

June 01, 2010 9:39 PM

The Rare Pleasure of Pairing

When my student team began the second iteration of our project this morning, one member had no partner. So I jumped in to fill the void for half an hour or so. When the missing student arrived, he joined us and we worked as a trio.

What a treat! Most of the time, I code alone, and far too often the only code I work closely with is code I have written. Working with another programmer was a lot of fun. We talked as we studied and tried our ideas about how the code worked out on one another. Studying someone's code added a second dimension to the fun, because neither of us brought much experience with this part of the system to our pairing session. That meant real study, and real learning.

Our programming session reminded me just how valuable tests can be. We bounced back and forth between the class we needed to extend and its tests. We would study the code, come up with an idea of how it worked, and then inspected the tests to confirm or disconfirm our idea. At one point, we were studying a particularly confusing method. We finally figured out what it was doing and went to the tests to check our understanding. But there was no test to help us... and there should have been. So we wrote a test that embodied what we thought should happen, ran it, and -- voilé -- it passed. That felt good.

The story my partner and I picked out turned out to be effectively solved by the existing code. Rather than taking the easy way out, mark the story as done, and grab another story card, we decided to clean up the code a bit. We were a bit disturbed at having to study that confusing method so long and generally at having to work so hard to understand the class as a whole. So we refactored the class to express the code's intent more clearly. The biggest product of our clean-up was a helper class to structure the parts of a journal entry and name them. This meant taking a horizontal slice of data that was originally sliced vertically. This clarified several of the internal interfaces and even simplified one hairy loop with an if statement that selected on value type into two simpler loops. Ahh.

Is the new code better than the previous version? We think so, but we won't know until the next developers to touch it, whether others our ourselves, bump into it again. I will say that it is at least more explicit. The intention of the code shows up in the names of local variables and formal parameters; it shows up in blocks that send messages to objects of the same kind. It shows up in tests that make finer-grained assertions about the behavior of the system. That feels like an improvement to me and, hopefully, my partners. We'll know more soon enough.

I would love to have more chances like the one I had today.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

May 24, 2010 8:31 PM

Teaching TDD and Incremental Design

Today in the lab while developing code, we encountered another example of stories colliding. We are early in the project, and most parts of the system are still inchoate. So collisions are to be expected.

Today, two pairs were working on stories involving Account objects, which at the start of the day knew only their account number, account name, and current balance. In both stories, the code would have to presume a more knowledgable object, one that knows something about the date of on which the balance is in effect and the various entries that have modified the balance over time.

One team asked, "How can we proceed without knowing the result of the other story?" More importantly, how can either team proceed without knowing how journal transactions will be recorded as entries to the accounts? Implicit in the question, sometimes, is a suggestion disguised as a question: Isn't this an example of where we should do a little up-front design?

In a professional setting, a little up-front design might be the right answer. But with newcomers to XP and TDD, I am trying to have us all think in as pure an XP way as possible. Finding the right point on the continuum between too little up-front design and too much is something better done once the developer has more experience with both ways of working.

This situation is actually a perfect place for us to reinforce the idea behind TDD and why it can help us write better software. Whatever the two pairs do right now, there will likely be some conflicts that need to be merged. Taking that as a given, how can the pairs proceed best? As they write their tests, each should ask itself,

What is the simplest interface we can possibly use to implement this story?

When we write a test, we design a little part of our system's internal interface. Students are used to knowing everything about an already-designed object when they write code to use the object. Programming test-first forces us to think about the interface first, without being privy to implementation. This is good, as it will encourage us to design components that are as loosely coupled as possible. Stories we implement later will impose more specific details on how the object behaves, and we can handle more detailed implementation issues then. This is good, as it encourages us (1) to write simple tests that do not presume any more about the object than is required, and (2) to do the simplest thing that could possibly work to implement the new behavior, because those later stories may well cause our implementation to be extended or changed altogether.

Our story collision is both an obstacle of sorts and an opportunity to let our tests drive us forward in small steps!

This collision also has another lesson in store for us. The whole team has been avoiding a couple of stories about closing journals at the end of the month. Implementing these stories will teach us a lot about what accounts know and look like. By avoiding them, the team has made implementing some of our simplest stories more contingent than they need to be.

Over the weekend, Kent Beck tweeted:

got stuck. wrote a test. unstuck.

A bit he later he followed up:

i get stuck trying to write all the logic at once. feels great to deliberately ignore cases that had me stumped. "that's another test"

This is a skill I hope my students can develop this month. In order for that to happen, I need to watch for opportunities to point them in the right direction. When a pair is at an impasse, unsure of what to do next, I need to suggest that they step back and try to take a smaller step. Write a test for that something smaller, and see where that leads them. The Account saga is a useful example for me to keep in mind.

If nothing else, teaching this course in real time -- in a lab with students testing, designing, and coding all the time -- makes clear something we all know. It is one thing to be able to do something, to react and act in response to the world. It is another thing all together to teach or coach others to do the same thing. I have to have ready at hand questions to ask and suggestions to make as students encounter situations that I handle subconsciously through ingrained experience. Teaching this course is fun on a lot of levels.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 19, 2010 4:39 PM

TDD Exploration on an Agile Student Project

Man, I having fun teaching my agile course. Writing code is fun, and talking design and technique with students in real time is fun. Other than being so intense as to tire me out every day, I think I could get used to this course-in-a-month model.

I had expected that this week would be our first iteration, but it became clear early on that the student team did not understand the domain of our project -- a simple home accounting system I might use -- well enough to begin a development iteration. Progress would have been too slow, and integration too halting, to make the time well-spent.

So, in agile fashion, we adapted. We began our time together yesterday by discussing the problem a bit more at the one-page story level. The team was having difficulty with idea of special journals and different kinds of transactions. We collectively decided to focus on the general journal for recording all transactions, and so adjusted our thinking and our stories.

Then we took inspiration from XP's practice of a spike solution. In XP, a spike is a simple program that helps a team to explore a thorny technical or design problem and learn enough to begin working on live code. As Ward Cunningham relates, a spike is the answer to the question, "What is the simplest thing we can program that will convince us we are on the right track?" For our team, the problem wasn't technical or design-related; it was a purely a matter of insufficient domain understanding.

We paired up, took a simple story, wrote a test, and wrote code. The story was:

Record a check written on May 15 for cash, in the amount of $100.

By writing even one test for this story, the team began to learn new things about the system to be built. For example,

  • A transaction is atomic. You can't add a debit to a journal independent of its balancing credit.

  • Recording a transaction does not update any account. That happens when the journal is closed at the end of the period.

  • There is a difference between the model, the core computation and data of a program, and the view, the way users see or experience the program's behavior. We can and usually should think about these parts of our program as independent.

The first two of these lessons are about the domain. The third is about software design. Both kinds of lesson are essential ones for a young team to to learn, or be reminded of, before building the system.

Some pairs explored this story and its implications for an hour or more. Others tried to forge ahead further, with a second story:

Record receipt of a $200 paycheck, with $100 going to my checking account, $20 to my prepaid medical expense account, and $80 to income tax withholding.

Again, these teams learned something: A transaction may consist of multiple debits or credits. This also means that a transaction must be able to record multiple amounts, unlike in the first story, because several debits may total up to the value of single credit. Finally, if there are multiple debits, the sum of their values must total exactly to the value of the single credit.

Each little bit of learning will help the team to begin to code productively and to be prepared to grow the design of the system.

The development team was not the only party who learned a lot with this spike. By watching the pairs implement a story or two, offering advice and answering questions, I did, too. I play two roles on this project, both as a teacher of sorts. I am the customer for the product and fully intend to use it when the course ends. This makes me a teacher of the domain, both specifically this program and generally Accounting 101. I am also the coach, which finds me helping to guide the XP process as well as teaching students a bit about Ruby, software design, and OO.

By collaborating with the development team as they wrote spike-like code, I learned a lot about how to write better stories. This is true of me as customer, who realized that my original stories lacked the concrete focus the team needed to be able to focus on essential features of the program. It is also true of me as coach, who realized that certain stories would be especially useful in helping the team arrive at a more valuable design more quickly.

It is more than okay for the coach and the customer to learn as much from working with the team as the team members themselves; it is expected. That's one of the great attractions and one of the great assets of agile software development. My students are getting to see that early in their experience with XP.

Tomorrow, I think we will shake of our experimental mindset and begin to write our program. I don't have a lot of experience with a team starting a brand-new system from Code Zero in XP; most of my greenfield development has been on projects where I am the only programmer or the only one writing the initial code base. I am considering trying out advice that @jamesshore tweeted last week:

When starting brand-new product, codebase too small for 8 programmers to work separately. Instead, use projector. 1 driver, 7 nav

This seems like a great way to build an initial code base around a set of commonly-understood objects and interfaces. If I try it out, I will need to avoid one temptation. I will surely want to drive, but I should let a student -- a member of the development team -- control the keyboard!


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 17, 2010 5:19 PM

Course Notes from the Treadmill

Teaching a course two hours a day , especially a course that is essentially new in this incarnation, feels like running on ice. I'm enjoying every day, but tomorrow becomes today way too fast!

At the end of last week, I began to feel the effects of compressing a 3-credit course into four weeks. At the end of Week 1, we are a quarter of the way through the course. But one week is not really enough time for all these new ideas to soak into a student's brain or fingertips. TDD, refactoring, pairing, .... Ruby, an IDE, a VCS, ... Our brains take time to adjust. The students are doing remarkably well under the conditions, but some of them are feeling the rush of days, too.

I most noticed the compression in my conflicting desires to do stuff and to talk more about stuff before doing anything big. Most professors tend to err on the side of talking more, but that isn't the best way to learn most disciplines. I decided that we had seen enough background on XP and that students had practiced enough on small exercises such as the spreadsheet TDD challenge and refactoring Fowler's code, Ruby style. It was time to start building software, and learn as we go. So today we played the Planning Game and put ourselves in position to write Line 1 of code tomorrow.

It's been interesting talking to students about XP's practices. Pairing seemed odd to many of them at first, but they seem to have taken to it quickly. They are social beings. Refactoring seems like the Right Thing To Do to many of them, but in practice it is hard. Using a tool like Reek to identify some smells and an IDE like RubyMine to perform some of the refactoring will help, but RubyMine does not yet implement enough different refactorings to really dampen their fear of breaking code.

TDD is causing a couple of programmers fits, because it inverts how they think about coding. When it comes time to write tests for the app they are building -- no longer a small exercise in their minds -- I expect us to struggle as we think about simple design steps. I hope, though, that this practice will get them over the hump to see how writing tests early or first can really affect how we think about our code.

I am still surprised when developers bemoan their inability to deliver working code and then balk so mightily at a practice that could help them in a major way. But, as we all know, old habits die hard. When the mind is ready, change can happen. All we can hope for in a course is to try to be in position for change to occur whenever the mind becomes ready.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

May 14, 2010 9:15 PM

Greatness, Skill, and Metaprogramming

It's been a long week teaching and doing end-of-year reports for the department, not to mention putting out daily fires. I have a few things to say about the agile development course at the 1/4 mark, but another day.

While writing reports this evening, I listened to several talks and interviews. One was Giles Bowkett's talk on meta-programming at the 2008 Mountain West Ruby Conference. Actually, Bowkett objects to the the idea of meta-programming, as I discussed a few months ago. At one level, I agree with him; it's all just programming. In this talk, he elaborates on this position and does a little just-programming in Ruby to generate code.

The part of this talk that stood out for me this evening was the part of his conclusion in which he discusses Paul Graham's recent work. Bowkett summarizes most of Graham's writing about Lisp, programming, and meta-programming as:

Great programmers can write better programmers than they can hire.

He disagrees with this sentiment in only one word: 'great'. After comically mocking an undue focus on greatness that he attributes to most Harvard grads, he explains that he prefers the more straightforward 'skilled': Skilled programmers can write better programmers than they can hire.

I prefer 'skilled' to 'great' too, because 'great' intimidates too many people. They think other people are or can be great, but that they themselves can be merely ordinary. Maybe so, but ordinary programmers can improve their skills and learn new things. Most ordinary programmers can become skilled programmers, even in the dark art of metaprogramming. They, too, can learn to write better programmers than they can hire, or be.

Of course, this implies that we can help most of the programmers we want to hire be better programmers, by helping them to develop the skills that they need to be good.

If you watch the talk, watch out for a his egregious botching of Lisp syntax in the course of demeaning all those evil parentheses that Lisp foists on us. I would tell him the same thing I tell my students: the parentheses aren't nearly as bad -- or as numerous -- once you learn how to use them properly!


Posted by Eugene Wallingford | Permalink | Categories: Software Development

May 09, 2010 11:53 AM

Technical Debt as Consumption or Investment

A couple of days ago I retweeted a one-liner from @unclebobmartin:

All developers know that bad code slows them down; yet nearly all insist that writing bad code is faster.

This struck me as one of the themes I'd like for my agile software developments students to pick up on this month. It's tempting to write code quick and dirty so that you can feel as if you are ready to move on to the next task. But I as I mentioned here recently, "dirty remains long after quick has been forgotten". The feeling of completion is an illusion, one that we end up paying for later.

A recent student commented that this is the traditional trade-off between "pay now" versus "pay later":

I'd compare it to buying a car today using a loan versus paying outright next year. It's cheaper if you're willing to wait, but can you?

This is an apt analogy, because it allows us to peel off a layer and consider the decision at a deeper level. For example, if I can borrow money at a lower net interest rate than I can earn investing my money elsewhere, then it makes sense for me to borrow. This is a facet of financial borrowing that I think we can learn from in the context of software development. If we can borrow, in the form of technical debt, at a lower net cost than the value of the benefit we can accrue by putting our efforts elsewhere, then it makes sense for us to incur the technical debt.

Seth Godin wrote recently that consumer debt is not your friend. Thinking in this way about software development, we can contrast borrowing for consumption and borrowing for investment. Taking on "consumer" technical debt is a sucker's bet, a losing proposition. This is the sort of debt that agile developers rightly warn us about. Rushing through a story with inadequate testing or with inattention to the shape of the system after we add our code -- just so that we can make a tick on the burndown chart and get more stories done -- this is a habit that eventually buries a team in an avalanche of debt. Pretty soon, we can barely keep up with the minimum monthly payment, and we watch our total debt grow faster than we can add new value to the system. The result is inevitable: bankruptcy.

However, this analogy tells us that there may be a kind of debt that we should be willing to take on. As I described my own thinking above, if I can borrow money at a lower net cost than I can earn investing elsewhere, then it makes sense for me to borrow. Godin writes of borrowing in order to improve your productivity or to buy things that go up in value. In the software development world, this is what I call investment technical debt. If a team makes a thoughtful, conscious decision to let a part of the system fall out of compliance with its usual standards for coding, testing or design because doing so lets them create greater value in another way, then they will be better off in the future for taking on the debt. This is investment, and done well it can pay.

This is the sort of debt that Kent Beck has written about in recent months when he dared to say that it might be okay not to write a test. He has taken a lot of grief from some XP folks, who seem to fear that talking about not following agile practices to the letter will give other developers license to do the wrong thing under the mantle of Beck's advice. I feel for those folks. Investment can be dangerous. People lose money in the stock market all the time, and developers drown in technical debt all the time. It is important for beginners to learn solid fiscal habits, and good development practices, before venturing too far into the world of investing.

But Kent is speaking truth, the same truth my student alluded to when raising the pay-now versus pay-later trade-off. With experience and expertise, developers can begin to take on technical debt for the purpose of investment -- and not only survive, but thrive as a result.

Of course, it is essential that the investor be as honest as possible with herself about being able to repay the debt later. The team must be able to brings its test coverage back up to safe levels for the long term, and it must be able to refactor the system to bring its living design back up to a level that it supports ongoing development. Some teams like to pay down their debt in one or a few large focused episodes. This is akin selling a stock to paying off a note in whole, and I have done this in my own programming more than once.

When I can, though, I prefer to amortize the work of paying off technical debt over a more extended set of development tasks. I still think of this in terms of a relatively quick repayment, maybe a few iterations, because I don't want to the burden of the debt to affect the rhythm of development any longer than it must. (Similarly, I would never take out a 5-year loan to buy a car.) But I prefer the amortized approach because it fits better with how I want to think about my work: always conscious of the state of the system, always taking small steps to make my programs better. As a creature of habit, I like to work within a set of practices that keep me focused on continuous improvement.

That sort of development is not easy to do, or to learn. It's one of the challenges that newcomers to agile software development must overcome. I think back to another part of my student's comment: "It's cheaper if you're willing to wait, but can you?" I think this points out another common choice people face all the time: want versus need. The consumer debt situation in the U.S. is founded in large part on the fact that many consumers confuse wanting something with needing it. I imagine that some of the so-called software crisis has its roots in the same confusion. "We have to write subpar code in order to meet our goal..." -- only to find the team unable to meet its goal in party precisely because it cut corners earlier.

Godin's article closes with an exhortation to resist consumer debt in the face of temptation:

Stuff now is rarely better than stuff later, because stuff now costs you forever if you go into debt to purchase it. ... It takes discipline to forego pleasure now to avoid a lifetime of pain and fees.

Software developers are wise when they take this advice to heart, too. Investment debt is a good idea in certain circumstances, once you have the experience and expertise to take it on wisely, manage it, and pay it off promptly. Consumer debt is always a loser.

My summer agile course is less than 24 hours away. My mind is turning on all cylinders...


Posted by Eugene Wallingford | Permalink | Categories: Software Development

April 30, 2010 10:17 AM

Taking the Pulse of the Agile Community

Thanks to all of you who have written in response to my previous entry with suggestions for my May term course on agile software development. Most everyone recommended what I knew to be true: source control, automated builds, automated testing, and refactoring are the foundation of agile teams. Keep those suggestions coming!

Over the last semester I have been reading the XP mailing list a little more closely in an effort to discern the pulse of the community these days. Every so often an interesting thread pops up. For example, a few months back, the group talked about its general aversion for software done "quick and dirty". One poster quoted Steve McConnell as saying, "The trouble with quick and dirty is that dirty remains long after quick has been forgotten."

This thread stood out starkly against comments from a couple of my colleagues who view agile ideas as a poison, not just a bad idea but a set of temptations that prevent developers from learning The Right Way to make software. They often rail against XP and its ilk as encouraging quick-and-dirty development, producing bad code with no documentation before moving on willy-nilly to the next "story".

That sounded quite funny as I read professionals who use agile practices every day promote unit testing and especially test-driven development as ways to guard against a quick-and-dirty process. Similarly, building refactoring into your weekly, daily, or hourly development cycle is hardly a recipe for reckless development; it is a practice that shows deep care for the code base and for the quality of the software we deliver to our clients.

One of the most active threads on the list over the last few weeks has been a discussion of the "characteristics of a great XP team". This thread has been full of enlightening capsules from people who have been doing XP in the trenches for many years. Some of the discussion offered advice that applies to great teams of any sort, such as a desire to know the truth and adapt to it. Others took a stab at highlighting what XP itself brings to the table. In one especially insightful message, Bill Caputo suggested that, among other attributes, a great P team...

  • can deliver well-tested software at a regular pace indefinitely, because it has "successfully flattened [the] cost-of-change curve".
  • "has mastered the art of adapting [its] process to the needs of [its] environment."
  • "have such a distribution of knowledge ... that any one person could leave [the] team, and [its] velocity would not be negatively impacted any more than any other person leaving."

Steve Gordon decomposed the question that launched the thread into two parts:

  1. What are the characteristics of a great software team?
  2. How does XP achieve -- or not achieve -- these characteristics?

I think Gordon's decomposition serves as a nice way for me and my students to approach our course, and I think Caputo's list is a good start on what we mean when we talk about agile teams.


Posted by Eugene Wallingford | Permalink | Categories: Software Development

April 29, 2010 8:47 PM

Turning My Thoughts to Agile Software Development

April came and went in a flurry. Now begins a busy time of transition. Today was the last session of my programming languages course. This semester taught me a few new things, which I hope to catalog and consider soon.

Ordinarily the next teaching I do after programming languages is the compiler course that follows. I will be teaching that course, in the fall, as we seem to have attracted a healthy enrollment. But my next teaching assignments will be novelty and part-novelty all in one. I am teaching Agile Software Development in our May term, which runs May 10-June 4. This is a novelty for me in several ways. In all my years on the faculty, I have never taught summer school (!), and I have certainly never taught a 3-credit course in only four weeks. I expect the compressed schedule to create an intensity and focus unlike a regular course, but I fear that it will be hard to reflect much as we keep peddling every day for two hours. Full speed ahead!

The course is only part novelty because I have taught Agile Software Development twice before, in regular semesters. I'm also quite in tune with the agile values, principles, and practices. Still, seven years is an eon in the software world, so much has changed since my last offerings in 2003 and prior. Tools such as testing frameworks have evolved, changed outright, or sprung up new. Scrum, lean, and kanban have become major topics of discussion even as the original practices of XP remain the foundation of most agile teams. Languages have faded and surged. There is a lot of new for me in this old course.

The compressed schedule offers opportunities I have not had before when teaching a development course. Class will meet two hours every business day for four weeks. Students will be immersed in this course. Most will be working in the afternoons, but few will be taking a second course. This allows us to engage the material and our projects with an intensity we can't often muster. (I'll also have to be careful to pace the course so that we don't wear ourselves out, which seems a danger. This is a chance for me and the class to practice one of XP's bedrock practices, sustainable pace!)

The class will be small, only a dozen or so, which also offers interesting possibilities for our project. The best way to learn new practices is to use them, and with the class meeting for nearly eleven hours a week we have a chance to dig in and use tools and practice the practices for extended periods, as a group. The chance to pair program and work with a story board has never been so vivid for one of my classes.

I hope that we are able to craft a course and project that help us bypass some of the flaws with typical course projects. Certainly, we will be collocated more frequently and for longer stretches than in my department's usual project course, and we will be together enough to learn to work as a team. There shouldn't be the constant context switching between courses that students face during the academic year. Whether we can manage close interaction with a customer depends a lot on the availability of others and on the project we end up pursuing.

We do face many of the same challenges as my software engineering course last fall. Our curriculum creates a Babel of several programming languages. Students will come to the course with a cavernous range of experience, skills, and maturity. That gap offers a good test of how pair programming collective code ownership, and apprenticeship can help build and share culture and values. The lack of a common tongue is simply a challenge, though, if we hope to deliver software of value in four short weeks.

The next eleven days will find me busy, busy, busy, thinking about my course, organizing readings, and preparing a project and tools.

I am curious to hear what you think:

  • Which ideas, tools, and practices from the agile world ten years ago remain essential today?
  • What changes in the last decade fundamentally changed what we mean by agile development?
  • What readings -- especially accessible primary sources available on the web -- do you recommend?


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

April 22, 2010 8:36 PM

At Some Point, You Gotta Know Stuff

A couple of days ago, someone tweeted a link to Are you one of the 10% of programmers who can write a binary search?, which revisits a passage by Jon Bentley from twenty-five years ago. Bentley observed back than that 90% of professional programmers were unable to produce a correct version of binary search, even with a couple of hours to work. I'm guessing that most people who read Bentley's article put themselves in the elite 10%.

Mike Taylor, the blogger behind The Reinvigorated Programmer, challenged his readers. Write your best version of binary search and report the results: is it correct or not? One of his conditions was that you were not allowed to run tests and fix your code. You had to make it run correctly the first time.

Writing a binary search is a great little exercise, one I solve every time I teach a data structures course and occasionally in courses like CS1, algorithms, and any programming language- or style-specific course. So I picked up the gauntlet.

You can see my solution in a comment on the entry, along with a sheepish admission: I inadvertently cheated, because I didn't read the rules ahead of time! (My students are surely snickering.) I wrote my procedure in five minutes. The first test case I ran pointed out a bug in my stop condition, (>= lower upper). I thought for a minute or so, changed the condition to (= lower (- upper 1)), and the function passed all my tests.

In a sense, I cheated the intent of Bentley's original challenge in another way. One of the errors he found in many professional developers' solution was an overflow when computing the midpoint of the array's range. The solution that popped into my mind immediately, (lower + upper)/2, fails when lower + upper exceeds the size of the variable used to store the intermediate sum. I wrote my solution in Scheme, which handle bignums transparently. My algorithm would fail in any language that doesn't. And to be honest, I did not even consider the overflow issue; having last read Bentley's article many years ago, I had forgotten about that problem altogether! This is yet another good reason to re-read Bentley occasionally -- and to use languages that do heavy lifting for you.

But.

One early commenter on Taylor's article said that the no-tests rule took away some of my best tools and his usual way of working. Even if he could go back to basics, working in an unfamiliar probably made him less comfortable and less likely to produce a good solution. He concluded that, for this reason, a challenge with a no-tests rule is not a good test of whether someone is a good programmer.

As a programmer who prefers an agile style, I felt the same way. Running that first test, chosen to encounter a specific possibility, did exactly what I had designed it to do: expose a flaw in my code. It focused my attention on a problem area and caused me to re-examine not only the stopping condition but also the code that changed the values of lower and upper. After that test, I had better code and more confidence that my code was correct. I ran more tests designed to examine all of the cases I knew of at the time.

As someone who prides himself in his programming-fu, though, I appreciated the challenge of trying to design a perfect piece of code in one go: pass or fail.

This is a conundrum to me. It is similar to a comment that my students often make about the unrealistic conditions of coding on an exam. For most exams, students are away from their keyboards, their IDEs, their testing tools. Those are big losses to them, not only in the coding support they provide but also in the psychological support they provide.

The instructor usually sees things differently. Under such conditions, students are also away from Google and from the buddies who may or may not be writing most of their code in the lab. To the instructor, This nakedness is a gain. "Show me what you can do."

Collaboration, scrapheap programming, and search engines are all wonderful things for software developers and other creators. But at some point, you gotta know stuff. You want to know stuff. Otherwise you are doomed to copy and paste, to having to look up the interface to basic functions, and to being able to solve only those problems Google has cached the answers to. (The size of that set is growing at an alarming rate.)

So, I am of two minds. I agree with the commenter who expressed concern about the challenge rules. (He posted good code, if I recall correctly.) I also think that it's useful to challenge ourselves regularly to solve problems with nothing but our own two hands and the cleverness we have developed through practice. Resourcefulness is an important trait for a programmer to possess, but so are cleverness and meticulousness.

Oh, and this was the favorite among the ones I read:

I fail. ... I bring shame to professional programmers everywhere.

Fear not, fellow traveler. However well we delude ourselves about living in a Garrison Keillor world, we are all in the same boat.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

April 15, 2010 8:50 PM

Listen To Your Code

On a recent programming languages assignment, I asked students to write a procedure named if->boolean, whose spec was to recognize certain undesirable if expressions and return in their places equivalent but simpler boolean expressions. This procedure could be part of a simple refactoring engine for a Scheme-like language, though I don't know that we discussed it in such terms.

One student's procedure made me smile in a way only a teacher can smile. As expected, his procedure was a cond expression selecting among the undesirable ifs. His procedure began something like this:

    (define if->boolean
      (lambda (exp)
        (cond ((not (if? exp))
                 exp)
              ; Has the form (if condition #t #f)
              ((and (list? exp)
                    (= (length exp) 4)
                    (eq? (car exp) 'if)
                    (exp? (cadr exp))
                    (true-lit? (caddr exp))
                    (false-lit? (cadddr exp)))
                 (if->boolean (cadr exp)))
              ...

The rest of the procedure was more of the same: comments such as

    ; Has the form (if condition #t another)

followed by big and expressions to recognize the noted undesirable if and a call to a helper procedure that constructed the preferred boolean expression. The code was long and tedious, but the comments made its intent clear enough.

Next to his code, I wrote a comment of my own:

Listen to your code. It is saying, "Write syntax procedures!"

How much clearer this code would have been had it read:

    (define if->boolean
      (lambda (exp)
        (cond ((not (if? exp))   exp)
              ((trivial-if? exp) (if->boolean (cadr exp)))
              ...

When I talked about this code in class (presented anonymously in order to guard the student's privacy, in case he desired it), I made it clear that I was not being all that critical of the solution. It was thorough and correct code. Indeed, I praised it for the concise comments that made the intent of the code clearer than it would have been without them.

Still, the code could have been better, and students in the class -- many of whom had written code similar but not always as good as this -- knew it. For several weeks now we have been talking about syntax procedures as a way to define the interface of an ADT and as a way to hide detail that complicates a piece of code.

Syntax Procedure is one pattern in a family of patterns we have learned in order to write structurally-recursive code over an inductive data type. Syntax procedures are, of course, much more broadly applicable than their role in structural recursion, but they are especially useful in helping us to isolate code for manipulating a particular data representation from the code that processes the data values and recurses over their parts. That can be especially useful when students are at the same time still getting used to a language as unusual to them as Scheme.

Michelangelo's Pieta

The note I wrote on the student's printout was one measure of chiding (Really, have you completely forgotten about the syntax procs we've been writing for weeks?) mixed with nine -- or ninety-nine -- measures of stylistic encouragement:

Yes! You have written a good piece of code, but don't stop here. The comment you wrote to help yourself create this code, which you left in so that you would be able to understand the code later, is a sign. Recognize the sign, and use what it says to make your code better.

Most experienced programmers can tell us about the dangers of using comments to communicate a program's intent. When a comment falls out of sync with the code it decorates, woe to future readers trying to understand and modify it. Sometimes, we need a comment to explain a design decision that shapes the code, which the code itself cannot tell us. But most of the time, a comment is just that, a decorator: something meant to spruce up the place when the place doesn't look as good as we know it should. If the lack of syntax procedures in my student's code is a code smell, then his comment is merely deodorant.

Listen to your code. This is one of my favorite pieces of advice to students at all levels, and to professional programmers as well. I even wrote this advice up in a pattern of its own, called Speak the Problem's Language. I knew this pattern from many years writing Smalltalk to build knowledge-based systems in domains from accounting to enginnering. Then I read Peter Norvig's Paradigms of Artificial Intelligence Programming, and his Chapter 2 expressed the wisdom so well that I wanted to make it available as a core coding pattern in all of the pattern languages I was writing at the time. It is still one of my favorites. I hope my student comes to grok it, too.

After class, one of the other students in the class stopped to chat. She is a double major in CS and graphic design, and she wanted to say how odd it was to hear a computer science prof saying, "Listen to your code." Her art professors tell her this sort of thing all time. Let the painting tell you where it wants to go. And, The sculpture is already in the stone; your job is to set it free.

Whatever we want to say about software 'engineering', when we write a program to do something new, our act of creation is not all that different from the painter's, the sculptor's, or the graphic designer's. We shape the code, and it shapes us. Listen.

This was a pretty good way to spend an afternoon talking to students.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development

April 13, 2010 9:03 PM

Unexpected Encounters with Knowledge

In response to a question from Francesco Cirillo, Ward Cunningham says:

Reflecting on your career choices is there anything you would have done differently?

I'm pretty happy with my career, though I never did enough calculus homework if I think how much calculus has influenced how I think in terms of small units of change.

Calculus came up in my programming languages course today, while we were talking about maps, finite functions, and discrete math. Our K-12 system aims students toward calculus, and when they arrive at the university they often end up taking a calculus course if they haven't yet already. Many CS students struggle in calc. They can't help but notice the dearth of applications of calculus in most of their CS courses and naturally ask, "Why are we required to take this class?"

This is a common discussion even among faculty. I can argue both sides of the case, though I admit to believing that understanding the calculus at some level is an essential part of being an educated person, just as understanding the literary and historical context in which one grows and lives is essential. The calculus is one of the crowning achievements of the Enlightenment and helped to usher in the scientific advances that define in large part the world in which we all live today. But Cunningham's reflection encourages us to think about calculus in a different light.

Notice that Cunningham does not talk about direct application of the calculus in any program he wrote. The only program he mentions specifically is WyCash, a portfolio management system. Nor does he talk in an abstract academic way about intellectual achievement and the Age of Reason.

He says instead that the calculus's notion of small units of change has affected the way he thinks. I'm confident that he is thinking here not only of agile software development, with its short iterations and rapid feedback cycle, but also of test-driven development, patterns, and wiki. One can accumulate value in the smallest of the slices. If one accumulates enough of them, then over time the value one amasses can be the area under quite a large curve of action.

This is an indirect application of knowledge. Ward either did enough calculus homework or paid enough attention in class that he was able to understand one of the central ideas underlying the discipline. That understanding probably lay fallow in his mind until he began to see how the idea was recurring in his programming, in his community building, and in his approach to software development. He was then able to think about the implications of the idea in his current work and learn from what we know about the calculus.

I am a fan of Ward's in large part because of his wonderful ability to make such connections. It is hard to anticipate this kind of connection across domains. That's why it's so important to be educated widely and to take seriously ideas from all corners of human accomplishment. Even calc class.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

March 31, 2010 3:22 PM

"Does Not Play Well With Others"

Today I ran across a recent article by Brian Hayes on his home-baked graphics. Readers compliment him all the time on the great graphics in his articles. How does he do it? they ask. The real answer is that he cares what they look like and puts a lot of time into them. But they want to know what tools he uses. The answer to that question is simple: He writes code!

His graphics code of choice is PostScript. But, while PostScript is a full-featured postfix programming language, it isn't the sort of language that many people want to write general-purpose code in. So Hayes took the next natural step for a programmer and built his own language processor:

... I therefore adopted the modus operandi of writing a program in my language of choice (usually some flavor of Lisp) and having that program write a PostScript program as its output. After doing this on an ad hoc basis a few times, it became clear that I should abstract out all the graphics-generating routines into a separate module. The result was a program I named lips (for Lisp-to-PostScript).

Most of what lips does is trivial syntactic translation, converting the parenthesized prefix notation of Lisp to the bracketless postfix of PostScript. Thus when I write (lineto x y) in Lisp, it comes out x y lineto in PostScript. The lips routines also take care of chores such as opening and closing files and writing the header and trailer lines required of a well-formed PostScript program.

Programmers write code to solve problems. More often than many people, including CS students, realize, programmers write a language processor or even create a little language of their own to make solving the more convenient. We have been covering the idea of syntactic abstractions in my programming languages course for the last few weeks, and Hayes offers us a wonderful example.

Hayes describes his process and programs in some detail, both lips and his homegrown plotting program plot. Still, he acknowledges that the world has changed since the 1980s. Nowadays, we have more and better graphics standards and more and better tools available to the ordinary programmer -- many for free.

All of which raises the question of why I bother to roll my own. I'll never keep up -- or even catch up -- with the efforts of major software companies or the huge community of open-source developers. In my own program, if I want something new -- treemaps? vector fields? the third dimension? -- nobody is going to code it for me. And, conversely, anything useful I might come up with will never benefit anyone but me.

Why, indeed? In my mind, it's enough simply to want to roll my own. But I also live in the real world, where time is a scarce resource and the list of things I want to do grows seemingly unchecked by any natural force. Why then? Hayes answers that question in a way that most every programme