TITLE: Problems Are The Thing AUTHOR: Eugene Wallingford DATE: March 14, 2005 2:47 PM DESC: The era of toy problems should pass. Let's make learning to make software be about *something*. ----- BODY: In my last blog, I discussed the search for fresh new examples for use in teaching an OO CS1 course. My LazyWeb request elicited many interesting suggestions, some of which have real promise. Thanks to all who have sent suggestions, and thanks for any others that might be on the way! While simple graphical ball worlds make for fun toy examples, they don't seem as engaging to me as they did a few years back. As first demos, they are fine. But students these days often find them too simplistic. The graphics that they can reasonably create in the first course pale when compared to the programs they have seen and used. To use ball world, we need to provide infrastructure to support more realistic examples with less boilerplate code. Strategy and memory games engage some students quite nicely. Among others, I have used Nim, Simon, and Mastermind. Mastermind has always been my favorite; it has so many wonderful variations and offers so many opportunities to capitalize on good OO design. Games of all sorts seem to engage male students more than female students, which is a common point of conversation when discussing student retention and the diversity of computer science students. We need to use a broader and more interesting set of examples if we hope to attract and retain a more broader group of students. (More on whether this should be an explicit goal later.) I think that this divide is about something much more important than just the interests of men and women. The real problem with games as our primary examples is that these examples are really about nothing of consequence. Wanting to work on such a problem depends almost solely on one's interest in programming for programming's sake. If students want to do something that matters, then these examples won't engage their interest. This idea is at the core of our desire to create a new sort of CS 1 book. As one of my readers pointed out in an e-mail, most introductory programming instruction isn't about anything, either. It "simply marches through the features of whatever language is being used" at the time. The examples used are contrived to make the language feature usable -- not always even desirable, just usable. Being about nothing worked for Seinfeld, but it's not the best way to help students learn -- at least not if that's all we offer them. It also limits the audience that we can hope to attract to computing. So much of computer science instruction is about solutions and how to make them, but the solutions aren't to anything in particular. That appeals to folks who are already interested in the geeky side of programming. What about all those other folks who would make good computer scientists working on problems in other domains? Trying to create interesting problems around programming techniques and language features is a good idea, but it's backward. Interesting problems come from real domains. I learned the same lesson when studying AI as a graduate student in the 1980s. We could build all the knowledge-based systems we wanted as toys to demonstrate our techniques, but no one cared. And besides, where was the knowledge to come from? For toy problems, we had to make it up, or act as our own "domain experts". But you know, building KBS was tougher when we had to work with real domain experts: tax accountants and chemical engineers, plant biologists and practicing farmers. The real problems we worked on with real domain experts exercised our techniques in ways we did not anticipate, helping us to build richer theories at the same time we were building programs that people really used. I thank my advisor for encouraging this mindset in our laboratory from its inception. Real domain problems are more likely to motivate students and teachers. They offer rich interconnections with related problems and related domains. Some of these problems can be quite simple, which is a good thing for teaching beginning students. But many have the messy nature that makes interesting ideas matter. At OOPSLA last year, Owen Astrachan was touting the new science of networks as a source of interesting problems for introductory CS instruction. The books Linked and Six Degrees provide a popular introduction to this area of current interest throughout academia and the world of the Web. Even the idea of the power law can draw students into this area. I recently asked students to write a simple program that included calculating the logarithm of word counts in a document, without saying anything about why. Several students stopped by, intrigued, and asked for more. When I told them a little about the power law and its role in analyzing documents and explaining phenomena in economics and the Web, they all expressed interest in digging deeper. Other problems of this sort are becoming popular. We have started an undergraduate program in bioinformatics and have begun to explore ways to build early CS instruction around examples from that domain. The availability of large databases and APIs opens new doors, too. Google and Amazon have opened their databases to outside programmers. At last year's ITiCSE conference the invited talks all focused on the future of CS instruction by going back to a time in which computing research focused on applied problems. If you've been reading here for a while, then you have read about the importance of context in learning before. A big part of Alan Kay's talks there focused on how students can learn about the beauty of computing through doing real science. His eToys project has students build simulations of physical phenomena that they can observe, and in doing so learn a lot about mathematics and computation. But this is a much bigger idea in Kay's work. If you haven't yet, read his Draper Prize talk, The Power Of The Context. It speaks with eloquence about how people have great ideas when they are immersed in an environment that stimulates thought and connections. Kay's article says a lot about how the working conditions in lab foster such an environment, and perhaps most importantly the people. In an instructional setting the teacher and fellow students define most of this part of the environment. Other people can be a part of the environment, too, through their ideas and creations -- see my article on Al Cullum and his catchphrase "a touch of greatness". But the problems that we work on are also an indispensable element in the context that motivates people to do great work. In his Draper Prize talk, Kay speaks about how his Xerox PARC lab worked with educators, artists, and engineers on problems that mattered to them, along with all the messy distractions that those problems entail. Do you think that the PARC folks would have created as many interesting tools and ideas if they had been working "Hello, World" and other toy problems of their own invention? I don't. Alan's vision in his OOPSLA talks was to create the computing equivalent of Frank Oppenheimer's Exploratorium -- 500 or more exciting examples with which young people could learn about math, science, computation, reading, and writing, all in context. With that many problems at hand, we wouldn't have to worry as much about finding a small handful of examples for use each semester, as every student would likely find something that attracted him or her, something to engage their minds deeply enough that they would learn math and science and computing just to be able to work on the problem that had grabbed a hold of them. The real problem engages the students, and its rich context makes learning something new worthwhile. Whenever the context of the problem is so messy that distractions inhibit learning, we as instructors have to create boundaries that keep the problem real but let the students focus on what matters. Having students work on real problems offers advantages beyond motivation. Remember the problem of attracting and retaining a wider population of students? Real problems may help us there, too. We geeks may like programming for its own sake, but not everyone who could enrich our discipline does. Whatever the natural interests and abilities of students are, different kinds of people seem to different values that affect their choice of academic disciplines and jobs. This may explain some of the difficulty that computer science has attracting and retaining women. A study at the University of Michigan found that women tend to value working with and for people more than men, and that this value accounted at least in part for women tend to choose math and CS careers less frequently: they perceive that math and CS are less directly about people than some other disciplines, even in the sciences. If women do not have this perception, many of our CS1 and 2 courses would give it to them right away. But working on real problems from real domains might send a different signal: computing provides an opportunity to work with other people in more different ways than just about any other discipline! I know that this is one of the reasons I so loved working in knowledge-based systems. I had a chance to work with interesting people from all over the spectrum of ideas: lawyers, accountants, scientists, practitioners, .... And in the meantime I had to study each new discipline in order to understand it well enough to help the people with whom I worked. It was a constant stream of new and interesting ideas! I don't want you to think that no one is using real problems in their courses. Certainly a number of people are. For example, check out Mark Guzdial's media computation approach to introductory computing courses. Media computation -- programs that manipulate images, sounds, and video -- seems like a natural way to go for students of this generation. I think an example of this sort would make a great starting point for my group's work at ChiliPLoP next week. But Mark's project is one of only a few big projects aimed in this direction. If we are to reach an Exploratorium-like 650 great CS 1 examples, then we all need to pitch in. One downside for instructors is that working with real problems requires a lot of work up front. If I want to use the science of networks or genomics as a my theme for a course, then I need to study the area myself well in advance of teaching the course. I have to design all new classroom examples -- and programming assignments, and exam questions. I will probably have to build support software to hide my students from gratuitous complexity in the domain and the programming tools that students use. Another potential downside for someone at a small school is that an applied theme in your course may appeal to some students but not others, and your school can only teach one or a few sections of the course at any one time. This is where Alan Kay's idea of having a large laboratory of possibilities becomes so appealing. This approach requires work, but my ChiliPLoP colleagues seem willing to take the plunge. I'll keep you posted on our efforts and results as they progress. -----