For the last couple of days, I've been speaking with a prospective non-traditional student who is seeking advice about what to study for a second bachelor's degree. His indecision about careers and school is worthy of its own post; I've seen it in so many other 20- and 30-somethings who are unhappy with their first careers and not sure of where to go next.
He wants to get into web development. What should he study? Well, he thinks he wants to get into web development, but he isn't sure exactly what it is. This is a common issue for prospective CS majors coming out of high school; they rarely know just what computer science is. It is apparently also an issue with the more visible discipline of web development. People see the products, but they are not sure at all how they are created. Is this a problem for budding electrical engineers or civil engineers?
This young man asked me for precise definitions of the terms "web design" and "web development", especially clear distinctions between the two, and a map from the terms onto specific undergrad majors. I explained that these are fuzzy terms, like so many of our best words and phrases. That didn't satisfy him much, because the fuzziness won't easily wipe away his indecision.
He is asking the head of the Computer Science department for advice, so I tell him what I think about web development, as fairly as I can, but with an unmistakeable fondness for my discipline. On the one hand, there is presentation, the web pages themselves and how they appear to users within a browser. On the other, there is the technology that underlies the web, the code and data that make it all work. To be versatile in the web game, and ultimately great, a person should learn both.
We do not offer a program in web development at my university, so a student must go to departments that teach the ideas and tools behind these facets of the web. I suggested graphic design with a little psychology for the presentation side, and computer science for the technology side. We have a excellent graphic design program here, and our CS program introduces students to programming and to the ideas of data structures, databases, algorithms, and information storage and retrieval that drive the web. We even teach an undergrad course in user interface design, which is the closest our major courses gets to presentation.
A freshman coming out of high school would do well to double-major in CS and graphic design. We have at least one such double-major now -- a dynamite student -- and a few others are thinking about taking this route. Majoring in both is a challenge, because it requires skills that are, on the surface, so unlike one another. I think the styles of thinking at the heart of these two disciplines are not so different after all. Besides, these majors don't just require skills; they help to develop them. It is a great double major.
However, most nontraditional students coming back to school are not interested in re-living the four- or five-year undergrad experience. They have lives to support, and often families, so they are usually focused on making a career change and getting on with it. As a result, one of these majors will have to do.
It seems to me that if you want to make a dent in the web world, you really want to know the technology and the ideas that make them work. Philip Greenspun teaches a software engineering course at MIT with web programming as its centerpiece, based in large part on the premise that this particular technology is so central to the experience of current CS grads when they get out in the world.
(I wonder what it would be like if I tried to teach Greenspun's course as my SE course?! My guess is that a lot of students would love it, a few would hate it, and the respective ratios for the faculty would be swapped.)
Not all students want to do the technology. I would rather students make this decision from a position of power -- knowing something about the technology -- than from a position of ignorance, but that's not always how the world works. Fine. Then the student can study graphic design and psychology and apply that knowledge to learning how to make web sites whose appearances sizzle. There is a shortage of great web sites out there, which tells me there may be a shortage of great web designers. Go for it.
In at least two of my exchanges with this young man, I ran into a concern that I expected to encounter. It is the fear that rises to the top of almost every neophyte's mind whenever he or she is confronted with the words "computer science": Will I have to do much programming? I think he was asking as much about the jobs he would get in web development as about the CS courses he would have to take to get a major here.
My part of these conversations can be summed up in a mantra I now keep close to my heart:
You don't have to program; you get to program.
We need to do something to change the default expectation young people have about programming. Seriously.
This particular student is a reasonable young man. He is college-educated and earnest about finding a good career. He is also a little afraid. In my recent experience, I have encountered many like him, and I'd like to help them as best I can. So, let's get back to the original question, "What should a person study in college if she wants to get into web development?" Faithful readers: What should I tell them? What am I missing?
The economics blog Marginal Revolution has an occasional series of posts called "Markets in Everything", in which the writers report examples of markets at work in various aspects of everyday life. I've considered doing something similar here with computing, as a way to document some concrete examples of computational thinking and -- gasp! -- computer programs playing a role how we live, work, and play. Perhaps this will be a start.
Courtesy of Wicked Teacher of the West, I came across this story about NBA player Shane Battier, who stands out in an unusual way: by not standing out with his stats. A parallel theme of the story is how the NBA's Houston Rockets are using data and computer analysis in an effort to maximize their chances of victory. The connection to Battier is that the traditional statistics we associate with basketball -- points, rebounds, assists, blocked shots, and the like -- do not reflect his value. The Rockets think that Battier contributes far more to their chance of winning than his stat line shows.
The Rockets collect more detailed data about players and game situations, and Battier is able to use it to maximize his value. He has developed great instincts for the game, but he is an empiricist at heart:
The numbers either refute my thinking or support my thinking, and when there's any question, I trust the numbers. The numbers don't lie.
For an Indiana boy like myself, nothing could be more exciting than knowing that the Houston Rockets employ a head of basketball analytics. This sort of data analysis has long been popular among geeks who follow baseball, a game of discrete events in which the work of Bill James and like-minded statistician-fans of the American Pastime finds a natural home. I grew up a huge baseball fan and, like all boys my age, lived and died on the stats of my favorite players. But Indiana is basketball country, and basketball is my first and truest love. Combining hoops with computer science -- could there be a better job? There is at least one guy living the dream, in Houston.
I have written about the importance of solving real problems in CS courses, and many people are working to redefine introductory CS to put the concepts and skills we teach into context. Common themes include bioinformatics, economics, and media computation. Basketball may not be as important as sequencing the human genome, but it is real and it matters to a enough people to support a major entertainment industry. If I were willing to satisfy my own guilty pleasures, I would design a CS 1 course around Hoosier hysteria. Even if I don't, it's comforting to know that some people are beginning to use computer science to understand the game better.
Many runners spend too much time thinking about the beginning and the end of a race, and little or no time focusing on the middle. In a 5K, that can work all right, because it's almost like a sustained sprint. You worry about not going out too fast in the first mile, and you think about kicking for the last mile or half mile, and there isn't much else. But in a longer race, especially a marathon, the middle is the bulk of the experience. Even if you think of the beginning as 5 miles long and the end as the 10K that starts at the 20-mile marker, fifteen miles remain in between. A lot of runners don't have much of a plan for that part of the race and so go on autopilot. That's too bad, because the middle often determines how fast I run the whole race and, more importantly, how I feel about my race when I'm done.
If you let the middle of a marathon fill your mind with thoughts about being done, it can make you miserable. Many runners think about their time while running; I know I do. "I'm on a pace for 3:45. My goal was 3:35. This is already turning out wrong..." As sports psychologist Jeff Simons says,
Time is what happens after you cross the line. Changing your time is all that happens while you are in the process of running. That's where the focus needs to be.
Treating the middle as valuable in its own right is one way to get the mind off being done and back where it should be: right where you are on the course.
I think this is also true of marathon training plans. Beginners and experienced runners alike start off excited and meticulous about their training in weeks one and two and three; they also know how important it is to pay careful attention to how they taper in the last few weeks leading up to the race. But the plan is twelve, sixteen, or even twenty weeks long. There is a lot of middle... How attentive I am during those weeks to my mileage, pace, diet, and body have as much effect on race day as my taper, maybe more.
Given where I am mileage- and health-wise, I am treating all the parts of my marathon training this year with care -- especially the middle. I have designed a plan that consists of a slow, meticulous build-up of mileage. I'm putting more emphasis on the middle weeks, during which I hope to prepare my body for long runs again with a series of 12-, 14- and 16-milers before getting to the heavy stuff -- 18- and 20-milers -- toward the end of my middle. I am consciously delaying speed until August, well into the middle of my training, by which time I hope my body has adapted to more stress than it's been able to withstand for the last many months. In my current state, the biggest contributor to my time will be stamina at the slower paces. Once I have that in hand, I'll worry about taking my speed up a notch, to see whether my body is ready for that set of stresses.
On Sunday morning, I ran my first 14-miler since 2007 while training for Marine Corps. It went very well, much better than the twelve miles I ran the week before. The body adapts in fits ad starts, and it's always interesting to see how a long run will feel. This one left me felling energized, though I did take a good nap in the afternoon! I feel ready for this week, which aims to let my body consolidate on the last three weeks and so culminates in a 12-miler. Here's hoping that my body cooperates and lets me stay on a path to higher mileage throughout the rest of the middle.
I often find myself at meetings with administrator types, where I am the only person who has spent much time in a classroom teaching university students, at least recently. When talk turns to teaching, I sometimes feel like a pregnant woman at an ob-gyn convention. They may be part of a university and may even have taught a workshop or specialty course, but they don't usually know what it's like to design and teach several courses at a time, over many semesters in a row. That doesn't always stop them from having deeply-held and strong opinions about how to teach. Having been a student once isn't enough to know what it's like to teach.
I have had similar experiences as a university professor among professional software developers. All have been students and have learned something about teaching and learning from their time in school. But their lessons are usually biased by their own experiences. I was a student once, too, but that prepared me only a bit for being a teacher... There are so many different kinds of people in a classroom, so many different kinds of student. Not very many are just like me. (Thank goodness!)
Some software developers have taught. Many give half- or full-day tutorials at conferences. Others teach week-long courses on specific topics. A few have even taught a university course. Even still, I often don't feel like we are talking about the same animal when we talk about teaching. Teaching a training course for professional developers is not the same as teaching a university course to freshmen or even seniors. Teaching an upper-division or graduate seminar bears little resemblance to an elective course for juniors, let alone a class of non-majors. Even teaching such a course as an adjunct can't deliver quite the same experience as teaching a full load of courses across the spectrum of our discipline, day in and day out for a few years in a row. The principle of sustainable pace pops up in a new context.
As with administrators, lack of direct experience doesn't always stop developers from having deeply-held and strong opinions about what we instructors should be doing in the classroom. It creates an interesting dialectic.
That said, I try to learn whatever I can from the developers with whom I'm able to discuss teaching, courses, content, and curricula. One of the reasons I so enjoy PLoP, ChiliPLoP, and OOPSLA is having an opportunity to meet reflective individuals who have thought deeply about their own experiences as students and who are willing to offer advice on how I can do better. But I do try to step back from their advice and put it into the context of what it's like to teach in a real university, not one we've invented. Some ideas sound marvelous in the abstract but die a grisly death on the banks of daily university life. Revolution is easy when the battlefield is in our heads.
When it comes to working with software developers, I am more concerned that they will feel like the pregnant woman when they discuss their area of expertise with me and my university colleagues. One of my goals is not to be "that guy" when talking about software development with developers. I hope and prefer to speak out of personal and professional experience, rather than a theory I read in a book or a blog, or something another professor told me.
What we teach needs to have some connection to what developers do and what our students will need to do when they graduate. There is a lot more to a CS degree than just cutting code, but when we do talk about building software, we should be as accurate and as useful as we can be. This makes teaching a course like software engineering a bigger challenge for most CS profs than the more theoretical or foundational material such as algorithms or even programming languages.
One prescription is the same as above: I listen and try to learn whatever I can from developers when we talk about building software. Conferences like PLoP, ChiliPLoP, and OOPSLA give me opportunities I would not have otherwise, and I listen to alumni tell me about what they do -- and how -- whenever I can. I still have to sift what I learn into the context of the university, but it makes for great raw material.
Another prescription is to write code and use the tools and practices the people in industry use. Staying on top of that fast-moving game gets harder all the time. The world software is alive and changing. We professors have to keep at it. Mostly, it's a matter of us staying alive, too.
Update: I've added a link to a known use.
(A pattern I've seen in running that applies more broadly.)
You are developing a physical skill or an ability that will take you well beyond your current level of performance. Perhaps you are a non-runner preparing for a 5K, or a casual runner training for a marathon, or an experienced runner coming back from a layoff.
To succeed, you will need endurance, the ability to perform steadily over a long period. You will also need strength, the ability to perform at a higher speed or with greater power over a shorter period of time. Endurance enables you to last for an amount of time longer than usual. It requires you to develop your slow-twitch muscles and your aerobic capacity, which depends on effective delivery of oxygen to your muscles. Strength enables you to work faster or harder, such as uphill or against an irregular force. It requires you to develop your fast-twitch muscles and your anaerobic capacity, which depends on muscles working effectively in the absence of oxygen.
You might try to develop strength first. Strength training involves many repetitions of intense work done for short durations. When you are beginning your training, you can handle short durations more easily than long ones. The high intensity will be uncomfortable, but it won't last for long. This does not work very well. First, you won't be able to work at an intense enough level to train your muscles properly, which means that your training sessions will not be as effective as you'd hope. Second, because your muscles are still relatively week, subjecting them to intense work even for short periods greatly increases the risk of injury.
You might try to develop strength and endurance in parallel. This is a strategy commonly tried by people who are in a hurry to reach a specific level of performance. You do longer periods of low-intensity work on some days and longer periods of high-intensity work on others. This strengthens your both your slow- and fast-twitch muscles and allows you to piggyback growth in one area on top of growth in the other. Unfortunately, this does not work well, either. There is a small decrease in the risk of injury from your strength training, but not as much as you might think. Our bodies adapt to change rather slowly, which means that your muscles don't grow stronger fast enough to prepare them for the intensity of strength training. Even when you don't injure yourself, you increase the risk of plateauing or fatigue.
Therefore, build a strong aerobic base first. Train for several weeks or even months at a relatively low level of intensity, resting occasionally to give your body a chance to adapt. This will build endurance, with slower speed or less power than you might want, but also strengthen your muscles, joints, and bones. Only then add to your regimen exercises that build your anaerobic capacity through many repetitions of high-intensity, short-duration activities. These will draw on the core strength developed earlier.
Continue to do workouts focused on endurance. These will give your body a chance to recover from the higher intensity workouts and time to adapt to those stresses in the form of more speed or power. For all but the most serious athletes, one or two strength workouts a week are sufficient. Doing more increases the risk of injury, fatigue, or loss of interest in training. As in so many endeavors, steady, regular practice tends to be much more valuable than occasional or spotty practice. This is especially true when the goal requires a long period of preparation, such as a marathon.
Examples. Every training program I have ever seen for runners, from 5Ks up to marathons, emphasizes the need for a strong aerobic base before before worrying about speed or other forms of power. This is especially true for beginners. Some beginners are eager to improve quickly and often don't realize how hard training can be on their bodies. Others fear that is they are not working "hard enough" they are not making progress. Low-intensity endurance training does work your body hard enough, just not in short bursts that make you strain.
Greg McMillan describes the usual form this pattern takes as well as a variation in Time To Rethink Your Marathon Training Program?. In its most common form, a runner first builds aerobic base, then works on strength in the form of hills and tempo runs, and finally works on speed. Gabriele Rosa, whom McMillan calls "arguably the world's greatest marathon coach", structures his training programs differently. He still starts his athletes with a significant period building aerobic base (Lengthen) followed by by a period that develops anaerobic capability (Strengthen). But he starts the anaerobic phase with short track workouts that develop the runners speed, down to 200m intervals, and only then has the runner move to strength workouts. Rosa's insight is that "the goal in marathon training is to fatigue the athlete with the duration of the workouts and not the speed, so speed needed to be developed first". This variation may not work well for runners coming back from injuries or who are otherwise prone to injury, because the speed workouts stress the body in a more extreme way than the tempo and cruise workouts of longer-distance strength work.
Lengthen, then Strengthen applies even to more experienced runners coming back from periods of little or no training. Many such runners assume that they can quickly return to the level they were at before the layoff, but the body will have adapted to the lower level of exertion and require retraining. Elite athletes returning from injury usually take several months to build their aerobic base before resuming hard training regimens.
I have written this pattern from the perspective of running, but it applies to other physical activities, too, such as biking and swimming. The risk of injury in some sports is lower than in running, due to less load on muscles, joints, and bones, but the principles of endurance and strength are the same.
Related Ideas. I think this pattern is also present in some forms of learning. For example, it is useful to build attention span and vocabulary when learning learning a new discipline before trying to build critical skills or deep expertise. The gentler form of learning provides a base of knowledge that is required for expert analysis or synthesis.
I realize that this application of the pattern is speculative. If you have any thoughts about it, or the pattern more generally, please let me know.
As I set out on my first 12-miler since running a half-marathon in May, I could not help recalling Barney's First Law of Running. Just keep running. It is dark, and the miles lie formidably ahead, but you conquer them in the simplest of ways: keep running.
A couple of weeks ago I began to think about training for fall marathon. If I could run a full post-race week, maybe I was ready to try. Well, I have now run two full weeks, for the first time since a strong three-week stretch in April. Last week was my highest-mileage week -- 32 even -- since April 7-13, 2008, when I put in 33.5 miles. I have been fatigued, but I have managed to run each planned running day.
These have been strange days, indeed. In the span of five days, I ran my fastest 5-miler in recent times, ran a negative split 12-miler that started oppressively slow and finished reasonably, ran my slowest recorded 5-miler ever, and turned around the next day to shave 4 seconds off of last Friday's fast 5-miler.
Th two fast 5-mile runs were on the track, my first real forays on the track since coming down with whatever ails me last May. One day on the track is a healthy practice for me mentally, because it helps me think of pace and speed in a way that longer-form runs outdoors don't. I'm not running "fast" yet, just faster. That I have been healthy enough to do three in the last week and a half is a positive sign, even if they have sapped me more than I would expect.
I have thrown in one cross-training twist. Since I began training for races quite a few years ago, I have tended to neglect stretching and other basic exercise. I was getting plenty of work on the road. My wife has recently started doing Classical Stretch, which is a perfect fit for her, because she danced a lot of ballet growing up. I've been doing a workout or two with her each night. Wow. This is what in the modern running world is called a core workout. It focuses on the usual body parts, such as the abs and hamstrings, but also on infrastructure like the back and hips. The athletic workouts are tough. I never realized how had a workout one could get without using weights or other resistance. This could be good for getting me back closer to marathon condition.
The last time I started on a training plan for a marathon was July 30, 2007. That week, I ran 43 miles, including three 7-milers and a 5x800m track workout. I am nowhere near ready for that yet. I was already in great shape and had trained hard for a half-marathon earlier in the year. That plan required only twelve weeks, though, so I have some hope for getting ready to run an October race. It will be slower, less aggressive, but no less challenging, given where I am right now.
My October travel schedule complicates picking a race. Right now, I am giving most consideration to Indianapolis on October 17 and Mason City, Iowa, on October 25. Indianapolis would repeat the destination of my half earlier this year, but it would be all-new as a run; the half was downtown and on the west side, and the full is on the northeast side of town. This is a big city (373.1 square miles), and so a repeat would not seem like one. Running there twice in one year would be ironic, though after growing up there and never running there at all. The Mason City race is run by a local school system as a fundraiser and has a longer history than most anyone knows. The experience there would be a polar opposite to that of Chicago, my first marathon. It would require a different sort of mental preparation: fewer runners, much smaller crowds, and a lot more solitude. Can I be ready for that?
The cost, looser registration deadlines, and later race date have me leaning toward Mason City. The prospect of running a bigger race, on a day I'll already be in central Indiana, makes Indianapolis attractive. I need to decide by the end of the month for early registration, if nothing else, but more important are getting my training schedule in order and beginning to prepare my mind for the rigor of training.
Seth Godin says:
People will not pay for by-the-book rewrites of news that belongs to all of us. People will not pay for yesterday's news, driven to our house, delivered a day late, static, without connection or comments or relevance. Why should we?
Universities may not be subject to the same threats as newspapers, due in some measure to
But Godin's quote ought to cause a few university professors considerable uneasiness. In the many years since I began attending college as an undergrad, I have seen courses at every level and at every stop that fall under the terms of this rebuke.
After writing that two UNI CS grads had recently defended their doctoral dissertations, I heard about the possibility of a third. Turns out it was more than a possibility... Last Friday, Chris Johnson defended and has since submitted the final version of his dissertation to the University of Tennessee. His work is in the area of scientific visualization, with a focus on computation-intensive simulations. For the last few years, Chris has been working out of Ames, Iowa, and we may be lucky enough to have him remain close by.
The summer bonanza grows. Congratulations, Chris!
Weak developers will move heaven and earth to do the wrong thing. You can't limit the damage they do by giving them less powerful tools. They'll just swing the blunt tools harder.
I still agree with the sentiment, as well as the bigger point: Give people powerful tools, teach them what you can, and let them create. The best programmers will do amazing things; the rest will be none the worse off. If you help to create a culture that values learning and encourages responsibility for others, you may even find some of the weak growing into amazing programmers, too.
In trying to understand the role patterns and pattern languages play both in developing software and in learning to develop software, I often look for different angles from which to look at patterns. I've written the idea of patterns as descriptive grammar and the idea of patterns as a source of freedom in design. Both still seem useful to me as perspectives on patterns, and the latter is among the most-read articles on my blog. The notion of patterns-as-grammar also relates closely to one of the most commonly-cited roles that patterns play for the developer or learner, that of vocabulary for describing the meaningful components of a program.
This weekend, I read Brian Hayes's instructive article on compressive sensing, The Best Bits. Hayes talks about how it is becoming possible to imagine that digital cameras and audio recorders could record compressed streams -- say, a 10-megapixel camera storing a 3MB photo directly rather than recording 30MB and then compressing it after the fact. The technique he calls compressive sensing is a beautiful application of some straightforward mathematics and a touch of algorithmic thinking. I highly recommend it.
While reading this article, though, the idea of patterns as vocabulary came to mind in a new way, triggered initially by this passage:
... every kind of signal that people find meaningful has a sparse representation in some domain. This is really just another way of saying that a meaningful signal must have some structure or regularity; it's not a mere jumble of random bits.
Programs are meaningful signals and have structure and regularity beyond the jumble of seemingly random characters at the level of the programming level. The chasm between random language stuff and high-level structure is most obvious when working with beginners. They have to learn that structure can exist and that there are tools for creating it. But I think developers face this chasm all the time, too, whenever they dive into a new language, a new library, or a new framework. Where is the structure? Knowing it is there and seeing it are too different matters.
The idea of a sparse representation is fundamental to compression. We have to find the domain in which a signal, whether image or sound, can be represented in as few bits as possible while losing little or even none of the signal's information. A pattern language of programs does the same thing for a family of programs. It operates at a level (in Hayes' terms, in a domain) at which the signal of the program can be represented sparsely. By describing Java's I/O stream library as a set of decorators on a set of concrete streams, we convey a huge amount of information in very few words. That's compression. If we say nothing else, we have a lossy compression, in that we won't be able to reconstruct the library accurately from the sparse representation. But if we use more patterns to describe the library (such as Abstract Class and "Throw, Don't Catch"), we get a representation that pretty accurately captures the structure of the library, if not the bit-by-bit code that implements it.
This struck me as a useful way to think about what patterns do for us. If you've seen other descriptions of patterns as a means for compression, I'd love to hear from you.
What a summer for UNI CS alumni in academia! In the last few weeks, Andrew Drenner and Ryan Dixon both defended their Ph.D. dissertations, at the University of Minnesota and UC-Santa Barbara, respectively. Currently Andrew is currently working with a robotics start-up spun off from his research lab, and Ryan is enjoying a short break before starting full-time at Apple next month.
I had the great fortune to work with Andrew and Ryan throughout their undergraduate years, in several courses and projects each. Some students are different from the rest, and these guys distinguished themselves immediately not only by their creativity and diligence but also by their tremendous curiosity. When a person with deep curiosity also has the desire to work hard to find answers, stand back. It is neat to see them both expanding what we know about topics they were working on as undergrads. Indeed, Ryan's project at Apple is very much in the spirit of his undergrad research project, which I was honored to supervise.
Congratulations, gentlemen! Many of your friends and family may think that this means you are no longer students. But you are really joining a new fraternity of students. We are honored to have you among us.
Sketchbooks are not about being a good artist,
they're about being a good thinker.
-- Jason Santa Maria
Five years ago today, I started this blog as a sort of sketchbook for words and ideas. I didn't know just what to expect, so I'm not surprised that it hasn't turned out as I might have guessed. Thinking out loud and writing things down can be like that. Trying to explain to myself and anyone who would what was happening as I lived life as a computer scientist and teacher have been a lot of fun.
In the beginning, I was preparing to teach a course on agile software development and planning to run my second marathon. These topics grew together in my mind, almost symbiotically, and the result was a lot of connections. The connections were made firmer by writing about them. They also gave me my first steady readership, as occasionally someone would share a link with a friend.
Things have changed since 2004. Blogging was an established practice in a certain core demographic but just ready to break out among the masses. Now, many of the bloggers whose worked I cherished reading back then don't write as much as they used to. Newer tools such as Twitter give people a way to share links and aphorisms, and many people seem to live in the Twittersphere now. Fortunately, a lot of people still take the time to share their ideas in longer form.
Even though I go through stretches where I don't write much, my blog has become an almost essential element of how I go about life now. Yesterday's entry is a great example of me writing to synthesize experience in a way I might not otherwise. I had a few thoughts running around my head. They were mostly unrelated but somehow... they wanted to be connected. So I started writing, and ended up somewhere I may not have taken the time to go if I hadn't had to write complete sentences and say things in a way my friends would understand. That is good for me. For you readers? I hope so. A few of you keep coming back.
Five years down the road, I am no longer surprised by how computer science, writing, and running flow together. First, they are all a part of who I am right now, and our minds love to make connections. But then there is something common to all activities that challenge us. With the right spirit, we find that they drive us to seek excellence, and the pursuit of excellence -- whether we are Roger Federer, reaching the highest of heights, or Lance Armstrong, striving to reach those heights yet again, or just a simple CS professor trying to reach his own local max -- is a singular experience.
If you run regularly, you train your mind to cut through or ignore your resistance. You just do it. And in the middle of the run, you love it. When you come to the end, you never want to stop. And you stop, hungry for the next time. That's how writing is, too.
The more I blog, the more I want to write. And, in the face of some troubles over the last year, I wake up hungry to run.
A few years ago, I read a passage from Brian Marick that I tucked away for July 9, 2009:
I've often said that I dread the day when I look back on the me of five years ago without finding his naivete and misconceptions faintly ridiculous. When that day comes, I'll know I've become an impediment to progress.
Just last month, Brian quick-blogged on the same theme: continuing to grow enough that the me of five years ago looks naive, or stepping away from the stage. After five years blogging, my feeling on this is mixed. I look back and see some naivete, yes, but I often see some great stuff. "I thought that?" Sometimes I'm disappointed that a great idea from back then hasn't become more ingrained in my practices of today, but then I remember that it's a lot easier to think an idea than to live it. I do see progress, though. I also see new themes emerging in my thoughts and writing, which is a different sort of progress altogether.
I do take seriously that you are reading this and that you may even make an effort to come back to read more later. I am privileged to have had so many interactions with readers over these five years. Even when you don't send comments and links, I know you are there, spending a little of your precious time here.
So I think I'll stay on this stage a while longer. I am just a guy trying to evolve, and writing helps me along the way.
I recently ran across a link to a Dan Bricklin article from a few years ago, Why Johnny can't program. (I love the web!) Bricklin discusses some of the practical reasons why more people don't program. As he points out, it's not so much that people can't program as that they won't or choose not to program. Why? Because the task of writing code in a textual language isn't fun for everyone. What Bricklin calls "typed statement" programming fails all of Don Norman's principles of good design: visibility, good conceptual model, good mappings, and full and continuous feedback. Other programming models do better on these counts -- spreadsheets, rule-based expert system shells, WYSIWYG editors in which users generate HTML through direct manipulation -- and reach a wider audience. Martin Fowler recently talked about this style, calling it illustrative programming.
I had an agile moment as I read this paragraph from Bricklin about why debugging is hard:
One of the problems with "typed-statement" systems is that even though each statement has an effect, you only see the final result. It is often unclear which statement (or interaction of statements) caused a particular problem. With a "Forms" or direct manipulation system, the granularity is often such that each input change has a corresponding result change.
When we write unit tests for our code at about the same time we write the code, we improve our programming experience by creating intermediate results that help us to debug. But there's more. Writing tests helps us to construct a conceptual model of the program we are writing. They make visible the intended state of the program, and help us to map objects and functions in the code onto the behavior of the program at run-time. When we take small steps and run our tests frequently, they give us full and continuous feedback about the state of our program. Best of all, this understanding is recorded in the tests, which are themselves code!
In some ways, test-driven programming may improve on styles where we don't type statements. By writing tests, we participate actively in creating the model of our program. We are not simply passive users of someone else's constraint engine or inference engine. We construct our understanding as we construct our program.
Then again, some people don't need or want to write the reasoning component, so we need to provide access to tools they can use to be productive. Spreadsheets did that for regular folks. Scripting languages do it for programmers. Some people complain about scripting languages because they lack type safety, hide details, and are too slow. But the fact is that programmers are people, too, and they want tools that put them into a flow. They want languages that hit them in their sweet spot.
From all this you can see that the way a system requires an author to enter instructions into the computer affects the likelihood of acceptance by regular people. The more constrained the instructions the better. The more the instructions are clearly tied to various results the better. The more obvious and quickly the results may be examined the better.
TDD does all this, and more. It makes professional programmers more productive by providing better cognitive support for mental tasks we have to perform anyway. If we use TDD properly as we teach people to program, perhaps it can help us hit the sweet spot for more people, even in a "typed statement" environment.
Last week, I gave a talk on careers in computing to thirty or so high school kids in a math and science program on campus this summer. Because it's hard to make sense out of computing careers if one doesn't even know what computer science is, I started off with half an hour or so talking about CS. Part of that was distinguishing between discovering things, creating things, and studying things.
At the end, we had time for the usual question-and-answer session. The first question came from a young man who had looked quite disinterested throughout the talk: What is the most important thing you have discovered or invented?
Who says kids don't pay attention?
The Age of Fire
Yesterday, I took my laptop with me to do advising at freshmen orientation. It allows me to grab course enrollment data of the university web site (processed, but raw enough), rather than look at the print-outs the advising folks provide every morning. With that data and little more than grep and sorting on columns, I can find courses for my students much more easily than thumbing back and forth in the print-outs. And the results are certainly of a higher quality than my thumbing would give.
The looks on the other advisors' faces at our table made me think of how a group of prehistoric men must have looked when one of their compatriots struck two rocks together to make fire.
Computer Science's Dirty Little Secret
An alumnus sent me a link to an MSNBC article about Kodu, a framework for building Xbox-like games aimed at nine-year-olds.
I like how Matthew MacLaurin, lead developer, thinks:
MacLaurin ... says he hopes it doesn't just teach programming, but teaches us to appreciate programming as a modern art form.
The piece talks about "the growing importance of user-generated content in gaming" and how most people assume "that all of the creativity in video games takes place in the graphics and art side of the gaming studios, while the programming gets done by a bunch of math guys toiling over dry code. Author Winda Benedetti writes (emphasis added):
I had asked [McLaurin] if [Kodu] was like putting chocolate on broccoli -- a means of tricking kids into thinking the complex world of programming was actually fun.
But he insists that's not the case at all.
"It's teaching them that it was chocolate the whole time, it just looked like a piece of broccoli," he explains. "We're really saying that programming is the most fun part of creating games because of the way it surprises you. You do something really simple, and you get something really complex and cool coming back at you."
Programming isn't our dirty little secret. It is a shining achievement.
I am still amazed when lay people respond to me using a computer to solve daily problems, as if I have brought a computation machine from the future. Shocking! Yes, I actually use it to compute. The fact that people are surprised even when a computer scientist uses it that way should help us keep in mind just how little people understand what computer science is and what we can do with it.
Have an answer to the question, "What is the most important thing you have made?" ready at hand, and suitable for different audiences. When someone asks, that is the moment when you might be able to change a mind.
Quoted by Harry Lewis in Excellence Without a Soul:
A liberal education is what remains after you have forgotten the facts that were first learned while becoming educated.
-- Jorge Dominguez
I think this applies not only to a liberal education broadly construed but also to specialized areas of study -- and even to a "narrow" technical field such as computer science. What is left five or ten years from now will be the education our students have received. Students may not remember the intricacies of writing an equals method in Java. I won't mind one bit. What will they remember? This is the true test of the courses we create and of the curricula we design. Let's set our sights high enough to hit the target we seek.
Lately I've been trying to swear off scare quotes and other writing affectations. I use them above with sincere intention. Computer science is not as narrow as most people think. Students usually think it is, and so do many of their parents. I hope that what we teach and do alleviates this misconception. Sadly, too often those of us who study computer science -- and teach it -- think of the discipline too narrowly. We may not preach it that way, but we often practice it so.
With good courses, a good curriculum, and a little luck, students may even remember some of their CS education. I enjoyed reading how people like Tim O'Reilly have been formed by elements of their classical classical education. How are we forming our students in the spirit of a classical CS education? If any discipline needs to teach enduring truths, it is ours! The details disappear with every new chip, every new OS, every new software trend.
What is most likely to remain from our stints in school are habit. Sure, CS students must take with them some facts and truths: trade-offs matter; in some situations, the constant dominates the polynomial; all useful programming languages have primitives, means for combining them, and means for abstracting away detail. Yes, facts matter, but our nature is tied to its habits. I said last time that publishing the data I collect and use would be a good habit because habits direct how we think. I am a pragmatist in the strong sense that knowledge is habit of thought. Habit of action creates habit of thought. Knowledge is not the only value born in habit. As Aristotle taught us,
Excellence is an art won by training and habituation. We do not act rightly because we have virtue or excellence, but rather we have those because we have acted rightly.
Even an old CS student can remember some of his liberal arts education...
Finally, we will do well to remember that students learn as much or more from the example we set as from what we say in the classroom, or even in our one-on-one mentoring. All the more reason to create habits of action we don't mind having our students imitate.
Note. Someone might read Excellence Without a Soul and think that Harry Lewis is a classicist or a humanities scholar. He is a computer scientist, who just happened to spend eight years as Dean of Harvard College. Dominguez, whom Lewis quotes, is a political science professor at Harvard, but he claims to be paraphrasing Alfred North Whitehead -- a logician and mathematician -- in the snippet above. Those narrow technical guys...
My favorite Lewis book is, in fact, a computer science book, Elements of the Theory of Computation, which I mentioned here a while back. I learned theory of computation from that book -- as well as a lot of basic discrete math, because my undergrad CS program didn't require a discrete course. Often, we learn well enough what we need to learn when we need it. Elements remains one of my favorite CS books ever.
As I mentioned last time, this week I am getting back to some regular work after mostly wrapping up a big project, including cleaning off my desk. It is cluttered with a lot of loose paper that the Digital Age had promised to eliminate. Some is my own fault, paper copies of notes and agendas I should probably find a way to not to print. Old habits dies hard.
But I also have a lot paper sent to me as department head. Print-outs; old-style print-outs from a mainframe. The only thing missing from a 1980s flashback is the green bar paper.
Some of these print-outs are actually quite interesting. One set is of grade distribution reports produced by the registrar's office, which show how many students earned As, Bs, and so on in each course we offered this spring and for each instructor who taught a course in our department. This sort of data can be used to understand enrollment figures and maybe even performance in later courses. Some upper administrators have suggested using this data in anonymous form as a subtle form of peer pressure, so that profs who are outliers within a course might self-correct their own distributions. I'm ready to think about going there yet, but the raw data seems useful, and interesting in its own right.
I might want to do more with the data. This is the first time I recall receiving this, but in the fall it would be interesting to cross-reference the grade distributions by course and instructor. Do the students who start intro CS in the fall tend to earn different grades than those who start in the spring? Are there trends we can see over falls, springs, or whole years? My colleagues and I have sometimes wondered aloud about such things, but having a concrete example of the data in hand has opened new possibilities in my mind. (A typical user am I...)
As a programmer, I have the ability to do such analyses with relatively straightforward scripts, but I can't. The data is closed. I don't receive actual data from the registrar's office; I receive a print-out of one view of the data, determined by people in that office. Sadly, this data is mostly closed even to them, because they are working with an ancient mainframe database system for which there is no support and a diminishing amount of corporate memory here on campus. The university is in the process of implementing a new student information system, which should help solve some of these problems. I don't imagine that people across campus will have much access to this data, though. That's not the usual M.O. for universities.
Course enrollment and grade data aren't the only ones we could benefit from opening up a bit. As a part of the big project I just wrapped up, the task force I was on collected a massive amount of data about expenditures on campus. This data is accessible to many administrators on campus, but only through a web interface that constrains interaction pretty tightly. Now that we have collected the data, processed almost all of it by hand (the roughness of the data made automated processing an unattractive alternative), and tabulated it for analysis, we are starting to receive requests for our spreadsheets from others on campus. These folks all have access to the data, just not in the cleaned-up, organized format into which we massaged it. I expressed frustration with our financial system in a mini-rant a few years ago, and other users feel similar limitations.
For me, having enrollment and grade data would be so cool. We could convert data into information that we could then us to inform scheduling, teaching assignments, and the like. Universities are inherently an information-based institutions, but we don't always put our own understanding of the world into practice very well. Constrained resources and intellectual inertia slow us down or stop us all together.
Hence my wistful hope while reading Tim Bray's "Hello-World" for Open Data. Vancouver has a great idea:
- Publish the data in a usable form.
- License it in a way that turns people loose to do whatever they want, but doesn't create unreasonable liability risk for the city.
- See what happens. ...
Would anyone on campus take advantage? Maybe, maybe not. I can imagine some interesting mash-ups using only university data, let alone linking to external data. But this isn't likely to happen. GPA data and instructor data are closely guarded by departments and instructors, and throwing light on it would upset enough people that any benefits would probably be shouted down. But perhaps some subset of the data the university maintains, suitably anonymized, could be opened up. If nothing else, transparency sometimes helps to promote trust.
I should probably do this myself, at the department level, with data related to schedule, budget, and so on. I occasionally share the spreadsheets I build with the faculty, so they can see the information I use to make decisions. This spring, we even discussed opening up the historic formula used in the department to allocate our version of merit pay.
(What a system that is -- so complicated that that I've feared making more than small editorial changes to it in my time as head. I keep hoping to find the time and energy to build something meaningful from scratch, but that never happens. And it turns out that most faculty are happy with what we have now, perhaps for "the devil you know" reasons.)
I doubt even the CS faculty in my department would care to have open data of this form. We are a small crew, and they are busy with the business of teaching and research. It is my job to serve them by taking as much of this thinking out of our way. Then again, who knows for sure until we try? If the cost of sharing can be made low enough, I'll have no reason not to share. But whether anyone uses the data that might not even be the real point. Habits change when we change them, when we take the time to create new ones to replace the old ones. This would a good habit for me to have.
I've been buried in a big project on campus for the last few months. Yesterday, we delivered our report to the president. Ah, time to breathe, heading into a holiday weekend! Of course, next week I'll get back to my regular work. Department stuff. Cleaning my desk. And thinking about teaching software engineering this fall.
A bit of side reading found via my Twitter friends has me thinking about testing, and the role it will play in the course. In the old-style software engineering course, testing is a "stage" in the "process", which betrays a waterfall view of the world even when the instructor and textbook say that they encourage iterative development. But testing usually doesn't get much attention in such courses, maybe one chapter that describes the theory of testing and a few of the kinds of testing we need to do.
It seems to me that testing can take a bigger place in the course, if only because it exemplifies the sort of empiricism that we should all engage in as software developers. When we test, we run experiments to gather evidence that our program works as specified. We should adopt a similar mindset about how we build our programs. How do we know that our design is a good one? Or that our team is functioning well? Or that we are investing enough time and energy in writing tests and refactoring our code?
That's one reason I like Joakim Karlsson's post about the principle of locality in code changes. There may be ways that he can improve his analysis, but the most important thing about this post is that he analyzed code at all. He had a question about how code edits work, so he wrote a program to ask subversion repositories for the answer. That's so much better than assuming that his own hypothesis was correct, or that conventional wisdom was.
In regard to the testing process itself, Michael Feathers wrote a piece on "canalizing" design that points out a flaw in how we usually test our code. We write tests that are independent of one another in principle but that our test engines always run in the same order. This inadvertent weakness of sequential code creates an opportunity for programmers to write code that takes advantage of the implicit relationship between tests. But it's not really an advantage at all, because we then have dependencies in our code that we may not be aware of and which should not exist at all. Feathers suggests putting the tests in a set data structure and executing them them from there. At least then the code makes explicit that there is no implied order to the tests, which reminds the programmers who modify the code later that they should not depend on the order of test execution.
(I also like this idea for its suggestion that programs can and other should be dynamic structures, not dead sequences of text. Using a set of tests also moves us a step closer to making our code work well in a parallel environment. Explicit and implicit sequencing in programs makes it hard to employ the full power of multicore systems, and we need to re-think how we structure our programs if we want to break away from purely sequential machines. The languages guy in me sees some interesting applications of this idea in how write our compilers.)
Finally, I enjoyed reading Gojko Adzic's description of Keith Braithwaite's "TDD as if you mean it" exercise. Like the programming challenges I have described, it asks developers to take an idea to its extreme to break out of habits and to learn just how the idea feels and what it can give. Using tests to drive how the writing of code is more different from what most of us do than we usually realize. This exercise can help you to see just how different -- if you have an exercise leader like Keith to keep you honest.
However, I disagree with something Keith said in response to a comment about the relationship between TDD and functional programming:
I'm firmly convinced that sincere TDD leads one towards a functional style.
TDD will drive you to the style whose language you think.
There will be functional components to your solution to support the tests, and some good OOP has a functional feel. But in my experience you can end up with very nice objects in an object-oriented program as a result of faithfully-executed TDD.
Another of Braithwaite's comments saved the day, though. He credits Allan Watts for this line that captures his intent in designing exercises like this:
I came not as a salesman but as an entertainer. I want you to enjoy these ideas because I enjoy them.
Love this! He has a scholar's heart.
There is a lot more to testing that unit tests or regression testing. Finding ways to introduce students to the full set of ideas while also giving them a visceral sense of testing in the trenches is a challenge. I have to teach enough to prepare a general audience and also prepare students who will go on to take our follow-up course, Software Testing. That's a course that undergraduates at most schools don't have the opportunity to take, a strong point of our program. But that course can't be an excuse not to do testing well in the software engineering course. It's not a backstop; it's new ballgame.