December 18, 2014 2:33 PM

Technical Problems Are The Easy Ones

Perhaps amid the daily tribulations of a software project, Steven Baker writes

Oy. A moving goal line, with a skeleton crew, on a shoestring budget. Technical problems are the easy ones.

And here we all sit complaining about monads and Java web frameworks...

My big project this semester has not been developing software but teaching beginners to develop software, in our intro course. There is more to Intro than programming, but for many students the tasks of learning a language and trying to write programs comes to dominate most everything else. More on that soon.

Yet even with this different sort of project, I feel much as Baker does. Freshmen have a lot of habits to learn and un-learn, habits that go well beyond how they do Python. My course competes with several others for the students' attention, not to mention with their jobs and their lives outside of school. They come to class with a lifetime of experience and knowledge, as well some surprising gaps in what they know. A few are a little scared by college, and many worry that CS won't be a good fit for them.

The technical problems really are the easy ones.

Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

December 14, 2014 9:38 AM

Social Media, Baseball, and Addiction

In a recent interview with Rolling Stone, rock star Geddy Lee revealed himself as a fan of baseball, but not of social media:

Geddy Lee isn't a big fan of social media. "I sometimes look on Twitter to follow baseball transactions," he says. "But that's it. I'm also not on Facebook or anything. I see it as an addiction and I have enough addictions. God knows I pick up my phone enough to check baseball scores."

As a baseball fan without a smart phone, I am in no rush to judge. I don't need more addictions, either.

The recently-concluded winter baseball meetings likely kept Lee as busy following transactions as they kept me, with several big trades and free agent signings. My Reds and Tigers both made moves that affect expectations for 2015.

Pitchers and catchers report in a little over two months. Lee and I will be checking scores again soon enough.

Posted by Eugene Wallingford | Permalink | Categories: General, Photos

December 02, 2014 2:53 PM

Other People's Best Interests

Yesterday I read:

It's hard for me to figure out people voting against their own self-interests.

I'm not linking to the source, because it wouldn't be fair to single the speaker out, especially when so many other things in the article are spot-on. Besides, I hear many different people express this sentiment from time time, people of various political backgrounds and cultural experiences. It seems a natural human reaction when things don't turn out the way we think they should.

Here is something I've learned from teaching and from working with teams doing research and writing software:

If you find yourself often thinking that people aren't acting in their own self-interests, maybe you don't know what their interests are.

It certainly may be true that people are not acting in what you think is their own self-interest. But it's rather presumptuous to think that you other people's best interest better than they do.

Whenever I find myself in this position, I have some work to do. I need to get to know my students, or my colleagues, or my fellow citizens, better. In cases where it's really true, and I have strong reason to think they aren't acting in their own best interest, I have an opportunity to help them learn. This kind of conversation calls for great care, though, because often we are dealing with people's identities and most deeply-held believes.

Posted by Eugene Wallingford | Permalink | Categories: General, Managing and Leading, Teaching and Learning

December 01, 2014 2:05 PM

When YAGNI Becomes "Now We Need It"

The Codist told the story of DeltaGraph last week. It must be quite a feeling to have code you wrote twenty-five years ago still running in a commercial product today. Of course, such longevity creates challenges for the maintainers. The longer a program lives, the more likely that a feature that you thought you'd never need becomes desirable, even necessary.

Like a Windows version, circa 1989:

When I started the UI code I asked two questions of Deltapoint (1) will this code ever be ported to Windows (2) do you want to support multiple documents open at a time. The answer to both was no. Windows was still too primitive to care about and multiple open documents was fairly uncommon.

Within a few years, "You Aren't Gonna Need It" turned into "Now We Need It":

With 3.0 they actually began to port it to Windows 3.1, and it was an immense pain to do. I had spent no time trying to worry about cross platform issues as I had asked up front and been told no. Of course Windows was now becoming important. The port was never very stable as a result and I think it actually became a separate code base, making new features hard to do.

This is one of the balancing essential acts of writing software. Operating with a YAGNI mindset is generally a prudent way to ensure we don't write code that is unnecessarily complex and hard to work with. Yet we have to make our programs open to the changes that will inevitably follow. And some changes are easier to make if we have built hooks and fulcrums into the code.

So, we strive for ways not to build the things we don't need now, but make it possible to accommodate them later. For me, this has always been the great promise of object-oriented programming. I try to think that way even when I'm working in languages that don't support the style directly, but I'm not good enough yet. That only helps me admire the accomplishments of teams like the one that created DeltaGraph even more.

Posted by Eugene Wallingford | Permalink | Categories: Software Development

November 25, 2014 1:43 PM

Concrete Play Trumps All

Areschenko-Johannessen, Bundesliga 2006-2007
One of the lessons taught by the computer is that concrete play trumps all.

This comment appeared in the review of a book of chess analysis [ paywalled ]. The reviewer is taking the author to task for talking about the positional factors that give one player "a stable advantage" in a particular position, when a commercially-available chess program shows the other player can equalize easily, and perhaps even gain an advantage.

It is also a fitting comment on our relationship with computers these days more generally. In areas such as search and language translation, Google helped us see that conventional wisdom can often be upended by a lot of data and many processors. In AI, statistical techniques and neural networks solve problems in ways that models of human cognition cannot. Everywhere we turn, it seems, big data and powerful computers are helping us to redefine our understanding of the world.

We humans need not lose all hope, though. There is still room for building models of the world and using them to reason, just as there is room for human analysis of chess games. In chess, computer analysis is pushing grandmasters to think differently about the game. The result is a different kind of understanding for the more ordinary of us, too. We just have to be careful to check our abstract understanding against computer analysis. Concrete play trumps all, and it tests our hypotheses. That's good science, and good thinking.


(The chess position is from Areschenko-Johannessen 2006-2007, used as an example in Chess Training for Post-Beginners by Yaroslav Srokovski and cited in John Hartmann's review of the book in the November 2014 issue of Chess Life.)

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 23, 2014 8:50 AM

Supply, Demand, and K-12 CS

When I meet with prospective students and their parents, we often end up discussing why most high schools don't teach computer science. I tell them that, when I started as a new prof here, about a quarter of incoming freshmen had taken a year of programming in high school, and many other students had had the opportunity to do so. My colleagues and I figured that this percentage would go way up, so we began to think about how we might structure our first-year courses when most or all students already knew how to program.

However, the percentage of incoming students with programming experience didn't go up. It went way down. These days, about 10% of our freshman know how to program when they start our intro course. Many of those learned what they know on their own. What happened, today's parents ask?

A lot of things happened, including the dot-com bubble, a drop in the supply of available teachers, a narrowing of the high school curriculum in many districts, and the introduction of high-stakes testing. I'm not sure how much each contributed to the change, or whether other factors may have played a bigger role. Whatever the causes, the result is that our intro course still expects no previous programming experience.

Yesterday, I saw a post by a K-12 teacher on the Racket users mailing list that illustrates the powerful pull of economics. He is leaving teaching for software development industry, though reluctantly. "The thing I will miss the most," he says, "is the enjoyment I get out of seeing youngsters' brains come to life." He also loves seeing them succeed in the careers that knowing how to program makes possible. But in that success lies the seed of his own career change:

Speaking of my students working in the field, I simply grew too tired of hearing about their salaries which, with a couple of years experience, was typically twice what I was earning with 25+ years of experience. Ultimately that just became too much to take.

He notes that college professors probably know the feeling, too. The pull must be much stronger on him and his colleagues, though; college CS professors are generally paid much better than K-12 teachers. A love of teaching can go only so far. At one level, we should probably be surprised that anyone who knows how to program well enough to teach thirteen- or seventeen-year-olds to do it stays in the schools. If not surprised, we should at least be deeply appreciative of the people who do.

Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

November 20, 2014 3:23 PM

When I Procrastinate, I Write Code

I procrastinated one day with my intro students in mind. This is the bedtime story I told them as a result. Yes, I know that I can write shorter Python code to do this. They are intro students, after all.


Once upon a time, a buddy of mine, Chad, sent out a tweet. Chad is a physics prof, and he was procrastinating. How many people would I need to have in class, he wondered, to have a 50-50 chance that my class roster will contain people whose last names start with every letter of the alphabet?


This is a lot like the old trivia about how we only need to have 23 people in the room to have a 50-50 chance that two people share a birthday. The math for calculating that is straightforward enough, once you know it. But last names are much more unevenly distributed across the alphabet than birthdays are across the days of the year. To do this right, we need to know rough percentages for each letter of the alphabet.

I can procrastinate, too. So I surfed over to the US Census Bureau, rummaged around for a while, and finally found a page on Frequently Occurring Surnames from the Census 2000. It provides a little summary information and then links to a couple of data files, including a spreadsheet of data on all surnames that occurred at least 100 times in the 2000 census. This should, I figure, cover enough of the US population to give us a reasonable picture of how peoples' last names are distributed across the alphabet. So I grabbed it.

(We live in a wonderful time. Between open government, open research, and open source projects, we have access to so much cool data!)

The spreadsheet has columns with these headers:

    name,rank,count,prop100k,cum_prop100k,      \
                    pctwhite,pctblack,pctapi,   \

The first and third columns are what we want. After thirteen weeks, we know how to do compute the percentages we need: Use the running total pattern to count the number of people whose name starts with 'a', 'b', ..., 'z', as well as how many people there are altogether. Then loop through our collection of letter counts and compute the percentages.

Now, how should we represent the data in our program? We need twenty-six counters for the letter counts, and one more for the overall total. We could make twenty-seven unique variables, but then our program would be so-o-o-o-o-o long, and tedious to write. We can do better.

For the letter counts, we might use a list, where slot 0 holds a's count, slot 1 holds b's count, and so one, through slot 25, which holds z's count. But then we would have to translate letters into slots, and back, which would make our code harder to write. It would also make our data harder to inspect directly.

    ----  ----  ----  ...  ----  ----  ----    slots in the list

0 1 2 ... 23 24 25 indices into the list

The downside of this approach is that lists are indexed by integer values, while we are working with letters. Python has another kind of data structure that solves just this problem, the dictionary. A dictionary maps keys onto values. The keys and values can be of just about any data type. What we want to do is map letters (characters) onto numbers of people (integers):

    ----  ----  ----  ...  ----  ----  ----    slots in the dictionary

'a' 'b' 'c' ... 'x' 'y' 'z' indices into the dictionary

With this new tool in hand, we are ready to solve our problem. First, we build a dictionary of counters, initialized to 0.

    count_all_names = 0
    total_names = {}
    for letter in 'abcdefghijklmnopqrstuvwxyz':
        total_names[letter] = 0

(Note two bits of syntax here. We use {} for dictionary literals, and we use the familiar [] for accessing entries in the dictionary.)

Next, we loop through the file and update the running total for corresponding letter, as well as the counter of all names.

    source = open('app_c.csv', 'r')
    for entry in source:
        field  = entry.split(',')        # split the line
        name   = field[0].lower()        # pull out lowercase name
        letter = name[0]                 # grab its first character
        count  = int( field[2] )         # pull out number of people
        total_names[letter] += count     # update letter counter
        count_all_names     += count     # update global counter

Finally, we print the letter → count pairs.

    for (letter, count_for_letter) in total_names.items():
        print(letter, '->', count_for_letter/count_all_names)

(Note the items method for dictionaries. It returns a collection of key/value tuples. Recall that tuples are simply immutable lists.)

We have converted the data file into the percentages we need.

    q -> 0.002206197888442366
    c -> 0.07694634659082318
    h -> 0.0726864447688946
    f -> 0.03450702533438715
    x -> 0.0002412718532764804
    k -> 0.03294646311104032

(The entries are not printed in alphabetical order. Can you find out why?)

I dumped the output to a text file and used Unix's built-in sort to create my final result. I tweet Chad, Here are your percentages. You do the math.

Hey, I'm a programmer. When I procrastinate, I write code.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

November 17, 2014 3:34 PM

Empathy for Teachers Doing Research

I appreciate that there are different kinds of university, with different foci. Still, I couldn't help but laugh while reading Doug McKee's Balancing Research and Teaching at an Elite University. After acknowledging that he would "like to see the bar for teaching raised at least a little bit", McKee reminds us why they can't raise it too far: Faculty who spend too much time on their teaching will produce lower-quality research, and this will cause the prestige of the university to suffer. But students should know what's going on:

I'd also like to see more honesty about the primacy of research in the promotion process. This would make the expectations of the undergraduates more reasonable and give them more empathy for the faculty (especially the junior faculty).

Reading this, I laughed out loud, as my first thought was, "You're kidding, right?" I'm all for more empathy for professors; teaching is a tough job. But I doubt "My teaching isn't very good because I spend all my time on research, and the university is happy with that." is going to elicit much sympathy.

Then again, I may be seeing the world from the perspective of a different sort of university. Many students are at elite universities in large part for the prestige of the school. Faculty research is what maintains the university's prestige. So maybe those students view subpar classroom teaching simply as part of the cost of doing business.

Then there was this:

Undergraduates may not always get a great teacher in the classroom, but they are always learning from someone at the cutting edge of their discipline, and there is no substitute for that.

Actually, there is. It's called good teaching. Being talked at by an all-research, no-teaching Nobel laureate for fifty minutes a day, three days a week, may score high on the prestige meter, but it won't teach you how to think. Then again, with high enough admission standards, perhaps the students can take care of themselves.

I don't mean to sound harsh. Many research professors are also excellent teachers, and in some courses, being at the forefront of the discipline's knowledge is a great advantage. And I do feel empathy for faculty who find themselves in a position where the reward structure effectively forces them to devote little or no time to their teaching. Teaching well takes time.

But let's also keep in mind that those same faculty members chose to work at an elite university and, unlike many of their their students, the faculty know what that means for the balance between research and teaching in their careers.

Over the last decade or two, funding for public universities in most states has fallen dramatically compared to the cost of instruction. I hope state legislatures eventually remember that great teaching takes time, and take that into account when they are allocating resources among the institutions that focus on research and those that focus on students.

Posted by Eugene Wallingford | Permalink | Categories: Teaching and Learning

November 11, 2014 7:53 AM

The Internet Era in One Sentence

I just love this:

When a 14-year-old kid can blow up your business in his spare time, not because he hates you but because he loves you, then you have a problem.

Clay Shirky attributes it to Gordy Thompson, who managed internet services at the New York Times in the early 1990s. Back then, it was insightful prognostication; today, it serves as an epitaph for many an old business model.

Are 14-year-old kids making YouTube videos to replace me yet?

Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 09, 2014 5:40 PM

Storytelling as a Source of Insight

Last week, someone tweeted a link to Learning to Learn, a decade-old blog entry by Chris Sells. His lesson is two-fold. To build a foundation of knowledge, he asks "why?" a lot. That is a good tactic in many settings.

He learned a method for gaining deeper insight by accident. He was tabbed to teach a five-day course. When he reached Day 5, he realized that he didn't know the material well enough -- he didn't know what to say. The only person who could take over for him was out of town. So he hyperventilated for a while, then decided to fake it.

That's when learning happened:

As I was giving the slides that morning on COM structured storage (a particularly nasty topic in COM), I found myself learning how it it worked as I told the story to my audience. All the studying and experimentation I'd done had given me the foundation, but not the insights. The insights I gained as I explained the technology. It was like fireworks. I went from through those topics in a state of bliss as all of what was really going on underneath exploded in my head. That moment taught me the value of teaching as a way of learning that I've used since. Now, I always explain stuff as a way to gain insight into whatever I'm struggling with, whether it's through speaking or writing. For me, and for lots of others, story telling is where real insight comes from.

I have been teaching long enough to know that it doesn't always go this smoothly. Often, when I don't know what to say, I don't do a very good job. Occasionally, I fail spectacularly. But it happens surprisingly often just as Sells describes. If we have a base of knowledge, the words come, and in explaining an idea for the first time -- or the hundredth -- we come to understand it in a new, deeper way. Sometimes, we write to learn. Other times, we teach.

We can help our students benefit from this phenomenon, too. Of course, we ask them to write. But we can go further with in-class activities in which students discuss topics and explain them to one another. Important cognitive processing happens when students explain a concept that doesn't happen when they study on our own.

I think the teach-to-learn phenomenon is at play in the "why?" tactic we use to learn in the first place. The answer to a why is an reason, an explanation. Asking "why?" is the beginning of telling stories to ourselves.

Story telling is, indeed, a source of deeper insight.

Posted by Eugene Wallingford | Permalink | Categories: Teaching and Learning