November 25, 2014 1:43 PM

Concrete Play Trumps All

Areschenko-Johannessen, Bundesliga 2006-2007
One of the lessons taught by the computer is that concrete play trumps all.

This comment appeared in the review of a book of chess analysis [ paywalled ]. The reviewer is taking the author to task for talking about the positional factors that give one player "a stable advantage" in a particular position, when a commercially-available chess program shows the other player can equalize easily, and perhaps even gain an advantage.

It is also a fitting comment on our relationship with computers these days more generally. In areas such as search and language translation, Google helped us see that conventional wisdom can often be upended by a lot of data and many processors. In AI, statistical techniques and neural networks solve problems in ways that models of human cognition cannot. Everywhere we turn, it seems, big data and powerful computers are helping us to redefine our understanding of the world.

We humans need not lose all hope, though. There is still room for building models of the world and using them to reason, just as there is room for human analysis of chess games. In chess, computer analysis is pushing grandmasters to think differently about the game. The result is a different kind of understanding for the more ordinary of us, too. We just have to be careful to check our abstract understanding against computer analysis. Concrete play trumps all, and it tests our hypotheses. That's good science, and good thinking.

~~~~

(The chess position is from Areschenko-Johannessen 2006-2007, used as an example in Chess Training for Post-Beginners by Yaroslav Srokovski and cited in John Hartmann's review of the book in the November 2014 issue of Chess Life.)


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 23, 2014 8:50 AM

Supply, Demand, and K-12 CS

When I meet with prospective students and their parents, we often end up discussing why most high schools don't teach computer science. I tell them that, when I started as a new prof here, about a quarter of incoming freshmen had taken a year of programming in high school, and many other students had had the opportunity to do so. My colleagues and I figured that this percentage would go way up, so we began to think about how we might structure our first-year courses when most or all students already knew how to program.

However, the percentage of incoming students with programming experience didn't go up. It went way down. These days, about 10% of our freshman know how to program when they start our intro course. Many of those learned what they know on their own. What happened, today's parents ask?

A lot of things happened, including the dot-com bubble, a drop in the supply of available teachers, a narrowing of the high school curriculum in many districts, and the introduction of high-stakes testing. I'm not sure how much each contributed to the change, or whether other factors may have played a bigger role. Whatever the causes, the result is that our intro course still expects no previous programming experience.

Yesterday, I saw a post by a K-12 teacher on the Racket users mailing list that illustrates the powerful pull of economics. He is leaving teaching for software development industry, though reluctantly. "The thing I will miss the most," he says, "is the enjoyment I get out of seeing youngsters' brains come to life." He also loves seeing them succeed in the careers that knowing how to program makes possible. But in that success lies the seed of his own career change:

Speaking of my students working in the field, I simply grew too tired of hearing about their salaries which, with a couple of years experience, was typically twice what I was earning with 25+ years of experience. Ultimately that just became too much to take.

He notes that college professors probably know the feeling, too. The pull must be much stronger on him and his colleagues, though; college CS professors are generally paid much better than K-12 teachers. A love of teaching can go only so far. At one level, we should probably be surprised that anyone who knows how to program well enough to teach thirteen- or seventeen-year-olds to do it stays in the schools. If not surprised, we should at least be deeply appreciative of the people who do.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General, Teaching and Learning

November 20, 2014 3:23 PM

When I Procrastinate, I Write Code

I procrastinated one day with my intro students in mind. This is the bedtime story I told them as a result. Yes, I know that I can write shorter Python code to do this. They are intro students, after all.

~~~~~

Once upon a time, a buddy of mine, Chad, sent out a tweet. Chad is a physics prof, and he was procrastinating. How many people would I need to have in class, he wondered, to have a 50-50 chance that my class roster will contain people whose last names start with every letter of the alphabet?

    Adams
    Brown
    Connor
    ...
    Young
    Zielinski

This is a lot like the old trivia about how we only need to have 23 people in the room to have a 50-50 chance that two people share a birthday. The math for calculating that is straightforward enough, once you know it. But last names are much more unevenly distributed across the alphabet than birthdays are across the days of the year. To do this right, we need to know rough percentages for each letter of the alphabet.

I can procrastinate, too. So I surfed over to the US Census Bureau, rummaged around for a while, and finally found a page on Frequently Occurring Surnames from the Census 2000. It provides a little summary information and then links to a couple of data files, including a spreadsheet of data on all surnames that occurred at least 100 times in the 2000 census. This should, I figure, cover enough of the US population to give us a reasonable picture of how peoples' last names are distributed across the alphabet. So I grabbed it.

(We live in a wonderful time. Between open government, open research, and open source projects, we have access to so much cool data!)

The spreadsheet has columns with these headers:

    name,rank,count,prop100k,cum_prop100k,      \
                    pctwhite,pctblack,pctapi,   \
                    pctaian,pct2prace,pcthispanic

The first and third columns are what we want. After thirteen weeks, we know how to do compute the percentages we need: Use the running total pattern to count the number of people whose name starts with 'a', 'b', ..., 'z', as well as how many people there are altogether. Then loop through our collection of letter counts and compute the percentages.

Now, how should we represent the data in our program? We need twenty-six counters for the letter counts, and one more for the overall total. We could make twenty-seven unique variables, but then our program would be so-o-o-o-o-o long, and tedious to write. We can do better.

For the letter counts, we might use a list, where slot 0 holds a's count, slot 1 holds b's count, and so one, through slot 25, which holds z's count. But then we would have to translate letters into slots, and back, which would make our code harder to write. It would also make our data harder to inspect directly.

    ----  ----  ----  ...  ----  ----  ----    slots in the list

0 1 2 ... 23 24 25 indices into the list

The downside of this approach is that lists are indexed by integer values, while we are working with letters. Python has another kind of data structure that solves just this problem, the dictionary. A dictionary maps keys onto values. The keys and values can be of just about any data type. What we want to do is map letters (characters) onto numbers of people (integers):

    ----  ----  ----  ...  ----  ----  ----    slots in the dictionary

'a' 'b' 'c' ... 'x' 'y' 'z' indices into the dictionary

With this new tool in hand, we are ready to solve our problem. First, we build a dictionary of counters, initialized to 0.

    count_all_names = 0
    total_names = {}
    for letter in 'abcdefghijklmnopqrstuvwxyz':
        total_names[letter] = 0

(Note two bits of syntax here. We use {} for dictionary literals, and we use the familiar [] for accessing entries in the dictionary.)

Next, we loop through the file and update the running total for corresponding letter, as well as the counter of all names.

    source = open('app_c.csv', 'r')
    for entry in source:
        field  = entry.split(',')        # split the line
        name   = field[0].lower()        # pull out lowercase name
        letter = name[0]                 # grab its first character
        count  = int( field[2] )         # pull out number of people
        total_names[letter] += count     # update letter counter
        count_all_names     += count     # update global counter
    source.close()

Finally, we print the letter → count pairs.

    for (letter, count_for_letter) in total_names.items():
        print(letter, '->', count_for_letter/count_all_names)

(Note the items method for dictionaries. It returns a collection of key/value tuples. Recall that tuples are simply immutable lists.)

We have converted the data file into the percentages we need.

    q -> 0.002206197888442366
    c -> 0.07694634659082318
    h -> 0.0726864447688946
    ...
    f -> 0.03450702533438715
    x -> 0.0002412718532764804
    k -> 0.03294646311104032

(The entries are not printed in alphabetical order. Can you find out why?)

I dumped the output to a text file and used Unix's built-in sort to create my final result. I tweet Chad, Here are your percentages. You do the math.

Hey, I'm a programmer. When I procrastinate, I write code.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

November 17, 2014 3:34 PM

Empathy for Teachers Doing Research

I appreciate that there are different kinds of university, with different foci. Still, I couldn't help but laugh while reading Doug McKee's Balancing Research and Teaching at an Elite University. After acknowledging that he would "like to see the bar for teaching raised at least a little bit", McKee reminds us why they can't raise it too far: Faculty who spend too much time on their teaching will produce lower-quality research, and this will cause the prestige of the university to suffer. But students should know what's going on:

I'd also like to see more honesty about the primacy of research in the promotion process. This would make the expectations of the undergraduates more reasonable and give them more empathy for the faculty (especially the junior faculty).

Reading this, I laughed out loud, as my first thought was, "You're kidding, right?" I'm all for more empathy for professors; teaching is a tough job. But I doubt "My teaching isn't very good because I spend all my time on research, and the university is happy with that." is going to elicit much sympathy.

Then again, I may be seeing the world from the perspective of a different sort of university. Many students are at elite universities in large part for the prestige of the school. Faculty research is what maintains the university's prestige. So maybe those students view subpar classroom teaching simply as part of the cost of doing business.

Then there was this:

Undergraduates may not always get a great teacher in the classroom, but they are always learning from someone at the cutting edge of their discipline, and there is no substitute for that.

Actually, there is. It's called good teaching. Being talked at by an all-research, no-teaching Nobel laureate for fifty minutes a day, three days a week, may score high on the prestige meter, but it won't teach you how to think. Then again, with high enough admission standards, perhaps the students can take care of themselves.

I don't mean to sound harsh. Many research professors are also excellent teachers, and in some courses, being at the forefront of the discipline's knowledge is a great advantage. And I do feel empathy for faculty who find themselves in a position where the reward structure effectively forces them to devote little or no time to their teaching. Teaching well takes time.

But let's also keep in mind that those same faculty members chose to work at an elite university and, unlike many of their their students, the faculty know what that means for the balance between research and teaching in their careers.

Over the last decade or two, funding for public universities in most states has fallen dramatically compared to the cost of instruction. I hope state legislatures eventually remember that great teaching takes time, and take that into account when they are allocating resources among the institutions that focus on research and those that focus on students.


Posted by Eugene Wallingford | Permalink | Categories: Teaching and Learning

November 11, 2014 7:53 AM

The Internet Era in One Sentence

I just love this:

When a 14-year-old kid can blow up your business in his spare time, not because he hates you but because he loves you, then you have a problem.

Clay Shirky attributes it to Gordy Thompson, who managed internet services at the New York Times in the early 1990s. Back then, it was insightful prognostication; today, it serves as an epitaph for many an old business model.

Are 14-year-old kids making YouTube videos to replace me yet?


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

November 09, 2014 5:40 PM

Storytelling as a Source of Insight

Last week, someone tweeted a link to Learning to Learn, a decade-old blog entry by Chris Sells. His lesson is two-fold. To build a foundation of knowledge, he asks "why?" a lot. That is a good tactic in many settings.

He learned a method for gaining deeper insight by accident. He was tabbed to teach a five-day course. When he reached Day 5, he realized that he didn't know the material well enough -- he didn't know what to say. The only person who could take over for him was out of town. So he hyperventilated for a while, then decided to fake it.

That's when learning happened:

As I was giving the slides that morning on COM structured storage (a particularly nasty topic in COM), I found myself learning how it it worked as I told the story to my audience. All the studying and experimentation I'd done had given me the foundation, but not the insights. The insights I gained as I explained the technology. It was like fireworks. I went from through those topics in a state of bliss as all of what was really going on underneath exploded in my head. That moment taught me the value of teaching as a way of learning that I've used since. Now, I always explain stuff as a way to gain insight into whatever I'm struggling with, whether it's through speaking or writing. For me, and for lots of others, story telling is where real insight comes from.

I have been teaching long enough to know that it doesn't always go this smoothly. Often, when I don't know what to say, I don't do a very good job. Occasionally, I fail spectacularly. But it happens surprisingly often just as Sells describes. If we have a base of knowledge, the words come, and in explaining an idea for the first time -- or the hundredth -- we come to understand it in a new, deeper way. Sometimes, we write to learn. Other times, we teach.

We can help our students benefit from this phenomenon, too. Of course, we ask them to write. But we can go further with in-class activities in which students discuss topics and explain them to one another. Important cognitive processing happens when students explain a concept that doesn't happen when they study on our own.

I think the teach-to-learn phenomenon is at play in the "why?" tactic we use to learn in the first place. The answer to a why is an reason, an explanation. Asking "why?" is the beginning of telling stories to ourselves.

Story telling is, indeed, a source of deeper insight.


Posted by Eugene Wallingford | Permalink | Categories: Teaching and Learning

November 07, 2014 2:12 PM

Three Students, Three Questions

In the lab, Student 1 asks a question about the loop variable on a Python for statement. My first thought is, "How can you not know that? We are in Week 11." I answer, he asks another question, and we talk some more. The conversation shows me that he has understood some ideas at a deeper level, but a little piece was missing. His question helped him build a more complete and accurate model of how programs work.

Before class, Student 2 asks a question about our current programming assignment. My first thought is, "Have you read the assignment? It answers your question." I answer, he asks another question, and we talk some more. The conversation shows me that he is thinking carefully about details of the assignment, but assignments at this level of detail are new to him. His question helped him learn a bit more about how to read a specification.

After class, Student 3 asks a question about our previous programming assignment. We had recently looked at my solution to the assignment and discussed design issues. "Your program is so clean and organized. My program is so ugly. How can I write better-looking programs?" He is already one of the better students in the course. We discuss the role of experience in writing clearly, and I explain that the best programs are often the result of revision and refactoring. They started out just good enough, and the author worked to make them better. The conversation shows me that he cares about the quality of his code, that elegance matters as much to him as correctness. His question keeps him moving along the path to becoming a good programmer.

Three students, three questions: all three are signs of good things to come. They also remind me that even questions which seem backward at first can point forward.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning

November 01, 2014 3:27 PM

Passion is a Heavy Burden

Mark Guzdial blogged this morning about the challenge of turning business teachers into CS teachers. Where is the passion? he asks.

These days, I wince every time I hear word 'passion'. We apply it to so many things. We expect teachers to have passion for the courses they teach, students to have passion for the courses they take, and graduates to have passion for the jobs they do and the careers they build.

Passion is a heavy burden. In particular, I've seen it paralyze otherwise well-adjusted college students who think they need to try another major, because they don't feel a passion for the one they are currently studying. They don't realize that often passion comes later, after they master something, do it for a while, and come to appreciate it ways they could never imagine before. I'm sure some of these students become alumni who are discontent with their careers, because they don't feel passion.

I think requiring all CS teachers to have a passion for CS sets the bar too high. It's an unrealistic expectation of prospective teachers and of the programs that prepare them.

We can survive without passionate teachers. We should set our sights on more realistic and relevant goals:

  • Teachers should be curious. They should have a desire to learn new things.
  • Teachers should be professional. They should have a desire to do their jobs well.
  • Teachers should be competent. They should be capable of doing their jobs well.

Curiosity is so much more important than passion for most people in most contexts. If you are curious, you will like encountering new ideas and learning new skills. That enjoyment will carry you a long way. It may even help you find your passion.

Perhaps we should set similarly realistic goals for our students, too. If they are curious, professional, and competent, they will most likely be successful -- and content, if not happy. We could all do worse.


Posted by Eugene Wallingford | Permalink | Categories: General, Personal, Teaching and Learning