January 08, 2021 2:12 PM

My Experience Storing an Entire Course Directory in Git

Last summer, I tried something new: I stored the entire directory of materials for my database course in Git. This included all of my code, the course website, and everything else. It worked well.

The idea came from a post or tweet many years ago by Martin Fowler who, if I recall correctly, had put his entire home directory under version control. It sounded like the potential advantages might be worth the cost, so I made a note to try it myself sometime. I wasn't quite ready last summer to go all the way, so I took a baby step by creating my new course directory as a git repo and growing it file by file.

My context is pretty simple. I do almost all of my work on a personal MacBook Pro or a university iMac in my office. My main challenge is to keep my files in sync. When I make changes to a small number of files, or when the stakes of a missing file are low, copying files by hand works fine, with low overhead and no tooling necessary.

When I make a lot of changes in a short period of time, however, as I sometimes do when writing code or building my website, doing things by hand becomes more work. And the stakes of losing code or web pages are a lot higher than losing track of some planning notes or code I've been noodling with. To solve this problem, for many years I have been using rsync and a couple of simple shell scripts to manage code directories and my course web sites.

So, the primary goal for using Git in a new workflow was to replace rsync. Not being a Git guru, as many of you are, I figured that this would also force me to live with git more often and perhaps expand my tool set of handy commands.

My workflow for the semester was quite simple. When I worked in the office, there were four steps:

  1. git merge laptop
  2. [ do some work ]
  3. git commit
  4. git push

On my laptop, the opening and closing git commands changed:

  1. git pull origin main
  2. [ do some work ]
  3. git commit
  4. git push origin laptop

My work on a course is usually pretty straightforward. The most common task is to create files and record information with commit. Every once in a while, I had to back up a step with checkout.

You may say, "But you are not using git for version control!" You would be correct. The few times I checked out an older version of a file, it was usually to eliminate a spurious conflict, say, a .DS_Store file that was out of sync. Locally, I don't need a lot of version control, but using Git this way was a form of distributed version control, making sure that, wherever I was working, I had the latest version of every file.

I think this is a perfectly valid way to use Git. In some ways, Git is the new Unix. It provided me with a distributed filesystem and a file backup system all in one. The git commands ran effectively as fast as their Unix counterparts. My repo was not very much bigger than the directory would have been on its own, and I always had a personal copy of the entire repo with me wherever I went, even if I had to use another computer.

Before I started, several people reminded me that Git doesn't always work well with large images and binaries. That didn't turn out to be much of a problem for me. I had a couple of each in the repo, but they were not large and never changed. I never noticed a performance hit.

The most annoying hiccup all semester was working with OS X's .DS_Store files, which record screen layout information for OS X. I like to keep my windows looking neat and occasionally reorganize a directory layout to reflect what I'm doing. Unfortunately, OS X seems to update these files at odd times, after I've closed a window and pushed changes. Suddenly the two repos would be out of sync only because one or more .DS_Store files had changed after the fact. The momentary obstacle was quickly eliminated with a checkout or two before merging. Perhaps I should have left the .DS_Stores untracked...

All in all, I was pretty happy with the experience. I used more git, more often, than ever before and thus am now a bit more fluent than I was. (I still avoid the hairier corners of the tool, as all right-thinking people do whenever possible.) Even more, the repository contains a complete record of my work for the semester, false starts included, with occasional ruminations about troubles with code or lecture notes in my commit messages. I had a little fun after the semester ended looking back over some of those messages and making note of particular pain points.

The experiment went well enough that I plan to track my spring course in Git, too. This will be a bigger test. I've been teaching programming languages for many years and have a large directory of files, both current and archival. Not only are there more files, there are several binaries and a few larger images. I'm trying decide if I should put the entire folder into git all at once upfront or start with an empty folder a lá last semester and add files as I want or need them. The latter would be more work at early stages of development but might be a good way to clear out the clutter that has built up over twenty years.

If you have any advice on that choice, or any other, please let me know by email or on Twitter. You all have taught me a lot over the years. I appreciate it.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

January 05, 2021 3:33 PM

Today's Reading

Lot's of good stuff on the exercise bike this morning...

Henry Rollins on making things because he must:

I'm a shipbuilder. I don't want to sail in them. I want you to sail in them. I'm just happy that they leave the harbor so I can have an empty workplace.

Deirdre Connolly on the wonder of human achievement:

We, ridiculous apes with big brains and the ability to cooperate, can listen to the universe shake from billions of light years away, because we learned math and engineering. Wild.

Sonya Mann on our ultimate task:

Our labor is the same as it ever was. Your job is to pioneer a resilient node in the network of civilization -- to dodge the punches, roll with the ones that you can't, and live to fight another day. That's what our ancestors did for us and it's what we'll do for those who come next: hold the line, explore when there's surplus, stay steady, and go down swinging when we have to.

Henry Rollins also said:

"What would a writer do in this situation?" I don't know, man. Ask one. And don't tell me what he said; I'm busy.

Back to work.


Posted by Eugene Wallingford | Permalink | Categories: General

January 03, 2021 5:08 PM

On the Tenth Day of Christmas...

... my daughter gave to me:

Christmas gifts from Sarah!

We celebrated Part 2 of our Zoom Family Christmas this morning. A package from one of our daughters arrived in the mail after Part 1 on Christmas Day, so we reprised our celebration during today's weekly call.

My daughter does not read my blog, at least not regularly, but she did search around there for evidence that I might already own these titles. Finding none, she ventured the long-distance gift. It was received with much joy.

I've known about I Am a Strange Loop for over a decade but have never read it. Somehow, Surfaces and Essences flew under my radar entirely. A book that is new to me!

These books will provide me many hours of enjoyment. Like Hofstadter's other books, they will probably bend my brain a bit and perhaps spark some welcome new activity.

~~~~~

Hofstadter appears in this blog most prominently in a set of entries I wrote after he visited my university in 2012:

I did mention I Am a Strange Loop in a later entry after all, a reflection on Alan Turing, representation, and universal machines. I'm glad that entry did not undermine my daughter's gift!


Posted by Eugene Wallingford | Permalink | Categories: General, Personal

January 01, 2021 12:01 PM

This Version of the Facts

The physicist Leo Szilard once announced to his friend Hans Bethe that he was thinking of keeping a diary: "I don't intend to publish it; I am merely going to record the facts for the information of God." "Don't you think God knows the facts?" Bethe asked. "Yes," said Szilard. "He knows the facts, but He does not know this version of the facts."

I began 2021 by starting to read Disturbing the Universe, Freeman Dyson's autobiographical attempt to explain to people who are not scientists what the human situation looks like to someone who is a scientist. The above passage opens the author's preface.

Szilard's motive seems like a pretty good reason to write a blog: to record the one's own version of the facts, for oneself and for the information of God. Unlike Szilard, we have an alternative in between publishing and not publishing. A blog is available for anyone to read, at almost no cost, but ultimately it is for the author, and maybe for God.

I've been using the long break between fall and spring semesters to strengthen my blogging muscle and redevelop my blogging habit. I hope to continue to write more regularly again in the coming year.

Dyson's book is a departure from my recent reading. During the tough fall semester, I found myself drawn to fiction, reading Franny and Zooey by J. D. Salinger, The Bell Jar by Sylvia Plath, The Lucky Ones by Rachel Cusk, and The Great Gatsby by F. Scott Fitzgerald, with occasional pages from André Gide's diary in the downtime between books.

I've written about my interactions with Cusk before [ Outline, Transit, Kudos ], so one of her novels is no surprise here, but what's with those classics from sixty years ago or more? These stories, told by deft and observant writers, seemed to soothe me. They took the edge off of the long days. Perhaps I could have seen a run of classic books coming... In the frustrating summer run-up to fall, I read Thomas Mann's Death in Venice and Ursula Le Guin's The Lathe of Heaven.

For some reason, yesterday I felt the urge to finally pick up Dyson's autobiography, which had been on my shelf for a few months. A couple of years ago, I read most of Dyson's memoir, Maker of Patterns, and found him an amiable and thoughtful writer. I even wrote a short post on one of his stories, in which Thomas Mann plays a key role. At the time, I said, "I've never read The Magic Mountain, or any Mann, for that matter. I will correct that soon. However, Mann will have to wait until I finish Dyson...". 2020 may have been a challenge in many ways, but it gave me at least two things: I read my first Mann (Death in Venice is much more approachable than The Magic Mountain...), and it returned me to Dyson.

Let's see where 2021 takes us.


Posted by Eugene Wallingford | Permalink | Categories: General, Personal

December 31, 2020 12:33 PM

Go Home and Study

In a conversation with Tyler Cowen, economist Garett Jones said:

... my job in the classroom is not to teach the details of any theory. My job is to give students a reason to feel passionate enough about the topic so that they'll go home for two or three hours and study it on their own.

Perhaps this is a matter of context, but I don't think this assetion is entirely accurate. It might give the wrong impression to the uninitiated by leaving an essential complementary task implicit.

One could read this as saying that the instructor's job is purely one of motivation. Closures! Rah-rah! Get students excited enough to go learn everything about them on the own, and the instructor has succeeded.

If you think that's true, then I can introduce you to many students who have struggled or failed to learn something new despite being excited to learn and putting in a lot of time. They were missing some prerequisite knowledge or didn't have the experience they needed to navigate the complexities of a new area of study. In principle, if they plugged away at it long enough, they would eventually get there, but then why bother having an instructor at all?

So I think that, as instructor, I have two jobs. I do need to motivate students to put in the time and effort they need to study. Learning happens inside the student, and that requires personal study. I also, though, have to help create the conditions under which they can succeed. This involves all sorts of things: giving them essential background, pointing them toward useful resources, helping them practice the skills they'll need to learn effectively, and so on.

Motivation is in some ways a necessary part of the instructor's job. If students don't want to invest time in study and practice, then they will not learn much. But motivation is not sufficient. The instructor must also put the student in position to succeed to learn effectively.


Posted by Eugene Wallingford | Permalink | Categories: Teaching and Learning

December 30, 2020 3:38 PM

That's a Shame

In the middle of an old post about computing an "impossible" integral, John Cook says:

In the artificial world of the calculus classroom, everything works out nicely. And this is a shame.

When I was a student, I probably took comfort in the fact that everything was supposed to work out nicely on the homework we did. There *was* a solution; I just had to find the pattern, or the key that turned the lock. I suspect that I was a good student largely because I was good at finding the patterns, the keys.

It wasn't until I got to grad school that things really changed, and even then course work was typically organized pretty neatly. Research in the lab was very different, of course, and that's where my old skills no longer served me so well.

In university programs in computer science, where many people first learn how to develop software, things tend to work out nicely. That is a shame, too. But it's a tough problem to solve.

In most courses, in particular introductory courses, we create assignments with that have "closed form" solutions, because we want students to practice a specific skill or learn a particular concept. Having a fixed target can be useful in achieving the desired outcomes, especially if we want to help students build confidence in their abilities.

It's important, though, that we eventually take off the training wheels and expose students to messier problems. That's where they have an opportunity to build other important skills they need for solving problems outside the classroom, which aren't designed by a benevolent instructor to have follow a pattern. As Cook says, neat problems can create a false impression that every problem has a simple solution.

Students who go on to use calculus for anything more than artificial homework problems may incorrectly assume they've done something wrong when they encounter an integral in the wild.

CS students need experience writing programs that solve messy problems. In more advanced courses, my colleagues and I all try to extend students' ability to solve less neatly-designed problems, with mixed results.

It's possible to design a coherent curriculum that exposes students to an increasingly messy set of problems, but I don't think many universities do this. One big problem is that doing so requires coordination across many courses, each of which has its own specific content outcomes. There's never enough time, it seems, to teach everything about, say, AI or databases, in the fifteen weeks available. It's easier to be sure that we cover another concept than it is to be sure students take a reliable step along the path from being able to solve elementary problems to being able to solve to the problems they'll find in the wild.

I face this set of competing forces every semester and do my best to strike a balance. It's never easy.

Courses that involve large systems projects are one place where students in my program have a chance to work on a real problem: writing a compiler, an embedded real-time system, or an AI-based system. These courses have closed form solutions of sorts, but the scale and complexity of the problems require students to do more than just apply formulas or find simple patterns.

Many students thrive in these settings. "Finally," they say, "this is a problem worth working on." These students will be fine when they graduate. Other students struggle when they have to do battle for the first time with an unruly language grammar or a set of fussy physical sensors. One of my challenges in my project course is to help this group of students move further along the path from "student doing homework" to "professional solving problems".

That would be a lot easier to do if we more reliably helped students take small steps along that path in their preceding courses. But that, as I've said, is difficult.

This post describes a problem in curriculum design without offering any solutions. I will think more about how I try to balance the forces between neat and messy in my courses, and then share some concrete ideas. If you have any approaches that have worked for you, or suggestions based on your experiences as a student, please email me or send me a message on Twitter. I'd love to learn how to do this better.

I've written a number of posts over the years that circle around this problem in curriculum and instruction. Here are three:

I'm re-reading these to see if past me has any ideas for present-day me. Perhaps you will find them interesting, too.

Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development, Teaching and Learning

December 28, 2020 9:23 AM

Fall Semester Hit Me Harder Than I Realized

(This is an almost entirely personal entry. If that's not your thing, feel free to move on.)

Last Monday morning, I sat down and wrote a blog entry. It was no big deal, just an observation from some reading I've been doing. A personal note, a throwaway. I'm doing the same this morning.

I'd forgotten how good that can feel, which tells you something about my summer and fall.

Last spring, I wrote that I would be teaching a new course in the fall. I was pretty excited for the change of pace, even if it meant not teaching my compiler course this year, and was already thinking ahead to course content, possible modes of delivery in the face of Covid-19, and textbooks.

Then came summer.

Some of my usual department head tasks, like handling orientation sessions for incoming first-year students, were on the calendar, but the need to conduct them remotely expanded what used to be a few hours of work each week to a few hours each day. (It must have been even worse for the university staff who organize and run orientation!) Uncertainty due to the pandemic and the indecisiveness of upper administration created new work, such as a seemingly endless discussion of fall schedule, class sizes, and room re-allocation.

One of the effects of all this work was that, when August rolled around, I was not much better prepared to teach my class than I had been in May when I solicited everyone's advice.

Once fall semester started, my common refrain in conversations with friends was, "It feels like I'm on a treadmill." As soon as I finished preparing a week's in-class session, I had to prepare the week's online activity. Then there were homework assignments to write, and grade. Or I had to write an exam, or meet with student's to discuss questions or difficulties, made all the more difficult but the stress the pandemic placed on them. I never felt like I could take a day or hour off, and when I did, class was still there in my mind, reminding of all I had to do before another week began and the cycle began again.

That went on for fourteen weeks. I didn't feel out of sorts so much as simply always busy. It would be over soon enough, my rational mind told me.

When the semester ended at Thanksgiving, the treadmill of new work disappeared and all that was left was the grading. I did not rush that work, letting it spread over most of the week and a half I had before grades were due. I figured that it was time to decompress a bit.

After grades were in and I had time to get back to normal, I noticed some odd things happening. First of all I was sleeping a lot: generous naps most days, and a couple of Fridays where I was in bed for ten hours of rest (followed, predictably, by a nap later in the day). I'm no insomniac by nature, but this was much more sleep than I usually take, or need.

My workout data told a tale of change, too. My elliptical and bike performances had been steadily showing the small improvements of increased capability through May or so. They leveled off into the summer months, when I was able to ride outside more with my wife. Then fall started, and my performance levels declined steadily into November. The numbers started to bounce back in December, and I feel as strong as I've felt in a long while.

I guess fall semester hit me harder than I realized.

In most ways, I feel like I'm back to normal now. I guess we will find out next week, when my attention turns to spring semester, both as department head and instructor. At least I get to teach programming languages, where I have a deep collection of session materials in hand and years of thinking and practice to buoy me up. Even with continued uncertainty due to the pandemic, I'm in pretty good shape.

Another effect of the summer and fall was that my already too-infrequent blogging slowed to a trickle. The fate of most blogs is a lack of drama. For me, blogging tends to flag when I am not reading or doing interesting work. The gradual expansion of administrative duties over the last few years has certainly taken its toll. But teaching a new course usually energizes me and leads to more regularly writing here. That didn't happen this fall.

With the semester now over, I have a better sense of the stress I must have been feeling. It affected my sleep, my workouts, and my teaching. It's no surprise that it affected my writing, too.

One of my goals for the coming year is to seek the sort of conscious, intentional awakening of the senses that Gide alludes to the passage quoted by that blog post. I'm also going to pay better attention to the signs that the treadmill is moving too fast. Running faster isn't always the solution.


Posted by Eugene Wallingford | Permalink | Categories: Personal, Teaching and Learning

December 27, 2020 10:10 AM

What Paul McCartney Can Teach Us About Software

From Sixty-Four Reasons to Celebrate Paul McCartney, this bit of wisdom that will sound familiar to programmers:

On one of the tapes of studio chatter at Abbey Road you can hear McCartney saying, of something they're working on, "It's complicated now. If we can get it simpler, and then complicate it where it needs to be complicated..."

People talk a lot about making software as simple as possible. The truth is, software sometimes has to be complicated. Some programs perform complex tasks. More importantly, programs these days often interact in complex environments with a lot of dissimilar, distributed components. We cannot avoid complexity.

As McCartney knows about music, the key is to make things as simple as can be and introduce complexity only where it is essential. Programmers face three challenges in this regard:

  • learning how to simplify code,
  • learning how to add complexity in a minimal, contained fashion, and
  • learning how to recognize the subtle boundary between essential simplicity and essential complexity.
I almost said that new programmers face those challenges, but after many years of programming, I feel like I'm still learning how to do all three of these things. I suspect other experienced programmers are still learning, too.

On an unrelated note, another passage in this article spoke to me personally as a programmer. While discussing McCartney's propensity to try new things and to release everything, good and bad, it refers to some of the songs on his most recent album (at that time) as enthusiastically executed misjudgments. I empathize with McCartney. My hard drive is littered with enthusiastically executed misjudgments. And I've never written the software equivalent of "Hey Jude".

McCartney just released a new album this month at the age of 78. The third album in a trilogy conceived and begun in 1970, it has already gone to #1 in three countries. He continues to write, record, and release, and collaborates frequently with today's artists. I can only hope to be enthusiastically producing software, and in tune with the modern tech world, when I am his age.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development

December 21, 2020 8:30 AM

Watching the Scientific Method Take Off in Baseball

I make it a point never to believe anything
just because it's widely known to be so.
-- Bill James

A few years ago, a friend was downsizing and sent me his collection of The Bill James Abstract from the 1980s. Every so often I'll pick one up and start reading. This week, I picked up the 1984 issue.

my stack of The Bill James Abstract, courtesy of Charlie Richter

It's baseball, so I enjoy it a lot. My #1 team, the Cincinnati Reds, were in the summer doldrums for most of the 1980s but on their way to a surprising 1990 World Series win. My #2 team, the Detroit Tigers, won it all in 1984, with a performance so dominating that it seemed almost preordained. It's fun to reminisce about those days.

It's even more fascinating to watch the development of the scientific method in a new discipline.

Somewhere near the beginning of the 1984 abstract, James announces the stance that underlies all his work: never believe anything just because everyone else says it's true. Scattered through the book are elaborations of this philosophy. He recognizes that understanding the world of baseball will require time, patience, and revision:

In many cases, I have no clear evidence on the issue, no way of answering the question. ... I guess what I'm saying is that if we start trying to answer these questions now, we'll be able to answer them in a few years. An unfortunate side effect is that I'm probably going to get some of the answers wrong now; not only some of the answers but some of the questions.

Being wrong is par for the course for scientists; perhaps James felt some consolation in that this made him like Charles Darwin. The goal isn't to be right today. It is to be less wrong than yesterday. I love that James tells us that, early in his exploration, even some of the questions he is asking are likely the wrong questions. He will know better after he has collected some data.

James applies his skepticism and meticulous analysis to everything in the game: which players contribute the most offense or defense to the team, and how; how pitching styles affect win probabilities; how managers approach the game. Some things are learned quickly but are rejected by the mainstream. By 1984, for example, James and people like him knew that, on average, sacrifice bunts and most attempts to steal a base reduced the number of runs a team scores, which means that most of them hurt the team more than they help. But many baseball people continued to use them too often tactically and even to build teams around them strategically.

At the time of this issue, James had already developed models for several phenomena in the game, refined them as evidence from new seasons came in, and expanded his analysis into new areas. At each step, he channels his inner scientist: look at some part of the world, think about why it might work the way it does, develop a theory and a mathematical model, test the theory with further observations, and revise. James also loves to share his theories and results with the rest of us.

There is nothing new here, of course. Sabermetrics is everywhere in baseball now, and data analytics have spread to most sports. By now, many people have seen Moneyball (a good movie) or read the Michael Lewis book on which it was based (even better). Even so, it really is cool to read what are in effect diaries recording what James is thinking as he learns how to apply the scientific method to baseball. His work helped move an entire industry into the modern world. The writing reflects the curiosity, careful thinking, and humility that so often lead to the goal of the scientific mind:

to be less wrong than yesterday

Posted by Eugene Wallingford | Permalink | Categories: General, Personal

December 10, 2020 3:36 PM

The Fate of Most Blogs

... was described perfectly by André Gide on August 8, 1891, long before the digital computer:

More than a month of blanks. Talking of myself bores me. A diary is useful during conscious, intentional, and painful spiritual evolutions. Then you want to know where you stand. But anything I should say now would be harpings on myself. An intimate diary is interesting especially when it records the awakening of ideas; or the awakening of the senses at puberty; or else when you feel yourself to be dying.

There is no longer any drama taking place in me; there is now nothing but a lot of ideas stirred up. There is no need to write myself down on paper.

Of course, Gide kept writing for many years after that moment of doubt. The Journals of André Gide are an entertaining read. I feel seen, as the kids say these days.


Posted by Eugene Wallingford | Permalink | Categories: General