TITLE: DIY Empirical Analysis Via Scripting AUTHOR: Eugene Wallingford DATE: February 11, 2011 4:15 PM DESC: ----- BODY: In my post on learning in a code base, I cited Michael Feathers's entry on measuring the closure of code. Michael's entry closes with a postscript:
It's relatively easy to make these diagrams yourself. grep the log of your VCS for lines that depict adds and modifications. Strip everything except the file names, sort it, run it through 'uniq -c', sort it again, and then plot the first column.
Ah, the Unix shell. A few years ago I taught a one-unit course on bash scripting, and I used problems like this as examples in class. Many students are surprised to learn just how much you can do with a short pipeline of Unix commands, operating on plain text data pulled from any source. You can also do this sort of thing almost as easily in a more full-featured scripting language, such as Python or Ruby. That is one reason languages like them are so attractive to me for teaching programming in context. Of course, using a powerful, fun language in CS1 creates a new set of problems for us. A while back, a CS educator on the SIGCSE mailing list pointed out one:
Starting in Python postpones the discovery that "CS is not for me".
After years of languages such as C++, Java, and Ada in CS1, which hastened the exit of many a potential CS major, it's ironic that our new problem might be students succeeding too long for their own good. When they do discover that CS isn't for them, they will be stuck with the ability to write scripts and analyze data. With all due concern for not wasting students' time, this is a problem we in CS should willingly accept. -----