TITLE: Agile Moments, "Why We Test" Edition AUTHOR: Eugene Wallingford DATE: December 04, 2013 3:14 PM DESC: ----- BODY: Case 1: Big Programs. This blog entry tells the sad story of a computational biologist who had to retract six published articles. Why? Their conclusions depended on the output of a computer program, and that program contained a critical error. The writer of the entry, who is not the researcher in question, concludes:

What this should flag is the necessity to aggressively test all the software that you write.

Actually, you should have tests for any program you use to draw important conclusions, whether you wrote it or not. The same blog entry mentions that a grad student in the author's previous lab had found several bugs a molecular dynamics program used by many computational biologists. How many published results were affected before they were found? Case 2: Small Scripts. Titus Brown reports finding bugs every time he reused one of his Python scripts. Yet:

Did I start doing any kind of automated testing of my scripts? Hell no! Anyone who wants to write automated tests for all their little scriptlets is, frankly, insane. But this was one of the two catalysts that made me personally own up to the idea that most of my code was probably somewhat wrong.

Most of my code has bugs but, hey, why write tests? Didn't a famous scientist define insanity as doing the same thing over and over but expecting different results? I consider myself insane, too, but mostly because I don't write tests often enough for my small scripts. We say to ourselves that we'll never reuse them, so we don't need tests. But we don't throw them away, and then we do reuse them, perhaps with a tweak here or there. We all face time constraints. When we run a script the first time, we may well pay enough attention to the output that we are confident it is correct. But perhaps we can all agree that the second time we use a script, we should write tests for it if we don't already have them. There are only three numbers in computing, 0, 1, and many. The second time we use a program is a sign from the universe that we need the added confidence provided by tests. To be fair, Brown goes on to offer some good advice, such as writing tests for code after you find a bug in it. His article is an interesting read, as is almost everything he writes about computation and science. Case 3: The Disappointing Trade-Off. Then there's this classic from Jamie Zawinski, as quoted in Coders at Work:

I hope I don't sound like I'm saying, "Testing is for chumps." It's not. It's a matter of priorities. Are you trying to write good software or are you trying to be done by next week? You can't do both.

Sigh. If you you don't have good software by next week, maybe you aren't done yet. I understand that the real world imposes constraints on us, and that sometimes worse is better. Good enough is good enough, and we rarely need a perfect program. I also understand that Zawinski was trying to be fair to the idea of testing, and that he was surely producing good enough code before releasing. Even still, the pervasive attitude that we can either write good programs or get done on time, but not both, makes me sad. I hope that we can do better. And I'm betting that the computational biologist referred to in Case 1 wishes he had had some tests to catch the simple error that undermined five years worth of research. -----