TITLE: Empirical Data about Software Practices AUTHOR: Eugene Wallingford DATE: October 29, 2009 4:19 PM DESC: ----- BODY: I was happy to come across Greg Wilson's talk, Bits of Evidence, on empirical data in software engineering. When I started preparing to teach software engineering last summer, I looked for empirical data to support some of the claims that we make about building software. I wasn't all that successful. I figured that either I wasn't looking hard enough, or there wasn't much. The answer probably lies somewhere in the middle. Someone could do the SE world a great service by gathering, organizing, and providing links to all the good work that has been done. Wilson is one of the people I turn to for pointers to empirical SE results. I did have fun reading some classic old work in this area. One is McCabe's original article on cyclomatic complexity. This is very cool and has a nice tie to theory, but it simply describes a metric. It does not gather present data from real programs against which the metric has applied, and it doesn't provide any base line for comparison. When he speaks of 10 as a reasonable upper bound for a module's cyclomatic complexity, or of code with cyclomatic complexity in the 3-7 range as "quite well structured", I wonder "Why?" He drew these values from experience looking at production code available to him, but these numbers feel a bit unreliable. I'm not a stickler who needs statistically significant analysis of carefully collected data, though. I like to learn from experience that is gathered and presented qualitatively. I enjoyed reading about an informal experiment into the complexity of code developed test-first. We need to be careful to take the results with a grain of salt, given the informality of the definitions and methodology used, but the experiment seems to say something useful. I also liked Martin Fowler's reflective analysis of dynamic type checking in production Ruby code from ThoughtWorks. He also wrote an interesting reflection on Ruby at ThoughtWorks that I learned from. Still, we need more of the sort of empirical evidence that Wilson offers in his talk. As a discipline, we can do a better job of paying attention to the validity of our claims, and of more frequently asking, "Data, please!" -----