TITLE: Empirical Data about Software Practices
AUTHOR: Eugene Wallingford
DATE: October 29, 2009 4:19 PM
I was happy to come across Greg Wilson's talk,
Bits of Evidence,
on empirical data in software engineering. When I started
preparing to teach software engineering last summer, I
looked for empirical data to support some of the claims
that we make about building software. I wasn't all that
successful. I figured that either I wasn't looking hard
enough, or there wasn't much. The answer probably lies
somewhere in the middle.
Someone could do the SE world a great service by gathering,
organizing, and providing links to all the good work that
has been done. Wilson is one of the people I turn to for
pointers to empirical SE results.
I did have fun reading some classic old work in this area.
One is McCabe's
on cyclomatic complexity. This is very cool and has a nice
tie to theory, but it simply describes a metric. It does
not gather present data from real programs against which
the metric has applied, and it doesn't provide any base
line for comparison. When he speaks of 10 as a reasonable
upper bound for a module's cyclomatic complexity, or of
code with cyclomatic complexity in the 3-7 range as "quite
well structured", I wonder "Why?" He drew these values
from experience looking at production code available to
him, but these numbers feel a bit unreliable.
I'm not a stickler who needs statistically significant
analysis of carefully collected data, though. I like
to learn from experience that is gathered and presented
qualitatively. I enjoyed reading about an informal
experiment into the
complexity of code developed test-first.
We need to be careful to take the results with a grain
of salt, given the informality of the definitions and
methodology used, but the experiment seems to say
something useful. I also liked Martin Fowler's
of dynamic type checking in production Ruby code from
ThoughtWorks. He also wrote an interesting reflection
Ruby at ThoughtWorks
that I learned from.
Still, we need more of the sort of empirical evidence
that Wilson offers in his talk. As a discipline, we
can do a better job of paying attention to the validity
of our claims, and of more frequently asking, "Data,