TITLE: The Growing Buzz around Empirical Analysis of Repositories
AUTHOR: Eugene Wallingford
DATE: March 04, 2011 5:25 PM
DESC:
-----
BODY:
This has turned into a recurring theme, due to a
hopeful trend out in industry.
Last semester, I wrote a bit about
studying program repositories
as a way to understand how programmers work. Then
last month, I wrote about
simple empirical analysis of code,
referring to Michael Feathers's article on how we
can learn a lot about our program's design by
looking at our commit log.
Feathers went on to write a short note about
getting empirical about refactoring,
in which he expanded on the idea of looking at our
code to understand its design better.
Now we have
Turbulence,
a package for pulling useful metrics about our code
out of a git repository. The package began its life
when Feathers and Corey Haines wrote a script to
plot code churn versus its complexity. Haines has
written a bit about the Turbulence project.
It doesn't end there. Developers are using Turbulence
and adding to its code base. Feathers's has called
for a
renewed focus on design in the wild
using the data we have at our fingertips. The
physicians have begun to heal themselves, and they
are leading the way for the rest of us.
One nice side effect of this trend is making available
to a wider audience some of the academic research that
has been done in this vein, such as Nagappan and Ball's
paper on
code churn and defect density.
(I had the pleasure of meeting Ball when we served on
a panel at OOPSLA several years ago.)
As many people are saying,
we swim in data.
We just have to find ways to use it well. I remain
ever amazed at what our tools enable us to do.
All this talk about git has me resolved to go all the
way and make a full switch to it. I've dabbled with
git a bit and consumed a lot of software off GitHub,
but now it's time to do all my development in it.
Fortunately, there are a few excellent resources to
help me, including the often-lauded
Git Immersion
guided tour by Jim Weirich and crew. and Scott
Chacon's visually engaging
Getting Git
slidedeck. My trip to to SIGCSE and the spring break
that follows immediately after can't come to soon!
-----