TITLE: The Growing Buzz around Empirical Analysis of Repositories AUTHOR: Eugene Wallingford DATE: March 04, 2011 5:25 PM DESC: ----- BODY: This has turned into a recurring theme, due to a hopeful trend out in industry. Last semester, I wrote a bit about studying program repositories as a way to understand how programmers work. Then last month, I wrote about simple empirical analysis of code, referring to Michael Feathers's article on how we can learn a lot about our program's design by looking at our commit log. Feathers went on to write a short note about getting empirical about refactoring, in which he expanded on the idea of looking at our code to understand its design better. Now we have Turbulence, a package for pulling useful metrics about our code out of a git repository. The package began its life when Feathers and Corey Haines wrote a script to plot code churn versus its complexity. Haines has written a bit about the Turbulence project. It doesn't end there. Developers are using Turbulence and adding to its code base. Feathers's has called for a renewed focus on design in the wild using the data we have at our fingertips. The physicians have begun to heal themselves, and they are leading the way for the rest of us. One nice side effect of this trend is making available to a wider audience some of the academic research that has been done in this vein, such as Nagappan and Ball's paper on code churn and defect density. (I had the pleasure of meeting Ball when we served on a panel at OOPSLA several years ago.) As many people are saying, we swim in data. We just have to find ways to use it well. I remain ever amazed at what our tools enable us to do. All this talk about git has me resolved to go all the way and make a full switch to it. I've dabbled with git a bit and consumed a lot of software off GitHub, but now it's time to do all my development in it. Fortunately, there are a few excellent resources to help me, including the often-lauded Git Immersion guided tour by Jim Weirich and crew. and Scott Chacon's visually engaging Getting Git slidedeck. My trip to to SIGCSE and the spring break that follows immediately after can't come to soon! -----