TITLE: Simplicity, Complexity, and Good Enough Chess Ratings AUTHOR: Eugene Wallingford DATE: August 20, 2010 3:30 PM DESC: ----- BODY: Andrew Gelman writes about a competition offered by Kaggle to find a better rating system for chess. The Elo system has been used for the last 40+ years with reasonable success. In the era of big data, powerful ubiquitous computers, and advanced statistical methods, it turns out that we can create a rating system that predicts more accurately the performance of players on games in the near-future. Very cool. I'm still enough of chess geek that I want to know just when Capablanca surpassed Lasker and how much better Fischer was than his competition in the 1972 challenger's matches. I've always had an irrational preference for ridiculously precise values. Even as we find systems that perform better, I find myself still attached to Elo. I'm sure part of it is that I grew up with Elo ratings as a player, and read Elo's The Rating of Chess Players, Past and Present as a teen. But there's more. I've also written programs to implement the rating system, including the first program I ever wrote out of passion. Writing the code to assign initial ratings to a pool of players based on the outcomes of games played among them required me to do something I didn't even know was possible at the time: start a process that wasn't guaranteed to stop. I learned about the idea of successive approximations and how my program would have to settle for values that fit the data well enough. This was my first encounter with epsilon, and my first non-trivial use of recursion. Yes, I could have written a loop, but the algorithm seemed so clear written recursively. Such experiences stick with a person. There is still more, though, beyond my personal preferences and experiences. Compared to most of the alternatives that do a better job objectively, the Elo system is simple. The probability curve is simple enough for anyone to understand, and the update process is basic arithmetic. Even better, there is a simple linear approximation of the curve that made it possible for a bunch of high school kids with no interest in math to update ratings based on games played at the club. We posted a small table of expected values based on rating differences at the front of the room and maintained the ratings on index cards. (This is a different sort of index-card computing than I wrote about long ago.) There may have been more accurate systems we could have run, but the math behind this one was so simple, and the ratings were more than good enough for our purposes. I am guessing that the Elo system is more than good enough for most people's purposes. Simple and good enough is a strong combination. Perhaps the Elo system will turn out to be the Newtonian physics of ratings. We know there are a better, more accurate models, and we use them whenever we need something very accurate. Otherwise, we stick to the old model and get along just fine almost all the time. -----