TITLE: Simplicity, Complexity, and Good Enough Chess Ratings
AUTHOR: Eugene Wallingford
DATE: August 20, 2010 3:30 PM
DESC:
-----
BODY:
Andrew Gelman
writes
about a competition offered by Kaggle to
find a better rating system
for chess. The Elo system has been used for the last
40+ years with reasonable success. In the era of big
data, powerful ubiquitous computers, and advanced
statistical methods, it turns out that we can create
a rating system that predicts more accurately the
performance of players on games in the near-future.
Very cool. I'm still
enough of chess geek
that I want to know just when Capablanca surpassed
Lasker and how much better Fischer was than his
competition in the 1972 challenger's matches. I've
always had an irrational preference for ridiculously
precise values.
Even as we find systems that perform better, I find
myself still attached to Elo. I'm sure part of it
is that I grew up with Elo ratings as a player, and
read Elo's
The Rating of Chess Players, Past and Present
as a teen.
But there's more. I've also written programs to
implement the rating system, including the
first program I ever wrote out of passion.
Writing the code to assign initial ratings to a
pool of players based on the outcomes of games
played among them required me to do something
I didn't even know was possible at the time:
start a process that wasn't guaranteed to stop.
I learned about the idea of successive approximations
and how my program would have to settle for values
that fit the data well enough. This was my first
encounter with epsilon, and my first non-trivial
use of recursion. Yes, I could have written a
loop, but the algorithm seemed so clear written
recursively. Such experiences stick with a person.
There is still more, though, beyond my personal
preferences and experiences. Compared to most of
the alternatives that do a better job objectively,
the Elo system is simple. The probability curve
is simple enough for anyone to understand, and the
update process is basic arithmetic. Even better,
there is a simple linear approximation of the
curve that made it possible for a bunch of high
school kids with no interest in math to update
ratings based on games played at the club. We
posted a small table of expected values based on
rating differences at the front of the room and
maintained the ratings on index cards. (This is
a different sort of
index-card computing
than I wrote about long ago.) There may have been
more accurate systems we could have run, but the
math behind this one was so simple, and the ratings
were more than good enough for our purposes. I am
guessing that the Elo system is more than good enough
for most people's purposes.
Simple and good enough is a strong combination.
Perhaps the Elo system will turn out to be the
Newtonian physics of ratings. We know there are a
better, more accurate models, and we use them whenever
we need something very accurate. Otherwise, we stick
to the old model and get along just fine almost all
the time.
-----