[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [computer-go] Statistical Significance
Someone said this:
Elo ranking is quite slow to converge. Furthermore is has tendency
deflate. That is why is not used in Go general.
How rapidly ELO converges is undefined, it is an adjustable parameter
to the ELO formula. It can be made to converge at any rate you wish.
In fact, making the choice is a problem that NO rating system is
immune to. If you converge rapidly the system adjusts quickly to
rapidly improving (or declining) players. But it won't be very
accurate in the general case since if you lose a couple of games
because your blood sugar was low that day, the rating will punish you
unfairly. Players will tend to be way overated on one day, way
underated on another day.
Deflation and inflation is also something no rating system I know of
is immune to. The only way to prevent this is if you can prove and
quantify how much of it has happened, and make artifical (presumably
very slow and gradual) adjustements to the rating system.
There are several reasons inflation and deflation happen. ELO assumes
there is a fixed number of points in the system once someone has an
established rating. If you start out as a beginner with a rating of
1200, then in 10 years become a master (at 2200 points), you have
taken 1000 rating points from the system. A fraction of a point was
taken from each rated player in order for you to get those 1000
points. So even if no one else has gotten weaker, you have lowered
the average rating of the other players just a bit. The United States
Chess Federation (which uses an ELO based system too) used to make up
for this by awarding bonus points to players for various reasons, like
having an exceptionally good tournament. I have no idea what they do
now but several years ago this caused a great INFLATION and had to be
changed.
But inflation and deflation constantly happens. If a player quits
playing the game and he happens to be overated at the time (had some
"lucky" tournaments and then quits) then he has stolen some points
from the system. If a player has a couple of "unlucky" tournaments
and quits playing out of discouragement, he has left the excess rating
points as a gift to the rest of the players. The amount of inflation
in this case is the difference between his "true strength" and his
actual rating.
I can't imagine any other system that doesn't have these problems, I
think that ELO is such a good system that the problem is much easier
to see because it is measured quite concisely by an ELO number.
One attempt to avoid these type of problems is to base playing ability
on some easily defined standard which is either self adjusting or can
be easily and incrementally adjusted (artificially) over time. For
instance you can define a rating system where the top N players must
have an average rating of some abitrary value and make constant small
adjustments based on that. But that doesn't have any thing to do with
inflation and deflation, it's just hiding the problem. In fact,
everyones rating will be based on how strong the top N players happen
to be at any point in time and how rapidly the system has adjusted to
it.
I think the GO ranking system is basically like this, but I admit that
I don't understand it. I would appreciate it if someone went into
some detail on the Go ranking system, and if it's believed that it is
immune to deflation and inflation explain how that can be. My
understanding is that it is based more on tradition than mathematics.
But no matter what system you use, I don't believe there is an
objective way to measure or compare one great player of today to
another great player from hundreds of years ago. If there were, then
you can convince me that GO is immune to inflation/deflation. In
Chess an attempt was made to measure the old great players by
assigning ELO ratings to them. You can move backwards in time based
on players with known ratings to rate players with unknown ratings but
who played each other. Once you attach a rating to one of these
players, you can work your way backwards in time comparing them to
other players they had played and so on. But this is pretty dicey at
best. Of course you can look at the games and draw conclusions based
on this, but the conclusions are going to be mostly subjective and
likely very emotional. Unless there is a clear superiority, you won't
be able to say for sure and you certainly won't be able to say HOW
MUCH better with any accuracy.
I guess what I'm saying is that I don't believe the GO rating system
is immune to deflation/inflation any more than the ELO system is.
- Don
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/