[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [computer-go] Statistical Significance
I completely agree with much what you write here Don; for chess we sure can
write in more detail about inflation and deflation from the rating lists.
Let's first take a look at what Elo is doing. Elo is using a statistical
assumption which is very advanced. However, like it is in all sports, on
average the assumption may be correct, but what does it measure?
Elo measures the score a person X can get against a random chosen number of
opponents Y.
Even though this system is more advanced than any other system, also this
system has a number of flaws. A big flaw is at the online chess clubs. They
do not play for money there, nor do they play against random opponents there.
Additionally many play when they 'feel in shape', and many don't play
against opponents where they have problems with to beat them. Additionally
many online rating systems use very weird parameters and ways to estimate
players strengths which are either very pessimistic or too optimistic.
For the official international elorating, which only strong international
playing players can get, the rating system is very accurate however
measuring the strength of player X at this moment versys player Y at this
moment.
The big flaw so you want from the international rating system
is that it doesn't take progress into account. A player of today who is
rated 2700 plays at a level which at the start of the rating list would
have been 3000 worth a year or 30 ago. Note that the rating list started
around that time. So any 'rating' from players before that date are just
uneducated guesses.
Though it is not very nice to say so, any world top10 player of today will
of course completely annihilate in a simultaneously exhibition all world
champions from half a century ago.
That doesn't say that the world champions of those days 'sucked'. In
contradiction, they did their job so well that players after them have
learned from them and *started* with their knowledge and improved even
further.
Especially online play at internet takes care that professional players of
today do not have many weak links anymore.
This where in grandmaster games from a century ago, a simple blundercheck
learns us already that in the 'big' tournaments of those days, that even
against untitled players they do not make a chance now.
A deep study of Grandmaster Nunn has clearly proven this. He blunderchecked
a toptournament of around 1991 and compared it with a very important
tournament from start of the century.
This where as i just mentionned, the last few years the level of most
professional chessplayers, has boosted again a lot of points, thanks to
their daily practice on the internet.
So when international chess ratings (FIDE ratings) slowly go up now, don't
ever forget that they are deflated. Not inflated.
On the other hand in the amateur internet world, inflation is daily business.
At 14:04 23-9-2004 -0400, Don Dailey wrote:
>
>Someone said this:>
> Elo ranking is quite slow to converge. Furthermore is has tendency
> deflate. That is why is not used in Go general.
>
>
>How rapidly ELO converges is undefined, it is an adjustable parameter
>to the ELO formula. It can be made to converge at any rate you wish.
>
>In fact, making the choice is a problem that NO rating system is
>immune to. If you converge rapidly the system adjusts quickly to
>rapidly improving (or declining) players. But it won't be very
>accurate in the general case since if you lose a couple of games
>because your blood sugar was low that day, the rating will punish you
>unfairly. Players will tend to be way overated on one day, way
>underated on another day.
>
>Deflation and inflation is also something no rating system I know of
>is immune to. The only way to prevent this is if you can prove and
>quantify how much of it has happened, and make artifical (presumably
>very slow and gradual) adjustements to the rating system.
>
>There are several reasons inflation and deflation happen. ELO assumes
>there is a fixed number of points in the system once someone has an
>established rating. If you start out as a beginner with a rating of
>1200, then in 10 years become a master (at 2200 points), you have
>taken 1000 rating points from the system. A fraction of a point was
>taken from each rated player in order for you to get those 1000
>points. So even if no one else has gotten weaker, you have lowered
>the average rating of the other players just a bit. The United States
>Chess Federation (which uses an ELO based system too) used to make up
>for this by awarding bonus points to players for various reasons, like
>having an exceptionally good tournament. I have no idea what they do
>now but several years ago this caused a great INFLATION and had to be
>changed.
>
>But inflation and deflation constantly happens. If a player quits
>playing the game and he happens to be overated at the time (had some
>"lucky" tournaments and then quits) then he has stolen some points
>from the system. If a player has a couple of "unlucky" tournaments
>and quits playing out of discouragement, he has left the excess rating
>points as a gift to the rest of the players. The amount of inflation
>in this case is the difference between his "true strength" and his
>actual rating.
>
>I can't imagine any other system that doesn't have these problems, I
>think that ELO is such a good system that the problem is much easier
>to see because it is measured quite concisely by an ELO number.
>
>One attempt to avoid these type of problems is to base playing ability
>on some easily defined standard which is either self adjusting or can
>be easily and incrementally adjusted (artificially) over time. For
>instance you can define a rating system where the top N players must
>have an average rating of some abitrary value and make constant small
>adjustments based on that. But that doesn't have any thing to do with
>inflation and deflation, it's just hiding the problem. In fact,
>everyones rating will be based on how strong the top N players happen
>to be at any point in time and how rapidly the system has adjusted to
>it.
>
>I think the GO ranking system is basically like this, but I admit that
>I don't understand it. I would appreciate it if someone went into
>some detail on the Go ranking system, and if it's believed that it is
>immune to deflation and inflation explain how that can be. My
>understanding is that it is based more on tradition than mathematics.
>
>But no matter what system you use, I don't believe there is an
>objective way to measure or compare one great player of today to
>another great player from hundreds of years ago. If there were, then
>you can convince me that GO is immune to inflation/deflation. In
>Chess an attempt was made to measure the old great players by
>assigning ELO ratings to them. You can move backwards in time based
>on players with known ratings to rate players with unknown ratings but
>who played each other. Once you attach a rating to one of these
>players, you can work your way backwards in time comparing them to
>other players they had played and so on. But this is pretty dicey at
>best. Of course you can look at the games and draw conclusions based
>on this, but the conclusions are going to be mostly subjective and
>likely very emotional. Unless there is a clear superiority, you won't
>be able to say for sure and you certainly won't be able to say HOW
>MUCH better with any accuracy.
>
>I guess what I'm saying is that I don't believe the GO rating system
>is immune to deflation/inflation any more than the ELO system is.
>
>- Don
>_______________________________________________
>computer-go mailing list
>computer-go@xxxxxxxxxxxxxxxxx
>http://www.computer-go.org/mailman/listinfo/computer-go/
>
>
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/