[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] Statistical Significance



I completely agree with much what you write here Don; for chess we sure can
write in more detail about inflation and deflation from the rating lists.

Let's first take a look at what Elo is doing. Elo is using a statistical
assumption which is very advanced. However, like it is in all sports, on
average the assumption may be correct, but what does it measure?

Elo measures the score a person X can get against a random chosen number of
opponents Y.

Even though this system is more advanced than any other system, also this
system has a number of flaws. A big flaw is at the online chess clubs. They
do not play for money there, nor do they play against random opponents there.

Additionally many play when they 'feel in shape', and many don't play
against opponents where they have problems with to beat them. Additionally
many online rating systems use very weird parameters and ways to estimate
players strengths which are either very pessimistic or too optimistic. 

For the official international elorating, which only strong international
playing players can get, the rating system is very accurate however
measuring the strength of player X at this moment versys player Y at this
moment.

The big flaw so you want from the international rating system
is that it doesn't take progress into account. A player of today who is
rated 2700 plays at a level which at the start of the rating list would
have been 3000 worth a year or 30 ago. Note that the rating list started
around that time. So any 'rating' from players before that date are just
uneducated guesses.

Though it is not very nice to say so, any world top10 player of today will
of course completely annihilate in a simultaneously exhibition all world
champions from half a century ago.

That doesn't say that the world champions of those days 'sucked'. In
contradiction, they did their job so well that players after them have
learned from them and *started* with their knowledge and improved even
further.

Especially online play at internet takes care that professional players of
today do not have many weak links anymore.

This where in grandmaster games from a century ago, a simple blundercheck
learns us already that in the 'big' tournaments of those days, that even
against untitled players they do not make a chance now.

A deep study of Grandmaster Nunn has clearly proven this. He blunderchecked
a toptournament of around 1991 and compared it with a very important
tournament from start of the century.

This where as i just mentionned, the last few years the level of most
professional chessplayers, has boosted again a lot of points, thanks to
their daily practice on the internet.

So when international chess ratings (FIDE ratings) slowly go up now, don't
ever forget that they are deflated. Not inflated.

On the other hand in the amateur internet world, inflation is daily business.

At 14:04 23-9-2004 -0400, Don Dailey wrote:
>
>Someone said this:>
>  Elo ranking is quite slow to converge. Furthermore is has tendency
>  deflate. That is why is not used in Go general.
>
>
>How rapidly ELO converges is  undefined, it is an adjustable parameter
>to the ELO formula.  It can be made to converge at any rate you wish.
>
>In  fact, making  the choice  is a  problem that  NO rating  system is
>immune  to.  If  you converge  rapidly the  system adjusts  quickly to
>rapidly  improving  (or declining)  players.   But  it  won't be  very
>accurate  in the  general case  since if  you lose  a couple  of games
>because your blood sugar was low  that day, the rating will punish you
>unfairly.   Players will  tend  to be  way  overated on  one day,  way
>underated on another day.
>
>Deflation and inflation  is also something no rating  system I know of
>is immune to.   The only way to  prevent this is if you  can prove and
>quantify how much  of it has happened, and  make artifical (presumably
>very slow and gradual) adjustements to the rating system.
>
>There are several reasons inflation and deflation happen.  ELO assumes
>there is  a fixed number of points  in the system once  someone has an
>established rating.  If  you start out as a beginner  with a rating of
>1200, then  in 10  years become  a master (at  2200 points),  you have
>taken 1000 rating  points from the system.  A fraction  of a point was
>taken  from each  rated player  in  order for  you to  get those  1000
>points.  So  even if no one  else has gotten weaker,  you have lowered
>the average rating of the other players just a bit.  The United States
>Chess Federation (which uses an ELO  based system too) used to make up
>for this by awarding bonus points to players for various reasons, like
>having an exceptionally good tournament.   I have no idea what they do
>now but several years ago this  caused a great INFLATION and had to be
>changed.
>
>But  inflation and deflation  constantly happens.   If a  player quits
>playing the game  and he happens to be overated at  the time (had some
>"lucky" tournaments  and then  quits) then he  has stolen  some points
>from the  system.  If a player  has a couple  of "unlucky" tournaments
>and quits playing out of discouragement, he has left the excess rating
>points as a gift to the  rest of the players.  The amount of inflation
>in this  case is  the difference between  his "true strength"  and his
>actual rating.
>
>I can't imagine  any other system that doesn't  have these problems, I
>think that ELO  is such a good system that the  problem is much easier
>to see because it is measured quite concisely by an ELO number.
>
>One attempt to avoid these type of problems is to base playing ability
>on some easily defined standard  which is either self adjusting or can
>be easily  and incrementally  adjusted (artificially) over  time.  For
>instance you can  define a rating system where the  top N players must
>have an average rating of  some abitrary value and make constant small
>adjustments based on that.  But that doesn't have any thing to do with
>inflation  and deflation,  it's  just hiding  the  problem.  In  fact,
>everyones rating will be based on  how strong the top N players happen
>to be at any point in time  and how rapidly the system has adjusted to
>it.
>
>I think the GO ranking system is basically like this, but I admit that
>I don't  understand it.   I would appreciate  it if someone  went into
>some detail on the Go ranking  system, and if it's believed that it is
>immune  to  deflation and  inflation  explain  how  that can  be.   My
>understanding is that it is based more on tradition than mathematics.
>
>But  no matter  what  system you  use,  I don't  believe  there is  an
>objective  way to  measure or  compare one  great player  of  today to
>another great player from hundreds  of years ago.  If there were, then
>you  can convince  me that  GO is  immune to  inflation/deflation.  In
>Chess  an  attempt  was made  to  measure  the  old great  players  by
>assigning ELO ratings  to them.  You can move  backwards in time based
>on players with known ratings to rate players with unknown ratings but
>who  played each  other.  Once  you attach  a rating  to one  of these
>players, you  can work  your way backwards  in time comparing  them to
>other players they had played and  so on.  But this is pretty dicey at
>best.  Of course you can look  at the games and draw conclusions based
>on this,  but the  conclusions are going  to be mostly  subjective and
>likely very emotional.  Unless there is a clear superiority, you won't
>be able  to say for sure  and you certainly  won't be able to  say HOW
>MUCH better with any accuracy.
>
>I guess what  I'm saying is that I don't believe  the GO rating system
>is immune to deflation/inflation any more than the ELO system is.
>
>- Don
>_______________________________________________
>computer-go mailing list
>computer-go@xxxxxxxxxxxxxxxxx
>http://www.computer-go.org/mailman/listinfo/computer-go/
>
>
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/