[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] Statistical Significance



Someone said this:

  Elo ranking is quite slow to converge. Furthermore is has tendency
  deflate. That is why is not used in Go general.


How rapidly ELO converges is  undefined, it is an adjustable parameter
to the ELO formula.  It can be made to converge at any rate you wish.

In  fact, making  the choice  is a  problem that  NO rating  system is
immune  to.  If  you converge  rapidly the  system adjusts  quickly to
rapidly  improving  (or declining)  players.   But  it  won't be  very
accurate  in the  general case  since if  you lose  a couple  of games
because your blood sugar was low  that day, the rating will punish you
unfairly.   Players will  tend  to be  way  overated on  one day,  way
underated on another day.

Deflation and inflation  is also something no rating  system I know of
is immune to.   The only way to  prevent this is if you  can prove and
quantify how much  of it has happened, and  make artifical (presumably
very slow and gradual) adjustements to the rating system.

There are several reasons inflation and deflation happen.  ELO assumes
there is  a fixed number of points  in the system once  someone has an
established rating.  If  you start out as a beginner  with a rating of
1200, then  in 10  years become  a master (at  2200 points),  you have
taken 1000 rating  points from the system.  A fraction  of a point was
taken  from each  rated player  in  order for  you to  get those  1000
points.  So  even if no one  else has gotten weaker,  you have lowered
the average rating of the other players just a bit.  The United States
Chess Federation (which uses an ELO  based system too) used to make up
for this by awarding bonus points to players for various reasons, like
having an exceptionally good tournament.   I have no idea what they do
now but several years ago this  caused a great INFLATION and had to be
changed.

But  inflation and deflation  constantly happens.   If a  player quits
playing the game  and he happens to be overated at  the time (had some
"lucky" tournaments  and then  quits) then he  has stolen  some points
from the  system.  If a player  has a couple  of "unlucky" tournaments
and quits playing out of discouragement, he has left the excess rating
points as a gift to the  rest of the players.  The amount of inflation
in this  case is  the difference between  his "true strength"  and his
actual rating.

I can't imagine  any other system that doesn't  have these problems, I
think that ELO  is such a good system that the  problem is much easier
to see because it is measured quite concisely by an ELO number.

One attempt to avoid these type of problems is to base playing ability
on some easily defined standard  which is either self adjusting or can
be easily  and incrementally  adjusted (artificially) over  time.  For
instance you can  define a rating system where the  top N players must
have an average rating of  some abitrary value and make constant small
adjustments based on that.  But that doesn't have any thing to do with
inflation  and deflation,  it's  just hiding  the  problem.  In  fact,
everyones rating will be based on  how strong the top N players happen
to be at any point in time  and how rapidly the system has adjusted to
it.

I think the GO ranking system is basically like this, but I admit that
I don't  understand it.   I would appreciate  it if someone  went into
some detail on the Go ranking  system, and if it's believed that it is
immune  to  deflation and  inflation  explain  how  that can  be.   My
understanding is that it is based more on tradition than mathematics.

But  no matter  what  system you  use,  I don't  believe  there is  an
objective  way to  measure or  compare one  great player  of  today to
another great player from hundreds  of years ago.  If there were, then
you  can convince  me that  GO is  immune to  inflation/deflation.  In
Chess  an  attempt  was made  to  measure  the  old great  players  by
assigning ELO ratings  to them.  You can move  backwards in time based
on players with known ratings to rate players with unknown ratings but
who  played each  other.  Once  you attach  a rating  to one  of these
players, you  can work  your way backwards  in time comparing  them to
other players they had played and  so on.  But this is pretty dicey at
best.  Of course you can look  at the games and draw conclusions based
on this,  but the  conclusions are going  to be mostly  subjective and
likely very emotional.  Unless there is a clear superiority, you won't
be able  to say for sure  and you certainly  won't be able to  say HOW
MUCH better with any accuracy.

I guess what  I'm saying is that I don't believe  the GO rating system
is immune to deflation/inflation any more than the ELO system is.

- Don
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/