[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[computer-go] 9x9 Ratings



I gave my rating code a bit more thought and realized that it was
quite flawed. Since our programs most likely wont improve after each
game, there is no reason to update the rating after each game using
the old rating and some other information. I therefore tried an
alternative. You may skip this section if you don't care.

I store all results in a vector where each index corresponds to an
opponent rating, 9d to 30k. The results are spit into wins and losses
for each rating...I then proceed to try out all possible ratings, 9d
to 30k, and calculate a "badness" value. All games that are lost to a
weaker opponent will be added to this badness value along with all
games won against a stronger opponent. I then simply pick the rating
with the least badness. I tried it for a number of players on kgs (
19x19 games mainly ) and it matched the kgs-rating for all except some
9d's. This lead me to believe it may very well be a better estimate
than the previous.

Here's a new table with ratings based upon 9x9 games played this month
( march 2005 ). The first  set is based solely on games against
players without [-],[?] and [n?], the second set includes games
against [n?]. Some bots where excluded because of lack of games.


Handle		Rank 1	Games 1	Rank 2	Games 2
-----------------------------------------------
gnugo3pt6	11k	112	11k	199
viking5		14k	110	14k	213
go81		19k	259	20k	522
botnoid		22k	115	18k	223
tslbottest	23k	38	25k	80
fstoned		24k	50	30k	122


Now, some of these results look quite strange, for example botnoid
goes from 22k to 18k, and tlsbottest from 23k to 25k. But I guess
there's a natural explanation somewhere. :)

This time botnoid proves stronger than tlsbottest as Don Daily
suggested, it's also stronger than go81 if unstable players are
included..

It also becomes apparent that viking5 is much stronger than it's
predecessor viking4. Viking4 got 21k/22k (81/128 games) during all of
2004, but since some changes were made to the rating system on kgs,
this may be less accurate..

I hope your bots will be playing more frequently on kgs so that my
estimates can be improved ( more games -> better estimates ). Also, if
you have a bot playing 9x9, but isn't in the table, please let me know
if you want to be included ( and the other way around ). :P

I apologise for the lengthy mail..

/Christian
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/