[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] future KGS Computer Go Tournaments - two sections?



On 11, May 2005, at 11:48 AM, Jim O'Flaherty, Jr. wrote:

How many games would you need to play A versus B in order to
come to an acceptable (high?) degree of confidence about the
percentage of time A wins over B?
Acceptable and high are relative to the person asking the question.
We often will run a set of games and calculate a 95% confidence
interval, and run another set of games where we get a 95% confidence
interval that does not even overlap. Our solution is MORE GAMES.

My experience is that if you are going to try to get this answer between
programs that differ by less than 2 stones, and you just try the simple
thing and have the programs play even, the number is in the range of
200 games. Even then I think the results are somewhat difficult to really
understand.

What I favor at this time is something suggested to us by Doug Ridgway.
In addition to playing a large number of games even, a number of games
also need to be played at different handicaps in both directions. If the
programs are supposed to be close, I would suggest handicaps out to
at least 4 stones. A plot of all the scores against the handicap will intersect
the handicap axis at the relative strength. It will still take about 200 games,
but I think the result is better. You should expect that the scatter is going
to be wide.

Here are links to two such plots:

http://dridgway.com/Go/sluggo_vs_MFG2.pdf
http://dridgway.com/Go/sluggo_vs_MFG3.pdf

Cheers,
David


_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/