[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: computer-go: GoeMate is now 1st in 13x13-CGoT
This is a little philosophical, but here is what I have observed when
people look at games to judge the strengths of the players.
Even when strong players give opinions about playing strengths based
on their own judgements, it seems to go awry. Humans are pretty good
as saying which moves are good and which are bad or what the
weaknesses and strengths of a player is, but they seem to be terribly
bad at actually judging the quality of the player. The fact of the
matter is that humans are just too biased to do this well.
I have seen this in computer chess all too often. A human is too
impressed by a great move, OR too critical of a weak move. I once
spent an evening (several years ago) with 2 masters who played games
against 2 popular chess computers, for several hours. They were
extremely impressed with one program, but did not like the other one,
believing that it was far weaker based on the way it played. When I
explained that they really had it backwards, they wouldn't believe me.
We ended up playing several games just between the 2 computers so I
could prove my point. The computer they didn't like was known to be
the best at the time and dominated all the other computers back then.
But it took a long time to convince them otherwise, they just had to
see several wins (the better computer rarely lost but they kept
believing it was "lucky.")
The weaker computer was well known for an active and interesting
playing style and was viewed at the time as the computer that played
most like a human. The stronger computer played relatively boring
chess and was much less likely to play the flashy kinds of moves which
impressed humans. It sometimes played awkward looking moves that
humans would not consider (but the moves were not that bad either.)
I learned a lot from that experience and others like it. Even in the
games I have personally played in tournaments I rarely observe much of
a difference in the quality of moves produced by my opponents, even
though I know there must be. In short, I don't think I could come
very close to guessing their strength based on judging the quality of
their moves. But the end result is the same if there is a significant
rating difference. I will almost alway lose to a stronger player and
likewise will usually beat a weaker one. The actual results (which
the ratings are based on) is far more reliable than my judgement.
So I think you have to play tournaments. And Mark is right, if a
program is "clearly stronger", it will show up right away. However,
this is still tricky. A "clearly superior" opponent will very likely
win even a very short match, but that doesn't mean you have very much
empirical evidence of his superiority. Even equal opponents can
produce very lopsided results with a small number of games, so you
still need a few tens of games to demonstrate that a clearly superior
opponent really is the better player.
Don
From: "Mark Boon" <tesuji@xxxxxxxxxxxxxxxxx>
Date: Thu, 31 Jan 2002 11:07:09 +0100
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2314.1300
Importance: Normal
Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
Precedence: bulk
Reply-To: computer-go@xxxxxxxxxxxxxxxxx
Content-Type: text/plain;
charset="iso-8859-1"
Content-Length: 1055
> -----Original Message-----
> From: owner-computer-go@xxxxxxxxxxxxxxxxx
> [mailto:owner-computer-go@xxxxxxxxxxxxxxxxx]On Behalf Of Michael Reiss
> Sent: Tuesday, January 29, 2002 11:32 AM
> To: computer-go@xxxxxxxxxxxxxxxxx
> Subject: Re: computer-go: GoeMate is now 1st in 13x13-CGoT
>
>
> > Please have a look at the games and tell me your opinion:
> > Does a tournament like this show the real playing strength of the
> > programs or is there still too much of luck and chance?
>
> Playing more games certainly reduces the element of luck. If you look
> at how many different winners there have been in recent computer go
> tournaments in the past couple of years you could either conclude
> that there are five or six programs incredibly close in strength
> or that the contests have too few games to really find the strongest.
> I personally would conclude the latter.
>
The one possibility doesn't exclude the other. If one program had been
clearly stronger than the others, the results would have shown it, even if
there are only few results.