[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] future KGS Computer Go Tournaments - two sections?



On 11, May 2005, at 9:01 AM, Don Dailey wrote:


And yes, I would be very pleased to be part of your verification
process, but botnoid is probably too weak for you to mess with. In my
opinion you need to play a long public match against GNU Go, MFOG or
some other strong program on KGS or one of the other servers.
We do this in our own lab. I am wondering what the additional benefit
is for doing it on KGS.
I wasn't sure about WHY you built this thing, I assumed that you were
doing it as a scientist.
True. See:  http://currents.ucsc.edu/04-05/07-26/cluster.html

 Isn't there supposed to be some mechanism
for letting others experimentally verify your results?  This is an
extremely positive and easy way to deal with this.
Well, we are going to publish when I think we have enough data.
That is a little different than letting others verify our results. I am
open to the idea, but the exact mechanism is unclear. You are
right that KGS and other public tournaments are visible, but
there will not be the quantity of games played there that I run in
our lab. But when I have enough data to know SlugGo's strength
relative to MFoG I will be very happy to find out about others and
I will prefer that to take place in a way that lets us play at least 100
games.

But even if you are not a research scientist there is enormous
positive value to actually doing the experiment publicly, versus
saying you did an experiment in private and "here is what we got."


...                 Games against GNU Go are particularly unfair.
SlugGo's lookahead scheme is too good at predicting what GNU Go
will do (we call it evil twin syndrome) and thus we crush GNU Go far
out of proportion to our actual difference in strength. For other
opponents the lookahead is not as effective because of a low hit rate
on their moves.
Maybe I remember this wrong,  but didn't we determined many months ago
that we expected this effect, but it really didn't show up much?
SlugGo can give GNU Go 5 stones and win convincingly and consistently.
I don't have the statistics in front of me now so I cannot be exact,

Please don't think I'm questioning your results in any way,
Healthy skepticism is never a problem.

 but I just
know I would feel better if I saw it reproduced independently or done
publicly in clear view.  In fact I WANT to see this done and I want to
be able to point to SlugGo when I'm in other discussions about global
search and such.  I need a real example so that I can prove
empirically what I already know to be true, that global search is
feasible in Go.  I won't point to SlugGo (yet) because it's sloppy to
point to unconfirmed results to make a point,
Smart Go has a similar overall scheme, as per my earlier email.

 it's asking to be
humiliated or embarrassed.  It's bad form.  I would have to stick my
neck out (at least a little) and completely trust my reputation to
you.  I won't do that.
And that is a good idea. As our mistake with timing quirks causing us
to mistake the strength of MFoG proves, all of this is preliminary. When
we publish you should be able to depend upon it. It will have gone
through the review process. But it is not bad form to say that there are
some preliminary results that you are aware of.

You probably think tournaments are good enough,
I think tournaments are interesting social events and are not the
proper measure of a program.

 but aside from the
issue of fairness, which we are discussing, winning a prize in a
tournament just isn't very impressive for a GNU Go clone.  No point
has been made.  And that's part of the trouble with being a GNU Go
derivative.  You really need to prove you are clearly better than a
GNU Go.  Is that unreasonable?
Nothing unreasonable. But I still do not understand all of your reasons
for discounting derivative programs. It is my opinion that most advances
are slow and incremental, building upon previous knowledge. With
computer science being so young, experiments that build upon existing
code seem as natural to me as physics experiments at an accelerator
that build upon all the previous instrumentation. One tries to do some
things differently, but many or even most things stay the same.

With respect to fairness, the arguments about statistical fairness in
a tournament I fully accept. I just have some reservations about how
one "fairly" decides when a derivative program has become sufficiently
different from the base code. And at this time I have no idea what to
think about the similarity between SlugGo's play and that of GNU Go.
There are times in the code when we just take the GNU Go answer
with no additional computation at all, and there are times when our
additional computation over that done by a normal single threaded
GNU Go is substantial.

My opinion at this time leans towards taking metrics of the frequency
that SlugGo chooses moves not selected as GNU Go's top move, but
I still am not sure that even that is the right metric ... it just seems like a
good place to start and it is similar to the suggestion made earlier that
we look at a set of problems and require some number of different
responses.


Cheers,
David


_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/