[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: Discarding the rubbish moves



On Wed, Jan 15, 2003 at 11:27:21AM -0600, Richard Brown wrote:
> Darren Cook wrote:
> > Heikki Levanto wrote:
> > > Anyway, in any given go position, there are a small number of
> > > reasonable moves that can be considered, and a great number of
> > > totally stupid moves that no human would even consider.
> 
> > > Any go
> > > program must be able to discard the 90% of rubbish moves quickly.
> 
> Yes, this is extremely important.  Strong programs of the future will
> use such a razor in an early phase. I believe that the actual percentage
> will be higher.  I'll go out on a limb and predict 97%.

This would be extremely handy, but it may not be achievable on the
timeline which leads to a successful program based on other
approaches. In computer Chess, the usual shining example of a
successful software technology for game playing, my understanding is
that many people have tried really hard over the years to do such
classification, but the successful programs have largely given up on it.

I guess if you wanted to, maybe you could claim that the move ordering
which results from heuristic evaluations and iterative deepening in
alpha-beta search is a reflection of this principle at work in
computer Chess. But even then, I think it'd be hard to come up with
anything like a 97% figure for top computer Chess programs. So if you
want to show that 97% is the right figure for computer Go, you should
probably have some argument why computer Go follows different
principles from computer Chess.

(I think it does follow different principles in several ways, but 
that doesn't mean we can't learn from the history of computer Chess.)

> > > Here a simple move evaluation could make sense. (especially since
> > > the same move may be totally irrelevant or the best possible,
> > > depending if there is anything more valuable around...)
> 
> There is a clear benefit from a razor which, for any potential move,
> correctly places it into one of two classes:
> 
>         o-  Moves that no reasonable player would ever even consider.
> 
>         o-  Other.
> 
> For some suitable definition of "reasonable".
> 
> The "Other" category comprises ~3%, says I.

In my (admittedly largely unsuccessful so far) work in computer Go,
the only large class of moves I've been able to automatically,
reliably classify as "never even consider" are in the opening, far
from any other stones and not on the third or fourth line. 

This is one of those places where my net-trained reflex is to admit
that YMMV. But your mileage would have to vary a *lot* to get to 3%.
As an AGA 3 dan looking at pro games, I don't feel sure I can quickly
predict that the next move will be in one of <10 places, which is what
I'd expect if I were automatically classifying all but 3% (of moves on
a 19x19 board) as not worth considering. I doubt computers are going
to do better than human 3 dans on this until computers play, overall,
much much better than 3 dans. (I appeal to the computer chess history,
and general engineering good sense, to justify this claim. We can
teach computers to do sophisticated classification, but usually what
they really shine at is exploring many simpler classifications very
fast.)

> Assuming that one has a suitable metric, then various techniques can
> be used (their success depends on the nature of the data).  Neural nets,
> "linear classifiers", "non-linear classifiers" [just Googling for these
> last two can be enlightening] and my current favorite, a little thing
> called the "k nearest neighbor" rule.  "Hopfield networks", "Boltzmann
> machines", "Voronoi diagrams", "convex hulls", and "Delaunay
> triangulations"
> are left as an exercise for the reader who hasn't fallen asleep yet.

As a human 3 dan, I can very quickly discard many moves even in the
most complicated middlegame situations. (Maybe 50% of moves?) A few of
them are locally pointless, like various moves on the first line, or
moves inside enclosed spaces which simply reduce one's own eyespace.
But most of them aren't locally pointless, it's just that whatever
they might accomplish locally is done better by another move, or what
they accomplish is irrelevant on a larger scale (like making another
eye for a group which has three eyes already). AFAIK that kind of
dominated-by-something-else criterion is rather difficult for the
neural-net-like classification systems you discuss, and the
recognition that a move's benefits are globally irrelevant is
extremely difficult.

After that, I often discard all moves in entire regions, because
relatively shallow analysis shows that those regions are more settled
than the important battles elsewhere. By the time I do this, maybe I
have discarded 90% or more of the moves on the board. But note that
the process involves feeding a lot of high level global knowledge back
into the local classification. So if you're looking for algorithms
which might be as effective as this, I wouldn't look at just local
classification, but instead try algorithms like those used for speech
recognition, where there are various techniques to resolve ambiguities
in local pattern recognition by using higher level information, like
the way that the sentence structure is parsed.
 
> I'm still trying to get my head around the fact that it is possible to
> extract an _infinite_ amount of information from a finite amount of
> data.
> 
> Ah, well, the nice thing about being an amateur mathematician is that
> I have nothing to prove.

Playing one's Go program against any opponent, or just feeding it
tsume-go problems for that matter, may not *prove* anything, but it
can be terribly eye-opening.:-|

-- 
William Harold Newman <william.newman@xxxxxxxxxxxxxxxxx>
"That's our advantage at Microsoft; we set the standards and we can
change them."
 -- Karen Hargrove, Microsoft (quoted in Feb 1993 Unix Review editorial)
PGP key fingerprint 85 CE 1C BA 79 8D 51 8C  B9 25 FB EE E0 C3 E5 7C