[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

computer-go: Gozilla -- New Kid on the Block



Gozilla is not yet ready for prime-time, but the high quality of the
recent
discussions here has prompted this attempt to articulate at least an
overview
of our program.  We hope to supplement this with additional detail as
time
permits, especially if our results continue to be encouraging.

Our approach so far has been to concentrate on a "global move
suggester", a
component which selects candidate moves for lookahead.  Perhaps even
more
important, it rejects those moves which it is able to classify as
garbage.

Tactical searches and lookahead itself are beyond the scope of this
article.

In fact, all go knowledge _per_se_ is beyond its scope.  That's because
the
global move suggester -- we call it the "strategist" -- works via
applied
pattern recognition, and the techniques we employ may be used just as
well on
a very wide range of data types.  In our case, the data happens to be
the
observed behavior of professional go players, but it might equally well
be any
other collection of numerical measurements, such as an image (i.e., a
sequence
of bits, one per pixel), a vector of weather data, an electrocardiogram,
a
fingerprint, or a signature on a check (suitably digitized, of course).

Pattern recognition is all about guessing or predicting the unknown
nature of
an observation:  some discrete quantity such as black or white, one or
zero,
sick or healthy, real or fake.  Garbage, or not garbage. [1]

For the sake of simplicity, let us be concerned for the time being with
only
two classes:  Those moves which no reasonable player would ever play,
and --
others.  What we propose to do, then, is to construct a classifier
which,
given an arbitrary potential go behavior, heretofore unseen, will
classify
that behavior either as GARBAGE, or as not-GARBAGE.  The point of doing
this
is, naturally, to trim the search space.  [Oops!  Scope violation.]

Exact details of the creation of the classifier are a state secret, but
it
suffices for our immediate purposes here to describe it in a more
general way.

Essentially, what happens is this:  Professional games are observed, and
certain measurements are taken from the board state at each move.  [The
most difficult part of this is, of course, deciding what to measure!]
With each behavior indulged in by a pro -- that is, every move of every
game
that is observed -- new measurements are compared with old, and the
program's
"idea" of just what sort of behavior pros do, and do _not_ indulge in,
grows
and changes.  The process of constructing the classifier is what's known
in
the literature as learning, supervised learning, or learning with a
teacher.

Now, even if the classifier is imperfect, that's o.k. -- a few good
moves will
get classified as GARBAGE, and vice-versa.  But we can't just dismiss it
on
those grounds.  As more and more games are observed, the classifier
becomes
better.  That is to say -- and this seems to be a very important,
encouraging
result -- the probability that it will make such an error of
misclassification
is decreasing.  Yay!

We don't generate millions of random moves, because we don't think that
the
program can learn anything useful from that sort of exercise.  Instead,
we
have the program observe good play, and let it learn how to emulate
that,
mostly by _not_ indulging in playing moves that it has learned are
GARBAGE.

Attached is a short game (33 moves) that Gozilla (our program) played
against
Viking on NNGS.  Gozilla lost on time (primarily because of
excruciatingly
detailed debug logging).  Move number 25 at o5 struck us as being very
pretty
-- it almost made us proud -- but move number 29 at e3 appears to be one
of
those misclassifications, as described above.

Oh, well.  Hope springs eternal.

Rich

--
[1]  Formally, an _observation_ is a d-dimensional vector "x", and the
unknown
nature of the observation is a _class_, denoted by "y".   The class "y"
takes
values in a finite set  { 1, 2, ..., M }  each element of which
represents one
of the mutually exclusive discrete natures that an observation might
have.  [In
each of the given examples, M==2.]  In pattern recognition, one creates
(by
some means or other) a function

                            d
                   g(x) :  R   -->  { 1, 2, ..., M }

which represents one's guess of y, given x.  The mapping g (from the
reals
in d dimensions, onto the set of discrete natures) is called a
classifier.  A
classifier is said to "err on x" if g(x) does not, in fact, correctly
predict
the true nature of y.

-- 
Richard L. Brown       Office of Information Services
                       University of Wisconsin System Administration
                       780 Regent St., Rm. 246
rbrown@xxxxxxxxxxxxxxxxx        Madison, WI  53715

Attachment: Viking_v_Gozilla.sgf
Description: application/go-sgf