Gozilla is not yet ready for prime-time, but the high quality of the recent discussions here has prompted this attempt to articulate at least an overview of our program. We hope to supplement this with additional detail as time permits, especially if our results continue to be encouraging. Our approach so far has been to concentrate on a "global move suggester", a component which selects candidate moves for lookahead. Perhaps even more important, it rejects those moves which it is able to classify as garbage. Tactical searches and lookahead itself are beyond the scope of this article. In fact, all go knowledge _per_se_ is beyond its scope. That's because the global move suggester -- we call it the "strategist" -- works via applied pattern recognition, and the techniques we employ may be used just as well on a very wide range of data types. In our case, the data happens to be the observed behavior of professional go players, but it might equally well be any other collection of numerical measurements, such as an image (i.e., a sequence of bits, one per pixel), a vector of weather data, an electrocardiogram, a fingerprint, or a signature on a check (suitably digitized, of course). Pattern recognition is all about guessing or predicting the unknown nature of an observation: some discrete quantity such as black or white, one or zero, sick or healthy, real or fake. Garbage, or not garbage. [1] For the sake of simplicity, let us be concerned for the time being with only two classes: Those moves which no reasonable player would ever play, and -- others. What we propose to do, then, is to construct a classifier which, given an arbitrary potential go behavior, heretofore unseen, will classify that behavior either as GARBAGE, or as not-GARBAGE. The point of doing this is, naturally, to trim the search space. [Oops! Scope violation.] Exact details of the creation of the classifier are a state secret, but it suffices for our immediate purposes here to describe it in a more general way. Essentially, what happens is this: Professional games are observed, and certain measurements are taken from the board state at each move. [The most difficult part of this is, of course, deciding what to measure!] With each behavior indulged in by a pro -- that is, every move of every game that is observed -- new measurements are compared with old, and the program's "idea" of just what sort of behavior pros do, and do _not_ indulge in, grows and changes. The process of constructing the classifier is what's known in the literature as learning, supervised learning, or learning with a teacher. Now, even if the classifier is imperfect, that's o.k. -- a few good moves will get classified as GARBAGE, and vice-versa. But we can't just dismiss it on those grounds. As more and more games are observed, the classifier becomes better. That is to say -- and this seems to be a very important, encouraging result -- the probability that it will make such an error of misclassification is decreasing. Yay! We don't generate millions of random moves, because we don't think that the program can learn anything useful from that sort of exercise. Instead, we have the program observe good play, and let it learn how to emulate that, mostly by _not_ indulging in playing moves that it has learned are GARBAGE. Attached is a short game (33 moves) that Gozilla (our program) played against Viking on NNGS. Gozilla lost on time (primarily because of excruciatingly detailed debug logging). Move number 25 at o5 struck us as being very pretty -- it almost made us proud -- but move number 29 at e3 appears to be one of those misclassifications, as described above. Oh, well. Hope springs eternal. Rich -- [1] Formally, an _observation_ is a d-dimensional vector "x", and the unknown nature of the observation is a _class_, denoted by "y". The class "y" takes values in a finite set { 1, 2, ..., M } each element of which represents one of the mutually exclusive discrete natures that an observation might have. [In each of the given examples, M==2.] In pattern recognition, one creates (by some means or other) a function d g(x) : R --> { 1, 2, ..., M } which represents one's guess of y, given x. The mapping g (from the reals in d dimensions, onto the set of discrete natures) is called a classifier. A classifier is said to "err on x" if g(x) does not, in fact, correctly predict the true nature of y. -- Richard L. Brown Office of Information Services University of Wisconsin System Administration 780 Regent St., Rm. 246 rbrown@xxxxxxxxxxxxxxxxx Madison, WI 53715
Attachment:
Viking_v_Gozilla.sgf
Description: application/go-sgf