[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[computer-go] An [open] question on game tree search theory



Dear All,

I wish to share a problem I've been lately thinking of.

Suppose T is the full game tree of go (with a known komi), and G(t) is
the game theoretic value for every node t in T. Say, G(t) is either
"black wins", "white wins", or "draw" (consolidating draw, no result
etc. under this value; not a matter of importance). Clearly, if G(t)
would be effectively computable, we could play go algorithmically and
never lose from a winning position. But this is not the case.

Instead, we have an "evaluation function" for every node t, say, E(t),
with values, say, from the set of rational numbers. Without loss of
generality, we can assume that higher values of E(t) are "better for black".

A standard go, chess---or whatever program for zero-sum full-information
game---computes a min-max value for E(t) upto some variable, nonuniform
depth in general, and chooses the continuation move that maximizes the
min-maxified value---for black, of course.

If E is just a random number attached to every node in the tree, search is meaningless---any min-max value over the tree is just a min-max value of random numbers, and hence worthless.

On the other hand, a perfect evaluation function E would work so that E(t) >= E(t') implied G(t) >= G(t'), with ordering "black wins" > "draw" > "white wins". With this kind of function E, no search would be necessary, we could just evaluate all the next positions and choose the one with highest evaluation score.

But in practice, evaluation functions are neither random nor perfect.

The question is: what structure the evaluation function E must have so that it is possible to argument that a min-maxified value of E upto a search depth is a better guide for game play than the flat evaluation of the successor positions by E itself? Why min-maxify E?

I mean, min-maxifying the game-theoretic values of leaf positions leads to perfect game play. But E yields not the game-theoretic value for a node, but just "a number". Can a link between E and G be established? Could it be verified upto a level of confidence algorithmically?

It is common knowledge that increasing search depth when evaluation function is kept fixed raises game play strength. But what is the mathematical reason for that?

For example, in chess E reflects usually among other things material balance. There is probably even a (conjencture level) fact that positions with material balance favoring white are statistically more probably game-theoretic wins for white than those positions where material balance favors black. This would be a mathematical link between E and G. Higher values of E would yield game-theoretic win for white (in case of chess) with higher probability in a strict statistical sense. Does min-maxifying keep and strengthen this effect? Why? Someone probably knows this. And, is there a way to measure the effect or verify it algorithmically?

Truly Yours,

--
Antti Huima (Mr.)
Director, Conformiq Tools
mobile: +358 40 528 8667
email: antti.huima@xxxxxxxxxxxxxxxxx

Conformiq Software Ltd.
Stella Terra, Lars Sonckin kaari 16
FIN-02600 Espoo, Finland
tel: +358 10 286 6300
fax: +358 10 286 6309
http://www.conformiq.com/
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/