[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[computer-go] An [open] question on game tree search theory
Dear All,
I wish to share a problem I've been lately thinking of.
Suppose T is the full game tree of go (with a known komi), and G(t) is
the game theoretic value for every node t in T. Say, G(t) is either
"black wins", "white wins", or "draw" (consolidating draw, no result
etc. under this value; not a matter of importance). Clearly, if G(t)
would be effectively computable, we could play go algorithmically and
never lose from a winning position. But this is not the case.
Instead, we have an "evaluation function" for every node t, say, E(t),
with values, say, from the set of rational numbers. Without loss of
generality, we can assume that higher values of E(t) are "better for black".
A standard go, chess---or whatever program for zero-sum full-information
game---computes a min-max value for E(t) upto some variable, nonuniform
depth in general, and chooses the continuation move that maximizes the
min-maxified value---for black, of course.
If E is just a random number attached to every node in the tree, search
is meaningless---any min-max value over the tree is just a min-max value
of random numbers, and hence worthless.
On the other hand, a perfect evaluation function E would work so that
E(t) >= E(t') implied G(t) >= G(t'), with ordering "black wins" > "draw"
> "white wins". With this kind of function E, no search would be
necessary, we could just evaluate all the next positions and choose the
one with highest evaluation score.
But in practice, evaluation functions are neither random nor perfect.
The question is: what structure the evaluation function E must have so
that it is possible to argument that a min-maxified value of E upto a
search depth is a better guide for game play than the flat evaluation of
the successor positions by E itself? Why min-maxify E?
I mean, min-maxifying the game-theoretic values of leaf positions leads
to perfect game play. But E yields not the game-theoretic value for a
node, but just "a number". Can a link between E and G be established?
Could it be verified upto a level of confidence algorithmically?
It is common knowledge that increasing search depth when evaluation
function is kept fixed raises game play strength. But what is the
mathematical reason for that?
For example, in chess E reflects usually among other things material
balance. There is probably even a (conjencture level) fact that
positions with material balance favoring white are statistically more
probably game-theoretic wins for white than those positions where
material balance favors black. This would be a mathematical link between
E and G. Higher values of E would yield game-theoretic win for white
(in case of chess) with higher probability in a strict statistical
sense. Does min-maxifying keep and strengthen this effect? Why? Someone
probably knows this. And, is there a way to measure the effect or verify
it algorithmically?
Truly Yours,
--
Antti Huima (Mr.)
Director, Conformiq Tools
mobile: +358 40 528 8667
email: antti.huima@xxxxxxxxxxxxxxxxx
Conformiq Software Ltd.
Stella Terra, Lars Sonckin kaari 16
FIN-02600 Espoo, Finland
tel: +358 10 286 6300
fax: +358 10 286 6309
http://www.conformiq.com/
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/