[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: computer-go: [Question] Alpha-Beta Pruning & Null Move
Don Dailey wrote:
>The tricky part is defining "happy." Fortunately, "happy" has a very
>precise definition. It simply means that given 2 moves in a row, can
>I achieve a score good enough to achieve at least my expectation? If
>you can, you are "happy." In a losing position, you can still be
>"happy" given this definition because you expectations are
>appropriately lowered! Your "expectation" is the "alpha" part of
>"alpha/beta pruning" or put another way your proven lower bound.
Regarding null-moves, I think it helps to distinguish between two kinds of
applications: (a) uniform search trees, and (b) non-uniform search trees.
(a) In Go, the global evaluation function produces a typical uniform search
tree with an average branching factor of, say, 250. As pointed out, "good"
and "bad" depends on the context (in alpha-beta terminology: the values of
alpha and beta). Regarding the global evaluation function, null-move pruning
only starts to be meaningful when this tree is searched to depth 5 plies or
more (remember that the null-moves are searched with a depth-reduction of
2-3 plies).
(b) However, for local tactical searches, non-uniform search trees are
abundant. The goal of killing a block or connecting two blocks produces a
typical non-uniform search tree, because after the block gets taken off the
board or the two blocks are connected, the local game ends. Such
non-uniformity is highly exploitable by null-move techniques.
For instance, in the context of a ladder, the attacker only plays moves so
that -- if the defender passes -- the attacker can follow up by taking the
stones off the board. That produces a search-tree of threat-order 1. What
I've been fumbling with is that this idea can be generalized to any
threat-order (so-called lambda-order). For instance, in a loose ladder
(threat-order 2), the attacker only plays moves so that -- if the defender
passes -- the attacker can follow up by killing the stones in a normal
ladder. And in a loose loose ladder (threat-order 3), the attacker only
plays moves so that -- if the defender passes -- the attacker can follow up
by killing the stones in a loose ladder. And so on. This saves an enormous
amount of work, but still guarantees exactness.
Note that (b) doesn't operate with alpha's and beta's. This is because
obtaining the goal is thought of as an irreversible feat: if a block can be
captured in 5 plies, there is no risk of it being de-captured when looking a
few plies deeper. Similarly, if a block is connected, it cannot be
de-connected deeper in the search tree. However, (a) is different. It may be
possible to achieve a very good evaluation at depth 5 plies, but no matter
how good the evaluation value is, nothing can guarantee that the result is
not reverted when looking a few plies deeper.
Best regards,
-Thomas Thomsen