[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[computer-go] Tactical search or not in computer go
There has been some interesting discussions here recently, and I
will try to add some of my experiences from my program Viking (you can currently
test it on NNGS where it plays as viking4 (14*kyu relative to gnugo- perhaps 18-
20*k when playing stronger humans). If you wish to play against it you have to do it
before Saturday this week - after that I will move away and I may not resume
normal computer go activity for some months).
The current version of Viking is extreme because it does not use any local tactical
search at all in its evaluation function. Instead it relies on a *huge* amount of
carefully designed patterns that is used to evaluate for example safe connections,
possible connections, eyevalues (0.5 1 1.5 2+), captured stones (such that all
liberties are contained within the pattern and all surrounding stones has enough
liberties to prevent a semeai).
The evaluation function is slow and can at most do 20 positions in a second and it is
doing a partial full board search similar to what Fotland described recently. My
approach is to compute local temperatures (crude approximations of the theoretical
temperature) which means that it make a 1-3 ply search locally for both sides. It
plays the move with highest temperature. Given enough time it will evaluate the
temperature of all possible moves (spending 1 to 30 evaluations on each possible
move) that is not inside secure territory where it does not generate moves. Possible
moves are generated with patterns with different priorities for move ordering. In
complex middle game situations it is only able to evaluate 10 possible moves within
20 seconds on my 500 Mhz processor, which currently causes a lot of mistakes
since too many candidates are pruned. I think the strength of my current program
will benefit a lot from about 10x more processing power but after that I need to use
the cpu on other things - more about that below.
There is one big + and one big - for using a slow pattern based evaluationfunction.
+
My program knows things that a "simple" tactical local search cannot see, and full
board search with a fast evaluation would never see. For eample it can see that a
huge group with 4 hard to read halfeyes is alive without search. A brute force search
that cannot break down this problem in four independent parts will have trouble
solving such a problem without using a lot of go knowledge. If my program plays a
bad move it is often easy to correct this by adding a pattern that it did not have yet.
Since all new patterns are very specific they do not mess up the evaluation of other
patterns (well, in theory at least - there are a lot of bugs in the patterns caused by
me and also a lot of bad older patterns that do not follow my current constraints for
"correct" patterns, there are also a lot of situations that are hard or impossible to
cover with the current design - I cannot make patterns conditional of ladders (I think
gnugo does this) for example). The patternmatching is using a treebased
representation so the pattern matching time do not grow linearly with the number of
patterns. Thus the need for many additional patterns will not cause the evaluation
function to be slower.
-
The approach is not robust, because every time a pattern is missing, the evaluation
function will make arbitrarily huge errors. A common catastrophic situation is that it
plays some random moves inside an enemy group and creates a life and death
position that it has never seen before (there are no matching patterns) and then it
invariably believes that the group it attacks is dead. Most of the times though the
groups is easily alive. A typical game against gnugo for example is that my program
gives the impression to play very well until complications arise where most of
gnugo's live groups are seen as dead and then my program quickly collapse all over
the board until the groups of gnugo is correctly estimated as alive in the late
endgame. Which showes that no program (or human by the way) is stronger than
the ability to judge the strength of groups correctly. I have tried to improve the
robustness of the evaluationfunction with computationally cheap heuristics, but that
approach suffers from bugs and incomplete coverage of positions as well. And the
worst thing is that it make the code extremly complex, and hard to maintain.
The conclusions:
My program needs tactical search to make it robust against lack of patterns.
Yet it plays with even winning chances with gnugo using 6-7 handicap stones on
19x19 and sometimes win on 9x9 without handicap. (But note that I have tuned the
pattern library using games against gnugo a lot while I do not think the
gnugoprogrammers spend as much time tuning for my program so it certainly much
weaker against humans than go programs).
Recently I have added life and death search to my program but I do not think it
account for much of the rating at NNGS. The life and death module is based on the
same evaluation function which is used for full board situations with some time
consuming features turned off, and has been developed in parallel with my current
program. It makes max 100 nodes a second on my computer. It uses best first
search and is of course much better than the static slow evaluation function when it
comes to detect which groups are alive and dead since it is based on the evaluation
function. In enclosed sitations where it has been tuned it can completely solve
problems faster than I can do. The reason is that search stops as soon as it can
match patterns for two eyes. This allows huge reductions in the tree size. In theory it
possible to add patterns such that any variation where the groups lives is found
within 2-ply. A killing move is simply the only move that was not refuted
immediately. This is the reason that my program manages quite well against people
who have very little knowledge about life&death without the life & deth module
because it uses the same knowledge for evaluation.
The problem is that the life & death module is too slow to to be included in a 1-ply
full board search. So currently my program has a split personality sort of. The life
and death module get some time from the full board time, perhaps 10 seconds at
most while it plays on NNGS and analyzes 1 or 2 groups that are believed to need
urgent attention. All results are cashed and reused in a very optimistic fashion. If a
group is unstable it will propose the best move for that group and calculates a
temperature that is used as a minimum value no matter what the full board
evaluation would think. There is a lot of problems with this because if there are no
unstable moves the result from the life&death module is not used. I do try to use it
indirectly by forbidding unsafe moves (stones that are not evaluated as alive but
nethertheless increases the value of the evaluationfunction) near tactically alive
opponent groups, and avoids adding stones to its own tactically dead groups.
Often the life and death module makes Viking play defencive moves that take no
territory because it has no time to search for the best move but picks the first move
that work at all. This means that the life and death module causes mistakes as well
as it avoids them. And it is very costly in time and it would need 10 minutes of
thinking time each move to reveal its full potential. Just using it causes the program
to prune away many good moves that would otherwise have been selected as the
best move several times in every game at normal time limits.
In the future I will be able to use more of my potentially strong life & death module
in the program. In fully enclosed text-book problems it is both reasonably accurate
and fast as long as it has most of the patterns for evaluation and movegeneration.
It also quite strong in open middlegame positions, but a lot of work remain to be
done. Since move generation is based on a very complex and information rich
evaluation function (all candidate moves are generated after each nodeevaluation
unless it is terminal) it is possible to be very accurate. Since move generation
reuses knowledge it is cheap to add code to it. In easy enclosed postions where
potential eyes are separate and there are no complex tactical interactions and it
know all relevant patterns it plays almost perfectly. Semeais are not implemented
well but it can handle simple ones at least. The problem in go is to handle open
middlegame fighting well though. So even if my life&death program would handle
some tactical positions as well as a 6-dan would, it would not make my current
program much stronger on 19x19.
So is it possible that a chessprogrammer that works fulltime would beat my program
within a year on 9x9? I think it is possible (I think Honte is proof that it is possible to
do well with local search - it is stronger than my program and I think it was written in
short time) but the program need to do what my patterns do with local search in the
evaluation function. But if I during that time spent fulltime on adding and tuning
patterns I think Viking might become a lot a stronger. Perhaps even as strong as
gnugo is right now, without adding more major features to the program.
Currently I belive that a very strong program will use a hybrid approach: it will learn
patterns by local tactical search and build a pattern library to use both during a
game and in subsequent games.
Summary: my point here is that local tactical searches in go can in principle be
replaced by patternmatching. Patterns are also useful for move ordering. Most
strong goprograms uses both patterns and local search, and my program do prove
that it is possible to reach moderate (computer) playing strength completely without
tactical search. But a pattern database will never be complete, so there is always a
need for search. Full board search with no or little knowledge on the other hand will
never work in go. But as many already pointed out local search creates knowledge,
and it is possible that a program using local search may be as strong as an amatuer
1 dan. But the kind of knowledge needed to play better than that will not be provided
at all by local tactical search I think. By the way I do not think there is much
difference between 9x9 and 19x19. My program is designed to play 19x19 but it is
as strong on 9x9 (perhaps stronger right now because it does not suffer as much
from cronic time trouble on 9x9 and the life & death module have a better impact)
By the way the previous version of my go program used fast tactical search for
group survival which was fast enough to use in the evaluationfunction.
Unfortunately the tactical search always returned "no result" for big groups with
many liberties so that was why I turned to a more slow and knowledge intensive
approach. My current version wins against this program by killing big groups,
although the older version is superior in simple capture tactics.
A future project is to add fast tactical search for simple goals as connecting, cutting,
single eyes, living with two eyes, captures. If it is fast enough it can be used as a
complement to patterns in the evaluation function and make it more robust, and for
move generation in cases where the candidates with highest priority cannot be
distinguished accurately by patterns alone.
That's my 10 cents.
Magnus Persson 2 Dan and author of Viking
--
Magnus Persson
Department of psychology, Uppsala University
Box 1225, SE-751 42, Sweden
018-471 2141 (work), 018-460264 (home)
070-2987879 (cellular)
MAILTO: magnus.persson@xxxxxxxxxxxxxxxxx
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://computer-go.org/mailman/listinfo/computer-go