[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: Learning from existing games



On Mon, Jan 13, 2003 at 03:36:55PM -0800, Piotr Kaminski wrote:
> Nicol Schraudolph et al.:  Learning to Evaluate Go Positions Via Temporal
> Difference Methods (http://www.inf.ethz.ch/~schraudo/pubs/gochap.pdf)

Interesting paper. I had not thought that a neural network that just sees
the board position would ever get that far.

My own theory, which I have not had the time to do much about, and probably
never will, is that feeding the board position to the network is not
sufficient, that we need a higher level of abstraction.

Most current programs extract a lot of information from the position
(grouping of stones into strings, strings into groups, counting liberties,
eyes, and territory, and so on). I feel that this sort of info would be much
better input for a network that will estimate the winning probability (or
score).

Actually, I believe it would be enough to count a pile of key numbers that
reflect the position. Number of captured stones, number of strings and
stones in atari, with 2-5 liberties; number and total size of living groups,
weak groups, dead groups; number of points under more or less strong
black/white control; and so on. A small set of (say) 100 numerical inputs
ought to suffice. I believe TD-learning would work better on such a network.

Has anyone tried this sort of things?  It should not be overly difficult job
to take (for example) GnuGo's engine, extract the first analysis phase, 
calculate those numbers, feed them to a network, and train with TD(0).

Regards
	Heikki


-- 
Heikki Levanto  LSD - Levanto Software Development   <heikki@xxxxxxxxxxxxxxxxx>