[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: computer-go: Board evaluation by counting...
I'm doing something along those lines. I have a net that takes 9x9 areas
centred on a potential move and then outputs a value indicating how 'good'
the move is. The training data is taken from pro tournament games and seems
to produce a network that is fairly generalised but useful if used with
other things such as alpha-beta. I'm hoping to add some more networks that
specialise in the opening game and perhaps corner and edge moves to get
around the overgeneralisation that the current network suffers.
Apart from this if you're going to use a neural net to estimate territory,
in other words to be used as an evaluation function, why not use temporal
difference learning methods that worked so well for backgammon? I think it
Nicol Schraudolph wrote a paper or two on this and I seem to remember the
results were very promising.
Julian Churchill
> -----Original Message-----
> From: owner-computer-go@xxxxxxxxxxxxxxxxx
> [mailto:owner-computer-go@xxxxxxxxxxxxxxxxx]On Behalf Of Mark Boon
> Sent: 19 September 2001 12:53
> To: computer-go@xxxxxxxxxxxxxxxxx
> Subject: RE: computer-go: Board evaluation by counting...
>
>
> Your proposition sounds like the one I made a while ago when
> people were
> discussing using Neural Nets for Go. However, your message
> rather suggests
> this is a tried and tested method, which as far as I know
> it's not. I think
> this would be an interesting experiment (although I would
> expect that such a
> NN would be too slow for practical use at this time) that would give a
> better idea about how NNs could be applied usefully to computer-go.
>
> To suggest this as an alternative to the poor guy who's
> trying to figure out
> a way to design a simple influence algorithm is a bit crass.
>
> Mark Boon
>
> >
> > Sounds like an ad hoc method. Why do you think it will give
> a better way
> > to calculate territory? Why you think it will be more accurate? Why?
> > Asking these questions is definitely reasonable scientific practice.
> >
> > I think that methods for estimating territory (i.e. methods
> for evaluating
> > a board position) must be derived more from the game itself
> and less from
> > the ease of implementing of or the smallness of an algorithm.
> >
> > One possibility would be this: Choose a big set of
> dan-level games that
> > have played until end. Choose a game G. Guess where the
> territories were
> > at the end. This can be done heuristically with good
> precision. It doesn't
> > matter if the judgements go wrong at times as long as most
> of the time the
> > territories are guessed correctly.
> >
> > Then walk backwards in the game G. Correlate by a
> computational learning
> > method the patterns of stones in the positions of G with
> the known sets of
> > final territories. In this way you can train a learning
> system to predict
> > where territories will appear, given a position that is not
> final. Of
> > course, the system will make errors because it doesn't
> understand the
> > full-board tactical aspects of go, but it should work
> better (and consume
> > more resources) than a basic influence-based method.
> >
> > To illustrate, we `know' based on our experience that
> >
> > . . . . . .
> > . X . . X .
> > . . . . . .
> > ._._._._._.
> >
> > is a relative stable formation. This means that in `most'
> games where the
> > position above appears, the marked intersections will
> appear as territory
> > (there will be either a live stone or an empty intersection
> that counts as
> > area):
> >
> > . . . . . .
> > . X a a X .
> > . a a a a .
> > ._a_a_a_a_.
> >
> > To be more precise, there could be e.g. two games with the
> position above
> > and the final board position at the same point looking like
> (`b' denotes
> > black's area)
> >
> > X X O O O O
> > b X X O X O
> > X X b X X O
> > b_b_X_X_O_O
> >
> > and
> >
> > O X b b b b
> > X X b b X b
> > b b b b b b
> > b_b_b_b_b_b
> >
> > The pattern of `a's above would appear as an `average' of
> these, but the
> > definition of `average' of course depends on the
> computational learning
> > method used.
> >
> > A computational learning method could learn this. For
> example, a neural
> > network whose inputs are an N x N subgrid of the go board and whose
> > outputs denote ownership of territory could do it, learning
> from examples.
> > Or so could do unsupervised vector quantization within a
> suitable context.
> >
> > Regards,
> >
> > --
> > Antti Huima
> >
> >
_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com