[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[computer-go] TD(lambda), Neural Networks and evaluation functions



I've had an interesting idea about implementing a Go program using Neural
Networks (I'm assuming 9x9 board although there's no reason it could be
extend to 19x19), it basically runs alongs the following lines,

1) Create 81 neural networks (one associated with each intersection on the
board). Let's represent them by N(x, input-board) with x=1...81.
2) Use temporal difference learning to teach the neural networks, with the
rewards being +1/-1 depending on which side controls that intersection at
the end of that game.

You'ld play by having an evaluation function which for a given board
returns the sum(N(x, input-board); x=1..81), i.e. it would predict the
final score.

This approach would have the advantages over other TD approaches as the
functions that the NN has to approximate would be very smooth. Also there
would be many linear features (for instance stones surrounding a
point) that would allow the networks to "boot-strap" themselves a la
td-gammon.

Has anyone experimented with this kind of approach before ?

Imran Ghory
-- 
http://bits.bris.ac.uk/imran

_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://computer-go.org/mailman/listinfo/computer-go