[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
computer-go: NN GA was RE: Perl Module for next move.
Okay thanx for all the help and suggestions. I've gone with making a
perl wrapper that executes "gnugo --mode gtp --quiet" And I now have a
computer trying to teach itself to play go. maybe in 8 weeks it would have
learned enough to beat me =}. (wouldn't be a big effort as i only know the
basics)
here are the specifics:
its a 9x9 board
The NN:
a 4 layer Net, the Input layer is 81 nodes for each location on the
board. a -1 means opponent's piece 0 means empty and 1 means the computers.
layer 1 is 162 nodes, layer 2 is 324 nodes layer 3 is 648 nodes layer 4 is
one node. all nodes have a tangent clamping function on them. each weight
can be in the range of -127 to +127 (Mainly cause it makes the GA and NN
easier to code)
For each move the computer uses gtp to talk to gnugo to ask which are all
the legal moves. for each legal move it send the resulting board through
the net. I keeps track of the highest answer from the net and if that
answer is better then the answer for the current position it
moves. otherwise it passes.
How the net is being "taught" (ie Genetic Algorithm):
on the first generation it creates 9 nets with random weights. to score
each network each net plays as black vs all the other nets. the scores are
tallied from each game, the bottom 5 scores are sent down the hall to be
euthenized. The no 1 and no 2 are crossbred with each other such that the
children get the others layer4 weights. (randomize that will be my next
test as this is more of a bug hunting run) these children will then be
mutated at a 1% rate (roughly %1 or 2,700 of the weights will be
randomized) the second, third, and forth place nets of the previous gen
will also be mutated at at 1%. and for the 9th net slot. the first place
net will be cloned and mutated at 2% the rate. this new batch will then be
ran again. I'll double the rates on the next run as its belived that
higher mutation tends to lead to better results. But where is the
diminishing returns?
I've been catching a couple of bugs so I've been aborting the runs the
last couple of hours. But in that that has given me a good rough time
estimate per generation
two neural nets play a game an average 100 seconds (the NN processing is
down in a little c program i wrote other wise it would take 90 minutes to
play. I used the same tricks i learned to interface with gnugo/GTP) for
each generation there are 72 games played. so thats roughly 2
hours per generation. each network takes up 270K of space. And i expected
there would be less moves per game at the start but most games are having
each side the opportunity to move an average of 40-60 times.
If i went to a 19x19 board each net would be 5.4 megs huge and each game
would take 40-90 minutes somewhere on average maybe longer. This
The times are on my PIII 600MHZ computer
What ever the results I am enjoying myself.. Now to write an TK interface
to play against it/monitor it.. and a gomodem interface so I'll be able
to have it play against others.
Matthew Corey Brown bromoc@xxxxxxxxxxxxxxxxx
"Death can not stop true love. All it can do is delay it for awhile."