[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: computer-go: Programs learning to play Go
Heikki Levanto wrote:
> My plan is to work only on a part I can a global evaluator. Given a
> position, it will return the probability of winning. As its inputs it will
> have a huge number of data coming from GnuGo's analysis of the position.
> Nothing in the inputs will reflect the board size or location of stones.
> So, what will I feed the net? Key numbers, like
> - number of stones on board
> - number of captured stones
> - number of stones that are in strings that have only 1,2,3,4, or 5
> liberties
> - number of stones in groups that have life status of
> (dead/weak/uncertain...)
> - number of empty points GnuGo considers territory, influence, etc
>
> And probably a good number of other details. All numbers encoded somehow
> into inputs (possibly in a way that makes black-white differences stand
> out).
>
> And one output: Probability of winning from this position.
>
> I have a few ideas for training: Either study complete games, and assume
> that the probability starts from 50% and ends in 0% or 100%. Assume it goes
> linearily (for lack of better info). Thus you get a number of positions and
> percentages.
In a board of any size, which % of the total number of positions
do you estimate you need to have outputed to be usefull ?
Try in 2x2, 2x3, 3x3, 3x4 and 4x4 boards. Estimate at 7x7 which number
is it?
> Probably a better way is to note that every reasonable move is played to
> improve the winning probability.
I'd rather say that a good move keeps it at 50%, a mistake lowers it for
you
so it raises the oponent's probability.
> Thus we can evaluate the position before
> and after a move, and correct the net if it believes the probability to
> down. (This assumes that the sample games do not contain serious errors - on
> my level nearly any human-human game satisfies this condition, but to play
> safe, I could train on professional games)
How to learn how to punish the most obvious mistakes, then?
This is good to find flaws in the evaluation function: if it thinks the
position
becames unbalanced on a pro game that ends with someone winning by less
than 5
there's probably an error.
> Using the net should be easy, for example take the best 10 moves proposed by
> GnuGo, evaluate the positions, feed to the net, and choose the one that gets
> the highest score. Maybe, in a few years when computers get really fast, it
> might make sense to try and evaluate most of legal moves, or to even to do
> simple limited read-ahead.
That's not a few years, I'm afraid. Better strive for a good way to
prune
the tree from the beginning.
> What do you think, would this be complete waste of time? Are there some
> obvious problems you can see? Any improvements, suggestions? Any hope to be
> seen?
The light at the end of the tunnel is a conflagration, not a bulb.
Joan
> - Heikki
>
> --
> Heikki Levanto LSD - Levanto Software Development <heikki@xxxxxxxxxxxxxxxxx>