[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: Programs learning to play Go



Heikki Levanto wrote:

> My plan is to work only on a part I can a global evaluator. Given a
> position, it will return the probability of winning. As its inputs it will
> have a huge number of data coming from GnuGo's analysis of the position.
> Nothing in the inputs will reflect the board size or location of stones.
> So, what will I feed the net? Key numbers, like
>   - number of stones on board
>   - number of captured stones
>   - number of stones that are in strings that have only 1,2,3,4, or 5
>     liberties
>   - number of stones in groups that have life status of
>     (dead/weak/uncertain...)
>   - number of empty points GnuGo considers territory, influence, etc
> 
> And probably a good number of other details. All numbers encoded somehow
> into inputs (possibly in a way that makes black-white differences stand
> out).
> 
> And one output: Probability of winning from this position.
> 
> I have a few ideas for training: Either study complete games, and assume
> that the probability starts from 50% and ends in 0% or 100%. Assume it goes
> linearily (for lack of better info). Thus you get a number of positions and
> percentages.

In a board of any size, which % of the total number of positions
do you estimate you need to have outputed to be usefull ?
Try in 2x2, 2x3, 3x3, 3x4 and 4x4 boards. Estimate at 7x7 which number
is it?

> Probably a better way is to note that every reasonable move is played to
> improve the winning probability.

I'd rather say that a good move keeps it at 50%, a mistake lowers it for
you
so it raises the oponent's probability.

> Thus we can evaluate the position before
> and after a move, and correct the net if it believes the probability to
> down. (This assumes that the sample games do not contain serious errors - on
> my level nearly any human-human game satisfies this condition, but to play
> safe, I could train on professional games)

How to learn how to punish the most obvious mistakes, then?
This is good to find flaws in the evaluation function: if it thinks the
position
becames unbalanced on a pro game that ends with someone winning by less
than 5
there's probably an error.
 
> Using the net should be easy, for example take the best 10 moves proposed by
> GnuGo, evaluate the positions, feed to the net, and choose the one that gets
> the highest score. Maybe, in a few years when computers get really fast, it
> might make sense to try and evaluate most of legal moves, or to even to do
> simple limited read-ahead.

That's not a few years, I'm afraid. Better strive for a good way to
prune
the tree from the beginning.

> What do you think, would this be complete waste of time? Are there some
> obvious problems you can see? Any improvements, suggestions? Any hope to be
> seen?

The light at the end of the tunnel is a conflagration, not a bulb.

Joan
 
> - Heikki
> 
> --
> Heikki Levanto  LSD - Levanto Software Development   <heikki@xxxxxxxxxxxxxxxxx>