[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: some ideas



On Mon, 3 May 1999, Heikki Levanto wrote:

> Henrik Rydberg (rydberg@xxxxxxxxxxxxxxxxx) wrote in lsd.compgo:
> 
> : As opposed to for instance Backgammon [1], Go is strictly
> : deterministic, leading to some problems when applying
> : algorithms such as temporal difference learning (TD).
> 
> I think the problem with go is not so much the determinism, but the
> difficulty in evaluating positions. 
> 
> In Backgammon both players move closer to their goals on every move. A quick
> estimate of the score can be done just by calculating the sum of the
> distance each piece has to travel. Estimating risks is easy (counting
> blotted pieces).
> 
> But in go, things are quite much more complex. Even a rough estimate of the
> score is no easy thing to obtain, and group safety is indeed a hard nut to
> crack.

I could second Rydberg's opinion in that the reason why TD succeeds for
Backgammon is the stochastic nature of the game. The probabilistic
component smoothes the state space so that indeed there are very `similar'
positions and it is possible to generalize game experience. Contrary to
that, go is very chaotic in the sense that minor changes in a board
position can totally change its evaluation. I think this is the main
reason why neural-net based approaches have failed. Neural nets are good
for learning continuous functions but not chaotic ones.

-- 
Antti Huima
SSH Communications Security Oy