[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fundamental problems for reinforcement



On Sun, May 09, 1999 at 12:00:05AM +0200, Heikki Levanto wrote:

> I would think that for reinforcement (or other self-play) learning, it
> should be sufficient to say that
>   - the game is over when both players pass
>   - anything on the board is alive
>   - only fully surrounded territory counts
>     (that is, any point that can see both colors of stones
>     is considered nobody's territory)

This is the way I used for training NeuroGo:

- forbid playing into the interior of an unconditionally alive group
  (Benson's algorithm)
- do not allow a player to pass until no more move is possible
- use Chinese scoring (but treat enemy stones within alive group as dead)

This will not give correct results with seki, but seki is rare anyway.

I used this only for training. During normal play the program was allowed
to pass if no move lead to a position with a better evaluation than
the current position.

- Markus

-- 
Markus Enzenberger | http://home.t-online.de/home/markus.enzenberger