[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

TD(-leaf) for GO

To: computer-go@xxxxxx
Subject: TD(-leaf) for GO
From: Stuart Cracraft <cracraft@xxxxxxxxxxxxx>
Date: Tue, 04 May 1999 21:07:52 -0700
Delivered-to: computer-go@xxxxxxxxxxxxxxxxx
In-reply-to: <199905041807.UAA07631@xxxxxxxxxxxxxxxxx>
Sender: computer-go-owner@xxxxxx

What I'd like to see with GO is mimicing the stochastic nature of Backgammon
for the purposes of training the network using TD. I.e. choose randomly
from amongst a set of reasonable moves (say the top 10 moves of a program 
like HandTalk or Many Faces) as the game went along, or varying with
different opponents between games. Naturally this is not the way Backgammon
nets like Snowie or JellyFish or TD-Gammon were trained (I don't think)
but at least the state space would be enlarged compartively. I.e. don't
search down the same rut pathways.

This should let the net explore a larger search space than the usual "best of
N" selection method. Also it would result in biasing the network much less
towards being simply anti (or pro) that program's style. The patterns trained
should be better, more useful. Possibly could be shared with GO experts after
a (long) training to get further refinement.

If necessary, take a bunch of automated opponents and better a bunch of humans
say on FIGS, and train the net against 'em.  Choose a bunch of generally
useful
Go features for training. If necessary, talk with this list, Fotland, Zhixng 
for feature ideas. Use a full 19x19 board. Let it play 7x24 for months.
Compare the resulting net with hand-tuned programs like Many Faces and
HandTalk as well as with the starting program.

KnightCap's (and predecessor's) TD-leaf for Chess resulted in something like 
a 300 point jump in 3 days of playing on the chess server against mostly
humans.

--Stuart

References:
- Re: some ideas
  - From: Heikki Levanto

Prev by Date: Re: some ideas
Next by Date: Re: some ideas
Previous by thread: Re: some ideas
Next by thread: Re: some ideas
Index(es):
- Date
- Thread