[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: computer-go: Learning from existing games
> Markus Enzenberger wrote:
> > The big problem is the training time. I found TD with self-played games
> > superior to other training methods and it takes at least 100000 games
> > for best results (several weeks or even months on a fast PC).
> > Algorithms for faster weight update are not necessarily helpful, because
> > they decrease the exploration that is done with a given set of weights.
One possible solution is to allow a suboptimal theory. (Amateur) humans
too weigh things suboptimal. Training time is significantly shorter and
the usefulness of the theory is probably not much inferior for most uses
(e.g. candidate move generation / bad move elimination).
Also, a smaller theory (for neural nets e.g. 1e4 weights instead of 1e5
weights) can be both trained and evaluated faster. The faster evaluation
allows one to evaluate several different theories in the same time.
Often, combining these results (voting, stacking, or whatever
method you prefer to combine them) give at least equally good results
than the big theory.
Jan