[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Learn Alternative Moves
On Sun, 26 Apr 1998, Jens Yllman wrote:
> some kind of paper on this. But first I need some usefull results. I would
> not mind discussing this in more detail if anyone is interested. I hope to
> get some ideas from others that is working on this too.
I tried lots of things, there was no way to get a useful result with
training from a fixed pattern set. How can a learning program learn that
leaving a big group in atari is a bad thing, if such a situation never
occurs in the training set? That is why I let the net play against itself.
I also would like to use an actor/evaluator pair, because playing a move
with an evaluator only is time consuming (you have to evaluate every legal
move). Unfortunately all experiments in this direction failed. Basically I
tried to train the evaluator by TD learning and the actor should predict
the value difference (given by the evaluator) for each move to the value
after passing. It never worked ...
- Markus