[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: Temporal Difference Learning

To: computer-go@xxxxxxxxxxxxxxxxx
Subject: Re: computer-go: Temporal Difference Learning
From: Markus Enzenberger <compgo@xxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 15 Jul 2003 12:24:19 -0600
In-reply-to: <47B7E3A2-B650-11D7-8A74-0003937E1CFC@xxxxxxxxxxxxxxxxx>
References: <47B7E3A2-B650-11D7-8A74-0003937E1CFC@xxxxxxxxxxxxxxxxx>
Reply-to: computer-go@xxxxxxxxxxxxxxxxx
Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
User-agent: KMail/1.5.1

> > there is an algorithm called TDLeaf, but I am not
> > convinced that it is useful.
>
> A quick web search found a paper by Baxter, Tridgell, and
> Weaver.  Is this the canonical one?

yes.

> Also, can you say why you're not convinced this is
> useful?

it was used for training evaluation functions in chess that
used the material value of of the position as input.
Then you have the disadvantage that the material value
can change at every move in an exchange of pieces
which would give you horrible training patterns.
TDLeaf avoids this by using search to get more appropriate 
target positions for training (e.g. after the exchange has 
happened).
But you pay a very high price for it, because move 
generation during self-play is now exponentially slower.
IMHO it would have been better to do a quiescence search for 
determining the material value of a position used as input 
for the evaluation function and choose the moves during 
self-play by 1-ply look-ahead.
However I haven't performed any experiments and the neural 
network in NeuroGo is much to slow to use TDLeaf.

> > NeuroGo in its most recent version uses local
> > connectivity and single-point eyes as additional
> > outputs that are trained with TD. I will present a
> > paper about this at ACG2003 which takes place together
> > with the Computer Olympiad in Graz/Austria in November.
>
> So when and how do those of us stuck stateside get ahold
> of it?  :-)

I'll put the paper online when the final version is ready.

- Markus

Follow-Ups:
- Re: computer-go: Temporal Difference Learning
  - From: Jan Ramon
- Re: computer-go: Temporal Difference Learning
  - From: Don Dailey
- Re: computer-go: Temporal Difference Learning
  - From: Peter Drake

References:
- Re: computer-go: Temporal Difference Learning
  - From: Peter Drake

Prev by Date: Re: computer-go: Temporal Difference Learning
Next by Date: Re: computer-go: Temporal Difference Learning
Previous by thread: Re: computer-go: Temporal Difference Learning
Next by thread: Re: computer-go: Temporal Difference Learning
Index(es):
- Date
- Thread