[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

computer-go: Temporal Differences again

To: computer-go@xxxxxxxxxxxxxxxxx
Subject: computer-go: Temporal Differences again
From: Peter Drake <drake@xxxxxxxxxx>
Date: Tue, 12 Aug 2003 16:08:20 -0700
Reply-to: computer-go@xxxxxxxxxxxxxxxxx
Sender: owner-computer-go@xxxxxxxxxxxxxxxxx

I'm having real trouble with my temporal difference learning. As far as I can tell, the problem stems from the fact that, except at the end of the game, the reinforcement signal is simply the system's own estimate of the board value. This noise seems to overwhelm the real signal that appears at the end of the game.

Cursory experiments indicate that it is better to play to the end of the game, then go back and teach the system that this is the expected result for each board position encountered along the way.

Thoughts?

Peter Drake
Assistant Professor of Computer Science
Lewis & Clark College
http://www.lclark.edu/~drake/

Follow-Ups:
- Re: computer-go: Temporal Differences again
  - From: Markus Enzenberger
- Re: computer-go: Temporal Differences again
  - From: Don Dailey

Prev by Date: Re: computer-go: Test collections?
Next by Date: Re: computer-go: Temporal Differences again
Previous by thread: computer-go: IGS/KGS games mirrored
Next by thread: Re: computer-go: Temporal Differences again
Index(es):
- Date
- Thread