[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] TD(lambda), Neural Networks and evaluation functions

To: computer-go <computer-go@xxxxxxxxxxxxxxx>
Subject: Re: [computer-go] TD(lambda), Neural Networks and evaluation functions
From: Erik van der Werf <E.vanderWerf@xxxxxxxxxxxxx>
Date: Wed, 17 Sep 2003 00:54:56 +0200
Delivered-to: computer-go@xxxxxxxxxxxxxxxxx
In-reply-to: <Pine.LNX.4.21.0309162225150.26691-100000@xxxxxxxxxxxxxxxxx>
List-archive: <http://computer-go.org/pipermail/computer-go>
List-help: <mailto:computer-go-request@xxxxxxxxxxxxxxxxx?subject=help>
List-id: computer-go <computer-go.computer-go.org>
List-post: <mailto:computer-go@xxxxxxxxxxxxxxxxx>
List-subscribe: <http://computer-go.org/mailman/listinfo/computer-go>,<mailto:computer-go-request@xxxxxxxxxxxxxxxxx?subject=subscribe>
List-unsubscribe: <http://computer-go.org/mailman/listinfo/computer-go>,<mailto:computer-go-request@xxxxxxxxxxxxxxxxx?subject=unsubscribe>
References: <Pine.LNX.4.21.0309162225150.26691-100000@xxxxxxxxxxxxxxxxx>
Reply-to: computer-go <computer-go@xxxxxxxxxxxxxxx>
Sender: computer-go-bounces@xxxxxxxxxxxxxxx
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US;rv:1.4) Gecko/20030624 Netscape/7.1

Imran Ghory wrote:

I've had an interesting idea about implementing a Go program using Neural
Networks (I'm assuming 9x9 board although there's no reason it could be
extend to 19x19), it basically runs alongs the following lines,

1) Create 81 neural networks (one associated with each intersection on the
board). Let's represent them by N(x, input-board) with x=1...81.
2) Use temporal difference learning to teach the neural networks, with the
rewards being +1/-1 depending on which side controls that intersection at
the end of that game.

You'ld play by having an evaluation function which for a given board
returns the sum(N(x, input-board); x=1..81), i.e. it would predict the
final score.

This approach would have the advantages over other TD approaches as the
functions that the NN has to approximate would be very smooth. Also there
would be many linear features (for instance stones surrounding a
point) that would allow the networks to "boot-strap" themselves a la
td-gammon.

Has anyone experimented with this kind of approach before ?

Imran Ghory

This idea was tried by several people. Whether it can work (adequately) mostly depends on your input features (a raw board representation will perform weak).

PS Because of symmetry you only need tot train 15 networks for the 9x9 board :-)

Regards,
Erik

_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://computer-go.org/mailman/listinfo/computer-go

References:
- [computer-go] TD(lambda), Neural Networks and evaluation functions
  - From: Imran Ghory

Prev by Date: Re: [computer-go] TD(lambda), Neural Networks and evaluation functions
Next by Date: Re: [computer-go] TD(lambda), Neural Networks and evaluation functions
Previous by thread: Re: [computer-go] TD(lambda), Neural Networks and evaluation functions
Next by thread: Re: [computer-go] TD(lambda), Neural Networks and evaluation functions
Index(es):
- Date
- Thread