[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fundamental problems for reinforcement

To: "cgo" <computer-go@xxxxxx>
Subject: fundamental problems for reinforcement
From: "Ives Steglich" <dalini@xxxxxxxxxxxxx>
Date: Fri, 7 May 1999 15:38:11 +0200
Delivered-to: computer-go@xxxxxxxxxxxxxxxxx
Sender: computer-go-owner@xxxxxx

so have thought about some implementation strategies
in relation with reinforcement learning end detected some basic
problems:

a) i need a way to detect the end of a game
b) to rate it in that way that i can say which play has won

this to statements are most importend to get known couse without this a
implementation of a client could get very difficult,
to determine life, death groups would be useful if i want to know how won a
game

so maybe the first step shoulb be develop a system that can manage the
group problem

the difference from go to chess or backgammon is u have no deterministic
end of game and u dont know such easy who won it

chess: the king
backgamon: all stones at home
go: ????

couse the structure of a ri-agent who should learn play a game - would be
every move gets a reinforcement of r=0
if won the game r=1
if lost the game r=-1

the system than have itself to figure out why it lost or won, and figure
out/aproximate a policy of
how to play this game in a way to win it
u tell him nothing about how to play, just win is good, lost is bad

of course it have to be implemented the rules of the game ...

of course a second possible structure could be:
u have a multi agent like this:
- detect groups
(- rate groups)
- loss of a group ==> negative reinforcement maybe r=-0.2
- catch a opponents group ==> positive reinforcement maybe r=+0.1
- save an area ...
and so on

and every agent only tries to maximies his reinforcement

than is a super-agent - one how manage how importend which agent
in wich siutation will be most importend

but thats all future if i don't get sure statements about a) and b)

ives

Attachment: smime.p7s
Description: application/pkcs7-signature

Follow-Ups:
- Re: fundamental problems for reinforcement
  - From: Hans F. Zschintzsch
- Re: fundamental problems for reinforcement
  - From: Weimin Xiao
- Re: fundamental problems for reinforcement
  - From: Mousheng Xu

Prev by Date: Re: f(x) and NN
Next by Date: Re: f(x) and NN
Previous by thread: Re: f(x) and NN
Next by thread: Re: fundamental problems for reinforcement
Index(es):
- Date
- Thread