[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: Learning from existing games

To: computer-go@xxxxxxxxxxxxxxxxx
Subject: Re: computer-go: Learning from existing games
From: Don Dailey <drd@xxxxxxx>
Date: Mon, 13 Jan 2003 13:32:06 -0500
Cc: computer-go@xxxxxxxxxxxxxxxxx
In-reply-to: <003501c2bb22$145b38c0$416e8150@steinmann>(frank-steinmann@xxxxxxxxxxxxxxxxx)
References: <003501c2bb22$145b38c0$416e8150@steinmann>
Reply-to: computer-go@xxxxxxxxxxxxxxxxx
Sender: owner-computer-go@xxxxxxxxxxxxxxxxx

Hi Frank,

Your idea  is perfectly  valid and is  done by some  game programmers.
The fact that some moves may be bad is not a bad thing since the whole
idea of "temporal difference learning", which is basically the same or
at  least very  similar to  your idea,  is to  gain feedback  from the
success AND non-success of positional features.

In  temporal difference learing,  it is  considered more  effective to
gain feedback  from the  WHOLE game, not  just the final  results.  In
practical terms, your program can view  the score a few moves later as
more accurate (on the average)  than the current score, and modify the
weights accordingly.  But  as I said, this is just  a variation on the
same idea.

How you go  about implementing this in Go is  another question.  But I
do actually use this technique  in my chess program to tune evaluation
weights and it  produces a much stronger program than  I am capable of
generating on  my own.  (I can't  say for sure that  someone with more
skill at tunning weights would not do better!)


Don



   From: frank-steinmann@xxxxxxxxxxxxxxxxx (Frank Steinmann)
   Date: Mon, 13 Jan 2003 17:37:35 +0100
   Content-Type: text/plain;
	   charset="iso-8859-1"
   X-Priority: 3
   X-MSMail-Priority: Normal
   X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
   X-Sender: 520045112508-0001@xxxxxxxxxxxxxxxxx
   Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
   Precedence: bulk
   Reply-To: computer-go@xxxxxxxxxxxxxxxxx

   Hello,

   realizing, that my go program doesn't make any good moves at all (and is
   also badly desinged), I decided to start again with a completly different
   strategy. My program ist going to learn from existing games (and from the
   ones it has played itself) now.

   My question: To analyze a game, I'd like to evaluate the moves, that have
   been made in that game. The simplest way to do that, is to give every move
   the value of the game result (positive for the moves of the winner, negative
   for the moves of the loser). But I don't think it is a very promising way,
   because you don't consider that there could be some good moves and some bad
   moves which finally lead to the game result. Are there any better ways to do
   that (except from getting a game analysis from a professional go player
   ;-) )?

   Frank

Follow-Ups:
- Re: computer-go: Learning from existing games
  - From: Don Dailey

References:
- computer-go: Learning from existing games
  - From: Frank Steinmann

Prev by Date: Re: computer-go: Learning from existing games
Next by Date: Re: computer-go: Learning from existing games
Previous by thread: Re: computer-go: Learning from existing games
Next by thread: Re: computer-go: Learning from existing games
Index(es):
- Date
- Thread