[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: Learning from existing games



This is an addendum to my previous post.   

I assumed  that you  were talking about  positions and not  moves.  As
Heikki alluded to, that is the best way to think about these things.

Don


   Date: Mon, 13 Jan 2003 13:32:06 -0500
   From: Don Dailey <drd@xxxxxxxxxxxxxxxxx>
   CC: computer-go@xxxxxxxxxxxxxxxxx
   Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
   Precedence: bulk
   Reply-To: computer-go@xxxxxxxxxxxxxxxxx


   Hi Frank,

   Your idea  is perfectly  valid and is  done by some  game programmers.
   The fact that some moves may be bad is not a bad thing since the whole
   idea of "temporal difference learning", which is basically the same or
   at  least very  similar to  your idea,  is to  gain feedback  from the
   success AND non-success of positional features.

   In  temporal difference learing,  it is  considered more  effective to
   gain feedback  from the  WHOLE game, not  just the final  results.  In
   practical terms, your program can view  the score a few moves later as
   more accurate (on the average)  than the current score, and modify the
   weights accordingly.  But  as I said, this is just  a variation on the
   same idea.

   How you go  about implementing this in Go is  another question.  But I
   do actually use this technique  in my chess program to tune evaluation
   weights and it  produces a much stronger program than  I am capable of
   generating on  my own.  (I can't  say for sure that  someone with more
   skill at tunning weights would not do better!)


   Don



      From: frank-steinmann@xxxxxxxxxxxxxxxxx (Frank Steinmann)
      Date: Mon, 13 Jan 2003 17:37:35 +0100
      Content-Type: text/plain;
	      charset="iso-8859-1"
      X-Priority: 3
      X-MSMail-Priority: Normal
      X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
      X-Sender: 520045112508-0001@xxxxxxxxxxxxxxxxx
      Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
      Precedence: bulk
      Reply-To: computer-go@xxxxxxxxxxxxxxxxx

      Hello,

      realizing, that my go program doesn't make any good moves at all (and is
      also badly desinged), I decided to start again with a completly different
      strategy. My program ist going to learn from existing games (and from the
      ones it has played itself) now.

      My question: To analyze a game, I'd like to evaluate the moves, that have
      been made in that game. The simplest way to do that, is to give every move
      the value of the game result (positive for the moves of the winner, negative
      for the moves of the loser). But I don't think it is a very promising way,
      because you don't consider that there could be some good moves and some bad
      moves which finally lead to the game result. Are there any better ways to do
      that (except from getting a game analysis from a professional go player
      ;-) )?

      Frank