[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: A problem with understanding lookahead



Lookahead  is a simple  concept  but very  few  people seem  to have a
significant grasp of what it is all about.

In a  nutshell, if I have a  clearly defined goal  and a function that
recognizes  it, then in  a 2 player  perfect information game like Go,
Chess, Checkers, tic-tac-toe etc,  you can  try all possibilities  and
discover that you  can either achieve your  goal or not.  There are no
exceptions,  this always works,  as long as the   goal can be measured
with 100% certainty.

But with     imperfect evaluation  functions,    things are different.
Whether you search  deep or shallow,  your end-node evaluation is just
as imperfect.   So why does  passing it  down  the  tree makes it  any
better?  Or does it?

The answer is  that there  is  some magic   in the fact  that a  score
actually survived the journey to the root node!  This single score had
to  compete with a vicious adversary  (during the min-max process) who
has  the opposite goal.  In  a nutshell, it  boils down  to this: It's
better to  be happy 20 moves  from now, that it  is  to be happy right
now!   That's the key to  mini-max  and why it  works.   It works with
humans too, consider:

Let's do a strange  thought experiment.  Suppose,  as a Go player, you
had the  ability  to know exactly  how  you would feel  20 moves later
about any move you make.  Let's say I like a certain move, but a voice
in my head magically tells me that  20 moves later  I will not like my
position.  Let's say this ability  is integrated in a nicely recursive
fashion into your  brain, so that I know  that the  move I'm currently
thinking of playing is the culprit, not one I  might play later (which
is exactly what a search does.)  Would this ability  make you a better
player?  In my  opinion it would make a  very good  player practically
unbeatable, and a weak player much better.  

The "magic" seems to  be that if  you can survive  many moves  and you
still  like your position (which  a search with an evaluation function
proves because it  "knows" it will like  it's position n moves later),
then it's probably a pretty good move.

People used to wonder, in Chess, why just  doing a single extra ply of
searching  was so powerful,  despite the fact that  it  SEEMED like we
needed a whole lot more just to get a little improvement.  But this is
why.  The "obvious" conclusions were completely wrong.  Even very weak
players  often realize  after  seeing  the opponents  reply that  they
"screwed up."  I do this  in Go myself.   If 1 ply error free searches
were  built into the  human brain, every chess  player would improve a
class.

I think it's true of Go too.  However we have to assume a "reasonable"
evaluation function and FULL WIDTH search for these things to be true.
Unfortunately, anything more than a 1 or 2 ply FULL WIDTH search seems
to be unreasonably difficult  in GO, if  you also require a reasonably
good evaluation function!
 

Don




   X-Sender: gl7@xxxxxxxxxxxxxxxxx
   X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
   Date: Tue, 16 Jan 2001 21:03:22 +0100
   From: "GL7: David Elsdon" <gl7@xxxxxxxxxxxxxxxxx>
   References: <20010108235912Z4277-62332+78@xxxxxxxxxxxxxxxxx>
   Mime-Version: 1.0
   Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
   Precedence: bulk
   Reply-To: computer-go@xxxxxxxxxxxxxxxxx
   Content-Type: text/plain; charset="us-ascii"; format=flowed
   Content-Length: 780

   Hello all,

   I have a serious problem with lookahead. I don't understand why it works. 
   Why should the backed up values of a, say, 7 ply full width alpha-beta 
   minimax search be any better than the values my evaluation function gives 
   me by simply evaluating the positions at ply 1. I can understand that if 
   the end of the game is in sight then lookahead is useful. I can 
   understand  that if I use a quiesence measure or some such and only 
   evaluate positions when I know that my evaluation function will work best 
   then lookahead is useful. But if neither of these is the case then does 
   lookahead help. Is there some way in which the backing up of values and 
   minimaxing somehow reduces the error in the evaluations.

   I really am seriously stuck with this one.

   Cheers

   David