[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: A problem with understanding lookahead



O yes there is a very good experiment on this
using a RANDOM evaluation.

It appears that basically a random evaluation is going
to favour playing for space.

so a 10 ply search random evaluation beats a program 
with 1 ply random evaluation easily.

This is also the case with GO as the goal of the game is
to optimize your own space. In other words: i beat my
opponent if i have more space as him.

So the chance is bigger that a deeper search will increase your
winning chances.

It is of course very QUESTIONABLE whether any depth of search
from a random evaluating program will EVER beat evan a simple 
evaluation of a 1 ply searching go program.

This is however a completely different question which still is
getting asked in all kind of different game playing programs.

Very sure is that searching deeper is even with random evaluation going
to improve your program as long as you compare with the same
program at a smaller search depth.

It took however a few years to discover that playing with a random
evaluation is a very very vague definition of playing for optimizing
the open space for the side to move, but some clever mathematicians
might write it down for you if they like Knuth can also work with
alfabeta in math.

At 06:08 PM 1/17/01 -0500, you wrote:
>
>Yes, if your evaluation is totally out of whack I agree with you.  The
>power of  the search is that  it helps you reach  your  goals, even if
>your goals are stupid and out of line.  You might regret reaching your
>goals, but you get better at it with increasing search depth.
>
>So  I think the  thought experiment is valid.   If  your evaluation is
>this far out of  whack, then your  program really just doesn't have  a
>clue about  what is good  and what isn't and  thus a 1 ply search does
>the least damage.  In fact, maybe  you should just  change the sign of
>the evaluation!
>
>Don
>
>
>   From: "Mark Boon" <tesuji@xxxxxxxxxxxxxxxxx>
>
>   Well, your thought experiment is a good one in case of a reliable
evaluation
>   but it's not necessarily applicable in the case of an unreliable
evaluation.
>   Most search strategies rely on alfa-beta cut-off or something similar
to cut
>   down the tree-size which means you need to use evaluations to determine
what
>   to cut-off. If the evaluation is unreliable, the actual tree becomes
just as
>   unreliable. I think David has written several times here that he
experienced
>   that there's a point where more reading made it less reliable because you
>   give it more opportunity to construct a false search-tree.
>
>   This is not the case in Chess because it's so much easier to make a
reliable
>   evaluation function.
>
>   Did anyone do any research on this? I mean, trying to quantify
theoretically
>   how much the evaluation has to be wrong to get a less reliable tree at a
>   given depth. It might possibly be that it's only the case for really very
>   unreliable functions, which will of course be the case for Go. That
would be
>   very useful information.
>
>       Mark
>
>   From: Don Dailey <drd@xxxxxxxxxxxxxxxxx>
>   To: <computer-go@xxxxxxxxxxxxxxxxx>
>   Sent: Wednesday, January 17, 2001 3:16 AM
>   Subject: Re: computer-go: A problem with understanding lookahead
>
>
>   >
>   > Lookahead  is a simple  concept  but very  few  people seem  to have a
>   > significant grasp of what it is all about.
>   >
>   > In a  nutshell, if I have a  clearly defined goal  and a function that
>   > recognizes  it, then in  a 2 player  perfect information game like Go,
>   > Chess, Checkers, tic-tac-toe etc,  you can  try all possibilities  and
>   > discover that you  can either achieve your  goal or not.  There are no
>   > exceptions,  this always works,  as long as the   goal can be measured
>   > with 100% certainty.
>   >
>   > But with     imperfect evaluation  functions,    things are different.
>   > Whether you search  deep or shallow,  your end-node evaluation is just
>   > as imperfect.   So why does  passing it  down  the  tree makes it  any
>   > better?  Or does it?
>   >
>   > The answer is  that there  is  some magic   in the fact  that a  score
>   > actually survived the journey to the root node!  This single score had
>   > to  compete with a vicious adversary  (during the min-max process) who
>   > has  the opposite goal.  In  a nutshell, it  boils down  to this: It's
>   > better to  be happy 20 moves  from now, that it  is  to be happy right
>   > now!   That's the key to  mini-max  and why it  works.   It works with
>   > humans too, consider:
>   >
>   > Let's do a strange  thought experiment.  Suppose,  as a Go player, you
>   > had the  ability  to know exactly  how  you would feel  20 moves later
>   > about any move you make.  Let's say I like a certain move, but a voice
>   > in my head magically tells me that  20 moves later  I will not like my
>   > position.  Let's say this ability  is integrated in a nicely recursive
>   > fashion into your  brain, so that I know  that the  move I'm currently
>   > thinking of playing is the culprit, not one I  might play later (which
>   > is exactly what a search does.)  Would this ability  make you a better
>   > player?  In my  opinion it would make a  very good  player practically
>   > unbeatable, and a weak player much better.
>   >
>   > The "magic" seems to  be that if  you can survive  many moves  and you
>   > still  like your position (which  a search with an evaluation function
>   > proves because it  "knows" it will like  it's position n moves later),
>   > then it's probably a pretty good move.
>   >
>   > People used to wonder, in Chess, why just  doing a single extra ply of
>   > searching  was so powerful,  despite the fact that  it  SEEMED like we
>   > needed a whole lot more just to get a little improvement.  But this is
>   > why.  The "obvious" conclusions were completely wrong.  Even very weak
>   > players  often realize  after  seeing  the opponents  reply that  they
>   > "screwed up."  I do this  in Go myself.   If 1 ply error free searches
>   > were  built into the  human brain, every chess  player would improve a
>   > class.
>   >
>   > I think it's true of Go too.  However we have to assume a "reasonable"
>   > evaluation function and FULL WIDTH search for these things to be true.
>   > Unfortunately, anything more than a 1 or 2 ply FULL WIDTH search seems
>   > to be unreasonably difficult  in GO, if  you also require a reasonably
>   > good evaluation function!
>   >
>   >
>   > Don
>   >
>   >
>   >
>   >
>   >    X-Sender: gl7@xxxxxxxxxxxxxxxxx
>   >    X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
>   >    Date: Tue, 16 Jan 2001 21:03:22 +0100
>   >    From: "GL7: David Elsdon" <gl7@xxxxxxxxxxxxxxxxx>
>   >    References: <20010108235912Z4277-62332+78@xxxxxxxxxxxxxxxxx>
>   >    Mime-Version: 1.0
>   >    Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
>   >    Precedence: bulk
>   >    Reply-To: computer-go@xxxxxxxxxxxxxxxxx
>   >    Content-Type: text/plain; charset="us-ascii"; format=flowed
>   >    Content-Length: 780
>   >
>   >    Hello all,
>   >
>   >    I have a serious problem with lookahead. I don't understand why it
>   works.
>   >    Why should the backed up values of a, say, 7 ply full width
alpha-beta
>   >    minimax search be any better than the values my evaluation function
>   gives
>   >    me by simply evaluating the positions at ply 1. I can understand that
>   if
>   >    the end of the game is in sight then lookahead is useful. I can
>   >    understand  that if I use a quiesence measure or some such and only
>   >    evaluate positions when I know that my evaluation function will work
>   best
>   >    then lookahead is useful. But if neither of these is the case then
does
>   >    lookahead help. Is there some way in which the backing up of
values and
>   >    minimaxing somehow reduces the error in the evaluations.
>   >
>   >    I really am seriously stuck with this one.
>   >
>   >    Cheers
>   >
>   >    David
>   >
>   >
>   >
>
>
>
>