[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: A problem with understanding lookahead



No offense, but this is stupid.  It is completely clear that
n-ply search with random evaluation always plays a uniform
random move.

On Thu, 18 Jan 2001, Vincent Diepeveen wrote:

> O yes there is a very good experiment on this
> using a RANDOM evaluation.
>
> It appears that basically a random evaluation is going
> to favour playing for space.
>
> so a 10 ply search random evaluation beats a program
> with 1 ply random evaluation easily.
>
> This is also the case with GO as the goal of the game is
> to optimize your own space. In other words: i beat my
> opponent if i have more space as him.
>
> So the chance is bigger that a deeper search will increase your
> winning chances.
>
> It is of course very QUESTIONABLE whether any depth of search
> from a random evaluating program will EVER beat evan a simple
> evaluation of a 1 ply searching go program.
>
> This is however a completely different question which still is
> getting asked in all kind of different game playing programs.
>
> Very sure is that searching deeper is even with random evaluation going
> to improve your program as long as you compare with the same
> program at a smaller search depth.
>
> It took however a few years to discover that playing with a random
> evaluation is a very very vague definition of playing for optimizing
> the open space for the side to move, but some clever mathematicians
> might write it down for you if they like Knuth can also work with
> alfabeta in math.
>
> At 06:08 PM 1/17/01 -0500, you wrote:
> >
> >Yes, if your evaluation is totally out of whack I agree with you.  The
> >power of  the search is that  it helps you reach  your  goals, even if
> >your goals are stupid and out of line.  You might regret reaching your
> >goals, but you get better at it with increasing search depth.
> >
> >So  I think the  thought experiment is valid.   If  your evaluation is
> >this far out of  whack, then your  program really just doesn't have  a
> >clue about  what is good  and what isn't and  thus a 1 ply search does
> >the least damage.  In fact, maybe  you should just  change the sign of
> >the evaluation!
> >
> >Don
> >
> >
> >   From: "Mark Boon" <tesuji@xxxxxxxxxxxxxxxxx>
> >
> >   Well, your thought experiment is a good one in case of a reliable
> evaluation
> >   but it's not necessarily applicable in the case of an unreliable
> evaluation.
> >   Most search strategies rely on alfa-beta cut-off or something similar
> to cut
> >   down the tree-size which means you need to use evaluations to determine
> what
> >   to cut-off. If the evaluation is unreliable, the actual tree becomes
> just as
> >   unreliable. I think David has written several times here that he
> experienced
> >   that there's a point where more reading made it less reliable because you
> >   give it more opportunity to construct a false search-tree.
> >
> >   This is not the case in Chess because it's so much easier to make a
> reliable
> >   evaluation function.
> >
> >   Did anyone do any research on this? I mean, trying to quantify
> theoretically
> >   how much the evaluation has to be wrong to get a less reliable tree at a
> >   given depth. It might possibly be that it's only the case for really very
> >   unreliable functions, which will of course be the case for Go. That
> would be
> >   very useful information.
> >
> >       Mark
> >
> >   From: Don Dailey <drd@xxxxxxxxxxxxxxxxx>
> >   To: <computer-go@xxxxxxxxxxxxxxxxx>
> >   Sent: Wednesday, January 17, 2001 3:16 AM
> >   Subject: Re: computer-go: A problem with understanding lookahead
> >
> >
> >   >
> >   > Lookahead  is a simple  concept  but very  few  people seem  to have a
> >   > significant grasp of what it is all about.
> >   >
> >   > In a  nutshell, if I have a  clearly defined goal  and a function that
> >   > recognizes  it, then in  a 2 player  perfect information game like Go,
> >   > Chess, Checkers, tic-tac-toe etc,  you can  try all possibilities  and
> >   > discover that you  can either achieve your  goal or not.  There are no
> >   > exceptions,  this always works,  as long as the   goal can be measured
> >   > with 100% certainty.
> >   >
> >   > But with     imperfect evaluation  functions,    things are different.
> >   > Whether you search  deep or shallow,  your end-node evaluation is just
> >   > as imperfect.   So why does  passing it  down  the  tree makes it  any
> >   > better?  Or does it?
> >   >
> >   > The answer is  that there  is  some magic   in the fact  that a  score
> >   > actually survived the journey to the root node!  This single score had
> >   > to  compete with a vicious adversary  (during the min-max process) who
> >   > has  the opposite goal.  In  a nutshell, it  boils down  to this: It's
> >   > better to  be happy 20 moves  from now, that it  is  to be happy right
> >   > now!   That's the key to  mini-max  and why it  works.   It works with
> >   > humans too, consider:
> >   >
> >   > Let's do a strange  thought experiment.  Suppose,  as a Go player, you
> >   > had the  ability  to know exactly  how  you would feel  20 moves later
> >   > about any move you make.  Let's say I like a certain move, but a voice
> >   > in my head magically tells me that  20 moves later  I will not like my
> >   > position.  Let's say this ability  is integrated in a nicely recursive
> >   > fashion into your  brain, so that I know  that the  move I'm currently
> >   > thinking of playing is the culprit, not one I  might play later (which
> >   > is exactly what a search does.)  Would this ability  make you a better
> >   > player?  In my  opinion it would make a  very good  player practically
> >   > unbeatable, and a weak player much better.
> >   >
> >   > The "magic" seems to  be that if  you can survive  many moves  and you
> >   > still  like your position (which  a search with an evaluation function
> >   > proves because it  "knows" it will like  it's position n moves later),
> >   > then it's probably a pretty good move.
> >   >
> >   > People used to wonder, in Chess, why just  doing a single extra ply of
> >   > searching  was so powerful,  despite the fact that  it  SEEMED like we
> >   > needed a whole lot more just to get a little improvement.  But this is
> >   > why.  The "obvious" conclusions were completely wrong.  Even very weak
> >   > players  often realize  after  seeing  the opponents  reply that  they
> >   > "screwed up."  I do this  in Go myself.   If 1 ply error free searches
> >   > were  built into the  human brain, every chess  player would improve a
> >   > class.
> >   >
> >   > I think it's true of Go too.  However we have to assume a "reasonable"
> >   > evaluation function and FULL WIDTH search for these things to be true.
> >   > Unfortunately, anything more than a 1 or 2 ply FULL WIDTH search seems
> >   > to be unreasonably difficult  in GO, if  you also require a reasonably
> >   > good evaluation function!
> >   >
> >   >
> >   > Don
> >   >
> >   >
> >   >
> >   >
> >   >    X-Sender: gl7@xxxxxxxxxxxxxxxxx
> >   >    X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
> >   >    Date: Tue, 16 Jan 2001 21:03:22 +0100
> >   >    From: "GL7: David Elsdon" <gl7@xxxxxxxxxxxxxxxxx>
> >   >    References: <20010108235912Z4277-62332+78@xxxxxxxxxxxxxxxxx>
> >   >    Mime-Version: 1.0
> >   >    Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
> >   >    Precedence: bulk
> >   >    Reply-To: computer-go@xxxxxxxxxxxxxxxxx
> >   >    Content-Type: text/plain; charset="us-ascii"; format=flowed
> >   >    Content-Length: 780
> >   >
> >   >    Hello all,
> >   >
> >   >    I have a serious problem with lookahead. I don't understand why it
> >   works.
> >   >    Why should the backed up values of a, say, 7 ply full width
> alpha-beta
> >   >    minimax search be any better than the values my evaluation function
> >   gives
> >   >    me by simply evaluating the positions at ply 1. I can understand that
> >   if
> >   >    the end of the game is in sight then lookahead is useful. I can
> >   >    understand  that if I use a quiesence measure or some such and only
> >   >    evaluate positions when I know that my evaluation function will work
> >   best
> >   >    then lookahead is useful. But if neither of these is the case then
> does
> >   >    lookahead help. Is there some way in which the backing up of
> values and
> >   >    minimaxing somehow reduces the error in the evaluations.
> >   >
> >   >    I really am seriously stuck with this one.
> >   >
> >   >    Cheers
> >   >
> >   >    David
> >   >
> >   >
> >   >
> >
> >
> >
> >
>