[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: computer-go: A problem with understanding lookahead
No offense, but this is stupid. It is completely clear that
n-ply search with random evaluation always plays a uniform
random move.
On Thu, 18 Jan 2001, Vincent Diepeveen wrote:
> O yes there is a very good experiment on this
> using a RANDOM evaluation.
>
> It appears that basically a random evaluation is going
> to favour playing for space.
>
> so a 10 ply search random evaluation beats a program
> with 1 ply random evaluation easily.
>
> This is also the case with GO as the goal of the game is
> to optimize your own space. In other words: i beat my
> opponent if i have more space as him.
>
> So the chance is bigger that a deeper search will increase your
> winning chances.
>
> It is of course very QUESTIONABLE whether any depth of search
> from a random evaluating program will EVER beat evan a simple
> evaluation of a 1 ply searching go program.
>
> This is however a completely different question which still is
> getting asked in all kind of different game playing programs.
>
> Very sure is that searching deeper is even with random evaluation going
> to improve your program as long as you compare with the same
> program at a smaller search depth.
>
> It took however a few years to discover that playing with a random
> evaluation is a very very vague definition of playing for optimizing
> the open space for the side to move, but some clever mathematicians
> might write it down for you if they like Knuth can also work with
> alfabeta in math.
>
> At 06:08 PM 1/17/01 -0500, you wrote:
> >
> >Yes, if your evaluation is totally out of whack I agree with you. The
> >power of the search is that it helps you reach your goals, even if
> >your goals are stupid and out of line. You might regret reaching your
> >goals, but you get better at it with increasing search depth.
> >
> >So I think the thought experiment is valid. If your evaluation is
> >this far out of whack, then your program really just doesn't have a
> >clue about what is good and what isn't and thus a 1 ply search does
> >the least damage. In fact, maybe you should just change the sign of
> >the evaluation!
> >
> >Don
> >
> >
> > From: "Mark Boon" <tesuji@xxxxxxxxxxxxxxxxx>
> >
> > Well, your thought experiment is a good one in case of a reliable
> evaluation
> > but it's not necessarily applicable in the case of an unreliable
> evaluation.
> > Most search strategies rely on alfa-beta cut-off or something similar
> to cut
> > down the tree-size which means you need to use evaluations to determine
> what
> > to cut-off. If the evaluation is unreliable, the actual tree becomes
> just as
> > unreliable. I think David has written several times here that he
> experienced
> > that there's a point where more reading made it less reliable because you
> > give it more opportunity to construct a false search-tree.
> >
> > This is not the case in Chess because it's so much easier to make a
> reliable
> > evaluation function.
> >
> > Did anyone do any research on this? I mean, trying to quantify
> theoretically
> > how much the evaluation has to be wrong to get a less reliable tree at a
> > given depth. It might possibly be that it's only the case for really very
> > unreliable functions, which will of course be the case for Go. That
> would be
> > very useful information.
> >
> > Mark
> >
> > From: Don Dailey <drd@xxxxxxxxxxxxxxxxx>
> > To: <computer-go@xxxxxxxxxxxxxxxxx>
> > Sent: Wednesday, January 17, 2001 3:16 AM
> > Subject: Re: computer-go: A problem with understanding lookahead
> >
> >
> > >
> > > Lookahead is a simple concept but very few people seem to have a
> > > significant grasp of what it is all about.
> > >
> > > In a nutshell, if I have a clearly defined goal and a function that
> > > recognizes it, then in a 2 player perfect information game like Go,
> > > Chess, Checkers, tic-tac-toe etc, you can try all possibilities and
> > > discover that you can either achieve your goal or not. There are no
> > > exceptions, this always works, as long as the goal can be measured
> > > with 100% certainty.
> > >
> > > But with imperfect evaluation functions, things are different.
> > > Whether you search deep or shallow, your end-node evaluation is just
> > > as imperfect. So why does passing it down the tree makes it any
> > > better? Or does it?
> > >
> > > The answer is that there is some magic in the fact that a score
> > > actually survived the journey to the root node! This single score had
> > > to compete with a vicious adversary (during the min-max process) who
> > > has the opposite goal. In a nutshell, it boils down to this: It's
> > > better to be happy 20 moves from now, that it is to be happy right
> > > now! That's the key to mini-max and why it works. It works with
> > > humans too, consider:
> > >
> > > Let's do a strange thought experiment. Suppose, as a Go player, you
> > > had the ability to know exactly how you would feel 20 moves later
> > > about any move you make. Let's say I like a certain move, but a voice
> > > in my head magically tells me that 20 moves later I will not like my
> > > position. Let's say this ability is integrated in a nicely recursive
> > > fashion into your brain, so that I know that the move I'm currently
> > > thinking of playing is the culprit, not one I might play later (which
> > > is exactly what a search does.) Would this ability make you a better
> > > player? In my opinion it would make a very good player practically
> > > unbeatable, and a weak player much better.
> > >
> > > The "magic" seems to be that if you can survive many moves and you
> > > still like your position (which a search with an evaluation function
> > > proves because it "knows" it will like it's position n moves later),
> > > then it's probably a pretty good move.
> > >
> > > People used to wonder, in Chess, why just doing a single extra ply of
> > > searching was so powerful, despite the fact that it SEEMED like we
> > > needed a whole lot more just to get a little improvement. But this is
> > > why. The "obvious" conclusions were completely wrong. Even very weak
> > > players often realize after seeing the opponents reply that they
> > > "screwed up." I do this in Go myself. If 1 ply error free searches
> > > were built into the human brain, every chess player would improve a
> > > class.
> > >
> > > I think it's true of Go too. However we have to assume a "reasonable"
> > > evaluation function and FULL WIDTH search for these things to be true.
> > > Unfortunately, anything more than a 1 or 2 ply FULL WIDTH search seems
> > > to be unreasonably difficult in GO, if you also require a reasonably
> > > good evaluation function!
> > >
> > >
> > > Don
> > >
> > >
> > >
> > >
> > > X-Sender: gl7@xxxxxxxxxxxxxxxxx
> > > X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
> > > Date: Tue, 16 Jan 2001 21:03:22 +0100
> > > From: "GL7: David Elsdon" <gl7@xxxxxxxxxxxxxxxxx>
> > > References: <20010108235912Z4277-62332+78@xxxxxxxxxxxxxxxxx>
> > > Mime-Version: 1.0
> > > Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
> > > Precedence: bulk
> > > Reply-To: computer-go@xxxxxxxxxxxxxxxxx
> > > Content-Type: text/plain; charset="us-ascii"; format=flowed
> > > Content-Length: 780
> > >
> > > Hello all,
> > >
> > > I have a serious problem with lookahead. I don't understand why it
> > works.
> > > Why should the backed up values of a, say, 7 ply full width
> alpha-beta
> > > minimax search be any better than the values my evaluation function
> > gives
> > > me by simply evaluating the positions at ply 1. I can understand that
> > if
> > > the end of the game is in sight then lookahead is useful. I can
> > > understand that if I use a quiesence measure or some such and only
> > > evaluate positions when I know that my evaluation function will work
> > best
> > > then lookahead is useful. But if neither of these is the case then
> does
> > > lookahead help. Is there some way in which the backing up of
> values and
> > > minimaxing somehow reduces the error in the evaluations.
> > >
> > > I really am seriously stuck with this one.
> > >
> > > Cheers
> > >
> > > David
> > >
> > >
> > >
> >
> >
> >
> >
>