[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: computer-go: A problem with understanding lookahead
Lookahead is a simple concept but very few people seem to have a
significant grasp of what it is all about.
In a nutshell, if I have a clearly defined goal and a function that
recognizes it, then in a 2 player perfect information game like Go,
Chess, Checkers, tic-tac-toe etc, you can try all possibilities and
discover that you can either achieve your goal or not. There are no
exceptions, this always works, as long as the goal can be measured
with 100% certainty.
But with imperfect evaluation functions, things are different.
Whether you search deep or shallow, your end-node evaluation is just
as imperfect. So why does passing it down the tree makes it any
better? Or does it?
The answer is that there is some magic in the fact that a score
actually survived the journey to the root node! This single score had
to compete with a vicious adversary (during the min-max process) who
has the opposite goal. In a nutshell, it boils down to this: It's
better to be happy 20 moves from now, that it is to be happy right
now! That's the key to mini-max and why it works. It works with
humans too, consider:
Let's do a strange thought experiment. Suppose, as a Go player, you
had the ability to know exactly how you would feel 20 moves later
about any move you make. Let's say I like a certain move, but a voice
in my head magically tells me that 20 moves later I will not like my
position. Let's say this ability is integrated in a nicely recursive
fashion into your brain, so that I know that the move I'm currently
thinking of playing is the culprit, not one I might play later (which
is exactly what a search does.) Would this ability make you a better
player? In my opinion it would make a very good player practically
unbeatable, and a weak player much better.
The "magic" seems to be that if you can survive many moves and you
still like your position (which a search with an evaluation function
proves because it "knows" it will like it's position n moves later),
then it's probably a pretty good move.
People used to wonder, in Chess, why just doing a single extra ply of
searching was so powerful, despite the fact that it SEEMED like we
needed a whole lot more just to get a little improvement. But this is
why. The "obvious" conclusions were completely wrong. Even very weak
players often realize after seeing the opponents reply that they
"screwed up." I do this in Go myself. If 1 ply error free searches
were built into the human brain, every chess player would improve a
class.
I think it's true of Go too. However we have to assume a "reasonable"
evaluation function and FULL WIDTH search for these things to be true.
Unfortunately, anything more than a 1 or 2 ply FULL WIDTH search seems
to be unreasonably difficult in GO, if you also require a reasonably
good evaluation function!
Don
X-Sender: gl7@xxxxxxxxxxxxxxxxx
X-Mailer: QUALCOMM Windows Eudora Version 5.0.2
Date: Tue, 16 Jan 2001 21:03:22 +0100
From: "GL7: David Elsdon" <gl7@xxxxxxxxxxxxxxxxx>
References: <20010108235912Z4277-62332+78@xxxxxxxxxxxxxxxxx>
Mime-Version: 1.0
Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
Precedence: bulk
Reply-To: computer-go@xxxxxxxxxxxxxxxxx
Content-Type: text/plain; charset="us-ascii"; format=flowed
Content-Length: 780
Hello all,
I have a serious problem with lookahead. I don't understand why it works.
Why should the backed up values of a, say, 7 ply full width alpha-beta
minimax search be any better than the values my evaluation function gives
me by simply evaluating the positions at ply 1. I can understand that if
the end of the game is in sight then lookahead is useful. I can
understand that if I use a quiesence measure or some such and only
evaluate positions when I know that my evaluation function will work best
then lookahead is useful. But if neither of these is the case then does
lookahead help. Is there some way in which the backing up of values and
minimaxing somehow reduces the error in the evaluations.
I really am seriously stuck with this one.
Cheers
David