[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [computer-go] Modern brute force search
At 22:17 5-11-2004 -0500, Don Dailey wrote:
>Yes, I know computer chess has advanced.
Most importantly the balance has been shifted from quantity to quality in
computerchess.
Anything that can possibly give you quality problems, or has some worst
case behaviour. Dang remove it.
>I gave up lazy eval before I
>left computer chess because it wasn't safe unless I used big margins,
>at which point it didn't help much at all.
I do use lazy eval for my draughtsprogram. My draughtsprogram (10x10
international checkers) is stupid. Even though once it was 2nd of world and
the first winner of the human-computer event, i'm sure by now it is
outdated bigtime.
It's evaluation function can have like 100 points (1 disk) difference
between the material situation on the board when there is not queens on the
board.
So it is a pretty safe assumption to use lazy eval there.
Even then lazy eval is bad for the move ordering (i'm using fail soft), so
there is *some* negative effect still there.
However the diep chess program always has been sophisticated.
The material consideration at the board can be 40 pawns difference from the
evaluation score.
That's 40000 points. This where lazy eval only could possibly work with at
most a 3000 (3 pawns) difference.
With months of work i wrote a quick evaluation function some years ago.
This can in 99% of the cases predict the outcome when using a margin of 3
pawns. The problem is that you cannot return the quick evaluation but MUST
return either alpha, beta, or if you have guts quickeval() +/- 3pawns
However:
a) it was tactical weaker in testsets
b) in many positions because of the failsoft effects it didn't search
deeper.
c) in games it just played worse.
Nowadays it would be even harder to use lazy eval for Diep. It has a more
agressive look on the world nowadays. Just like Chrilly explained in his
old document already.
>I predicted before anyone
>that the quality of evaluation functions would be the future of
>computer chess more than speed. This was when others were touting
>speed. I proved this to myself via simulations, better evaluations
>improved more rapidly with deeper searches. I think Berliner pretty
>much demonstrated this too.
Yes we were amongst the very very very few who said evaluation functions
would dominate when programs would get above that 10 ply.
By the way that experiment that you did has been redone by Ernst A Heinz,
but now with 3000 games and fritz6.
It is amazing that your testdata publicly published there wasn't enough to
convince him.
>But why MTD no longer used? Is it because MTD is so hard on the hash
>tables and memory latency is a bigger issue? MTD is much more
>dependant on the fact that the search tree remains mostly stored in
>memory via the hash tables so I can believe this is possible.
Let's sum up the huge list of problems with MTD.
First of all a pawn in Diep is 1000 and it has the largest evaluation
function in the world. Arguably also the best. I am of the opinion my
search sucks as i haven't worked on many search issues (other than parallel
issues) last so many years.
Yet even pawn is 1000 would not be a problem when each ply i get back the
same score.
However we must take a few aspects into consideration before doing a
win/loss argumentation what there is to win versus the losses.
Problem I
In all games at which chess and go are no exception (i personally feel
chess and go are the best comparable games. checkers for example is
completely different because the game is a complete asymmetric game and
chess and go are symmetric games; chess and go are more in balance for the
mind), what is going to score points and what is going to lose points?
Suppose you have a 100 move game.
Search A :
95 moves you search 20 ply.
5 moves you search 12 ply.
Average search depth : 19.6
Search B :
100 moves 19 ply.
Average search depth : 19.0 ply
Which search will win you more games?
MTD has of course the WORST CASE BEHAVIOUR as described in search A.
Add to that, that if you have to make an important choice, in general the
score first drops a lot. Then it finds another move, 1 ply deeper it doubts
it again score keeps changing and so on.
You can see this behaviour very clearly in SOS.
Though it can easily search 16-20 ply in middlegame. Sometimes it doesn't
get further as 12 ply.
Of course this worst case behaviour of MTD is already enough to put it
through the toilet.
Problem II
If a pawn in your program is worth 1000, such as for example in DIEP,
instead of 100 which is most common, then the number of researches needed
with MTD are up to 10 times more.
Problem III
All go programmers really should read this section as i feel certain go
programs really can use this vital conclusion from the computerchess in
their software too.
A few years ago i could beat every engine very easily. I'm not a
professional player. The reason i could beat it, was because they played so
passive. Not agressive. When playing passive my positional knowledge could
be used well. The programs which just innocently shuffled around with
pieces and pawns, of course could fall sooner for positional problems.
Nowadays they are more agressive. Something which would have been given a
bonus of 0.1 pawn a few years ago, can easily get half a pawn bonus now.
Best example is king in the center. Some programs (like junior) in general
just give it like 3 pawns penalty. A few years ago this was perhaps 0.3
pawns for the 'amateurs', if they had that knowledge anyway at the time.
Those huge bonuses and penalties cause a more agressive play. That
trivially also means the scores fluctuate more. Such huge bonuses and
penalties of several pawns for just a few positional aspects means that
from ply to ply there is big score swings.
Score swings are suicidal for MTD.
Problem IV
Rudolf Huber has very clearly proven that MTD doesn't work efficient if you
do not use hashtable in quiescencesearch. It is vital for the researches to
get cheap cutoffs there to avoid doing the work twice.
Now that is a serious problem nowadays. For example SOS scaled well at 8
processor Itanium2 1.3Ghz (altix3700). about 4 million nodes or so.
However with 16 processors it just DIED.
Even single processor this is a serious problem. SOS would search 2 times
faster when it would not store transpositions in qsearch.
1.3Ghz is pretty outdated by now. There is 2.4Ghz opterons now.
Crafty gets 2.2 million nodes a second at a single cpu opteron.
Tiger perhaps 6 million nodes a second or so.
Imagine they would need to store quiescence search at opteron?
1 node = 1000 cycles or so.
A single lookup to hashtable at opteron costs nearly 300 cycles (assuming
133 nanoseconds for a quad, note that my dual k7 has a latency of 400 ns
and all highend chipset driven memory controllers are 280ns or above to
local memory).
At clusters like that of chrilly. A single hashtable lookup at a remote
processor costs about 10000-20000 cycles.
Where the SGI hardware at 512 processors really is like a cluster. At just
16 processors it still is doing very good. It's just like 700-1200 ns for a
single lookup to remote memory at 16 processors.
Problem IV
What is the advantage of MTD over PVS? Let's first look at PVS. In diep i
get the bestline out of the hashtable. So i always get the mainline from
hashtable when the next iteration starts. Nowadays evaluation functions are
so much better than from a few years ago. So it directly gets a good line
from hashtable. First 'alpha' i get back at depthleft == 1, is already very
close to the root alpha i am going to get.
Add 1 move to mainline and you have a new alpha. I generate statistics with
diep regurarly. The total number of nodes which do not directly search with
the right mainline score, is a few thousands at most.
So the MAXIMUM savings MTD can give me at a search of say 100 million
nodes, is a few thousands of nodes.
Problem V
Though i will research any new algorithm, and i will be the last to say
that Aske Plaat has not invented a new algorithm. MTD is the invention of
Aske Plaat. It is not easy to invent something new, especially if you
realize he did do so in the 90s. Aske did find something new in a time that
it increasingly became more difficult to find a COMPLETELY new algorithm.
He gets credit for that.
However the 'proof' of him that it works, is just complete laughable. Even
a teenager of 16 years old can do better.
He investigated random search trees without quiescencesearch.
Game playing programs do not search random trees. They use sophisticated
ordering mechanisms. Game playing programs do use quiescencesearches.
Even my primitive go program did do so. And first tournament i played with
my international checkersprogram, i to my amazement found out that also
there you need a quiescence search (extending for example hung pieces).
A teenager of 16 would implement it in a program and see whether it works.
Because of problems 1..4 it is not possible for me to use MTD in DIEP.
Cilkchess could use MTD because
a) it had a very flat evlauation function. Something even my 1996 diep
scored +5.0 positional (knowing the very moderate penalties and bonuses it
gave at the time) cilkchess still scored as 0.01 pawn. Basically cilkchess
search was just material. Of course everything that captures a pawn then
gives a cutoff. Problem III 'agressive evaluation' doesn't apply therefore.
b) it was slowed down 40 times. Instead of getting 100 million nodes a
second, it just got a few million. Problem IV 'hashtable latency' doesn't
apply therefore at 512 processors.
c) Don really deserves credits for testing his programs well. In this
case cilkchess. If i compare it to the first few diep versions. Like first
tournament i joined with diep ever a computer chess event was in oktober
1994 (dutch open champs). Against the program ZZZZZ in a bigtime won
position for diep i by accident entered an illegal move. I clicked with
mouse at h1 instead of g1. Kh1 was an illegal move. The program played Kh1.
Search of diep instantly captured the king and claimed a mate in 0. I took
back the move.
King of course was already captured so diep didn't care about anything
anymore. Whatever move it did, it would still give a mate in 0 score back
of course. So diep played a random move. Which directly lost a full piece
for diep. Other programs also suffered with bugs (yes single cpu). I can't
remember having seen a program without bugs in time division at the time.
In computerchess we had time controls in many events like 2 hours for 40
moves. Many programs would for example guess they had to play 39 moves and
just forfeit at move 40 as the new time control had already started giving
them plenty time to search.
Amongst all those childish bugs that all those programs had, Don managed to
write a program well tested.
So problem I the worst case scenario didn't apply to cilkchess in those
days, all those programs suffered from bugs on the left and right.
d) i don't know the worth of a pawn in cilkchess, but it sure wasn't
1000. I'm sure pawn=32 would have worked just fine for cilkchess too.
One more small note on cilkchess. In the days cilkchess and zugzwang and
all those other big software ran on big hardware. 1 supercomputer processor
was a lot faster than the fastest pc processor.
So 1% speedup out of 512 * a 3 times faster processor = effectively 15
times faster.
This where first so many plies, like Don clearly had proven already in
those days, it is important to get another ply deeper.
Today a supercomputer processor is perhaps impressive for floating point,
but for chess they are factors slower than a fast pc processor (A64,
opteron) is.
So 5x 6xslower = no speedup.
So today things are quite a bit different.
>> Nullmove nowadays gets used at R=3, usually not near the leafs.
>
>Interesting. My program used null move, but near the leaf nodes. I
Oh my bad english. I'm as you know dutch speaking. I speak german nearly as
well as dutch and oceans better than english.
I meant to say: "R=3 everywhere except last few plies where R=2 is getting
used".
Of course you are correct that many also use last few plies a bit of pruning.
I am not a big fan of forward pruning in entire tree (other than nullmove
that is), as nullmove already cuts away so much. However last few plies it
makes very little sense to keep trying a lot of nonsense moves.
Here i use the KISS principle in testing.
I just play games with and without forward pruning. If it scores better i
keep the forward pruning, otherwise i drop it.
So far forward pruning scored, based upon 1000 games (about 6 to 8 hours a
game) 20% worse than without forward pruning.
At blitz forward pruning clearly is scoring better.
I feel here is a big difference with go and chess. In go they are where
chess was in 1989 at pc hardware. Any form of pruning that can give you a
deeper search by agressive pruning is great. First 10 plies (excluding
extensions and ladder extensions and hung group extensions and qsearch) are
so so so important to get in a superselective way.
Go software really should do an effort there to never miss anything but
just 'reduce' lines that are less rewarding.
Perhaps the junior principle (if i may call it that way and if they indeed
do it like that) might be a GREAT idea for go.
>used a static threat detector instead. This was a very clear
>improvement over pure null move and I considered it one of my tricks.
>I don't know if others knew about this or figured it out for
>themselves, but I think it was one of the little goodies in my program
>over many others and I didn't share this secret with others until much
>later. The static threat routine picked up some things null move
>could not. Interestingly, it also missed some things that null move
>would get, but the overall effect was a clearly better program.
>When you say, "usually not near the leafs", what do you mean? Are you
>using R=2 near leaf or something else entirely or nothing at all?
As usual you had guessed my bad english.
I plan to in diep again experiment with last few plies pruning. This time
not on a tactical basis but a positional basis.
In go i am sure first coming few years only tactics matter there. Not
positional depth.
In chess i already played 1000 games searching on average 2-3 ply deeper
and it scored simply 20-25% less against many different engines.
This 1000 game test as you might have guessed was not so quickly done. It
was done by Jan Louwman, who regrettably has died 2 years ago (at age 76)
after a 30 year struggle with his health. He had 30 computers and i cannot
remember anyone who that accurately could test software. It is incredible
what Jan Louwman has achieved in his life. First surviving the nazi death
camps. Then setting up big infrastructure works in Netherlands building up
effectively Netherlands.
Further when his health nearly killed him which forced him to stop work, he
started computerchess as a hobby and it soon was his passion and life.
In the 80s he set up many tournaments and at home he tested all computers
and later software programs against each other.
Tests he has done up to about 1 year before death still are unvaluable for
my understanding of how to do search when you search through the tactical
barrier.
Simply setting up tests and experiments and very accurately keeping track
of results at 30 computers at the same time. It's really incredible.
I am missing such type of experiments from GO world.
It seems all those statistical significant experiments are carried out ONLY
in computerchess.
>> I just saw it was 40 times slower than it could be.
> 1. Extremely poor internet connectivity.
> I don't remember Aske being given extra time to compensate for
> the poor connectivity, but I can believe they did. The
> connectivity was so poor that we probably could not have won
> otherwise, expecially since the time control was fast.
There was much resistance to this. And correct resistance. Connections are
at your own risk of course. I play nearly every tournament with a remote
connection and i can't say i have been very lucky there.
> Nevertheless, if you are saying this was unfair to the others, I
> agree completely. I believe remote machines should play by the
> same rules and they should have to deal with the extra problems,
> not the rest of the participants. The internet lag doesn't
> affect the playing strength of the program, only it's chance of
> losing on time. So if we were given a few extra minutes I see
> this as clearly wrong. But the tournament director was very
> inexperienced, as you will see from point 3 below.
>
> 2. Bugs. The user interface was incredibly buggy and really hurt
> our program. I think the SGI machine crashed once or twice to
> make matters worse. We ran with assertions turned on, and we
> tripped one of the assertion failures which of course crashed
> the program.
At world champs 2003 the SGI supercomputer i ran at also crashed several
times. It was effectively down 1 full playing day for me.
All the bugs are always file system related when one of the users happily
streams a file system too full.
It is for me pathetic to see that such expensive supercomputers have i/o
sizes which are far below that what money can easily buy for 5000 dollar.
the 512 processor TERAS supercomputer has a scratch size of 1 terabyte.
Now that might look great, but with a few cheapo disk arrays which
effectively stream just as fast as that scratch size, you effectively can
have easily 10 terabyte scratch for a few thousand dollar.
Jonathan Schaeffer also told me major horrors he encountered with SGI
machines.
Basically the only real problem i had was i didn't get enough testing time,
despite that i had asked it at the application already.
I only was given time to run at the tournament.
So i could brew software at home and then it suddenly had to work fine at
512 processors at the tournament.
I ran on outdated 500Mhz MIPS processors against 2.2Ghz A64's and quad xeon
MP 2.8Ghz.
EGTB's i could not use as the file system would crash in that case.
Of course in cilkchess days egtb's were not a hot topic yet.
> 3. In one of the games, Cilkchess chose one move, but Aske Plaat
> mistakenly played a different move on the board. In every
> computer chess tournament I have ever entered, the operating
> principle was to never penalize a program for operator error.
> For some reason, the tournament director ruled against us, and
> we had to force Cilkchess to play a move it never chose. I was
> extremely upset over this as I believed it would cost us the
> tournament and the move of course was a horrible error. As it
> turned out, we somehow won this game despite the non-standard
> ruling and Aske's mistake.
> The tournament director was inexperienced as he made 2 errors,
> he gave us extra time unfairly, but he made this grossly unfair
> ruling. If I had a choice of giving my opponent 15 extra
> minutes, or being forced to make 1 random move in 1 of the games
> of an 11 round tournament, I am not sure which I would choose,
> as the 1 random move should almost certainly be lethal.
As a titled chessplayer and co organizer of some big events i can assure
you that in all sports the tournament officials always take horrible
decisions.
In computerchess we are so lucky to have professor Jaap v/d Herik from the
ICGA (international computer games assossication) as a tournament director
in world champs. Jaap is unpaid, but invaluable to the computer chess world
champs. In fact he uses up holiday days of himself to be director in such
tournaments.
In a commercial environment (as the strong programs are always the
commercial ones which have a chance to win) taking the right decisions is
real hard.
I can assure you that in that sense computerchess is very blessed by wise
persons.
We were told that something horrible went wrong at computer go a few years
ago. Can anyone explain this to me. I heard something about a korean
program which would use a copied knowledge base of some other top go
program illegally and became a world champion with stolen evaluation
knowledge?
>In my later experiences at the Dutch open, I found all the
>participants and the directors to be extremely nice, friendly and
>helpful. In fact, one of the participants let me and one of the UROP
>students who traveled with me stay in his home for a few days. Let's
>see, who was that guy? I think his name was Vincent Diepeveen. Do
>you know him? Please tell your family that I remember this kindness
>and still think of it from time to time.
Thanks for your kind words Don. I really found it great that you came all
across the ocean to play some tournaments here and i regret it that after
deep blue no american sponsors were interested anymore to organize a
computer chess world champs in the states. I happily would have paid the
ticket.
I know you nowadays do not work for MiT anymore, but if there would be some
big event in the USA or Taiwan would cilkchess somehow show up in some
improved form at MiT hardware?
>- Don
> X-Original-To: computer-go@xxxxxxxxxxxxxxxxx
> X-Sender: diep@xxxxxxxxxxxxxxxxx
> Date: Sat, 06 Nov 2004 01:35:28 +0100
> From: Vincent Diepeveen <diep@xxxxxxxxxxxxxxxxx>
> X-Virus-Scanned: by XS4ALL Virus Scanner
> Reply-To: computer-go <computer-go@xxxxxxxxxxxxxxxxx>
> Sender: computer-go-bounces@xxxxxxxxxxxxxxxxx
> X-Scanned-By: MIMEDefang 2.42
>
> Don, a few things have changed since you left computerchess.
>
> MTD no longer can get used. Lazy eval gets used by less and less
engines in
> the top. If i read the document well Chrilly also can't use it and chrilly
> gives a clear explanation for it. The same big scores from evaluation
> function will let the evaluation fluctuate from move to move.
>
> Another thing that happened at supercomputers is that in contradiction to
> the 80s, the latency to get a remote byte from a cpu relative to the cpu
> speed has become 100 times bigger.
>
> So that already ends all plans to use cilk at supercomputers.
>
> Additional in todays software, a shared hashtable is very important to
have.
>
> Todays software, because of the improved evaluation function suffers from
> more effects.
>
> Deepfritz6 at a quad xeon 500Mhz as it was in 1999 running, searched 17
ply
> every move.
>
> Todays deepfritz8 in world champs 2003 and 2004 searched 14 ply at a quad
> opteron 2.2Ghz.
>
> Chrilly basically already answerred why.
>
> Nullmove nowadays gets used at R=3, usually not near the leafs.
>
> No one uses cilk in game tree search of course.
>
> At 19:13 5-11-2004 -0500, Don Dailey wrote:
> >
> >A little more about Cilk, since Vincent grossly mispresented it.
> >
> >Cilk is really C with a few simple constructs added; spawn, sync,
> >inlet and abort. An inlet can catch the result of a computation and
> >can abort work inside it. This mechanism makes it perfect for alpha
> >beta searching.
> >
> >Any function that begins with the keyword "cilk" can be spawned in
> >parallel. The scheduler is based on work stealing. In cilk you don't
> >think about the number of processors, this is abstracted away. Any
> >time a "sync" keyword is invoked, the processor doesn't idle, it
> >randomly grabs work from somewhere else.
> >
> >The beauty of cilk is that it rarely invokes overhead for these
> >parallel constructs. The cilk pre-processor constructs 2 versions of
> >each "cilk" function, one that is purely serial and knows nothing
> >about the cilk keywords and associated overheads and the other "slow"
> >version has all the good stuff for managing the scheduler. The vast
> >majority of the time your program is invoking pure C code without any
> >cilk overhead.
> >
> >Here is an example of the fibonacci function taken from the cilk web
> >page:
> >
> >
> >cilk int fib (int n)
> >{
> > if (n < 2) return n;
> > else
> > {
> > int x, y;
> >
> > x = spawn fib (n-1);
> > y = spawn fib (n-2);
> >
> > sync;
> >
> > return (x+y);
> > }
> >}
> >
> >
> >
> >
> >For alpha/beta applications like chess, there is an abort keyword that
> >can be invoked to stop work that is happening, such as is the case
> >when you get a beta cutoff. An "inlet" can catch this case and abort
> >all work spawned inside the parent stack frame. Basically, you call
> >the inlet (which is a function inside a cilk function GCC style) with
> >a spawn statement as an argument to the inlet function. Inside the
> >inlet, you can abort work.
> >
> >This sounds a lot more complicated than it actually is. It is quite
> >trivial to write an alpha/beta search routine using cilk without the
> >normal pain and agony and debugging using a threading library (like
> >pthreads) for instance.
> >
> >The scheduler is efficient and your code is effecient, you are still
> >programming in C. You tell cilk how many processes you want to use
> >and cilk will use than many (even if you are on a serial machine, you
> >can write, debug and test parallel programs specifying any number of
> >proceses.)
> >
> >If you write a cilk program and run it as a 1 processor program, it
> >will run nearly as fast as the same program written without cilk,
> >usually within 1 or 2 percent slowdown.
> >
> >It's really cool for many parallel programming tasks.
> >
> >Check it out at http://supertech.lcs.mit.edu/cilk/home/intro.html
> >
> >- Don
> >
> >_______________________________________________
> >computer-go mailing list
> >computer-go@xxxxxxxxxxxxxxxxx
> >http://www.computer-go.org/mailman/listinfo/computer-go/
> >
> >
> _______________________________________________
> computer-go mailing list
> computer-go@xxxxxxxxxxxxxxxxx
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
>_______________________________________________
>computer-go mailing list
>computer-go@xxxxxxxxxxxxxxxxx
>http://www.computer-go.org/mailman/listinfo/computer-go/
>
>
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/