[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
computer-go: don't use deep blue for proofs about GO
At 12:11 PM 11/16/99 -0800, you wrote:
>At 06:25 AM 11/16/99 -0800, you wrote:
>> >> Maybe current
>> >>chess programs and current go programs are equally strong but we just
don't
>> >>realize it ;-).
>> >
>> >no way, the average go player can beat all programs, only masters can beat
>> >chess programs.
>>
>>Both of the above statements are false.
>>
>>...
>>For example, I drew a game with Deep Thought and am certainly no master. ...
>
>is that the ibm program that uses special hardware? perhaps i should have
>specified that.
I get this kind of questions a lot.
What was Deep Thought, and what was Deep Blue?
Read very accurately what i read: "was". So definitely
not: "is".
Deep Thought was a system based upon hardware chessprocessors.
First what was the goal of Hsu writing Deep Thought?
"just search as fast as possible and thereby outsearch every
other program/person in the world in a tactical way".
Word tactical can be defined also: if i search a bit deeper with
a slightly worse evaluation then my depth might make up for it.
Of course in hardware you are a lot faster than in software,
so at a time where hashtables were hardly used and in a time
where nullmove was only used in commercial programs, but not
yet written about in official publications, it was obvious that
a big speed was needed to get a ply deeper.
Note that most chessprograms at that time played like GO programs
nowadays. Getting a 4 ply search was considered as normal.
Then there comes someone and because of having made hardware processors
that play chess, he gets a search speed of many million nodes a second.
I can't remember exactly what number of nodes a second Deep Thought
gets, as it was before my time,
but let's round it off to 10 million nodes a second.
Then Deep Thought starting playing some grandmasters, and made some
very silly mistakes. A big mistake which was a SEARCH mistake
was made against Karpov, at that time the one best player of the world.
Deep Thought missed there a move which would have led to a draw
in an endgame. Obviously Deep Thought didn't evaluate that endgame
very well. That wasn't needed though. The draw is for computers a pure
search thing. That's why i found this position so interesting.
The move it has to find is not only the move h4-h3, but also
the right score with it. It needs to see it's about a draw.
How does my own program (DIEP)
do on this position without EGTB, as i consider using those unfair here.
Allocated 79999984 bytes hashtables
black timeleft=27:46.40.00
- = - = - k - = ... 1 ...
= - = - = - = R ... 2 ...
- = - = O = - o ... 3 ...
= - = - = K = - ... 4 ...
o = r = - O - o ... 5 ...
= - = - = - = - ... 6 ...
- = - = - = - = ... 7 ...
= - = - = - = - ... 8 ...
white timeleft=27:46.40.00
black to move
00:00 5 (0) 1 -5.66 Rc4xf4 Kf5xf4
00:00 11 (0) 1 0.30 h6-h5 Rh7xh5
00:00 29 (0) 1 0.77 a4-a3 Rh7xh6
00:00 80 (0) 2 0.45 a4-a3 Rh7-a7
00:00 379 (0) 3 0.38 a4-a3 Rh7-f7 Kf8-g8 Rf7-a7
00:00 1176 (0) 4 -0.12 a4-a3 Rh7-a7 Rc4-c3 Kf5-e4
++ h6-h5
00:00 2488 (0) 4 0.36 h6-h5 Kf5-e5 a4-a3 Rh7-a7
00:00 5618 (0) 5 0.00 h6-h5 Rh7-h8 Kf8-e7 Rh8-h7 Ke7-d6 Rh7-d7 Kd6-c6 Rd7-d3
++ c4-b4
00:00 10648 (0) 5 0.05 Rc4-b4 Rh7-h8 Kf8-e7 Rh8-h7 Ke7-d6 Rh7-d7 Kd6-c6 Rd7-d3
++ h4-h3
00:00 12378 (0) 5 0.32 h4-h3 Rh7xh6 Rc4-c3 Kf5-f6 Kf8-g8
00:00 15289 (0) 6 -0.23 h4-h3 Rh7xh6 Rc4-c3 Kf5-f6 Kf8-g8 e6-e7 Rc3-e3
e7-e8Q Re3xe8 Rh6xh3
++ h6-h5
00:00 22306 (0) 6 0.13 h6-h5 Rh7-d7 h4-h3 Rd7-d2 Rc4-c3 Kf5-g5
00:01 43726 (0) 7 0.00 h6-h5 Rh7-h8 Kf8-e7 Rh8-h7 Ke7-e8 Rh7-h8 Ke8-e7
00:01 59895 (0) 8 0.00 h6-h5 Rh7-h8 Kf8-e7 Rh8-h7 Ke7-e8 Rh7-h8 Ke8-e7
00:05 209513 (0) 9 -0.14 h6-h5 Kf5-e5 h4-h3 f4-f5 h3-h2 Rh7-h8 Kf8-g7
Rh8xh5 Rc4-c5 Ke5-d4 Rc5-c2
00:09 328703 (0) 10 -0.18 h6-h5 Kf5-e5 h4-h3 f4-f5 h3-h2 Rh7-h8 Kf8-g7
Rh8xh5 Rc4-c5 Ke5-d4 Rc5-c2 Rh5-h4
00:17 650786 (0) 11 -0.21 h6-h5 Kf5-g5 h4-h3 Rh7xh5 Kf8-e7 f4-f5 Rc4-c3
Rh5-h7 Ke7-d6 Rh7-d7 Kd6-e5 Rd7-d2 Rc3-g3 Kg5-h4 Rg3-g7
00:27 1060547 (0) 12 -0.39 h6-h5 Kf5-g5 h4-h3 Rh7xh5 Kf8-e7 f4-f5 Rc4-c3
Rh5-h7
Ke7-d6 Kg5-g4 a4-a3 Rh7-d7 Kd6-e5 Rd7-d2
01:19 2894526 (0) 13 -0.68 h6-h5 Kf5-g5 Rc4-e4 f4-f5 Kf8-g8 Rh7-d7 h4-h3
Rd7-d8
Kg8-h7 Rd8-d2 Kh7-g8 Rd2-a2 Re4-g4 Kg5-f6 Kg8-f8
++ h4-h3
02:18 4821305 (0) 13 -0.04 h4-h3 Rh7xh6 a4-a3 Rh6xh3 Rc4-a4 Rh3-h8 Kf8-e7
Rh8-h7 Ke7-e8 Rh7-h1 Ke8-e7 Rh1-h7
03:04 6354176 (0) 14 -0.16 h4-h3 Rh7xh6 a4-a3 Rh6-h8 Kf8-e7 Rh8-h7 Ke7-f8
Rh7xh3 Rc4-a4 Rh3-h8 Kf8-e7 Rh8-h7 Ke7-e8 Rh7-h1 Ra4-a5 Kf5-f6 Ke8-d8
So DIEP at a PII450 getting 34.5k nodes a second
sees after 2:18 that black can draw here. Considering that
in this game the level was 3 minutes a move, DIEP would have made
it even at a single cpu of today. At the speed of Deep Thought
which got 10M nodes a second, let's say about a second?
Now some of you that remember what i wrote in the above sentences will say:
"this is not fair, your program uses nullmove and Deep Thought didn't, so
saying that Deep Thought was very inefficiently searching is comparing 2
different things with each other".
As nullmove was described very clearly in 1989 which is before the
deep thought chips were designed, i think you have a good point here.
So now i do a search with DIEP that's fullwidth, so in my case without
nullmove as after all my selective search experiments i got so sick of
all other forms of pruning other than nullmove that the only form of
pruning selectivity in DIEP is nullmove.
Here the output of my DIEP without nullmove (and not a single other
form of pruning. So completely fullwidth searching not skipping a single
move, only applying the correct alfabeta pruning):
00:00 5 (0) 1 -5.66 Rc4xf4 Kf5xf4
00:00 11 (0) 1 0.30 h6-h5 Rh7xh5
00:00 29 (0) 1 0.77 a4-a3 Rh7xh6
00:00 77 (0) 2 0.45 a4-a3 Rh7-a7
00:00 355 (0) 3 0.38 a4-a3 Rh7-f7 Kf8-g8 Rf7-a7
00:00 1378 (0) 4 -0.12 a4-a3 Rh7-a7 Rc4-c3 Kf5-e4
++ h6-h5
00:00 2638 (0) 4 0.36 h6-h5 Kf5-e5 a4-a3 Rh7-a7
00:00 6131 (0) 5 0.00 h6-h5 Rh7-h8 Kf8-e7 Rh8-h7 Ke7-d6 Rh7-d7 Kd6-c6 Rd7-d3
++ c4-b4
00:00 12458 (0) 5 0.05 Rc4-b4 Rh7-h8 Kf8-e7 Rh8-h7 Ke7-d6 Rh7-d7 Kd6-c6 Rd7-d3
++ h4-h3
00:00 15725 (0) 5 0.32 h4-h3 Rh7xh6 Rc4-c3 Kf5-f6 Kf8-g8
00:00 23672 (0) 6 -0.23 h4-h3 Rh7xh6 Rc4-c3 Kf5-f6 Kf8-g8 e6-e7 Rc3-e3
e7-e8Q Re3xe8 Rh6xh3
++ h6-h5
00:01 33831 (0) 6 0.14 h6-h5 Rh7-h8 Kf8-e7 Rh8-h7 Ke7-d6 Rh7-d7 Kd6-c6
Rd7-d3 Rc4-c1
00:01 59980 (0) 7 0.00 h6-h5 Rh7-h8 Kf8-e7 Rh8-h7 Ke7-e8 Rh7-h8 Ke8-e7
00:05 183327 (0) 8 -0.18 h6-h5 Kf5-e5 Kf8-g8 e6-e7 Rc4-c8 Rh7xh5 Kg8-f7
Rh5-h7 Kf7-e8
00:15 483036 (0) 9 -0.14 h6-h5 Kf5-e5 h4-h3 f4-f5 h3-h2 Rh7-h8 Kf8-g7
Rh8xh5 Rc4-c5 Ke5-d4 Rc5-c2
00:32 1052658 (0) 10 -0.18 h6-h5 Kf5-e5 h4-h3 f4-f5 h3-h2 Rh7-h8 Kf8-g7
Rh8xh5 Rc4-c5 Ke5-d4 Rc5-c2 Rh5-h4
01:51 3639774 (0) 11 -0.21 h6-h5 Kf5-g5 h4-h3 Rh7xh5 Kf8-e7 f4-f5 Rc4-c3
Rh5-h7
Ke7-d6 Rh7-d7 Kd6-e5 Rd7-d2 Rc3-g3 Kg5-h4 Rg3-g7
03:34 6942261 (0) 12 -0.39 h6-h5 Kf5-g5 h4-h3 Rh7xh5 Kf8-e7 f4-f5 Rc4-c3
Rh5-h7
Ke7-d6 Kg5-g4 a4-a3 Rh7-d7 Kd6-e5 Rd7-d2
++ h4-h3
07:39 14043873 (0) 12 -0.07 h4-h3 Rh7xh6 a4-a3 Rh6xh3 Rc4-a4 Rh3-h8 Kf8-e7
Rh8-h7 Ke7-f8 Kf5-f6 Ra4xf4 Kf6-e5 Rf4-f2 Rh7-a7 a3-a2
So we see now 2 things
a) it needs a ply less to find it
b) it needs still a lot more nodes and more time.
c) at 10M nodes a second this would be still under 2 seconds to
see the draw.
It's very important to realize that this h4-h3 move has nothing
to do with evaluation. It's just pure tactics. Even a bad evaluation
will see h3 without problems at these depths.
To answer a,b,c:
a) this trick has to do with a zugzwang that's something nullmove
has problems with. It needs another ply, but it does find it in the end
b) it needs more nodes because it doesn't use nullmove
So c) my conclusion here is that Deep Thought design was very inefficient.
It missed a trick that even a slow searching program like mine sees within
a few minutes. Not a single misevaluation can make up for this.
Deep Thought searched very inefficient whatever side you look from.
Note that most chessprograms with a lot less knowledge than DIEP
get hundreds of kilobytes nodes a second and see this tactical trick a lot
faster than my program does. Some even within seconds see that h3
leads to a draw.
Apart from this tactical problem, many games from deep thought gave
the world the obvious insight that the evaluation function it used
was far from good. It played to use a big understatement a very non
active form of chess that allowed it to get directly in a position where
there was no chance for escape at all. It was just defending,
in go terms this is very easily called: it was connecting all it
groups together *directly*, without caring for the rest
of the space on the board. I guess that's a *very* good way to
describe it. In chess this is called 'mobility or activity'.
IBM goes sponsor the project, so deep thought now was called deep blue.
Now of course the errors of deep thought gave the Deep Blue programmer(s)
obviously
the insight that 2 things sucked at their program.
a) search depth
b) chessknowledge (that's the general term for it of course,
in specific: deep thought didn't play not active enough it played
a too defensive form of chess allowing the opponent to build
up a bigtime won position)
Then Deep Blue goes play kasparov. It also joined one time the
open world computer chess championships. It definitely showed it
improved a lot, but it wasn't convincing. In the 1995 world championship
it lost from a micro computer running at a P5-90Mhz
tomorrow i'll finish the story, as deep blue obviously is a closed book.
tomorrow why...
>Ray (will hack java for food) http://home.pacbell.net/rtayek/
>hate spam? http://www.blighty.com/products/spade/
>
>
Vincent Diepeveen
diep@xxxxxxxxxxxxxxxxx
---
... en verder ben ik van ben ik van mening dat Dap het gesticht in moet