[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[computer-go] Learning : was Chess programs versus go programs



Thanks for your posting Matt.

A small note.

In history nothing has ever beaten human learning capacity. The human way
to learn things is without any compare.

My last openingsbooker of diep, Arturo Ochoa has succesfully achieved
victories with diep in different tournaments. 

Especially the win in last international tournament was very good.

In world champs he also did a great job. However without 'human' learning,
we would have been hopelessly lost, as one program tried to repeat a game 
a program played against us in game 2. Of course that 'repeating' was
human knowledge based as the booker of Fritz, Alexander Kure, 
wanted to repeat a certain line.

Arturo had been guessing that already. No learning system would ever.
Without Arturo but with automatic learning, it would not have been possible
to respond realtime and directly change this line. 

Just randomly changing a line in book is not possible. You can only play
well prepared lines.

New openingsbooker is Erdogan Guenez (for some reason the Sheikh of Hydra
team fired him, no idea why, he'll regret it within 2 years real bigtime i
bet, as that's probably the time i need to adjust diep to his book).

My new openingsbooker i asked why he didn't want to turn on learning during
testing. Answer: "human learning is better". Or in other words each few
games he already changes book.

Now this is perhaps not illustrating the mainpoint i want to make to you,
but it sure is indicative.

If you optimize all those thousands of patterns by hand, nothing beats
human learning there. From a single game i'm sure all the fantastic go
players here who also program a go program, are capable of exactly knowing
where the bug is and in which pattern, which an automatic system never
figures out in its entire life.

Apart from that. Such a human fix is a matter of a few minutes, the
automatic learning system needs to play a zillion games and needs
unavailable amounts of system time of solar driven kernfusion
supercomputers in order to determine where the bug was.

In short it has no good way to figure out which of the say 5000 patterns
had the bug.

I am sure sluggo has a potential to increase 5-6 stones in strength.

However, smartgo which currently is using a 1 ply global searchdepth,
has a potential of raising its level when moving to a 10 ply global search
in combination with a few bugfixes in eval (which will happen for sure),
to professional dan grades.

That it's with only a 1 ply search already playing positional so strong
means trivially that a 10 ply search will completely kill many professional
players.

Please remember, a 1 ply search in chess is rated LOWER than a 1 ply search
in go. According to what i read here smartgo is 3-4 kyu when players do not
know how to beat computers. 

A chessprogram searching 6 ply is rated on the internet on average at 1800,
which compares to roughly 3d dan grade, a 1 ply search when carried out by
for example a program like crafty is roughly rated 1200.

Though some argue 1965 is more comparable in level go versus chess,
In 1979 was one of the last years that there were chessprograms which just
selected a few moves from the root position and did that selectively
further on. I refer to Paradise here. A LISP program that tried to solve
tactics by selectively guessing a few moves in root and each ply guess a
few and prune the rest.

By then it was very clear that brute force search would kick butt.

So after that all software searched 1 ply, 2 ply, 3 ply. 

The reason why it's not so obvious to everyone here yet why global search
is so very important, is because of the huge effort done by strong go
player- programmers who created strong evaluatoin functions.

Diep with a 1 ply search plays 1600 or so as probably the only program on
the planet.

My first diep version which suffered from bugs around 1994, was getting a 5
ply search. 

Current diep with a 2 ply search for sure beats 5 ply diep from 1994.

However the combination of a good evaluation function with a reasonable
deep search like 10 ply, is really going to kick major butt in go.

Now that's not so easy to get. It took 15 years before in computerchess
they figured out how to search efficiently very deep. 

Already having optimized for a hybrid approach now where the focus is at
selective search (local tactical search) which gets carried out after
global search is not easy to rewrite to just 1 global search and a very
thinly searched selective searched.

It's very difficult to write a selective search that's not letting
completely explode the branching factor in the mainsearch.

Because i now how weak a 1 ply search weakest chain is, missing even
directly an obvious move from the opponent, i am very impressed by how the
evaluation functions in computergo have progressed past years.

As my chessprogram has the biggest chess evaluation function on the planet,
i'm in the know how difficult it is to write that.

I also know what kind of complete nonsense any automatic type of thing is
to tune or even create such an evaluation.

The more complexity something has, the more difficult automatic tuning it
gets as the optimizers don't even know where to start optimizing.

Let's give a very simple example for that.

Suppose that we have an optimal tuned evaluation function. Really *optimal*
for the knowledge it has.

Only 2 parameters are wrong tuned.

the bonus for having in a group 1 eye and the bonus for having in a group 2
eyes.

Now suppose a game shows that the bonus for 1 eye must get raised.
A human will *directly* understand he has to raise the bonus for 2 eyes too,
as 1 eye is never better than 2 eyes.

An automatic learner will not even conclude that it has to raise the bonus
for 1 eye, because if it experiments automatically raising the bonus for 1
eye, it will not conclude that it has to increase the obnus for the 2 eye
pattern too.

So it will conclude with respect to the 1 eye pattern that it is wrong to
make it worth more. The reason for that is because when it played a few
games making 1 eye it didn't care to make 2 eyes, so it then also created 1
eye in some other group meanwhile losing this group as it just has 1.

Now this is a dead simple example of course. The more complex your
evaluation function gets the more difficulty for a human to tune it. Yet
that's relative easy still for him.

a) he still remembers majority of knowledge he entered in the program
b) he has domain knowledge
c) human learning to figure out what to modify

This where the automatic learning + tuning system:
a) doesn't remember which knowledge was entered for what reason
b) has no domain knowledge
c) has no clue other than a random generator what patterns/values to modify.
d) cannot tune fine grained, this where a bad tuned huge evaluation function
   is effectively going to underperform at least 6 stones.
e) will try out random patterns first, which gives it a worst case
behaviour of a 60 kyu player. Just suppose it runs an experiment where it
considers a block good. like a1 a2 b1 b2 all filled with black stones it
considers good for black. 


It will be busy lightyears testing before it figures out that laying a
block of stones all over the board is real bad. And it needs a full cycle
of experiments just to figure out 1 tiny thing sucks. If that tiny thing is
related to other patterns, then trivially that shows you why automatic
learning will explode in difficulty.

A good example is in engineering processors. Right now intel and amd are
fighting hard for the average pc owner whose cpu is goind to be better.

A single cpu can have 1 billion transistors. 1 billion. All hand designed.

The expectation now is that in future with more complex cpu's, hand
designed processors get a bigger edge over automatic compiled processors.

So the design is still by hand, but the actual transistor layout is done by
verilog type compilers.

verilog compilers currently cannot optimize such designs very well when
number of transistors increase.

Just *imagine* that. 

Those processors have a number of transistors which *no* human can
understand fully. Yet hand implementing the design into transistor is
BETTER and going 
to be relatively better in future than automatic compiling your design to 
transistors.

Computer go and computer chess do not have so many patterns yet and
probably never a hand written program will be having 1 billion patterns.

Yet in engineering even such sizes, especially at such sizes, hand design
wins from automatic design a lot.

For those verilog compilers they don't use an average pc by the way.

They use at least 256 processor machines or more. They really throw insane
amounts of system time at those designs.

Still believe in automatic learners in go?

Just wait until go programs search 7 ply with global search...

Vincent

At 21:18 7-12-2004 -0600, Matt Gokey wrote: 
>>>>
      

Hello all,

It has been a while since I have participated on this list - its been fun
reading the posts lately though - at least there has been some lively
discussion. ;-)

I suggested this type of multi-level search technique where the global
search works on higher level "moves" in a post to this list back in May of
1998 soon after learning about go, studying it and computer-go including
all the computer-go archives at the time and other research papers.  I had
started writing a go program, but lacked the time and motivation to
continue putting any effort into it.  Its hard to believe that was six
years ago.  Since then, I have kept up with the discussions on the
computer-go list, submitting a few posts and playing around with a few
related algorithms/code from time to time but not doing any serious work.

It's also interesting that in that six years, the progress of computer-go
playing strength and generally accepted/common techniques for programming
go have made little, if any, significant progress.  I hope all of you
working diligently on your go programs don't take offense to that
characterization, but that's how it seems to me.  That is until recently IMHO.

I think the SlugGo experiment, which uses a parallel multi-level search
technique (correct me if I'm wrong), showing a potential improvement of 5-6
stones over its "root" program and measurable improvement against another
program is significant.  I still believe this type of technique is most
definitely one key to computers playing high-level go.  I think the same
idea can be taken to much further extremes than SlugGo and with less
complexity and overhead by implementing a search specifically designed
towards this end.   This is not to discount what the SlugGo team has
accomplished.  I'm excited to see how their experiment and its results
progress.  Keep up the good work!

Also, Frank de Groot's work on his pattern system that harvests patterns
from game records, despite all the doubts espoused on this list about it,
appears to me to be very promising - as _one_ component of many in a go
program, just a Frank has been repeatedly saying over and over.  I've
posted doubts about hand-coded pattern based systems and hand-coded
heuristics due to the fact that as the number of patterns and heuristics
grows, the harder it is to keep the system consistent and keep the rule
integrity high.  Its sounds like Frank's system could be of great benefit
as a move generator base for a local and/or global search engine.  Also
Frank's Moyo Go Studio looks to be quite impressive from the web site
screen shots.  If its not too expensive, I'll definitely be buying a copy
when its ready.  

There is little question in my mind that searching is a key to computer-go
just as it was in chess.  It may not be the same kind of search (muti-level
for example), and the evaluations (including life and death) may not be as
easy but I think its the most promising direction to pursue in combination
with pattern based concepts for move generation and help in evaluation.
Matt

Heikki Levanto wrote:
  On Fri, Dec 03, 2004 at 01:17:16PM -0500, Don Dailey wrote:           I
thought you  did a wonderful job of  articulating the problem.  This is the
 thing that kills  a global search  and needs to be  dealt with properly to
proceed with global searching programs (if that's what we want to persue.)
          I think some work has been done towards solving this, although I
don't have any clear references.  It sounds reasonable to assume that one
can do the local searches, and simplify them into straight sequences, or at
least into small trees. While reading, one should be able to mark possible
tenukis (ignoring the local situation and playing elsewhere). Now, having a
pile of such trees, a higher-level search can try to decide which local
battle to fith at each "move" (which can represent a longer forced
sequence). If all local moves are graded in terms of their size and urgency
(temperature?), then we should have enough data to make meaningful move
ordering and pruning in the higher-level search. I don't know if anyone
does this sort of multi-level search, and what practical the practical
limitations of such an approach. As with any search, it needs a decent
evaluation function, but that is another problem... Regards   Heikki
_______________________________________________ 
computer-go mailing list 
computer-go@xxxxxxxxxxxxxxxxx 
http://www.computer-go.org/mailman/listinfo/computer-go/



_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/