[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] Pattern matching - rectification & update



>  As long as the system performs quite worse than a pro, you can veryfy the
>  performance of the system by measuring it's "pro prediction". At least this
>  is the method I use and it seems to work out well. Higher "pro-prediction"
>  always meant an increase in strength (tested on play against Go programs).
>  Fixing bugs always meant an increase in "pro-prediction".

I know some things about this subject from my many years of experience
with computer chess.   Let me make some comments.

First of  all, your system appears  to be quite  impressive.  What I'm
going  to say  is not  to take  anything away  from that.   In  fact I
eagerly look  forward to seeing your  progress and I hope  you keep us
updated.  Nice work so far!

In theory, it would be really  nice if you could use pro-prediction as
a really nice strength measuring tool.   In practice, it tends to be a
bit dicey!  I'll explain in a momement, but before I do that I want to
make sure you understood my original comments.

If you  ARE going to  use pro-prediction (which  I don't think  is all
bad) then  it makes  sense to make  it work  as well as  possible.  My
comments were based on the observation that this is a lot of "slop" in
the  samples.  Your  results could  vary significantly  even  based on
which players you choose to predict!  Ideally, it would be nice if you
could  eliminate at least  some of  the uncertainty  and I  proposed a
method to do this.  Of course what I proposed is a pain in the neck to
set up and it's far easier to do what you're doing now, which is fine.
It's  not even  clear how  useful  what I  proposed would  be, it  was
basically just an idea to  identify with somewhat more certainty which
moves are among the best.


About pro-predicition.   I did an experiment  with pro-prediction many
years ago in chess.  I wanted a way to quickly measure improvements to
my program.  I purchased a  huge database of high quality master games
and took  a random sample  of positions from  those games.  I  think I
ended up  with about 10000 positions.   I made sure  each position was
unique since many of those positions came from the openings.

I did 1 ply search, 2 ply search, 3 ply search, etc ....

Each additional ply  greatly improved the prediction of  best move.  I
was  excited.   Finally  a  way  to easily  measure  relatively  small
iprovements in my program!

I decided  to put this to  a thorough test, since  it was to  be a new
powerful  tool in  my arsenal  of tools.   The first  test was  that I
turned  off all  pawn structure  evaluation to  see how  much strength
degradation there would  be.  Now pawn structure was a  big deal in my
program.   In self play  tests, it  amounted to  well over  150 rating
points.   In  testing  against  commercial  and  public  domain  chess
programs it  was proved that pawn structure  was incredibly important.
Games  played without  pawn structure  were lost  very quickly  due to
pawns getting easily picked off and other bad effects.

To my  surprise, the program predicted the  master moves significantly
better when pawn structure was turned off.  Even though increasing the
depth of the  search clearly improved the prediction  ratio, it wasn't
the same with  other kinds of improvments, even  ones that should have
shown a significant improvement.

Over the years I gradually  came to the conclusion that doing anything
involving human games (other than testing directly against humans) was
dicey  at best.   I  got some  very  good results  at  one point  with
automated learning, but could never get good results based on any kind
of analysis of human games.

>  Well, the purpose of the pattern system is mainly to be an expert system for
>  Fuseki, Joseki and "Good Shape".

I hope it works well for that.  At any rate, I find what you are doing
to  be extremely  interesting  and  I've been  following  all of  your
emails.

- Don





   From: "Frank de Groot" <frank@xxxxxxxxxxxxxxxxx>

   > I  don't really  believe  in  pro-move prediction  systems,  but I  do
   > believe a strong go program would probably play the same move as a pro
   > more often than a weak program would.

   Well, the purpose of the pattern system is mainly to be an expert system for
   Fuseki, Joseki and "Good Shape".

   In that case it works very well, it can be verified that it works very well
   because it "agrees" with most of the pro moves at those stages of the game
   and its alternative moves are usually pretty good as well.


   > It's too bad that there is not  an easy way to quantify the quality of
   > a move.   For instance,  there may be  many cases where  the predicted
   > move is perfectly valid and move choice is more a matter of style than
   > anything else.


   As long as the system performs quite worse than a pro, you can veryfy the
   performance of the system by measuring it's "pro prediction". At least this
   is the method I use and it seems to work out well. Higher "pro-prediction"
   always meant an increase in strength (tested on play against Go programs).
   Fixing bugs always meant an increase in "pro-prediction".

   In fact I would argue that using "pro-prediction" is very important during
   Go software development exactly because it is a "scientific" measuring stick
   and not some "gut feeling" of the programmer who plays a few games and sees
   that it is "better".

   You need a test set of positions and measure your average progress or the
   progress at stages of the game.
   I developed this standard because I don't play Go, but I think it is foolish
   to think that a 4d ama is better in judging his program's performance than,
   say a team of a thousand professionals.


   > What you would like to have (but  probably can't have) is a way to say
   > that  the  program's  top  choice  was  a GREAT  move  or  a  move  of
   > "professional  quality",   whether  it  happened  to  be   the  one  a
   > professional chose or not.


   That's exactly what I am doing with pro prediction.

   I put 250 never-seen before positions into the system and the system tells
   me 110 times that it would like to move the same move as the pro did. That
   is the current state of my pattern system now. I know that it thinks of more
   than 100 "great" moves per pro game.

   Of course that does not mean that it plays like a pro. When it does 56%
   terrible or sub-optimal moves, it will play like a wet dishrag.
   And when you play very "badly" against it, it might not see many patterns to
   work with.

   But the performance of a Go program can still be measured with
   pro-prediction.

   I am sure that 44% is way below optimal. Many moves are still obvious
   mistakes.
   So when you add a tactics module, you would expect to see the pro-prediction
   rise.


   > It might  also be the case  that an occasional  prediction is actually
   > BETTER than the move the professional chose in a particular situation,
   > in which case your prediction statistics get hurt unfairly.


   Yes, but you can verify that with statistical data.
   My pattern system explains why it deems one move better than another.
   It says: "100 pro's chose this move and 60 won whilst 200 chose that move
   whilst 115 won".

   This only works with Fuseki of course, and to a certain extent with Joseki.
   But that is exactly where a pattern system is very useful (and with very
   obvious moves like defending a cut etc.)


   > One subjective way around this is to  get a real pro to rate the top N
   > choices in  a few sample  games, asking him  to "put a check  mark" on
   > move choices that are reasonable  pro candidates, in other words could
   > this  be a move  that a  pro is  reasonably likely  to play?

   That would only be useful to rate the systems non-#1-moves.
   The pro game records can rate the #1 move.

   There is a correlation between the quality of move #1 and the others, so
   your system is not neccessary.
   In fact I can't think of any use for a pro.
   If I would have Go Seigen in a box, I would ask him to bring me coffe every
   few hours, that's about the only use a pro has, to me at this stage.

   No disrespect for pros, but it simply is like that.
   Because I would see no way I could *use* the info that "Move #2 is not so
   good because.. [insert very complex pro explanation, impossible to capture
   into software]".

   We all know the rules of Go.
   That is enough to make a Go program.
   If you think that your program needs patterns, don't go and make them up.
   Harvest them from pro records.

   Don't go and think you are so clever and know it better.
   Neither you know better which patterns to use than occur in half a million
   games, neither can you judge better the quality of the moves than 1000 pros.
   I really don't understand why people spend ages typing in patterns and then
   end up with a slow, sub-par pattern system that hangs together as loose
   sand, because how do you decide which "value" new patterns have? Based on
   what, "gut feeling" or your limited understanding of Go?



   >  I still
   > wouldn't trust this measurement  unless you got verification from more
   > than one pro, then you could actually compare their opinions.

   Exactly.
   So I use a thousand pros :)
   I don't even have to pay them :)
   The only problem is that I have to make my own coffee.

   > In my  opinion, the problem  with pro-prediction schemes is  that it's
   > only  the occasional  move that  makes the  biggest difference  in the
   > strength of  good players.  At least  it's this way in  chess.  A weak
   > master plays chess very much like a grandmaster, it might only be 2 or
   > 3 moves in the whole game that "separates the men from the boys."


   Yes, you're right of course.
   This kind of pattern system is limited to Fuseki, Joseki and "good shape".
   It needs to be integrated with a tactics module.

_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/