[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] Statistical significance (was: SlugGo vs Many Faces,newest data)



Stuart,

I know this is a popular idea  but I have some feelings against it.  I
know this fly's in the face  of common computer go wisdom, but I would
like to explain.  Please don't  flame me, I'm not being critical, just
presenting another viewpoint:

Let me explain the problem as I see it.  It's something I learned from
chess  programming.  In  a  nutshell, strong  players  tend to  create
positions that are not particularly representative of what the program
needs to understand.

Let me further  explain.  The way I discovered  this in computer chess
was that I wanted to grab  random positions from games in order to get
"natural" positions, ones that  were truly representative of positions
the program would encounter and need to understand in "real" games.

But when  I wanted to  specifically work on  King and Pawn  endings, I
found very  few (almost  none) examples of  king and pawn  endings.  I
looked in  huge samples of  hundreds of thousands of  published games.
It turns out that grandmasters rarely get into these endings.  AND YET
YOU  CANNOT PLAY  A CONSISTANTLY  STRONG  game without  having a  deep
understanding  of these  endings  and  how to  play  them.  There  are
complete books  written about handling and understanding  this kind of
ending!

So it seems to be the case  that a lot of the knowledge neccessary for
strong play is not explicitly represented in Grandmaster games (but is
of course hidden somewhere inside them.)

I  also  learned that  Grandmaster  games  contain  lots of  stylistic
biases.  Human players  do not have a "universal  style", they tend to
be strongly  influenced by current  fads and fashions.  Of  course the
play is incredibly strong within this framework.

Humans also are victims of their very own powerful pattern recognition
facilities.  I say "victim" but it's what makes us so good compared to
programs.  Nevertheless,  psychologists (I'm not  one of them  :-) can
tell you how  it pushes you to certain conclusions  and can trick you.
The bottom  line is  that humans reformulate  problems to  an internal
representation that is peculiar to our way of doing things and solving
problems.  I  have always believed  that trying to force  computers to
emulate us can  be taken way too far.  Do we  force runners to emulate
automobiles because automobiles are far faster?  No, we cooperate with
nature.   We should  cooperate with  computer nature  too and  let the
computer  be natural instead  of arrogantly  forcing it  to be  in our
image (which we all do including me!)

I don't know GO as played by  masters but my fear is that you wouldn't
get a  fair representation of what  the program really  needs to know.
Another way of putting this is  that the program would not know how to
handle any of the positions that humans avoid.

This problem really shows up  when doing automated training with human
games.  For instance  in chess, if you are trying  to make the program
learn  the value  of  the pieces,  you  will find  that humans  always
gravitate towards  positions where there is compensation.   All of the
examples where a Rook was sacraficed  for a knight in human games will
not  be representative  of how  good a  rook really  is compared  to a
knight.  It might even appear that  a knight and rook were pretty much
the same!

In chess, being a pawn down is probably a loss, all other things being
equal.  But in master games, after factoring in the times where a pawn
was sacraficed on purpose or a pawn down ending was purposely achieved
because it  was know  to be a  draw, we  might not easily  deduce that
begin a pawn up is actually a pretty good thing.

I don't have  a clue whether what I  do is any better, but  my hope is
that I'm erring on the side  of making the program a bit more "robust"
with regard to handling a variety of situations.

My thinking,  right or wrong, is that  even if a few  random moves are
completely silly, after this stage the programs are presented with the
with the problem of having to make the best of the situation.  I don't
really know  if this  makes the  program more robust  but that  is the
intent.

Along the  same line ...  It's  been several years since  I authored a
chess program,  but most evaluation functions for  chess programs back
then  were strongly  biased towards  the opening  setup  (including my
own.)   Although  no programmer  did  this  consciously,  most of  the
evaluation parameters were based on lots of assumptions that were tied
to the opening setup and that was often the cause for games being lost
due to irrelevant  moves.  What comes to mind  are evaluation features
like  castling bonuses  and putting  a rook  on certain  magic squares
(like the  7th rank with  no other consideration.)   General princples
that usually apply because of  how the pieces are originally placed on
the board but not based on actually understanding the game.

If evaluation  functions had been  designed based on throwing  all the
pieces  on squares at  random, I  have a  feeling that  chess programs
would have become  stronger more quickly because they  would have been
forced  to  have  robust  evaluation  functions  that  were  based  on
understanding more situations.

Another aspect of  this same phenomenon is the  priciple discovered by
many  go programmers.   You can  predict a  high percentage  of strong
human  moves with a  few rules  of thumb,  but being  able to  do this
doesn't make the program better!  A  well known rule is to place close
to where the  last move was played.  It turns out  that most moves fit
this pattern, but you can't actually use that rule to measure how good
a move is.  If go programs were written like the early chess programs,
there would be a big bonus for placing a stone close to the last move.

There are many  cases of people who have  trained their programs based
on  human  game analysis  with  "good  results."   I think  Deep  Blue
developed  their  chess evaluation  function  based  on a  statistical
analysis  of thousands  of master  games.  I  have read  papers  of go
patterns  that  have  been  extracted from  human  games.   Typically,
someone will report that they  had some program, then they implemented
the learning on master games and  the program got stronger.  I am sure
this  did help,  but  I can't  help  but feel  that  the technique  is
fundamentally  flawed.  It  probably works  well until  something more
sound is discovered.  (In chess,  "piece square tables" worked well in
practice  but  the concept  is  fundamentally  flawed  and today  most
program have  either dropped  this, or have  replaced much of  it with
more dynamic evaluation features.)


- Don





>   There would seem to be to be an even better way of solving this problem:
> 
>   Select positions random N steps into an opening book, allowing the 
>   database to reflect "real" playing. The opening book could be a standard 
>   one built from the games of high-level players or selected from games of 
>   players approximately the same strength as the go playing program.
> 
>   cheers
>   stuart

_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/