[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] Hardware-Instruction.

To: computer-go@xxxxxxxxxxxxxxx
Subject: Re: [computer-go] Hardware-Instruction.
From: Don Dailey <drd@xxxxxxx>
Date: Fri, 5 Nov 2004 22:17:40 -0500
Cc: computer-go@xxxxxxxxxxxxxxx
Delivered-to: computer-go@xxxxxxxxxxxxxxxxx
In-reply-to: <3.0.32.20041106013527.01013ad8@xxxxxxxxxxxxxxxxx> (message fromVincent Diepeveen on Sat, 06 Nov 2004 01:35:28 +0100)
List-archive: <http://computer-go.org/pipermail/computer-go>
List-help: <mailto:computer-go-request@xxxxxxxxxxxxxxxxx?subject=help>
List-id: computer-go <computer-go.computer-go.org>
List-post: <mailto:computer-go@xxxxxxxxxxxxxxxxx>
List-subscribe: <http://hosting.midvalleyhosting.com/mailman/listinfo/computer-go>,<mailto:computer-go-request@xxxxxxxxxxxxxxxxx?subject=subscribe>
List-unsubscribe: <http://hosting.midvalleyhosting.com/mailman/listinfo/computer-go>,<mailto:computer-go-request@xxxxxxxxxxxxxxxxx?subject=unsubscribe>
References: <3.0.32.20041106013527.01013ad8@xxxxxxxxxxxxxxxxx>
Reply-to: drd@xxxxxxx,computer-go <computer-go@xxxxxxxxxxxxxxx>
Sender: computer-go-bounces@xxxxxxxxxxxxxxx

Yes, I know computer chess has advanced.  I gave up lazy eval before I
left computer chess because it  wasn't safe unless I used big margins,
at which point it didn't help  much at all.  I predicted before anyone
that  the quality  of  evaluation  functions would  be  the future  of
computer chess  more than  speed.  This was  when others  were touting
speed.  I  proved this to  myself via simulations,  better evaluations
improved more  rapidly with deeper searches.  I  think Berliner pretty
much demonstrated this too.

But why MTD no longer used?  Is  it because MTD is so hard on the hash
tables  and  memory latency  is  a bigger  issue?   MTD  is much  more
dependant on  the fact that the  search tree remains  mostly stored in
memory via the hash tables so I can believe this is possible.   

> Nullmove nowadays gets used at R=3, usually not near the leafs.

Interesting.  My  program used null move,  but near the  leaf nodes. I
used  a  static  threat  detector  instead.  This  was  a  very  clear
improvement over pure null move and  I considered it one of my tricks.
I  don't  know  if others  knew  about  this  or  figured it  out  for
themselves, but I think it was one of the little goodies in my program
over many others and I didn't share this secret with others until much
later.   The static  threat routine  picked up  some things  null move
could not.  Interestingly,  it also missed some things  that null move
would get, but the overall effect was a clearly better program.

When you say, "usually not near the leafs", what do you mean?  Are you
using R=2 near leaf or something else entirely or nothing at all?

> I just saw it was 40 times slower than it could be.

In 1996 the program was running on an outdated 32 processor SGI, I say
"outdated" because  Pentiums were much  faster (and so were  newer SGI
machines.)   So it may  have been  like running  on 16  Pentiums which
still gave us a big advantage.  But if the parallel implementation was
even 20X slower, we would have been running slower than everyone else,
yet we won the tournament with an impressive score.  I don't know what
you believe you saw, but our program easily outsearched the others and
I don't believe  we won due to better evaluation, do  you?  Or are you
giving me a compliment?

In  1997  we only  had  a  4 processor  machine,  and  almost won  the
tournament.   I don't  believe this  program  was very  good, but  the
parallel implementation in both cases (1996 and 1997) was excellent.

But there were a number of  big problems we had during this tournament
which I think your comments are based on:

   1. Extremely poor internet connectivity.

      I don't remember  Aske being given extra time  to compensate for
      the  poor  connectivity,  but  I  can  believe  they  did.   The
      connectivity was  so poor  that we probably  could not  have won
      otherwise,  expecially  since the  time  control was fast.

      Nevertheless, if you are saying this was unfair to the others, I
      agree completely.  I believe  remote machines should play by the
      same rules and they should have to deal with the extra problems,
      not  the rest  of the  participants.  The  internet  lag doesn't
      affect the playing strength of  the program, only it's chance of
      losing on time.   So if we were given a few  extra minutes I see
      this  as clearly wrong.   But the  tournament director  was very
      inexperienced,  as  you  will  see  from point  3  below.

   2. Bugs.  The  user interface was incredibly buggy  and really hurt
      our program.  I  think the SGI machine crashed  once or twice to
      make matters  worse.  We ran  with assertions turned on,  and we
      tripped one  of the assertion  failures which of  course crashed
      the program.

   3. In one  of the games, Cilkchess  chose one move,  but Aske Plaat
      mistakenly  played a  different  move on  the  board.  In  every
      computer  chess tournament  I have  ever entered,  the operating
      principle was  to never penalize  a program for  operator error.
      For some  reason, the tournament director ruled  against us, and
      we had to force Cilkchess to  play a move it never chose.  I was
      extremely upset  over this  as I believed  it would cost  us the
      tournament and the  move of course was a  horrible error.  As it
      turned out,  we somehow won  this game despite  the non-standard
      ruling and Aske's mistake.    

      The tournament  director was inexperienced as he  made 2 errors,
      he gave us extra time  unfairly, but he made this grossly unfair
      ruling.   If I  had  a choice  of  giving my  opponent 15  extra
      minutes, or being forced to make 1 random move in 1 of the games
      of an 11  round tournament, I am not sure  which I would choose,
      as the 1 random move should almost certainly be lethal.

In  my  later  experiences  at   the  Dutch  open,  I  found  all  the
participants  and the  directors to  be extremely  nice,  friendly and
helpful.  In fact, one of the  participants let me and one of the UROP
students who traveled with me stay  in his home for a few days.  Let's
see, who  was that guy?  I  think his name was  Vincent Diepeveen.  Do
you know him?   Please tell your family that  I remember this kindness
and still think of it from time to time.

- Don

   X-Original-To: computer-go@xxxxxxxxxxxxxxxxx
   X-Sender: diep@xxxxxxxxxxxxxxxxx
   Date: Sat, 06 Nov 2004 01:35:28 +0100
   From: Vincent Diepeveen <diep@xxxxxxxxxxxxxxxxx>
   X-Virus-Scanned: by XS4ALL Virus Scanner
   Reply-To: computer-go <computer-go@xxxxxxxxxxxxxxxxx>
   Sender: computer-go-bounces@xxxxxxxxxxxxxxxxx
   X-Scanned-By: MIMEDefang 2.42

   Don, a few things have changed since you left computerchess.

   MTD no longer can get used. Lazy eval gets used by less and less engines in
   the top. If i read the document well Chrilly also can't use it and chrilly
   gives a clear explanation for it. The same big scores from evaluation
   function will let the evaluation fluctuate from move to move.

   Another thing that happened at supercomputers is that in contradiction to
   the 80s, the latency to get a remote byte from a cpu relative to the cpu
   speed has become 100 times bigger.

   So that already ends all plans to use cilk at supercomputers.

   Additional in todays software, a shared hashtable is very important to have. 

   Todays software, because of the improved evaluation function suffers from
   more effects.

   Deepfritz6 at a quad xeon 500Mhz as it was in 1999 running, searched 17 ply
   every move. 

   Todays deepfritz8 in world champs 2003 and 2004 searched 14 ply at a quad
   opteron 2.2Ghz.

   Chrilly basically already answerred why.

   Nullmove nowadays gets used at R=3, usually not near the leafs.

   No one uses cilk in game tree search of course.

   At 19:13 5-11-2004 -0500, Don Dailey wrote:
   >
   >A little more about Cilk, since Vincent grossly mispresented it.
   >
   >Cilk  is really C  with a  few simple  constructs added;  spawn, sync,
   >inlet and abort.   An inlet can catch the result  of a computation and
   >can abort work  inside it.  This mechanism makes  it perfect for alpha
   >beta searching.
   >
   >Any function  that begins  with the keyword  "cilk" can be  spawned in
   >parallel.  The scheduler is based on work stealing.  In cilk you don't
   >think about  the number of  processors, this is abstracted  away.  Any
   >time  a "sync"  keyword is  invoked,  the processor  doesn't idle,  it
   >randomly grabs work from somewhere else.
   >
   >The  beauty of  cilk  is that  it  rarely invokes  overhead for  these
   >parallel constructs.  The cilk  pre-processor constructs 2 versions of
   >each  "cilk" function,  one that  is purely  serial and  knows nothing
   >about the cilk keywords and  associated overheads and the other "slow"
   >version has all  the good stuff for managing  the scheduler.  The vast
   >majority of the time your program  is invoking pure C code without any
   >cilk overhead.
   >
   >Here is an  example of the fibonacci function taken  from the cilk web
   >page:
   >
   >
   >cilk int fib (int n)
   >{
   >    if (n < 2) return n;
   >    else
   >    {
   >        int x, y;
   >
   >        x = spawn fib (n-1);
   >        y = spawn fib (n-2);
   >
   >        sync;
   >
   >        return (x+y);
   >    }
   >}
   >
   >
   >
   >
   >For alpha/beta applications like chess, there is an abort keyword that
   >can be  invoked to stop  work that is  happening, such as is  the case
   >when you get a beta cutoff.   An "inlet" can catch this case and abort
   >all work spawned  inside the parent stack frame.   Basically, you call
   >the inlet (which is a function  inside a cilk function GCC style) with
   >a spawn  statement as an argument  to the inlet  function.  Inside the
   >inlet, you can abort work.
   >
   >This sounds a  lot more complicated than it actually  is.  It is quite
   >trivial to write  an alpha/beta search routine using  cilk without the
   >normal pain  and agony and  debugging using a threading  library (like
   >pthreads) for instance.
   >
   >The scheduler is  efficient and your code is  effecient, you are still
   >programming in  C.  You tell cilk  how many processes you  want to use
   >and cilk will use than many (even  if you are on a serial machine, you
   >can write, debug  and test parallel programs specifying  any number of
   >proceses.)    
   >
   >If you write  a cilk program and  run it as a 1  processor program, it
   >will  run nearly as  fast as  the same  program written  without cilk,
   >usually within 1 or 2 percent slowdown.
   >
   >It's really cool for many parallel programming tasks.
   >
   >Check it out at  http://supertech.lcs.mit.edu/cilk/home/intro.html
   >
   >- Don
   >
   >_______________________________________________
   >computer-go mailing list
   >computer-go@xxxxxxxxxxxxxxxxx
   >http://www.computer-go.org/mailman/listinfo/computer-go/
   >
   >
   _______________________________________________
   computer-go mailing list
   computer-go@xxxxxxxxxxxxxxxxx
   http://www.computer-go.org/mailman/listinfo/computer-go/

_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/

Follow-Ups:
- Re: [computer-go] Hardware-Instruction.
  - From: Gian-Carlo Pascutto

References:
- Re: [computer-go] Hardware-Instruction.
  - From: Vincent Diepeveen

Prev by Date: Re: [computer-go] Hardware-Instruction.
Next by Date: Re: [computer-go] Hardware-Instruction.
Previous by thread: Re: [computer-go] Hardware-Instruction.
Next by thread: Re: [computer-go] Hardware-Instruction.
Index(es):
- Date
- Thread