[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] Hardware-Instruction.



> Cilkchess  use  the "younger  brothers  wait"  algorithm,  which is  a
> programming choice,  you might do  it differently.  Basically,  at any
> node you search  the leftmost branch SERIALLY, (no  spawn and no wait)
> and if that doesn't give you a beta cutoff, you search all the sibling
> nodes using  spawn calls  inside an "inlet."   If ANY of  the children
> return with a beta cutoff, it  is caught by the inlet which aborts the
> work being done in the  parent stack frame (all those parallel spawned
> siblings  at the  current node.)   It is  very simple,  recursive, and
> efficient.


Whenever  a cilk  program encounters  a sync  command, the  process of
course doesn't just sit idle waiting.  It immediately "steals" a piece
of work.  It  randomly steals the oldest piece of  work from a process
chosen  at random  which  MIT  proved is  efficient.   Every time  you
"spawn" work it is put on a  queue to be scheduled.  MOST of the time,
unless you have  trillions of processors, the work  gets stolen by the
same process that scheduled it and  so the program acts the same as if
it was just a serial program.  When this happens, which is most of the
time, the code is exactly the same  as a low level C program with none
of the parallel overhead.

The vast majority of the time, your program is operating in full speed
serial mode.

It would  be very easy to  use Cilk in  a GO program.  You  can choose
which functions to spawn in  parallel, you can make decisions to abort
work if it  is determined to be uneeded, etc.   You might use separate
processes to do local life/death searches.

Parallel alpha/beta  searching involves speculation and  you can never
get full  utilization of more  than 1 processor because  alpha/beta is
inherently a  serial algorithm.   If you knew  in advance  which nodes
would return  beta cutoffs,  you wouldn't need  to search them  in the
first place.   But there are  some applications which can  take almost
complete advantage  of extra processes.  In such  applications, a cilk
program shows  almost zero overhead.   100 processors will give  you a
100X speedup  over a pure  serial program.  That's  why I say  Cilk is
efficient, there is  not a lot of high level  overhead of managing the
scheduling and parallelism.  This is true even when the computation is
fine grained.  Whatever limits there are to this also apply when using
pthreads, MPI or whatever.

- Don


_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/