[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [computer-go] Hardware-Instruction.
> Cilkchess use the "younger brothers wait" algorithm, which is a
> programming choice, you might do it differently. Basically, at any
> node you search the leftmost branch SERIALLY, (no spawn and no wait)
> and if that doesn't give you a beta cutoff, you search all the sibling
> nodes using spawn calls inside an "inlet." If ANY of the children
> return with a beta cutoff, it is caught by the inlet which aborts the
> work being done in the parent stack frame (all those parallel spawned
> siblings at the current node.) It is very simple, recursive, and
> efficient.
Whenever a cilk program encounters a sync command, the process of
course doesn't just sit idle waiting. It immediately "steals" a piece
of work. It randomly steals the oldest piece of work from a process
chosen at random which MIT proved is efficient. Every time you
"spawn" work it is put on a queue to be scheduled. MOST of the time,
unless you have trillions of processors, the work gets stolen by the
same process that scheduled it and so the program acts the same as if
it was just a serial program. When this happens, which is most of the
time, the code is exactly the same as a low level C program with none
of the parallel overhead.
The vast majority of the time, your program is operating in full speed
serial mode.
It would be very easy to use Cilk in a GO program. You can choose
which functions to spawn in parallel, you can make decisions to abort
work if it is determined to be uneeded, etc. You might use separate
processes to do local life/death searches.
Parallel alpha/beta searching involves speculation and you can never
get full utilization of more than 1 processor because alpha/beta is
inherently a serial algorithm. If you knew in advance which nodes
would return beta cutoffs, you wouldn't need to search them in the
first place. But there are some applications which can take almost
complete advantage of extra processes. In such applications, a cilk
program shows almost zero overhead. 100 processors will give you a
100X speedup over a pure serial program. That's why I say Cilk is
efficient, there is not a lot of high level overhead of managing the
scheduling and parallelism. This is true even when the computation is
fine grained. Whatever limits there are to this also apply when using
pthreads, MPI or whatever.
- Don
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/