[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Cluster Supercomputing
>Does anyone care to make some comment about today's (and in next 10 years)
>state of the art super computers? How do they compare with a Pentium 450
>MHz?
I've been working on using a cluster system for go. This means a number of
normal PC's, connected over a LAN, and each one does part of the
calculation. This is much cheaper than parallel machines. The reason
parallel machines are expensive and complex is because they are able to
share memory. This is also where programming gets complex.
But for most tasks you don't need this. My plan is to hand out one
top-level move to each machine that is on the LAN. Also giving it a maxply
and a max branching factor to search to. Each will return the score
estimate for that move, and the program then plays the best score. While
there is still time left for the current move, it will hand out each of the
top-level moves it wants to look at, and once they have all been done, it
will hand them out again but with higher maxply/maxbranch. [1]
Each search is 0.1 seconds to 2 seconds [2], so this is course-grain
parallelism. There is no memory sharing between machines - they each keep
their own caches. The only communication is from the central control
machine to send search requests, and to send the actual move played.
Because communication is so small compared to the size of each calculation,
I can be lazy. I can use cheap ethernet cards to connect machines, and I'm
currently using Python to handle all the client/server stuff. Python is
good because the sockets are built-in and easy to use, and because it is
portable between linux and NT, so I can use a mix of machines on the LAN.
The plan has been to prototype with python, then replace with C++ once the
design becomes stable. However a number of cluster supercomputers are using
python, including the one that recently set a record [4]. So maybe I'll
just leave it that way.
I can have 8 x PII-350Mhz machines for less than a million yen (8300
dollars). PII-2800Mhz. :-). [3]
You might ask why bother? An 8x speedup is not that significant for go.
Partly I'm after the experience as this will be useful in other projects.
But also, having that speed allows me to try out ideas that previously got
dismissed as too slow. I can also develop quicker as I don't have to spend
60% of my time working on speed optimizing before I dare run an algorithm.
Darren
[1]: This iterative deepening/widening algorithm was the one I used at the
recent FOST cup, on a single machine, and I was quite pleased with it.
[2]: From Fost'98 version timings on a P-200. This could change an order of
magnitude in either direction as I work on the code :-).
[3]: Actually I'm thinking of grabbing old Pentium's (166 and 200's) as
people upgrade.
[4]: http://cnls.lanl.gov/avalon/avalon_bell98/
(Sustained 10Gflops for $150,000)