[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

computer-go: FPGA



Recently I tried to write code for local tactical search. To improve the 
search result, the code becomes more and more complicated and takes more and 
more time. It now can only search about 20 end nodes per second on a 233 MHz 
machine and it still need more code to produce desired results. It seems to 
me a hardware improvement is needed to produce a good program.   

The options for a better hardware are following

1. Accessible to a super computer. Today's super computers use the parallel 
structure. Unforturnately adapting the search algorithm to a parallel 
computer is not a straight forward matter. 

2. Use PC clusters. This is a very realistic possibility. But it has the same 
problem as in 1 plus more. One problem probably was not notice that is the 
electricity bill. Ten PC could amount to 2 kW.

3. Specilized hardware. Accessible for us is the FPGA. I did some reading on 
this subject. I'll make some comment here.

A specialized hardware based on FPGA is quite possible. For us realistically 
it probably will take the form of a PCI (or ISA) card. Because of this the 
speed gain from the FPGA must compensate the speed loss in the communication 
between the computer and the PCI card. Further because the search program is 
so complicated, the FPGA can only be used to do part of the calculation. 
Using this approach I estimate that the search speed can probably be 
increased to 10,000 end nodes per second. using a 250 MHz FPGA. To achieve 
this speed, one must take advantage of parrallel processing in the FPGA. I 
did some simple layout in VHDL. It seems work out ok. The resource it 
requires is similar to those in high end FPGAs in the market. However, if 
this layout can be realized in a commercial FPGA is till unknown, even though 
the number of gates required is enough. For example, to use parallel 
processing, one need to use a RAM with only 1 word in depth. For a 1k byte 
memory. This will be a 1X 8000 RAM. One way to create such a RAM in FPGA is 
to use latches. It would need 8000 laches. In Altera's PLDs one latch takes 
one logic units. Only devices with more than 1 million gates provide such 
many logical units.  Even the number of logical units is enough, question is 
still open if the 8000 latches can be routed out in the device. There is also 
the financial obstacles. One need to purchase the manufacturer specific 
software to do the layout. For Xilinx such software cost $3500 /year. Altera 
is more customer friendly in this aspect, but still need to purchase software 
to do layout in million gates PLDs. Then there is the cost to implement the 
FPGA in a PCI card. Many company will do this for you but with a big charge 
(probably more than $10,000). The total cost to implement such a PCI card 
with FPGA could run as high as $20,000 to $30,000. Unless one can find some 
funding, it's difficult for an individual to do. 


Dan Liu