[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] Pattern Matcher



Vincent Diepeveen wrote:
This connect4 has the most idiotic and slow hashtable implementation i have
ever seen in my life.
It was optimized for space, so that it may benefit from L2 cache.
It's using only 5 byte per entry, including perfect collision detection.
How many bytes per entry did yours use?

He's using the SLOWEST possible integer instructions on the processor for
what is it 5 times or so just for 1 hashtable lookup or so?
It uses 2 modulo operations per lookup.

modulo and divide are like a 46+ cycles at opteron, and like 200 cycles or
so at a P4?
What can I say, P4s suck:-(
Alpha CPUs could implement % constant with a 64x64->high64 multiplications
of only a few cycles.

Additionally why not use a more clever hashtable probing system?
My connect4 had a more clever one back in 1995 already or so.
Are you all talk and no show?

Present me your code and I'll be happy to compare the two.

If you can't optimize for such SIMPLISTIC details, you sure must code the
rest of your life JAVA.
I rather leave the compiler to do the optimizations for me, instead of
rewriting my code for every new CPU that comes out.

> Note that this c4 program proofs really nothing. It's just cache trashing
>and main memory trashing.
> 99% of the system time goes to idiotic slow idiv instructions and memory
> lookups. Usually modulo is in hardware casted to idiv (well it is at P4
> where it is casted to floating point unit if i remember well).

Note that Vincent proves really nothing. It's just trash talking.
I can provide not only full source, but profiling data as well:

rank   self  accum   count trace method
   1 56.11% 56.11% 7321056   140 SearchGame.ab
   2 16.33% 72.45% 3779991   166 TransGame.transpose
   3  8.66% 81.11% 3779998   126 TransGame.hash
   4  6.52% 87.63% 2855890   173 TransGame.hash
   5  6.00% 93.62% 2608860    99 Game.makemove
   6  5.79% 99.41% 2608860    79 Game.backmove

regards,
-John

PS: it does seem C compilers have improved lately:
    gcc -O gets
      95994066 pos / 46619 msec = 2059.1 Kpos/sec
    while IBMJava2-142 gets
      95994066 pos / 72105 msec = 1331.3 Kpos/sec
    on solving the position 4443333 on my AMD Athlon(tm) XP 2700+ machine.
    So, C is almost 55% faster. More than the 25% gap I noticed a few years
    back, but nothing like the "more than 200% gap" Vincent would have you
    believe.

_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/