I have done some empirical tests to determine the speed of
memcpy() on modern processors.
My PC is a 2.8 GHz Pentium with 533 MHz memory.
I have simulated a full width 10 Ply Alpha-Beta search (the
programm does nothing usefull, its only purpose is to test the speed of
memcpy()).
The data-structure is 968 Bytes large (no speciall meaning,
just an arbitrary number).
One copy takes 0.38 Microseconds. This 0.38 microSecs
contain also the overhead of the AlphaBeta call and some simple update of the
Board.
But this should be neglible. One can
therefore make > 2.5 Mega-Copies/Sec. If one wants to write a programm with
100kNodes/Sec. the time to copy
is therefore only a minor fraction of the processing
time.
This 0.38 Microseconds correspond appr. to 1 cycle/Byte.
This 1 cycle/Byte measure seems fairly independent of different structure
sizes.
If one copies always the same structure, the time goes down to
0.20 Microseconds. This seems to be the time to copy two entries in first level
cache.
In my setting I assume mainly a second to first level cache
(or the other way round) operation.
The times are in good aggreement with the numbers given in the
book: R.Booth: "Inner Loops, A sourcebook for fast 32-Bit software
development".
I have used for the test the standard-lib memcpy() function.
One can write a somewhat faster memcpy() with MMX (or Floating-Point)
Assembler statements . But according my experience does this only
pay for first level to first level cache operations. In any case it changes
the result only by a few percents.
Best Regards
Chrilly Donninger
|
_______________________________________________ computer-go mailing list computer-go@xxxxxxxxxxxxxxxxx http://computer-go.org/mailman/listinfo/computer-go