[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: Huima's hash function



Jeffrey,

You  are  describing Zobrist  hashing.   This  is  both efficient  and
produces extremely well distributed  keys.  This of course assumes the
random number table is composed of high quality random numbers.

What makes this hashing function extremely appealing is the ability to
incrementally update the current key (or position hash.)  When a stone
is  placed on  an intersection,  they NEW  key is  computed  by simply
applying a  single xor operation  against the PREVIOUS key  to produce
the new key.  There is no need to pass over each intersection to build
a new key from scratch.

When captures are executed,  it's slightly more complicated, but still
very fast compared to hashing the  whole board.  Since XOR is it's own
inverse, applying xor  the second times always cancels  the first.  So
the  operation is  to simply  xor all  the intersections  that changed
state with  the coresponding table  value.  To put it  simply, whether
you remove a stone, or add a stone, you do the same operation, xor the
current  hash key  with the  table value  indexed by  intersection and
stone color.    

To make it crystal clear, the c code might look like this:

   new_key = old_key ^ random_table[color][intersection];



As far  as the  random number  table is concerned,  you can  use truly
random bits or you can use  a good pseudo random number generator like
ones that  comes with the  c library.  There  is a way to  prepare the
table  to ensure  that it  produce good  key distributions  which I'll
describe now for anyone interested:

What you  want to do  is maximize the  hamming distance between  any 2
numbers in  the table.  The  hamming distance between 2  numbers (with
respect to bits) is simply a count  of how many bits are in common, or
have the  same value whether 0  or 1.  One  way to improve this  is to
make many  passes over the  table, setting some hamming  distance goal
that you want to equal or  better.  For each element in the table, see
if there is another element  with a hamming distance smaller than your
goal.   If there  is,  generate a  new  random number  to replace  the
offender, and start all over  again (with the same hamming goal) until
you  have built  a  table with  no  2 elements  having  less than  the
hamming distance goal.

Once you have this table,  then increment your "hamming distance goal"
and try again to generate an  even better table.  You can give up when
you run  out of patience because  it's taking too long  to improve the
table.  It will of course take  longer and longer to get better tables
as the hamming distance goal becomes tougher.

In principle  doing the hamming thing improves  your hashing function,
but in practice, I have never actually been able to really see it make
any real difference.  The difference you would notice, of course, is a
reduction in  the number  of hash table  collisiions.  If  your random
number generator  is excellent,  you probably have  little to  gain in
practice (except to have a little fun) by doing the hamming excercise!

Don











   From: "Jeffrey Rainy" <jrainy@xxxxxxxxxxxxxxxxx>
   Date: Thu, 12 Sep 2002 20:16:22 -0400
   Content-Type: text/plain;
	   charset="iso-8859-1"
   X-Priority: 3
   X-MSMail-Priority: Normal
   X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000
   Sender: owner-computer-go@xxxxxxxxxxxxxxxxx
   Precedence: bulk
   Reply-To: computer-go@xxxxxxxxxxxxxxxxx

   I didn't read the article you mention, and the only link I found to it was
   broken, so, please take the following with a grain of salt. I might not know
   what I'm talking about. ;-)

   What are the desirable properties of your hash function ?

   If you want efficiency(as in easy to calculate) and minimal overlap of board
   position to hash values, I would suggest :

   Having two tables of (19x19) values in the range that pleases you ( say 32
   bits ). One table will be labeled Black and the other White.  Initialise
   both tables with random values.

   To evaluate the hash value, start with a value of 0, and for each point on
   the board where there is a stone, XOR your current value, with the
   corresponding value of the Black or White table according to the stone
   color.  By corresponding value, I mean to index into the table with the
   stone position.

   I think this hash would spread evenly the board position into the hash
   space, and that you can't do much simpler.  It requires 19x19x2 values to be
   stored, but you need only one copy of that.

   You might consider orienting/flipping the board so that identical rotated
   position map to the same hash, but I don't think this would be very useful.

   Does this hash suffer from any defects I don't know about ?

   Jeffrey.

   ----- Original Message -----
   From: "Antoine de Maricourt" <antoine.de-maricourt@xxxxxxxxxxxxxxxxx>
   To: <computer-go@xxxxxxxxxxxxxxxxx>
   Sent: Thursday, September 12, 2002 16:36
   Subject: computer-go: Huima's hash function


   > Hi,
   >
   > did somebody already used Huima's hash function [1] in pratical
   application?
   > or study in more depth the statistical properties of this function?
   >
   > It seems to me that this function has some very undesirable properties,
   > that is: some different positions map to the same hash value in a
   > predictible way that does not depend from the random generator.
   >
   > Huima points it out at the very end of his paper, and suggests the use
   > of addition modulo 256 to combine values instead of xor, but
   > unfortunately this do not solve the problem.
   >
   > [1] A Group Theoretic Zobrist Hash Function
   >