[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] Re: Sharing Secrets



Jeffrey A Raven wrote:
On Thu, Oct 21, 2004 at 02:17:41PM +0200, John Tromp wrote:

Most programmers conveniently ignore the issue. And they might as well.
Even if they store 2^24 positions in a hashtable, and do a search of 2^24
positions, the collision probability is at most 2^(24+24-64) = 1/65536,
which can be considered negligable.

They will start thinking about the issue in 10 years when they store 2^30
positions and search 2^30, bumping the collision probability to 1/16.
At that time, moving to 96-bit hashes seems the easy solution...

Just a little mathematical warning -- hash clashes are a lot more
likely than you might naively expect, even if your hashes are perfectly
randomly distributed. The classic example of this is the 'Birthday
Problem' that comes up in intro combinatorics classes.

Now this is just a hashing problem wrapped up in everyday terms --
we have a hash space with 365 values, and we're assigning a hash value
(their birthday) to each of the 50 students. The same phenomena occurs
when using a hash space with 2**64 values; inserting 2**24 positions
would have a very low chance of clash, but not nearly as low as you
might first guess (according to a little program I threw together,
the chance of a clash on 2**24 positions is around 7.6 in a million)
-- and adding in another 2**24 positions from a search makes a clash
virtually a certainty (though I haven't set my computer to calculate
the actual percentage, it should be much higher that 1 in 65536).
No Jeff, it really is less than 2^{-16}, under the mere assumption that
the 2^24 zobrist hashes appearing in the search behave randomly.
The birthday paradox is already accounted for.
Just one note: by collision I mean the probability that a hash
in the search matches a stored hash. Each new zobrist hash has a probability of
exactly 2^{-40} of colliding with the store. The probability of any one of these 2^24 bad events occuring is less than 2^24 * 2^{-40} = 2^{-16}

In an actual case, you will start with an empty store, and search 2^24 positions that each get stored. Now the probability of a collision is slightly lower than before, because the probability of occuring in the store rises linearly from 0 to 2^{-40}.

regards,
-John


_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/