[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sharing Secrets (was: [computer-go] Computer Go hardware)



Frank wrote:
> The n00bies simply have never heard of Hamming distance

Unless they have studied coding theory I see no particular reason why
they should have. Actually someone has to enlighten me why anyone
would care much about Hamming distance in the context of Zobrist
hashing at all.


The Hamming distance is the number of bits differing between two
binary strings of the same length. E.g. 10110010 and 10010011 have a
Hamming distance of 2. This is a concept of fundamental importance in
coding theory where one wants to be resilient to transmission errors
corrupting single or multiple bits. Assuming, e.g., that we have a set
of codewords with a pairwise minimal Hamming distance of 5 and that at
most two bits of a single codeword will be corrupted, we can easily
correct the errors by finding the codeword with smallest Hamming
distance to the received word.


The collision problem in Zobrist hashing is essentially a question of
whether there is a set of Zobrist numbers Z1, Z2, ..., Zn, such that

(1) Z1^Z2^...^Zn=0,

i.e. they XOR together to 0, and

(2) the set corresponds to stones which are likely to occur "together"
    in the positions we are storing.

To save some typing, let's call a set with property (1) a vanishing
set and a set with property (2) a significant set.

Obviously a small vanishing set tends to be more likely to also be a
significant set than a large vanishing set, so the first priority
should be to avoid small vanishing sets, unless you have very specific
knowledge about the distribution of significant sets.

Some elementary observations:

(a) A vanishing set of size 1 means that we have some Zk=0, which is
almost certainly disastrous.

(b) A vanishing set of size 2 means that we have some Zi=Zj or in
other words two Zobrist numbers of Hamming distance zero.

(c) A large minimum Hamming distance between a set of words does not
imply that there is no small vanishing subset among them. E.g. the
hexadecimal numbers FFFF0000, 0000FFFF and FFFFFFFF have an internal
minimum hamming distance of 16 (out of 32 bits) and can easily be
complemented by a fair number of other numbers keeping a reasonably
high minimum Hamming distance. Still the first three numbers form a
vanishing set.


It's clear that we don't want to have a Hamming distance 0 between any
two Zobrist numbers. But why should it make a difference whether the
minimum Hamming distance is 1 or 7? As far as I can tell the idea of
random bit errors in coding theory does not have very much in common
with collisions in Zobrist hashing. Is there some subtle correlation
between increasing the minimum Hamming distance and increasing the
minimum size of vanishing sets that I fail to see? Or have I missed
something more important?

/Gunnar
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/