[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: computer-go: Authenticating the identity of a remote go-playing computer program
David Stafford wrote:
>I'm afraid I'm stepping into something I'll regret but I don't
>want to let this one pass:
Your knowledge of information theory is commendable. I can't tell you how
many cranks there are out there who think they've developed a compression
algorithm that compresses no matter what the input data is. That being
said, you are stepping into something you perhaps haven't fully thought
through.
>Any signature of N bits can only represent, uniquely, a message
>of N bits. For each additional bit added to the message the
>signature will represent two more possible messages. For
>example, a 32-bit signature would match a random message with a
>probability of one in 4 billion. Longer signatures result in
>smaller probability of a mismatch. The 128-bit MD5 signature
>will match one in 2^128 messages. That is a very low
>probability, for sure, but it doesn't uniquely represent one and
>only one message.
The rub is in that low probability.
David, I am not a cryptographer, but I am probably more of an expert on both
compression and cryptography than anybody on this list. Its my job to work
with both. See http://www.hifn.com. While everything you say about the
counting argument is true, you underestimate its applicability to the
cryptographically HARD process of coming up with an identical MD5 digital
signature for two files.
First of all, I am using the phrase "MD5 digital signature", because the
word "checksum" isn't quite applicable to MD5, even though it is in common
parlance. MD5 is what cryptographers call a one way hash, not a checksum.
The result of all bits in the hash result is dependent on every bit of the
input data in such a way that it is almost impossible to "engineer" input
data to have the same digital signature as another set of input data.
Yes, an MD5 digital signature is of limited size length, so by the counting
argument, there are other inputs that will generate the same MD5 digital
signature.
However, finding one/spoofing one is next to impossible. An MD5 digital
signature is 16 bytes long. That is, 128 bits. Most people have no concept
of how big a number 2^128 is. Let's assume that you took a brute force
approach to finding inout data that matches a given digital signature by
generating random data, MD5 hashing it, and repeating till you found a
match. If you could do this operation 1 billion times times per second on
any given computer, and you used 1 billion computers on each of 1 billion
planets, it'd take on average ~5000 years to find input data with the same
digital signature purely using a brute force approach.
This is true, even given the fact that there are approximately 2^128
possible 32 byte blocks (on average) that match any given MD5 signature.
The counting argument proves that lots of possible blocks exist, however,
finding even one of those 2^128 32 byte blocks in the much larger 2^256
possibility space of a 32 byte block is computationally infeasible.
Using MD5 signatures to uniquely identify and fingerprint programs is
perfectly normal and acceptable cryptographic practice, and if there is
indeed some computationally feasible way to create two different programs
that have the same MD5 signature, almost all modern day crypto-systems would
be broken.
-Scott Dossey
sdossey@xxxxxxxxxxxxxxxxx