[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: computer-go: Authenticating the identity of a remote go-playing computer program



On Mon, Dec 11, 2000 at 01:43:39AM -0500, Don Dailey wrote:
> 
>    From: William Harold Newman <william.newman@xxxxxxxxxxxxxxxxx>
> 
>    A modern microprocessor has such a complicated state (especially
>    including the state of its cache) that unless you go to considerable
>    trouble to initialize it to a known state, the time spent in a
>    calculation is not deterministic. And using any typical OS (including
>    MS Windows, almost any Unix, or the Mac OS) makes the calculation time
>    even less deterministic, because of the way that the OS implements
>    multitasking, virtual memory, and other things.
> 
>    If a program checks the wall clock to determine how much calculation
>    to do, it will still be deterministic in the sense you refer to above.
>    That kind of determinism is useful for some purposes (like debugging).
>    However, it's not enough to prevent cheating. For example, I can write
>    a program which cycles through the most plausible moves, changing its
>    mind every millisecond or so. When a move comes up which I like, I
>    make a fake record of the clock running out. Voila.
> 
>    If you have a proposal for preventing this without making it
>    impossible for programs to use real-time clocks for time management,
>    bring it on. Or you can try to forbid the use of real-time clocks for
>    time management, but I'm afraid that many programmers would be very
>    unhappy with this. (I would certainly be unhappy with it.)
> 
> I never claimed it was impossible to incorporate non-randomness into a
> program,  only  that I  don't  think it  "limits"  a  program.  It was
> earlier implied that   determinism  was a limiting    requirement that
> prevented sophisticated AI techniques from being used.  
> 
> In earlier emails I argued that it  was worth the  time to design your
> program to be deterministic.  The whole  thrust of my argument is that
> you  don't have  to  weaken your  program in  any   significant way to
> accomplish good solid deterministic behavior.
> 
> If  it  turns out that  I  am wrong about  this,  it  won't be because
> someone could not  use a real time clock.   I'm not forced to use YOUR
> exact time management  technique any more than  I  am forced to  use a
> true random  number   generator  instead  of a psuedo   random  number
> generator just because you say so.
 
I'm not claiming that a "deterministic" program can't do simulated
annealing and machine learning and so forth. (I vaguely remember that
someone else may have claimed that, but I definitely remember that it
wasn't me.:-) I did my Ph. D. thesis on Monte Carlo simulations of
quantum mechanics, and I learned a thing or two about PRNGs along the
way. I even went to some trouble to make my Monte Carlo programs as
deterministic as possible, in part for debugging. My argument here is
only about the difficulty of making a program's time consumption
deterministic.

Is it fair to read into your three paragraphs above that you still
propose to forbid the use of real-time clocks for time management?

I don't really understand your third paragraph: "I'm not forced to use
YOUR exact time management technique", etc. I understand your point
that for the kinds of programs you write, time management is trivial.
The point I was trying to make is that if you force US to use your
time management techniques, WE'll be unhappy. I hope to convince you
that while ignoring the real time clock may work well for your
programs, it does not necessary work well for other people's programs.

> For any useful time management scheme you can  dream up, I can provide
> a close deterministic  approximation.   

It's not clear to me how you can expect to do this for everyone's
program. (Perhaps you aren't claiming to do this for everyone's
program, only for your own program. In that case, you can ignore the
next few paragraphs.)

To pick one of many issues, my Go program is written in Common Lisp.
The Lisp implementation that I use (http://sbcl.sourceforge.net/) uses
a conservative garbage collector. It's possible to construct
pathological cases where the conservative nature of the GC causes
arbitrarily large amounts of extra memory to be allocated without
being freed, causing performance to drop. In extreme cases, the
performance drop could be quite large as swapping occurred. It would
be extremely difficult to try to predict all the input values which
could cause this to happen.

(Related problems can occur in many other systems. If a program uses
standard dynamic memory allocation (C malloc(), C++ operator new,
etc.) its theoretical worst case performance is usually considerably
worse than most people realize, due e.g. to fragmentation. There's a
reason why hard-real-time programmers avoid dynamic memory
allocation..)

My time management scheme is basically always to have in mind a best
move so far, and then to terminate calculation and use the best move
so far when the wall clock time gets close to the time limit.

Your first impulse might be to tell me to run the program on typical
test data and use the average time spent as an estimate of the time
spent per X (e.g. per ply of search, or per pattern checked) in future
searches, then terminate the search when the count of Xs gets large
enough. In that case, let me remind you that computer Go tournaments
are a great environment for exposing your program to new bizarre
patterns of play you never tested against, so your estimated time per
X might not be nearly as close an approximation as you expect.

This seems to me to be a design choice between open loop and closed
loop, and it seems to me that the closed loop design (checking the
clock, instead of estimating time consumption by dead reckoning) is
more accurate and straightforward.

> By the way, even though your email is mostly correct (but not relevant
> to my points)  you made a  big error and then  used it to justify your
> point which was  not fair.  You talked  about "making a fake record of
> the clock running out", but you are not allowed to  wait until AFTER a
> move is played and then  make up arbitrary data  and claim it was part
> of the  input state.  If your program  is deterministic,  which is our
> premise, then the output is defined soley  from the input state, which
> cannot be contrived after the output happens.  Your program is allowed
> to publish it's input  state as long as  it is published before it has
> knowledge   of  the opponents move  (actually   even this  is a little
> dangerous if the move is easy to anticipate.)

(If you are still seriously proposing to forbid input from the real
time clock, then the next two paragraphs are irrelevant.)

I'm not talking about waiting until the move is played. I'm talking
about watching the state of the program as it's thinking (before the
move is played), and intervening when it gets to a state that I like.
I don't even need to write a fake record; I can e.g. suspend the
process which is running my program, so that its best move so far is
frozen until time runs out. If I have thirty seconds per move, and I
suspend the process within 0.5 second of the true stop time, it will
be rather difficult to distinguish the game record from what you'd see
with typical random fluctuations.

If this is impossible under your anti-cheating scheme, or if your
scheme is intended for something other than preventing this kind of
cheating, then perhaps you can describe your scheme more carefully, or
describe the problem you're trying to solve more carefully, or refer
me to the message ID of the message where I should go back and reread
the description more carefully.

>    Third, the "there's an engineering solution to every problem" approach:
>      If you want to do something which might be in some sense
>      straightforward but which is probably nontrivial enough to be
>      well worth publishing, then
>      (a) Write a Go program which plays a good enough game of Go to 
> 	 do respectably well in a serious tournament, despite (i)
> 	 being strongly deterministic (i.e. no real-time input) and
> 	 (ii) being specified in a language with sufficiently nice formal
> 	 properties that you can do the zero-knowledge proof below.
>      (b) Before each tournament game, submit a cryptographically strong
> 	 hash of the program and its starting state.
>      (c) After the tournament, participate in a zero-knowledge proof
> 	 that your program has the given hash and produces the given
> 	 output.
>      But until you've demonstrated the practicality of doing this, no
>      stampede of others doing it is likely to start. (And even once
>      you've demonstrated that this can be done, don't hold your breath
>      waiting for the stampede to start!)
> 
> Which is what every one of us has already said.

I thought it was worth restating my view of the problem. Sometimes
people end up talking past each other because they're not talking
about the same thing. Also, as far as I had noticed, my condition
(a)/(ii) hadn't appeared anywhere earlier in the discussion. It may be
so obvious to you that it just didn't occur to you to point it out,
but I doubt that every single person reading this discussion realizes
how difficult it is to prove things about a program written in a
normal programming language. So while this was completely redundant
for you, I'd thought it might not be completely redundant for every
one of us.

> Essentially, you described  the engineering approach  accurately.

Thank you.

> A few people seem to  feel threatened somehow  by  this idea, as  I've
> seen some critical,  negative and  dismissive  emails on this.    It's
> certainly  not anything worth   taking seriously, it's  just a talking
> point.

I think it's a neat idea actually, and worth taking seriously. Whether
or not it's practical for computer Go tournaments, something like this
is probably very useful somewhere. However, I think it's much more
difficult than some people let on, so much so that it's not realistic
for Go tournaments in the next few years.

(And now that I think about it, it occurs to me that there might even
be some devious way of generalizing this scheme to allow limited real
time input without enabling cheating, perhaps starting by making the
program expose its current choice of move at all times. But my guess
is that even if that could be done, this idea is still unrealistic in
the near term.)

-- 
William Harold Newman <william.newman@xxxxxxxxxxxxxxxxx>
software consultant
PGP key fingerprint 85 CE 1C BA 79 8D 51 8C  B9 25 FB EE E0 C3 E5 7C