[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] SlugGo vs Many Faces, newest data



Don, at 22:16 08/09/2004, you wrote:



> Don, I think you're observations about new ideas giving odd results at
> first, must be due to some psychological factor.  Perhaps you tend to
> remember the suprising results, or perhaps you test a lot of ideas that
> don't lead to improved performance, but quickly dismiss those that don't
> get off to a good start in the tests.

You are of course implying that  I have a gamblers mentality.  I think
you  may not  be a  very good  psychologist :-)
I don't mean to imply that at all, but you have made a very accurate assessment of my psychological abilities ;). A lot of statistical results are counter intuitive, and most untrained people don't do well designed experiments. I was speculating that you might be one of those, however the following description does suggest that your procedure is good.

Here is  how I  do my
testing.   Before  I start  a  test I  determine  in  advance what  my
stopping rule is and what result I much achieve to keep a change.  The
stopping rule is usually a fixed number of round robin matches between
2 or more version of the program.   I never stop early unless I find a
bug while testing in which case I restart after fixing.



> ... or perhaps you test a lot of ideas that don't lead to improved
> performance, ...

Why should I be exempt?  This is what engineering is all about.
I think your editing of my sentence is a bit unfortunate here: you chopped out the important part. That, and your comments make it look as though I was belittling your coding, which I wasn't. The better the performance of a programme, the harder it will be to improve it in general.

You have ruled out most of my ideas for why you could say
'So if I do something to my program and it tests 9-1, for instance, I just laugh.' If you haven't
succeeded in improving your code, then the probability of it testing 9-1 or 10-0 is at most 11/1024, which is about 1/93. Of course this does not say that with these results, the probability that you have improved the programme is 92/93, but to laugh at such a result, I would think you would estimate a prior of maybe only a 1/200 chance of improvement. Could this be the explanation? Is my psychological ability letting me down when I interpret your laughter? Maybe I am misinterpreting something here.

Re-reading your post, I think perhaps that you are aware that the rules of probability say that a 9-1 result for an unimproved programme is very unlikely, but that your observations show that probability is wrong in this case. If this is what you mean, then I absolutely disagree with you. The rules of probability are not wrong.

Doug. It is the one sided test that is relevant to these discussions. We are interested in testing things like 'Is slug go at least 2 stones stronger than MF?' rather than 'Is one or the other stonger in a two stone game', which is much less interesting and almost certainly true.

Best wishes
Tom.


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.548 / Virus Database: 341 - Release Date: 05/12/2003
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/