[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

computer-go: Tsume-Go database.




Hi,

Many applications of machine learning in Go start with the subdomain of 
Tsume-go.  Life-and-death problems are easy to evaluate: a move 'works' or
it does not work.  Also, they concern an area where search is typically
problematic and heuristic (learned) knowledge is welcome.

To evaluate an approach, the algorithm learns on a training set and then
tries to solve problems from a test set.  Moreover, for a comparison of
several approaches it is useful that all compared approaches use the same
training set and test set.  This allows to spot weaknesses and strengths
of individual approaches.

Also, systems that intend to solve tsume-go problems can be evaluated
using a database of test problems.

Therefore, there is a need for publicly available high quality benchmarks
in the tsume-go domain. 
1) A well-known dataset is the database generated by
gotools (by T. Wolf).  However, this dataset also has disadvantages:
- Gotools has produced it.  It can be expected that gotools
performs better on problems it has generated itself than on others.  This
makes comparison with gotools difficult using gotools problems.
- The problems are somehow artificial.  They contain only bounded
positions.  They are of basic level (i.e. easy to solve by dan players).
2) Some approaches (e.g. by Kojima Takuya) use problems from books by
Ishida and others.  However, these problems are copyrighted and hence
evaluation of other systems on these same datasets is difficult.  Even if
I would be able to obtain the same books (in japanese) and read them, and 
if i would type them in, I can't separate the problems into several levels
(basic/intermediate/...) in the same way as done by the authors for
obtaining their published results.

So I am currently searching a publicly available human-generated database
of tsume-go problems in electronic format which is big enough to train a
program and evaluate it (let's say at least 1000 problems).

Has anyone ideas or suggestions?

If no, would somebody find it interesting if such a database was
created?  If so, would somebody be willing to help with that huge task?


Jan