[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [computer-go] Spiral pattern matching and data structures



Mark Boon wrote:
> > -----Original Message-----
> > From: computer-go-bounces@xxxxxxxxxxxxxxxxx
> > [mailto:computer-go-bounces@xxxxxxxxxxxxxxxxx]On Behalf Of Paul Pogonyshev
> > Sent: Thursday, January 20, 2005 23:03
> > To: computer-go
> > Subject: Re: [computer-go] Spiral pattern matching and data structures
> >
> > Anders Kierulf wrote:
> > > Question: When checking the spiral of points around an anchor,
> >
> > you probably
> >
> > > repeat that for all symmetries? Or do you include the symmetric
> >
> > patterns in
> >
> > > the DFA?
> >
> > We have a single DFA, which is matched 8 times, once for each symmetry
> > transformation.  We tried to build pre-rotated DFAs, but they turned out
> > to be so huge, that they were not worth it.
>
> This surprises me. I think the number of leafs in the DFA tree is equal to
> the number of patterns you try to store. The number of internal nodes is
> smaller than that. So the DFA becomes more or less 8 times as big, but you
> don't need to copy the actual patterns 8 times. So the total memory cost of
> the pattern-matcher when building the DFA 8 ways is actually smaller than 8
> times. Lookup speed increases significantly though so to me that definitely
> justifies using a bit more memory.

Well, you started with "I think," but then fell into definitive "is actually"
here...

In GNU Go patterns can use wildcards, for instance "o" stands for "stone
of player who is to play or no stones."  Effectively, a pattern with several
wildcards can easily stand for 20--100 "plain" patterns.

The number of internal nodes should be smaller than the number of patterns
_multiplied by average pattern length_.  However, the mode "dense" DFA
becomes, the more close the number of internal nodes is to that limit.

So, increasing the number of patterns almost 8 times can blow the database up
hundreds or thousands times!

To make things worse, prerotated DFAs inhibit usage of relatively recent
optimization in GNU Go DFA builder (of which I'm the author :)  The
optimization capitalizes on the fact that if you match every transformation
of a DFA, it does not matter which particular transformation of each pattern
you store.  This allows to pack large DFAs down 3--4 times compared to semi-
random transformations used in human-readable database and 1.3--1.6 times
(as far as I remember) to DFA build by principle "choose the shortest
transformation."

That said, we have these words in GNU Go documentation:

     It is possible to construct a special pattern database that
  generates an "explosive" automaton: the size of the DFA is in the worst
  case exponential in the number of patterns it recognizes.  But it
  doesn't occur in pratical situations: the dfa size tends to be
  "stable".  By "stable" we mean that if we add a pattern which greatly
  increases the size of the dfa it also increases the chance that the
  next added pattern does not increase its size at all.

Frankly speaking, I cannot imagine exponential growth, but presumably
whoever wrote this knew what he was doing, or what?

Paul
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/