[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [computer-go] SGF parsers
Mark Boon wrote:
> > --- Paul Pogonyshev <pogonyshev@xxxxxxxxxxxxxxxxx> wrote:
> > > Mark Boon wrote:
> > > > There are several advantages to such an approach. First, the
> >
> > SGF parser
> >
> > > > really needs to be made only once and will then be universally
> > > > useful. Second, once the XML definitions are agreed upon, the laziest
> >
> > solution to
> >
> > > > parse them is to use an existing XML library like Xerces to create
> > > > the DOM-tree from XML. That way SGF can be used when space is an
> >
> > issue, as it's
> >
> > > > more compact, and XML can be used for other purposes.
> > >
> > > Another thing you are missing is speed. My SGF parser (also I admit
> > > it is _extermely_ complicated) can crunch a 100 M file in some 5--6
> > > seconds. I seriously doubt any XML parser out there can come anywhere
> > > close.
>
> Probably not, all XML parsers I've seen are slow. I wonder if my OS can
> *read* 100Mb in 5-6 seconds.
It can, or at least fetch it from disk cache. Actually, when I saw your
reply, I got somewhat sceptical myself as I actually never fead my parser
with 100 MB (only 50 ;) So I generated a random file with branching factor
of 1--4 and only moves and comments.
[paul@localhost sgf]$ head ~/real-monster.sgf
(;GM[1]FF[4]
CA[UTF-8]
AP[Quarry:0.1.10]
SZ[19]
(;W[dd]C[bmqbhcdarzowkkyhiddqscdxrj];W[ro]C[tyqjtmuqinntqmihn]
(;B[mq]C[qutmszfqjnmtaeqtmykcbrzkjuhltznluiyokfhvstouzgqmeaogrqsdmzohyd\
tuotjyysttl]
(;B[ke]C[yflrugawcba]
(;W[na]C[wxyuycpoxewzgiqtxz];B[hr]C[wydagicanorwladiilxsmhfwedytenocltc\
sdfusvnognrrvfoqrxvpdyowedmgoijilqeel]
[paul@localhost sgf]$ ll ~/real-monster.sgf
-rw-rw-r-- 1 paul paul 109943150 Dec 14 04:48 /home/paul/real-monster.sgf
[paul@localhost sgf]$ time ./sgf-test ~/real-monster.sgf
real 0m5.656s
user 0m4.885s
sys 0m0.520s
[paul@localhost sgf]$
So the file is 104.8 MB, `sgf-test' basically reads the file and discards
all data (i.e. does nothing else but reading) and my box has Athlon XP 2600+
and 512 MB RAM, the program is compiled without optimizations (easier to
debug.) Standard `-O2' chops off a little over 25% runtime. If the file
was not in UTF-8, quite a lot of time would have been spent in iconv()
converting characters.
Paul
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/