[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [computer-go] SGF parsers
On Dec 14, 2004, at 4:00 AM, Paul Pogonyshev wrote:
Another thing you are missing is speed. My SGF parser (also I admit
it is _extermely_ complicated) can crunch a 100 M file in some 5--6
seconds. I seriously doubt any XML parser out there can come
anywhere
close.
Probably not, all XML parsers I've seen are slow. I wonder if my OS
can
*read* 100Mb in 5-6 seconds.
It can, or at least fetch it from disk cache. Actually, when I saw
your
reply, I got somewhat sceptical myself as I actually never fead my
parser
with 100 MB (only 50 ;) So I generated a random file with branching
factor
of 1--4 and only moves and comments.
real 0m5.656s
user 0m4.885s
sys 0m0.520s
[paul@localhost sgf]$
So the file is 104.8 MB, `sgf-test' basically reads the file and
discards
all data (i.e. does nothing else but reading) and my box has Athlon XP
2600+
and 512 MB RAM, the program is compiled without optimizations (easier
to
debug.) Standard `-O2' chops off a little over 25% runtime. If the
file
was not in UTF-8, quite a lot of time would have been spent in iconv()
converting characters.
Sen:te Goban's parser is based on an optimized version of sgf.c by
Antti Huima. It takes 36 seconds to read the 20300+ files, ~40MB of a
GoGoD distribution, to build the trees in memory and to create game
record references that include the number of moves and the game
signature. That's on a 1 GHz PowerPC.
I agree with you that using XML or building an intermediate DOM tree
seems like a waste of time and memory. Of course, the parser I used
came with its own data structure and tree representation, which was not
necessarily what I wanted, but wrapping my tree structure around it was
not a big deal (but then, I wasn't using Java).
Marco Scheurer
Sen:te, Lausanne, Switzerland http://www.sente.ch
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/