[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [computer-go] alpha-beta + graph history question
Evan,
There is another issue here too. In your code you have
the call: check_transposition_table();
But there is no clue about what happens as a result of the call.
Presumably, you may be able to stop the search at this node, but your
code sample doesn't indicate this.
But the check_transposition_table() function must also have some
tricky logic to it based on what kind of entry is stored in the
table. For instance if there is a potentially usable score in the table
whether you can actually use it or not depends on what alpha and beta
are when the call is made and the type of record stored in the entry.
If the entry is stored as an exact score you can use it as is and
you are finished at this node. If the score stored in the table is
a lower bound, and is greater than alpha, you can treat this as a
beta cutoff.
But you don't show any of this logic, so it's not easy to figure out
exactly what you are doing.
In my program, I do all the work inside the get_transposition and
put_transposition call and I do not worry about this inside the tree
search code itself. I simply pass the current alpha and beta values
to put and get. This makes the tree search stuff simpler to
understand and isolates all the crufty logic to the transposition
table routines.
Here is some actual code from my checkers program. Instead of
cleaning it up, I'll show it to you in all its glory (or ugliness.)
A couple of notes first to make it a bit easier to follow:
1. There are actually 2 tables, one is always overwrite, the other
is depth precendent. Even though this means you can use only
have as many entries for a given amount of memory, it is clearly
superior than a single table and makes you program search
better.
2. This code is from a palm OS program, so unfortunately you must
wade through some memory management issues like MemHandleLock.
You can pretty much ignore that at pick it up at the point *ht,
the actual entry, has been established.
3. The code that looks similar to this: if ( sc > (WIN - 1000) ) t->hr[i].sc += ply;
deals with scoring wins correctly. Since in checkers you can
see a won position at various depths due to transpositions,
funky things happen when you store scores that represent wins at
various depths. This is likely to be a non issue in a Go
program, so you can basically ignore or delete that code.
4. The program ages records, so that it is not neccessary to clear the table
before each search. That's a big benefit because you can use old entries.
After each search, a global age variable is incremented and an entry that
doesn't have the current age can always be overwritten, in either table.
I use 5 bits for age, but even 2 or 3 works well if you only have this many
of extra bits laying around!
5. The program stores and returns a "best move" which is very handy for improving
the search efficiency, even if the entry cannot be used.
6. get_hash() returns 1 if the entry can be used directly and puts the score in the
*sc pointer.
7. 'ctx' is a big global state structure where I keep global variables. I probably
wouldn't use it in a non-palm program but it makes it convenient for keeping
the state of the application which must be saved and restored whenever someone
enters and leaves an application.
8. Even with all of this, the routines do not require a whole lot of code.
I apologize for not simplifying the code, but if I did, I would
probably make errors. Besides, the way it is now is a complete
example and demonstrates aging, double entry tables, correct depth
based win scoring, best move storage (and simple palm memory
management if anyone cares.)
typedef struct
{
UInt32 key;
Int16 sc;
int depth:8;
int bound:3;
UInt8 keep_forever;
UInt16 age:5;
move_t bm;
} sub_rec;
typedef struct
{
sub_rec hr[2];
} transposition;
void put_hash( position *p, int sc, int alpha, int beta, int depth, move_t bm, int ply )
{
int i;
transposition *t;
unsigned long cix = (p->akey >> 16) & (HT_CHUNKS-1);
unsigned long add = p->akey & (TT_SIZE-1);
transposition *ht;
ht = MemHandleLock( htmh[cix] );
add |= 1; // set bit 0 for black to move positions
if (p->ctm & 1) add ^= 1;
t = &(ht[add]);
for (i=0; i<2; i++)
{
if (ply >= 0)
{
if ( i==0 && t->hr[i].age == ctx->age && depth < t->hr[i].depth) continue;
if (t->hr[i].keep_forever)
{
MemHandleUnlock( htmh[cix] );
return;
}
}
if (ply == -1)
t->hr[i].keep_forever = 1;
else
t->hr[i].keep_forever = 0;
t->hr[i].age = ctx->age;
t->hr[i].bm = bm;
t->hr[i].key = p->pkey;
t->hr[i].sc = sc;
t->hr[i].depth = depth;
if (sc >= beta)
{
t->hr[i].bound = LB;
if ( sc > (WIN - 1000) ) t->hr[i].sc += ply;
if ( sc < -(WIN - 1000) ) t->hr[i].sc -= ply;
}
else if (sc <= alpha)
{
t->hr[i].bound = UB;
if ( sc > (WIN - 1000) ) t->hr[i].sc += ply;
if ( sc < -(WIN - 1000) ) t->hr[i].sc -= ply;
}
else
{
if ( sc > (WIN - 1000) ) t->hr[i].sc += ply;
if ( sc < -(WIN - 1000) ) t->hr[i].sc -= ply;
t->hr[i].bound = EX;
}
MemHandleUnlock( htmh[cix] );
return;
}
}
int get_hash( position *p, int *sc, int alpha, int beta, int depth, move_t *bm, int ply )
{
int i;
transposition *t;
unsigned long cix = (p->akey >> 16) & (HT_CHUNKS-1);
unsigned long add = p->akey & (TT_SIZE-1);
int rval = 0;
transposition *ht;
ht = MemHandleLock( htmh[cix] );
add |= 1; // set bit 0 for black to move positions
if (p->ctm & 1) add ^= 1;
t = &(ht[add]);
for (i=0; i<2; i++)
{
if ( t->hr[i].key != p->pkey ) continue;
*bm = t->hr[i].bm;
if ( t->hr[i].depth < depth ) continue;
*sc = t->hr[i].sc;
if ( *sc > (WIN - 1000) ) *sc -= ply;
if ( *sc < -(WIN - 1000) ) *sc += ply;
switch ( t->hr[i].bound )
{
case LB :
if (*sc >= beta) rval = 1;
break;
case UB :
if (*sc <= alpha) rval = 1;
break;
case EX :
rval = 1;
break;
}
if (rval)
{
MemHandleUnlock( htmh[cix] );
return(rval);
}
}
MemHandleUnlock( htmh[cix] );
return(0);
}
- Don
In my go program (GoFigure), I use a global alpha-beta search with
transpostion table. The transposition table is rotationally
invariant, and maintained across invocations of the program, which
results in walking of the game tree in ways that seem good at
demonstrating subtle problems with my code and my understanding of
alpha-beta search.
My alpha-beta function (simplified) looks something like this:
alphabeta(int alpha, int beta, int depth) {
check_transposition_table();
get_moves();
score_type = ALPHA_CUTOFF;
for (int i = 0; i < children.length; i++) {
children[i].alphabeta(beta, alpha, depth + 1);
if (child.score > score) {
score = child.score;
score_type = EXACT_SCORE;
}
if (score > beta) {
store_transposition_table();
score_type = BETA_CUTOFF;
return score;
}
if (score > alpha) {
assert child.score_type == EXACT_SCORE;
}
}
store_transposition_table();
return score;
(It's a bit more complicated, but I think that's the relevant bit.)
Now, the question is: is the assert at the bottom of the loop a valid
assertion to make? On very rare occasion, it triggers, but I haven't
been able to track down why or under what conditions exactly. So does
that represent a bug buried in my code somewhere, or are there real
cases where it isn't true in properly coded alpha-beta searches?
Thanks in advance,
Evan
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://www.computer-go.org/mailman/listinfo/computer-go/