[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: fundamental problems for reinforcement



Ives Steglich schrieb:

> ...
> so have thought about some implementation strategies
> in relation with reinforcement learning end detected some basic
> problems:
> 
> a) i need a way to detect the end of a game
> b) to rate it in that way that i can say which play has won
> 
> this to statements are most importend to get known couse without 
> this a implementation of a client could get very difficult,

If you or your program think, that it is not yet "game over", just
continue to play. The opponent must be able to handle this. 
Eventually you run out of legal moves, and you will have to pass.
This then will be the game end.

To shorten the after-end-game you might pass before all dead stones
are eliminated. However, doing so you or your program risks, that 
your opponent disagrees on what is dead. Therefore you need what is 
called an "agreement phase" (but how is this done when two programs 
play each other?).

If i'm not mistaken, computer go requires a program to demonstrate 
that it is able to capture.

> so maybe the first step shoulb be develop a system that can manage the
> group problem

Very, very difficult but not a problem for game end determination.

> the difference from go to chess or backgammon is u have no
> deterministic end of game and u dont know such easy who won it

As said not true in my opinion. Go players agree (easily) on game end 
and also all computer go programs have solved the "game end" problem.

Also: it is not forbidden to try a "late" invasion. The worst what
can happen is that the opponent demonstrates how to refute it.
Or is there a rule that you must trust the opponent that he has
a minimum Go skill to capture "impossible" invaders?

>From a human perspective it is of course annoying having to respond 
to obviously non-working moves, but in my view to be able to pass
at the right time (or to resign at the right time) is part of the GO 
knowledge of a good Go program.

Hans