[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: fundamental problems for reinforcement
Ives Steglich schrieb:
> ...
> so have thought about some implementation strategies
> in relation with reinforcement learning end detected some basic
> problems:
>
> a) i need a way to detect the end of a game
> b) to rate it in that way that i can say which play has won
>
> this to statements are most importend to get known couse without
> this a implementation of a client could get very difficult,
If you or your program think, that it is not yet "game over", just
continue to play. The opponent must be able to handle this.
Eventually you run out of legal moves, and you will have to pass.
This then will be the game end.
To shorten the after-end-game you might pass before all dead stones
are eliminated. However, doing so you or your program risks, that
your opponent disagrees on what is dead. Therefore you need what is
called an "agreement phase" (but how is this done when two programs
play each other?).
If i'm not mistaken, computer go requires a program to demonstrate
that it is able to capture.
> so maybe the first step shoulb be develop a system that can manage the
> group problem
Very, very difficult but not a problem for game end determination.
> the difference from go to chess or backgammon is u have no
> deterministic end of game and u dont know such easy who won it
As said not true in my opinion. Go players agree (easily) on game end
and also all computer go programs have solved the "game end" problem.
Also: it is not forbidden to try a "late" invasion. The worst what
can happen is that the opponent demonstrates how to refute it.
Or is there a rule that you must trust the opponent that he has
a minimum Go skill to capture "impossible" invaders?
>From a human perspective it is of course annoying having to respond
to obviously non-working moves, but in my view to be able to pass
at the right time (or to resign at the right time) is part of the GO
knowledge of a good Go program.
Hans