[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Ideas



Heikki,

	Sounds great and familiar. :-)
	
	Could you educate us a bit on "TD-learning"? Is it Traing with
Dataset? (I am embarassing myself)

	Before you go too far too big, just try a little thing -- teach
it to learn how to play joseki. To be more specific, teach it learn how
to play "star joseki" (Chinese term, joseki with the starting stone at a
corner star position). See how fast it could learn. Or another small
task -- teach it how to fight on a 4 x 4 corner.
	
	Any of the above concepts involves a lot of other critical go
concepts. A trap of previous ANN programs might be to too ambitious --
working on the whole 19 x 19 board.

	If you get any progress, please let us know. 

	Thanks.

-- Mousheng Xu 

-----Original Message-----
From: Heikki Levanto [mailto:heikki@xxxxxxxxxxxxxxxxx]
Sent: Wednesday, November 10, 1999 2:45 PM
To: computer-go@xxxxxxxxxxxxxxxxx
Subject: Ideas


heikki@xxxxxxxxxxxxxxxxx:
> I have some ideas of my own, and as soon as I get the time (heh) I
will
> try some of them out in practice.

Xu, Mousheng <moushengxu@xxxxxxxxxxxxxxxxx> replied:
> * Hekki, you might succeed, and you might not. :) 2 heads are better
> than 1, could you share your thoughts with us, and all credit all the
> values of the thoughts to you. But if you feel pretty confident, and
> don't think discussions help, then keep it secret.

I am not afraid of sharing my ideas, it is more that most of them are on
such a raw state that I have troubles even formulating them to myself.
The
one that I find most interesting (at the moment) is using a neural
network
and TD-learning for whole-board evaluation. My idea is not to feed the
board
image to the network at all, but extract various key numbers from it,
and
hope these will enable it to evaluate the position. 

For ecample, the number of stones in atari is an interesting figure. If
the
net learns to minimize this, it will tend to pull its stones out of
atari.

This is to be an evaluation function, to find the best move, evaluate
all
possibilities, and choose the best. The TD-learning is supposed to code
some
look-ahead-like effects into the learning, so I hope to get some not
totally-hopeless results from just one move lookahead. Of course
standard
search techniques can be used if we have the capacity to evaluate more
than
a few hundred positions...

As I said, this is a raw idea, not tested, and probably has a fatal flaw
in
it. If I have the time and energy, I will put it to a test some day...
Until
then, I would appreciate if someone could shoot it down, so I can
savethe
trouble of trying. Or any other comments...

- Heikki

-- 
Heikki Levanto     LSD Levanto Software Development   heikki@xxxxxxxxxxxxxxxxx
               "In Murphy we Turst"