[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: An AI program that doesn't learn

To: Peter.Smith@xxxxxxxxxxxxxxxx
Subject: RE: An AI program that doesn't learn
From: Mika Kojo <mkojo@xxxxxx>
Date: Wed, 30 Jun 1999 17:13:54 +0300 (EET DST)
Cc: computer-go@xxxxxx
Delivered-to: computer-go@xxxxxxxxxxxxxxxxx
In-reply-to: <6FDAD424E81FD211BFAB00A0C9DB2DDA69302E@xxxxxxxxxxxxxxxxx>
Organization: SSH Communications Security, Finland
References: <6FDAD424E81FD211BFAB00A0C9DB2DDA69302E@xxxxxxxxxxxxxxxxx>
Sender: computer-go-owner@xxxxxx

Peter.Smith@xxxxxxxxxxxxxxxxx writes:

> If the program is tuning itself against a set of problems then again it's
> not interesting for the program to keep history information for a board
> position (it won't have any when it is being trained.)

Yes, in supervised learning it is not necessary to store the previous
states. Many reinforcement methods work with history information.

> Where this would apply is if the program is itself learning by playing
> matches against humans or better still other programs.  I have the gut
> feeling this approach is simply going to take too long to be viable. 

The approaches with temporal difference learning use previous
positions, and they prove your point --- it takes a long time to learn
to play Go.

I believe that TD methods work nicely for recorded expert games, and
might perform better than the usual "strict" supervised learning in
many situations. In the TD approach storing the previous positions is
often necessary, and knowing them is always necessary.

However, there still is a place for supervised learning as TD methods
demand some care with the input data.

Mika Kojo

References:
- RE: An AI program that doesn't learn
  - From: Peter.Smith

Prev by Date: Re: An AI program that doesn't learn
Next by Date: Re: AI Methods in Go
Previous by thread: Re: An AI program that doesn't learn
Next by thread: Re: An AI program that doesn't learn
Index(es):
- Date
- Thread