Ah yes, this seems somewhat similar to the type of networks you get
with cascade correlation. IME such an architecture easily overfits on
the training data, leading to poor generalization.
For a moment I thought you were doing something pseudo-recurrent like
with time-delayed inputs. (Which could be derived from the internal
states of the net at previous board positions.) :-)
Maybe you can compare some different architectures for your next draft?
Best,
Erik
Peter Drake wrote:
On Monday, October 27, 2003, at 01:28 AM, Erik van der Werf wrote:
Interesting. What do you mean by: "each hidden unit also receives
information from all previous hidden units".
Exactly that. Suppose the network has three input units A, B, and C,
three input units D, E, and F, and three output units G, H, and I.
Each unit has incoming connections as follows:
D: ABC
E: ABCD
F: ABCDE
G: DEF
H: DEF
I: DEF
The intent was to avoid any decisions about how many hidden layers to
have, how big to make them, etc. Any arrangement of hidden layers is
a special case of this architecture.
In retrospect, the philosophy of making the architecture as general
as possible and leaving the details up to the backpropagation
algorithm may not have been wise. Our next draft will have far more
structure.
Peter Drake
Assistant Professor of Computer Science
Lewis & Clark College
http://www.lclark.edu/~drake/
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________
computer-go mailing list
computer-go@xxxxxxxxxxxxxxxxx
http://computer-go.org/mailman/listinfo/computer-go