AN EXTENDED CASCADE CORRELATION
NEURAL NETWORK
YULE1 BAI
Bachelor of Science
Northwest University
Xi'an, P. R. China
1985
Master of Science
Research Institute of
Petroleum Exploration and Development
Beijing, P. R. China
1989
Submitted to the Faculty of the
Graduate College of the
Oklahoma State University
in partial fulfillment of
the requirement for
the Degree of
MASTER OF SCIENCE
May, 2002
AN EXTENDED CASCADE CORRELATION
NEURAL NETWORK
Thesis Approved:
I/ Thesis Adviser ACKNOWLEDGEMENTS
I wish to express my sincere gratitude to my thesis advisor, Dr. John P. Chandler,
for his constructive guidance in choosing the topic of this thesis, constant inspiration, and
valuable time throughout my thesis work. My sincere appreciation also extends to other
committee members, Dr. George E. Hedrick and Dr. Debao Chen, for their helpll
advisement, suggestions and valuable time.
In addition, I would like to give my special appreciation to my wife, Qiaoling Li,
for her strong encouragement, support and understanding throughout my study in
Oklahoma State University. Thanks also go to my parents for their love and
encouragement.
Finally, I would like to thank the Department of Computer Science for supporting
me during the time of my study.
iii
TABLE OF CONTENTS
Chapter Page
1 . INTRODUCTION. ................................................................................................1.
1.1 MultiLayer Feedfonvard Neural Networks ...................................................... 1
1.2 Optimization of Network Architecture ............................................................ 2
1.3 The Purpose of the Paper ................................................................................. 5
2. AN EXTENDED CASCOR NETWORK ...............................................................7
2.1 The CascadeCorrelation Architecture (CasCor) .............................................. 7
2.2 Extension of CasCor Network Architecture . . ...................................................1 2
2.2.1 Addition of Nodes ............................................................................... 13
2.2.2 Connections ..................................................................................... 16
2.3 Learning Algorithm and Implementation ................................................. 18
2.3.1 Objective Functions .............................................................................1 8
2.3.2 Learning Algorithm ............................................................................. 20
2.3.3 Implementation ................................................................................... 24
3 . TEST ............................................. RESULTS ON REGRESSION PROBLEMS 26
3.1 Setup .............................................................................................................. 26
3.2 Regression Problems ...................................................................................... 27
3.3 Test Results and Comparisons ....................................................................... 29
3.3.1 Test Results on GroupI Problems ....................................................... 29
3.3.2 Test Results on Group11 Problems ...................................................... 33
4 . CONCLUSIONS AND FUTURE WORK ............................................................ 38
4.1 Conclusions ...................................................................................................3 8
4.2 Recommendation for Future Work ................................................................. 38
REFERENCES ........................................................................................................ 39
APPENDICES .........................................................................................................4. 2
Appendix A.. Error Measures ................................................................................ 42
Appendix B.. Tables of Test Results ..................................................................... 43
Appendix C. Program Source Code ..................................................................... 48
LIST OF TABLES
Table Page
21 : Growth rates of architectural parameters with number of hidden nodes . . . .. . . . . . . . . . . . . .18
31 : Random seeds used in initialization of weights .................. ... ... ... ............................ 29
B1 : Test results on GroupI for the CasCor network. .. .. . . . . . . . . . . . . . . . . . . ... ... . . . . . . . ... . . . . . . . . . . . . . .44
B2: Test results on GroupI for the XCAS network .................................. ........... ........ .45
B3: Test results on GroupII for the CasCor network.. . . . . . . . . . . . . . . ... . . .. . . . . . .. ... . .. . . . . . . . .. . . . . . . .46
B4: Test results on GroupI1 for the XCAS network ...... . . .. . . . . .. ......... ... ...... . . . . . . . ..... . . . . . .. 47
LIST OF FIGURES
Figure Page
1.1 : An artificial neuron (a) and a multilayer feedforward neural network (b) ............... 2
2.1 : Diagram showing all connections in a CasCor network ............................................ 8
2.2. The .............................................................................. CasCor network architecture 9
2.3. Strictly layered cascaded architecture .................................................................. 1 1
2.4. The proposed network architecture (XCAS) .......................................................... 1 2
3.1 : Average number of hidden nodes (GroupI) ............................................................3 0
3.2. Average total number of weights (GroupI) ............................................................ 31
3.3 : Average squared error percentage on the test set (GroupI) ..................................... 31
3.4. Total number of ................................... weights for the bestrun network (GroupI) 32
3.5. Squared ............ error percentage on the test set for the bestrun network (GroupI) 33
3.6. Average number of hidden nodes (Group11) .......................................................... 34
3.7. Average total number of weights (Group11) ........................................................... 35
3.8. Average .................................... squared error percentage on the test set (GroupIT) 35
3.9. Total .................................. number of weights for the bestrun network (GroupIT) 36
3.10. Squared error percentage on the test set for the bestrun network (Group11) ......... 37
Chapter 1. INTRODUCTION
I .I MultiLayer Feedforward Neural Networks
Multilayer feedforward networks, usually called MultiLayer Perceptrons or MLPs,
are the most common form of artificial neural networks (ANNs). Typically a MLP
consists of several hidden layers of hllyconnected artificial neurons (Figure 1.1). Each
neuron (or node) has a bias input and an activation hnction associated with it.
The connections between neurons are represented by weights. The output of each
neuron is the output of the activation finction which takes as its input the weighted sum
of all inputs to the neuron plus a weighted bias. The outputs of the neurons in one layer
are all linked to each of the neurons in the next layer, except for the output layer of the
network.
A MLP is capable of learning fiom examples. When input signals are fed in, the
computation of the network is carried out on a layerbylayer basis until the outputs of the
network have been produced (forward pass). The result from this forward pass is
compared with the desired output and error estimates are computed for the output nodes.
This process repeats until the network goes through all the training examples. After this
stage, the network is used to compute output for other unseen example inputs based on its
generalization of the training examples.
ANNs are especially useful to solve problems whose underlying rules are unknown or
difficult to be explicitly represented. Another important feature of ANNs is that a
network with the same architecture can be used to solve different kinds of problems.
W s ar e very popular in, but not limited to, the domains of pattern classification and
function approximation.
Inrut Wcight
Wcightcd Sum
Activatian Function
Xa Output=f(~)
Figure 1.1: An artificial neuron (a) and a multilayer feedforward neural network (b)
1.2 Optimization of Network Architecture
Apart fiom the problems of choosing the training algorithm, the first major decision is
to determine the optimal architecture fbr the given problem befbre training begins. The
architectural factors include the numbers of hidden layers and the numbers of hidden
nodes in these layers.
It has been known that a twolayer MLP network (one hidden layer) with enough
hidden units can approximate any continuous function to any degree of accuracy [4]
[7][9]. However, there is no theoretic solution for determining how many hidden units are
sufficient for a given problem. If the number of nodes is too small the network may fail to
generalize well (underfitting) between different inputs. On the other hand if the number
of nodes and layers is too large then the network may be very slow in training and too
closely approximate the training data (overfitting, i.e. exactly fit the noise).
The bad generalization is partly due to the insufficient representational capacity of the
network (because of the finite size of the network used) and partly due to insufficient
information about the target function because of a finite number of examples [16]. In the
case of underfitting, the network cannot learn the correct representation for the given
problem. In overfitting the network tends to memorize the details of all examples seen
and is most unlikely to perform well when novel or noisy examples presented. Both
underfitting and overfitting show the mismatch of the complexity between the problem
being solved and the network used [l 11 1221.
It is reasonable to understand that the performance of a MLP network with fixed
architecture may vary from problem to problem. In other words, an optimal architecture
found for one problem may not be optimal for another problem. It is desirable for a
network to have the ability to adjust its architecture towards an optimal structure during
learning.
There are various approaches used in the area of network architecture selection [18].
Ad hoc or trialanderror methods are based on past experience and knowledge of the
problem. Several networks of different architectures are experimented with and the
results are examined. Only the network that gives the best results is selected. This
approach is useful when apriori knowledge of the problem is available. But it is not so
usefbl in most cases where neural networks are used. On the other hand, the architecture
of the net cannot be modified during training. An alternative is to find the optimal
structure of a neural network by using Genetic Algorithms [23] or other methods [3].
These methods seem computationally intensive before the optimal architecture is found.
Another popular approach is to use Dynamic Learning Algorithms for optimization of the
network architecture.
Dynamic Learning Algorithms are characterized by growing (or constructive,
additive) andlor pruning (or destructive, subtractive) processes that automatically mode
network topology during learning fiom a set of examples.
In constructive algorithms [2] [lo], an initial network is developed with a small
number of hidden nodes, and more nodes can be added during training in order to
produce more accurate results (growing the network to minimize the error). The final
topology and size of the network are dynamically determined by the algorithm and are a
function of the set of examples and of the learning parameters. Constructive algorithms
have the inherent advantage of rapid training in addition to finding both the architecture
and weights. The potential disadvantage is that they may create overcomplex networks.
Although only a few constructive algorithms have actually been used in real applications,
the Cascade Correlation network (CasCor) proposed by Fahlman [6] and its variants have
been successfblly used in many applications and is implemented in most large neural
network simulator programs [20].
Pruning algorithms take the opposite strategy to growing algorithms. They start with
a more complicated network and eliminate weights and nodes based on the analysis of
relevance between weights [8] [12] [IS] [17] [21] in order to reduce the complexity of the
network.
The constructive approaches have several advantages over pruning algorithms [lo]:
1) They are straightforward to specie an initial network with small size. For
pruning, one does not know in practice how big the initial network should be.
2) Constructive algorithms always search for small network solutions first, and
thus tend to be computationally economical and find a smaller network than a
pruning algorithm in which the majority of training time is spent on larger
networks than necessary.
3) The pruning process may introduce larger errors, especially when many are to
be pruned. Smaller weights may have important impact on generalization [I].
In this paper, constructive algorithms are used to deal with the modification of the
network architecture during training.
1.3 The Purpose of the Paper
The aim of this paper is to propose an extended CascadeCorrelation network
(CasCor) that can be trained by constructive algorithms. The CascadeCorrelation
network builds the net by adding nodes in one dimension. The proposed network
architecture (XCAS) is based on the CasCor network but allows addition of nodes in two
dimensions.
The goal of the new XCAS network is to reduce number of weights needed to resolve
the given problems compared with the CasCor network. The network topology and
weights will be automatically determined during training. The CasCor network and the
proposed network, as well as the learning algorithm, will be introduced and described in
Chapter 2. Simulation results on regression problems, and comparisons between CasCor
and XCAS networks will be presented in Chapter 3.
Chapter 2. AN EXTENDED CASCOR NETWORK
In this chapter, we will describe features of the CasCor network first, then propose an
extended CasCor network architecture (XCAS), finally, introduce the training algorithm
used in both the CasCor and XCAS networks.
2.1 The CascadeCorrelation Architecture (CasCor)
The CascadeCorrelation network [6] is characterized by the cascade architecture in
which hidden nodes are added to the net one at a time and each node added becomes a
oneunit layer of the net (Figure 2.1). Another feature is weightzing: once a new
hidden node has been added to the net, its incoming connection weights are frozen and do
not change in later training. The training algorithm that creates and install the new hidden
nodes will be described in later section.
Every output unit receives connections from a bias unit, all original inputs and all
hidden nodes with corresponding adjustable weights. The bias unit provides a constant
value of +I. Every hidden unit receives connections from the bias unit, all original inputs
and all previous existing hidden nodes (Figure 2.1). Output units may employ linear or
nonlinear activation hnctions according to the problems.
As illustrated in Figure 2.2, CasCor begins with no hidden nodes. After the output
connections have been trained, new hidden nodes can be added to the network onebyone.
The hidden node's input weights are fiozen at the time the node is added to the net;
only the output connections are trained repeatedly. The cycle of adding a node repeats until certain stop criteria are met. Thus the network topology and weights are determined
automatically during the training.
Bias Inl In2
+ 1
4 b
w1.0
w1.1 4D
Wl. 2
4 b
Hidden Node 2
out
out2
Figure 2.1: Diagram showing all co~ect ionsin a CasCor network with two input units, two output
units and two hidden units. For clarity of showing addition of hidden nodes (see Figure
2.2), several connections were lumped into one in the diagram drawn by Falhman
(1990).
Outputs
Initial State
No Hidden Units
Outputs
Add Unit 1
Outputs
Add Unit 2 P 4
1
Inputs : L J
m w
Rgure 2.2: The CasCor network architecture. The vertical lies sum all incoming activation. Boxed
connections are frozen, output connections (filled circles) are trained repeatedly (after
Fahlman, 1990).
The CasCor network has two advantages over classic MLPs:
1) No need to guess about network topology in advance. Network size (number of
layers and number of nodes in each layer) is automatically determined in the
course of training.
2) No need to backpropagate error signals through the connections of the
network. This means faster training.
However, the CasCor may produce very deep networks and lead to high fanin to the
hidden nodes. The total number of weights Nw in the resultant network will be large in
this case. Weights (free or independent parameters) include all adjustable parameters that
are associated with connections.
Let N be the number of hidden nodes in the net. p and q are the numbers of original
inputs and outputs, which are constant for the given problem. The ith hidden node has (i
1) connections received fkom preexisting hidden nodes and (l+p) connections fron~
original inputs and the bias node. Each output node has (N+p+l) connections, so we
have:
This indicates the total number of weights is @(N2), where N is the number of hidden
nodes ultimately needed to solve the problem. It is obvious that both the depth of the net
(or propagation delay) and the maximum fanin of the hidden nodes are @(N).
In order to minimize the network depth and the fanin of the hidden nodes, one
approach proposed by [19] is to generate a strictly layered structure in which each layer
has the same fixed number of connections (Figure 2.3). However, their test results
showed the number of weights needed is close to or larger than that of CasCor network
for the same problem. Another problem with their modified architecture is that the
number of nodes in each layer must be set before training and can only be evaluated via
heuristics and trialandemor on the problem at hand. This problem should not appear in
CasCor network family.
Input Units
First Hidden Layer
Second Hidden Layer
Output Units
Figure 2.3: Strictly layered cascaded architecture. Each hidden layer has the fmed number of nodes.
Old output units (dotted ellipses) are collapsed into the next layer (after [19]).
We take a different approach to achieve the same purpose. Our network architecture
described in next section is designed to reduce the total number of weights but also
reduce the network depth and fanin of the hidden nodes.
2.2 Extension of CasCor Network Architecture
An important feature of the CasCor network is the way it adds the new node to the
net. In effect, each node added forms a new layer in the net and the network grows in one
dimension, thus the depth of the net keep increasing when more nodes are added.
Our idea is to allow the network to grow in two dimensions in the course of adding
nodes (Figure 2.4). This kind of architecture is an extension of the CasCor network so we
name it the XCAS network.
Input Units
Hidden Layers
Output Units
Rgure 2.4: The proposed network architecture (XCAS). There are shortcut connections from input
units to output units (shown by the arrowed line on the left side), from hidden units to
hidden units (curved and m w e d line). Every node in the hidden layers also receives
connections from all input units (block arrow). Every output unit also receives
connections from all the hidden nodes in the network (block arrow). The number in the
circle represents the order for addition of that node. Dashed circles represent nodes to be
added later. The network topology can be taken as an mxn matrix, where m, n are network layout
parameters defined by user: m, n are the maximum number of layers (or rows) and the
maximum number of columns permitted respectively. The CasCor is a special case of the
XCAS for n = 1. Addition of nodes is to fill an empty mxn matrix according to a certain
rule.
2.2.1 Addition of Nodes
Like the CasCor network, nodes are added to the net one at a time and their
connection weights are fiozen once added. For simplicity, we denote node(i, j) as the
node in the ith row and jth column of the network layout matrix.
1) Symmetric addition:
In an nxn network layout, nodes in the ith row and ith column of the network
layout matrix are added symmetrically relative to diagonal nodes. When
node(i, k) is added, node(k, i) is the next node to be added. This process repeats
from k = 1 until k = i. After all positions in the ith row and ith column of the
matrix are occupied, nodes are added to the next row and column in the same
way when necessary.
The network will extend outwards along the diagonal of the matrix and keep as
square a shape as possible (Figure 2.4). When N nodes have been added to the net, the
depth is roughly the square root of N.
2) Rowwise or Columnwise adlition:
In an mxn network layout, nodes are added symmetrically first on an n'xn' submatrix
where n' = rnin(m, n), then added row after row if m > n, or column after
column if n > m.
The following procedure determines the order according to which node(i, j) is added
given an mxn network layout. The results are stored in twodimensional arrays of integer
neuron and neuseq.
Init  NetLayout(m, n, neuron, neuseq)
INTEGER m, n, neuron(m, n), neuseq(mxn, 2)
k =l
min  mn=MIN(m,n)
DO i=l, min mn
DO j=1, min  mn
neuron( i, j) = k
neuseq( k, 1) = i
neuseq( k, 2) = j
k=k +1
I F ( i #j)THEN
neuron( j, i) = k
neuseq( k, 1) = j
neuseq( k, 2) = i
k=k+ 1
END IF
END DO
END DO
IF(n>min  mn)THEN
DO j=(min  m n + l ) , n
DO i = l , m
neuron( i, j ) = k
neuseq( k, 1 ) = z
neuseq( k, 2) = j
k = k + 1
END DO
END DO
END IF
IF(m>min  rnn)THEN
DO z=(min  mn+ 1),m
DO j = 1, n
neuron( i, j) = k
neuseq( k, 1) = i
neuseq( k, 2) = j
k = k + 1
END DO
END DO
END IF
END Init  NetLayout
The row index and column index for the kth node to be added are given by neuseq( k,
1) and neuseq( k, 2) respectively. The node( i, j) will be the kth node to be added for k =
neuron( i, j).
There may be many strategies to explore for addition of hidden nodes in the XCAS
network. We only discuss the schemes described above in this paper.
2.2.2 Connections
Each node in the net can receive forward pass connections fiom previous adjacent
layer, shortcut connections from hidden nodes and from original inputs. Nodes in the first
layer receive connections only from original inputs.
In the XCAS network, node(i, j) receives connections from:
1) node(i1, k) fork= 1 toj;
Those are nodes from the previous adjacent layer but with column index I j;
number of forward pass connections = j; i > 1;
2) node(k, j) for k = 1 to i2;
Those are nodes fiom previous layers but in the same column as node(i, j);
number of shortcut connections fiom hidden nodes = i2;
3) shortcut connections from original inputs;
number of shortcut connections fiom original inputs = number of inputsp;
Each node in the same column has the same pattern of connections as that of the
CasCor network, but receives more connections fiom adjacent layers (constant 1 for
CasCor). In actual implementation, shortcut connections fiom hidden nodes andlor original inputs can be enabled or disabled by the user, so XCAS is flexible for
experimenting different connection schemes with the same network layout.
For node(i, j) , the number of weights Ndz, j) can be expressed as:
Ndi, j) =p + 1 for i = 1;
Ndi,j) = j + p + 1 for i = 2;
Ndi,j) = i + j + p  1 for i > 2;
wherep is the number of original inputs.
Now we can calculate the total number of weights Nw assuming that the final network
layout is nxn with all connections enabled as described above.
The total number of hidden nodes is N = n2 in this case. Let MI, Mz, M3 be the total
numbers of forward pass connections, shortcut connections from hidden nodes and from
original inputs respectively. We have:
nl n 1
M2 =xxj=  'I n ( n 2 1) fori > 1
for i > 2
The total number of weights for output nodes = q(l+p + n2). So
This indicates the total number of weights is @(pa).
In comparisons with the CasCor network as seen in Table 21, the total number of
weights, depth of the net and maximum fanin of the hidden nodes are reduced
significantly and also grow much slower when the number of hidden nodes become large.
Table 21: Growth rates of architectural parameters with number of hidden nodes
2.3 Learning Algorithm and Implementation
PARAMETERS
Depth of the network
Maximum fanin of the hidden nodes
Total number of weights
The learning algorithm used in CasCor network is also suitable for XCAS network
although XCAS network has different architecture from CasCor. Another reason is that
better and unbiased comparisons can be made between CasCor and XCAS if we employ
the same algorithm for experimenting. For clarity, we describe here the main features of
CasCor learning algorithm in pseudocode.
2.3.1 Objective Functions
CasCor
@(N)
00
@(N2)
In the CasCor learning algorithm, training cycle is divided into two phases: input
training and outpztt training.
Input training trains input weights of the candidate node by maximizing S, which is
the covariance of the candidate's activation and the residual error developed before the
candidate is added to the net:
18
XCAS
@(NIR)
@(NID)
@(NZR)
where C, is the activation of the candidate for pattern p, C is the activation of the
candidate averaged over the set of all training patterns, E,,, is the residual error observed
for patternp at output unit o, and is the average linear deviation at output unit o.
The partial derivative of S with respect to wj is given by
where o, is the sign of the covariance for the candidate at output unit o, f,' is the
derivative for pattern p of the candidate's activation function with respect to its sum of
inputs, and in,,, is the input to the candidate node for the pattern p and associated with
w ~ 
During the input training, there are no real connections from the candidate to output
nodes because the candidate does not pass its activation to output nodes at this time. For
this reason, a pool of several candidates can be trained independently at the same time
and only the best one will be selected into the network. The covariance developed by a
candidate during training depends on the random initialization of its input weights. The
use of a pool of candidates thus greatly reduces the chance that a bad candidate caused by
bad weight initialization will be added to the network.
Output training trains the weights of the output nodes by minimizing the squared error
function E. E and its derivatives with respect to the weights of output nodes are
where y,, is the actual net output at output unit o for patternp. t,, is the corresponding
target value, f;,, is the derivative for pattern p of the output node's activation function
with respect to its sum of inputs, and i n , , is the input to the output node for pattern p
and associated with wv For regression problems, output nodes usually use identity
activation function such that fLo = 1, SO computation of derivatives is simpler.
2.3.2 Learning Algorithm
The training starts with no hidden nodes, so output training is performed first. If the
criteria are met, the network ends in a network without any hidden nodes; otherwise,
hidden nodes are trained and added to the network until some criteria are satisfied.
The main training procedure works as follows:
Train  Net
FirstTime = TRUE
REPEAT
IF ( NOT FirstTime) THEN
InputTraining
FirstTime = FALSE
END IF
OutputTraining
UNTIL ( EhD  TRAIN  NET)
The termination criteria EhD M NN E T can be:
(i) Error measure value I Error tolerance E
(ii) Number of hidden nodes > Permitted number of hidden nodes
The error tolerance E and the permitted number of hidden nodes are set by the user.
The training stops when either of conditions (i) and (ii) is met.
The error measure we used here can be squared error (SQE), mean squared error
(MSE), square root of mean squared error (RMSE), normalized mean squared error
(NMSE), squared error percentage (SQEP) or error index (EIDX). All those measures are
based on squared error. Their definitions are listed in Appendix A.
The input training works as the follows:
InputTraining
Initialize  Candidates
EvaluateCovarianceS
REPEAT
ComputeDerivatives
UpdateCandidateWeight s
Evaluate  CovarianceS
UNTIL (END  INPUT  W M N G )
The termination criteria END  INPUT  TRAINING include:
(i) Epochs trained > Permitted maximum epochs
(ii) Change rates in covariance values 5 Change threshold E
Input training stops when either of the above conditions is satisfied.
An epoch is defined as one pass through the entire set of training examples. Condition
(ii) means progress stagnation of input training where the highest covariance value
produced by any of the candidates has not changed by greater than a threshold value E in
the last k epochs. So there are three parameters selected by the user for the termination of
input training: permitted maximum epochs, input change threshold E and patience
parameter k.
Initialize  Candidates : randomly initializes the weights of all candidates. The
number of candidates used is set by user.
Evaluate  CovarianceS : computes values of the objective finction S, i.e.
covariance values for each candidates, and the signs of the covariance
values that will be used in computation of derivatives.
Compute  Derivatives : computes derivatives with respect to the weights of
candidates.
Update  Candidate  Weights : updates the weights of the candidates in order to
maximize covariance.
Mer the termination of input training, the candidate with the highest covariance
value is installed into the network and connected to the output nodes. The rest of the candidates are discarded. The weights of the output nodes associated with the newly
installed node are also initialized with small values, the sign of which is the inverse of the
covariance with the respective output unit.
The output training is similar to input training. In this phase, all the weights of hidden
nodes currently in the network are frozen. A backward propagation of error through the
hidden nodes is unnecessary, so output training is just like training a network without any
hidden nodes but with additional input units.
Output  Training
Compute  Error  Derivatives
REPEAT
Update  Output  Weights
Compute  ErrorDerivatives
UNTIL ( END  OUTPUT  TRAIMNG )
The termination criteria END  OUTPUT  TRAINING are similar to those in input
training:
(i) Error measure value < Error tolerance E
(ii) Epochs trained > Permitted maximum epochs
(iii) Change rate in error 2 Change threshold E'
Three userselectable parameters for output training termination are: permitted
maximum epochs, patience parameter k and output change threshold E'.
Compute  ErrorDerivatives : computes residual error and the derivatives with
respect to the weights of the output nodes.
UpdateOutputWeights : updates the weights of the output nodes in order to
minimize squared error.
2.3.3 Implementation
The learning algorithm for the XCAS network, as described in the last section, has
been implemented in FORTRAN 77.
When dealing with the learning algorithm, we did not mention a specific algorithm
for updating weights. In fact, any existing algorithms for updating weights should work
in our network. What most concerns us is whether the XCAS network can learn or not. In
actual implementation, we used the quickprop algorithm proposed by Fahlman [24],
which was also used in CasCor network.
The activation knctions used and the corresponding derivatives are
1) Identity Function
f ( x ) = x ;
f t ( x ) = 1.0;
2) Asymmetrical Sigmoid Function
f ( x ) = l.O/(l.O+exp(x)) ;
f t ( x ) = f ( l . O  f ) ;
3) Symmetrical Sigmoid Function
f ( x ) = l.O/(l.O+exp(x))  0.5;
f ' ( x ) = 0.25f2;
4) Hwerbolic Tangent Sigmoid Function
f ( x ) = ( 1.0  exp(x))/(l.O + exp(x)) ;  l . O < f ( x ) < 1.0
f l ( x ) = 1 . 0  f 2 ;
All those functions can be used for output units. The identity function is usually not
used for hidden units.
Chapter 3. TEST RESULTS ON REGRESSION PROBLEMS
In this Chapter we will perform experimentation with XCAS network and present test
results and comparisons between the CasCor and XCAS.
3.1 Setup
In order to get consistent test results, common userselectable parameters in the
training algorithm were specified on an identical basis for the CasCor and the XCAS
network.
Permitted maximum epochs for input and output training = 100;
Patience parameter for input and output training = 8;
Number of candidates = 8;
Learning rate for input training = 0.75;
Learning rate for output training = 0.3 5;
Maximum growth factor for learning rate = 1.75;
Input change threshold = 0.03;
Output change threshold = 0.0 1 ;
We used the symmetrical sigmoid function with output range between  0.5 and + 0.5
for all the hidden units, and identity function for all the output units. The weights of the
candidate nodes were randomly initialized between  0.5 and + 0.5. The network layout
for the XCAS was chosen as n by n (square layout) for the test although other layouts can
be explored. 3.2 Regression Problems
We carried out experiments with the CasCor and XCAS networks on the following
five nonlinear hnctions, which had been used in [25] :
(1) Simple Interaction Function
f (x,, x2) = 10.391(0.36 + (x,  0.4)  (x2  0.6))
(2) Radial Function
(3) Harmonic Function
f (x,, x,) = 42.659 (0.1+ y, (0.05 + y;  loy: yi + 5 ~ ););
y, = (x,  0.5); y, = (xz  0.5);
(4) Additive Function
(5) Complicated Interaction Function
For simplicity, we refer to the above hnctions as F1, F2, F3, F4 and F5 respectively
in simulation. All the five fbnctions have two inputs and single output. Experiments were
also made on combinations of the five functions for the purpose of testing the networks
nn n r n h l ~ m cw ith miiltinl~n ~~tniaintd~ i n ~ r ~ a cr n~mdn l ~ w i t vT he h ~ tnpc t urnl inc I ~ C P ~

GroupI: each of the five hnctions is treated as an independent problem;
Group11: the first 2, 3, 4, and 5 of the five functions are combined into four
problems with different number of outputs:
CF2: F1 and F2 as the outputs of the network;
CF3 : F 1, F2, and F3 as the outputs of the network;
CF4: F1, F2, F3 and F4 as the outputs of the network;
CF5: F1, F2, F3, F4 and F5 as the outputs of the network;
GroupI1 is also used to test the learning ability of the networks when several
independent problems are put together for training. If there exist the similarities to a
great degree between the independent problems, we expect the networks should be able
to take advantage of those similarities, and the total numbers of hidden nodes and weights
used would be less than the sum of those for independent learning.
The training and test data sets are generated on a regular grid with the same range [0,
11 for the two inputs of the five hnctions. The abscissa values of the two independent
variables for the training data set are sampled as
for i = 1, 2; j = 1, 2, ..., n;
Similarly, the two inputs for the test data set are sampled as
So, the number of examples produced is T? for the training data set and n(n1) for the
test set. The test set is independent of the training data set. We assumed n =15 in our
simulation, thus obtained 225 examples for the training data set and 210 examples for the
test set. We used the same set of input data pairs {( x, j , x , )) generated for the
experiments with all the five functions.
3.3 Test Results and Comparisons
The goal of our testing is to explore the learning ability of the XCAS network and
assess its pefiormance compared with the CasCor network. We carried out 10 runs on
each problem with different random seeds for initialization of the candidates.
Table 31: Random seeds used in initialization of weights
We used the error index (EIDX) defined in Appendix A as the error measure in the
training, and the threshold value 0.1 for stopping the training for all experiments. The
final error is reported in squared error percentage (SQEP) over the set of all training
examples.
Trial No.
Seed
3.3.1 Test Results on GroupI Problems
The average number of hidden nodes used by the CasCor and XCAS networks
(Figure 3.1) shows that the complexities of the problems remain the same for both
networks, in other words, the problem that is difficult for the CasCor network to learn is
29
1
7
2
48
4
173
3
77
5
231
6
378
7
455
8
571
9
601
10
737
still difficult for the XCAS network. Generally, XCAS needs a @w more nodes on each
problem than CasCor, but needs much more nodes to learn the harmonic hction (F3),
which is the most difficult problem for both networks. The number of hidden nodes in the
final network is a good indicator of the complexity of the problem, but not a good
measure for comparisons between networks of different architectures and connections,
Test Qroup I
;c
0 .
F1 F2 F3 F4 F5
Regression Problems
Figure 3.1: Average number of bidden nodes (GmpI)
As we have seen, the XCAS uses more nodes than the CasCor, however, it uses fkwer
weights than CasCor (Figure 3.2). The percentages reduced on the total number of
weights by XCAS against CasCor range fiom 25.6% to 54.4%, at least 3 1% for F1, F3,
F4 and F5, or in other words, XCAS only needs about 45%  75% of the total number of
weights used by CasCor on the same problem.
The results obtained here indicate XCAS is able to learn all of the problems tested
and is also more efficient in usage of weights than the CasCor.
Test Group I
Regression Problems
Figure 3.2: Average total number of weights (GmupI)
Test Qmup I
F2 F3 F4
Regression Problems
Flgure 3.3: Average squared error percentage on the test set (GroupI)
The average enors yielded by CasCor on the test set (Figure 3.3) are 1.3  2.3 times
as large as those produced by XCAS. In other words, XCAS is 1.3  2.3 times better than
the CasCor by the ability of generalization, which is measured by the performance on the
test set. The results here suggest that the network with a lesser number of weights tends
to have better generabation. Both the CasCor and XCAS networks produced larger
errors on problems F3 and F5 than on other problems (see Figure 3.3).
Test Group I
F1 F2 F3 F4 FS
Regression Problems
Figure 3.4: Total number of welghb for the bestrun network (GrwpI)
The total numbers of weights in the networks fiom the best run (Figure 3.4) which
gives the minimum error over the training data set, show the same trends indicated by the
averages (see Figure 3.2). This suggests that the bestrun network topology is close to the
average network obtained over all runs. The testing error yielded by the bestrun network
(Figure 3.5) is generally scaled down in magnitude compared with the average value (see
Figure 3.3) for both networks, but it becomes scaled up about 3 times on F3 fbt the
32
CasCor and 2 times on F5 for the XCAS. The XCAS performs much better than the
CasCor for the most difficult problem F3. The bestrun networks do not definitely
produce lower testing error than the average testing error.
Test Group I
F1 F2 F3 F4 F5
Regmslon Problems
Figure 3.5: Squared emr percentage on the teat set for the bestrun mtwark (GmpI).
The simulation results on the test GroupI demonstrate that XCAS is able to learn all
of the problems tested and uses a lesser number of weights than CasCor, and also has
better performance on the test set in general.
3.3.2 Test Results on GroupII Problems
All the problems in GroupII have numbers of outputs varying fiom 2 to 4 although
they have the same number of inputs. The average number of hidden nodes (Figure 3.6)
keeps increasing with the increased number of outputs. XCAS still needs a fkw more
nodes on each problem compared with CasCor.
It is interesting to note that the maximum number of hidden nodes used for GroupII
problems is less than 70, about 10 more nodes than that for Group1 problems (Figure
3.1). This implies that both networks are able to take advantage of the similarities
between those problems, thus the number of nodes needed for learning them together is
less than the sum of those for learning them independently.
CF2 CF3 CF4 CF5
Regression Problems
Figure 3.6: Average number of hidden nodes (GmpIQ.
As to GroupI problems, the XCAS network still needs a smaller number of weights
to learn GroupII problems (Figure 3.7). The percentages of the average number of
weights reduced by XCAS against CasCor range fiom 39.4% to 52.8%. In other words,
XCAS only needs about 47%  61% of the total weights used by CasCor fir the GroupII
problems. The results we obtained until now confirm that XCAS is able to learn complex
problems as CasCor does but uses many fbwer weights. Group ll
2400
Regremion Problems
Figure 3.7: Average total number of weighb (GroupH).
Group II
CF3 CF4 CF5
Regression Problems
Figure 3.8: Average squared emr percentage on the test set (Gmpll).
The testing errors produced by CasCor are 1.2  3.4 times as largt as those by XCAS
(Figure 3.8), indicating that XCAS has better performances on the test set in general
although fewer weights are used.
The total numbers of weights in the bestrun networks (Figure 3.9) show a similar
trend to that indicated by the averages (see Figure 3.7). The percentages of the total
number of weights reduced by XCAS against CasCor are 3 1.0%  42.8% for the first two
problems (CF2, CF3), and at least 56.0% for the last two (CF4, CFS), indicating that the
XCAS network uses many &wer weights than the CasCor network when the complexity
of the problem increases.
Group ll
QI
2400
5
Regression Problems
Figure 3.9: T~tanl umber of deb for the bestmn network (Gmpn)
As to the bestrun networks, XCAS network still shows better performance on the test
set than CasCor (Figure 3.10) in general, especially for the last two problems (CF4, CF5)
which are more difficult than the first two problems.
CF3 CF4 CFS
Regression Problems
Figure 3.10: Squared error percentage on the tat set for the bestnm network (Gmpnr)
The test results on the GroupI and Group4 problems demonstrate that the XCAS
network is able to learn all of regression problems tested, and has better performance on
the test set than CasCor in general but uses fewer weights. Chapter 4. CONCLUSIONS AND FUTURE WORK
4.1 Conclusions
This study proposes a new network architecture (XCAS) based on extension of the
CascadeCorrelation network (CasCor). Theoretically the total number of weights in the
final network grows with the number of hidden nodes N as N f i for the proposed
network and as N2 for the CasCor network. The test results on regression problems
confirm that the proposed network is able to learn all of the problems tested, and not only
use fewer weights but also exhibit better performance on the test set in general as
opposed to the CasCor network. On the average, the percentages of the total number of
weights reduced by XCAS against CasCor range from 25% to 55% for the corresponding
problems tested, and increase with the number of hidden nodes used.
4.2 Recommendation for Future Work
Further investigations could be done in the following places:
(1) Use direct error minimization instead of the covariance maximization in training
the hidden nodes so that the network is more suitable for regression problems.
(2) Allow hidden nodes to be of different types of activation functions. The network
with variable types of hidden nodes may be more flexible for various and
complicated problems.
(3) Use arctan(x) instead of a logistic or tanh(x) activation function.
(4) Extend the twodimensional XCAS architecture to three or more dimensions.
REFERENCES
[I] Bartlett, P. L., "For Valid Generalization the Size of the Weights is More
Important than the Size of the Network", in Advances in Neural Information
Processing Systems, vol. 9, p. 134, The MIT Press, 1997.
[2] Arnaldi, E. and E. Guenin, "Two Constructive Methods for Designing Compact
Feedforward Networks of Threshold Units", International J o u d of Neural
Systems, vol. 8, Nos. 5 & 6, pp. 629645, 1997.
[3] Chen, K., Liping Yang , Xiang Yu and Huisheng Chi, "A SelfGenerating Modular
Neural Network Architecture for Supervised Learning", Neurocomputing  An
International Journal, vol. 16, No. 1, pp. 3348, 1997.
[4] Cybenko, G., "Approximation by Superposition of a Sigmoidal Function",
Mathematics of Control, Signals, and Systems, vol. 2, No. 4, pp. 3033 14, 1989.
[5] Duch, W. and N. Jankowski, "Survey of Neural Transfer Functions", Neural
Computing Surveys, vol. 2, pp. 1632 13, 1999.
[6] Fahlman, S.E. and C. Lebiere, "The CascadeCorrelation Learning Algorithm",
Technical Report CMUCS90 100, Carnegie Mellon University, 1990.
[7] Funahashi, K., "On the Approximate Realization of Continuous Mappings by
Neural Networks", Neural Networks, vol. 2, No. 3, pp. 183192, 1989.
[8] Hassibi, B. and D.G Stork. "Second Order Derivatives for Network Pruning:
Optimal Brain Surgeon", in Actvances in Neural Infrmation Procesing Systems,
vol. 5, pp. 1641 71, Morgan Kaufinann, San Mateo, CA, 1993.
[9] Hornik, K., Stinchombe, M., White, H. "Multilayer Feedforward Networks are
Universal Approximators", Neural Networks, vol. 2, No. 5, pp. 359366, 1989.
[lo] Kwok, T.Y. and D.Y. Yeung, "Constructive Algorithms for Structure Learning in
Feedforward Neural Networks for Regression Problems", ZEEE Transactions on
Neural Networks, vol. 8, No. 3, pp. 630645, 1997.
[l 11 Lawrence, S., C. L. Giles and A. C. Tsoi, "What Sue Neural Network Gives
Optimal Generalization? Convergence Properties of Backpropagation", Technical
Report UMIACSTR9622 and CSTR3617, Institute for Advanced Computer
Studies, University of Maryland, 1996.
[12] LeCunn, Y., J.S. Denker, and S.A. Solla, "Optimal Brain Damage", in Advances in
Neural Information Processing Systems, vol. 2, pp. 598605, Morgan Kaufinann,
San Mateo, CA, 1999.
[13] Littmann, E. and H. Ritter, "Cascade Network Architectures", in Proceedings of the
International Joint Conference on Neural Networks, Baltimore, MD, USA, vol. 2,
pp. 398404, June 1992.
[14] Littmann, E. and H. Ritter, "Cascade LLM Networks", in Artzjicial Neural
Networks, vol. 2, pp. 253257, Elsevier Science Publishers B.V., 1992.
[15] Mozar, M.C., "Skeletonization: A Technique for Trimming the Fat from a Network
via Relevance Assessment", in Advances in Neural Information Proceeding
Systems, vol. 1, pp. 1071 15, Morgan Kaufinann, San Mateo, CA, 1989.
[16] Niyogi, P., & Girosi, F., "On the Relationship between Generalization Error,
Hypothesis Complexity, and Sample Complexity for Radial Basis Functions",
Neural Computation, vol. 8, No. 4, pp. 8 19842, 1996.
[17] Van de Laar, P, and Heskes, T., "Pruning Using Parameter and Neuronal Metrics",
Neural Computation, vol. 11, No. 4, pp. 977993, 1999.
[18] Park, Y.R., Murray, T. J., & Chen, C., "Predicting Sun Spots Using a Layered
Perceptron Neural Network", IEEE Transactions on Neural Networks, vol. 7, No. 2,
pp. 501505, 1996.
[19] Phatak, D. S. and Koren, I., "Connectivity and Performance Tradeoffs in the
Cascade Correlation Learning Architecture", IEEE Transactions on Neural
Networks, Vol. 5, No. 6, pp. 930935, 1994.
1201 Prechelt, L., "Investigation of the CasCor Family of Learning Algorithms", Neural
Networks, vol. 10, No. 5, pp. 885896, 1997.
[21] Reed, R., "Pruning Algorithms A Survey", IEEE Transactions on Neural
Networks, vol. 4, No. 5, pp. 74 1747, 1993.
[22] Weigend, A., "On Overfitting and the Effective Number of Hidden Units", in
Proceedings of the 1993 Connectionist Models Summer School, pp. 335342,
Lawrence Erlbaum Associates, Hillsdale, NJ, 1993.
[23] Yao, X., "Evolving Artificial Neural Networks", Proceedings of the IEEE, vol. 87,
No. 9, pp. 14231447, 1999.
[24] Fahlman, S. E., "An Empirical Study of Learning Speed in BackPropagation
Networks", Technical Report CMUCS88 162, Carnegie Mellon University, 1 988.
[25] Hwang, J.N., S.R. Lay, M. Maechler, D. Martin, and J. Schimert, "Regression
Modeling in BackPropagation and Projection Pursuit Learning", IEEE
Transactions on Neural Networks, vol. 5, No. 3, pp. 3423 53, 1994.
APPENDICES
Appendix AError Measures
SQE: squared error;
MSE: mean squared error;
RMSE: square root of mean squared error;
NMSE: normalized mean squared error;
SQEP: squared error percentage;
EIDX: error index;
m : number of training examples or patterns;
n : number of outputs or dimensions of the output vector;
y , , : actual output of the network for the pattern i at output unit j;
Y . maximum value of the actual outputs of the network;
minimum value of the actual outputs of the Ymn . network;
ti,, : target or desired output for the pattern i at output unit j;

t : average target or desired outputs over the set of all training patterns;
3 ~ M S E= 1/ .1C C"b",j ti, j) ' = J X G rnn
m n
(4) NMSE = YmmYMn .CCbi,jt .)' =(y yMn)MSE
m. n 1. I i j
rn n
(5) SQEP = 1 00 . Ymm Yntin rnn *iC CQi,tji , ,)' = 100Cymmy ,).MSE j
(6) EDX = ,rnn / w / i v rnn1
Appendix BTables of Test Results
Simulations were performed on two test groups (see Chapter 3.2) for the CasCor and
XCAS networks. GroupI includes five problems labeled as F1, F2, F3, F4 and F5.
Group11 includes four problems labeled as CF2, CF3, CF4 and CF5, which are
combinations of problems from GroupI.
The numbers of training examples and testing examples are 225 and 210 respectively,
and are held constant for all the problems in GroupI and GroupII. Each problem was
executed ten times on the CasCor and XCAS. The averages are made over ten runs.
Table A1: Test r e d s on GroupI for the CasCor network
Testing Error SQEP 1 217.26
RMSE ( 0.4580
83.84
0.2981
544.27
0.7511
460.43
0.453 1
13.109
0.1254
83.84
0.2981
Table A2: Test results on GroupI for the XCAS network
Table A3: Test resalts on GmupII for the CmCor network Table A4: Test results on Group11 for the XCAS network
Appendix CProgram Source Code
XCAS network was implemented in FORTRAN77. Five files in plain text format
must be created before the program is executed.
1) Network Configuration File: storing setup parameters and weights.
FIRST LINE
REM NETSTA, NETLAY, NETCOL, HNUTYP, ONUTYP, ISHORT, HSHORT
0 16 16 2 0 1 1
REM HMXEPC, HBONUS, HCANDS, OMXEPC, OBONUS
100 8 8 100 8
REM HLRNRT, HMXLRN, HTHRES, WRANGE
0.75 1.75 0.03 0.5
REM OLRNRT, OMXLRN, OTHRES, ODECAY, ETOLER
0.35 1.75 0.01 0.01 0.1
REM NUMLAY, NUMNEU, ERNORM, PERCNT , ERRMSR
0 0 1 0 5
REM INPUTS, OUTPTS, VLDMOD, VLDVAL, NSEEDS
0 0 1 0 3
REM RANDOM SEEDS
77
139
354
The first line will be replaced by the program with the training data file name after
training is finished. Lines starting with REM provide names for values appeared below
and should not be removed. Values in boldface here are set by user, and must be
separated by at least one space. Weights of the trained network will be appended to this
file.
NETSTA: should be 0 when the network is going to be trained and nonzero when trained.
NETLAY: maximum number of layers permitted by user.
NETCOL: maximum number of columns permitted by user.
HNUTYP: types of hidden nodes represented by activation functions. Valid values: 13.
ONUTYP: types of output nodes. Valid values: 0  3.
ISHORT: enable ( set to 1 ) or disenable (set to 0 ) shortcut connections to original inputs.
HSHORT: enable ( set to 1 ) or disenable (set to 0 ) shortcut connections to hidden nodes.
HMXEPC: maximum epochs permitted for training a hidden node.
HBONUS: patience parameter for training a hidden node.
HCANDS: number of candidates used for training a hidden node.
OMXEPC: maximum epochs permitted for training an output node.
OBONUS: patience parameter for training an output node.
HLRNRT: learning rate for training hidden nodes, usually 0.5  2.5.
HMXLRN: maximum learning factor for training hidden nodes, usually 1.0  2.5.
HTHRES: threshold of change rate for training hidden nodes, usually 0.02  0.05.
WRANGE: weights will be initialized randomly between  WRANGE and + WRANGE.
OLRNRT: learning rate for training output nodes, usually 0.5  2.5.
OMXLRN: maximum learning factor for training hidden nodes, usually 1.0  2.5.
OTHRES: threshold of change rate for training output nodes, usually 0.01  0.05.
ODECAY: decay factor for updating weights of output nodes, usually < 0.05
ETOLER: error tolerance for training the network. The value depends on the error measure used.
NUMLAY: number of layers in the final network trained.
NUMNEU: number of hidden nodes in the final network trained.
ERNORM: specifies whether the error is expressed in normalized form ( 1 ) or not ( 0 ).
PERCNT: specifies whether the error is expressed in percentages ( 1 ) or not ( 0 ).
ERRMSR: specifies what kind of error measure will be used.
= 1, mean squared error (MSE).
=2, square root of mean squared error (RMSE).
=3, error index (EIDX).
The definitions for the above measures are listed in Appendix A.
INPUTS: number of inputs used for the network, determined by training data file.
OUTPTS: number of outputs used for the network, determined by training data file.
VLDMOD: mode for selecting validation set from training data file.
VLDVAL: value associated with VLDMOD. Valid values depend on VLDMOD.
VLDMOD = 1, create validation set from whole training data, starting from
VLADVAL+l to N, where N is the number of training examples.
VLDMOD =2  4, example j is in validation set if ( j mod VLDMOD = VLDVAL).
NSEEDS: number of seeds (positive integers) for random number generator.
2) Training Data File: training data (training set + validation set ) and test set.
Exmple :
The header (7 lines in total) includes information about attributes of input(s) and
output(s), sizes of training set, validation set and test set. There must be at least one
space between equal signs and values assigned. Each line in the data section includes
input(s) and desired output(s) (shown in boldface in the above example). Data items are
separated by at least one space. In data section, the training set comes first, then the
validation set and test set. There should be no blank lines in data section.
3) Report file: an empty file used for storing statistical results about the network
training. Information includes:
HNUS: number of hidden nodes.
HWTS: number of weights for hidden nodes.
TWTS: total number of weights (for hidden and output nodes).
IEPC: epochs used for input training.
OEPC: epochs used for output training.
TEPC: total number of epochs used in network training.
MSE: mean squared error.
RMS: square root of mean squared error.
VAR: variance of actual network outputs.
STD: standard deviation of actual network outputs.
IEW: EPCW for input training.
OEW: EPCW for output training.
EPCW = (C Kj  Nj)/lOOO
Kj = weightupdating epochs for node j;
Nj = number of weights for node j;
EPCW is a measure of times spent on training the nodes in the network.
4) Input Data File: inputs to a trained network.
Example :
The first line gives number of inputs and number of data lines.
5) Output File: an empty file used for storing network outputs
CCCCCCCCCCCCC XCAS NEURAL NETWORK CCCCCCCCCCCCCCCCCCCCCCCCCC
C PROGRAMMED IN FORTRAN77, COMPILABLE UNDER G77 IN UNIX SYSTEM
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C NAMING CONVENTION:
C IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
CCCCCCCCCCCCCCCCCCCC CONSTANTS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
C USED IN ALLOCATION OF STORAGE, MUST BE SET BEFORE COMPILED
C ALSO COPIED TO COMMON /MAXCON/ BLOCKS
C MXI: MAXIMUM NUMBER OF INPUTS
C MXO: MAXIMUM NUMBER OF OUTPUTS
C MXE: MAXIMUM NUMBER OF TRAINING EXAMPLES
C MXT: MAXIMUM NUMBER OF RANDOM SEEDS ( ONE SEED USED IN EACH TRIAL)
C MXD: MAXIMUM NUMBER OF CANDIDATES USED IN INPUT TRAINING
C MXL: MAXIMUM NUMBER OF LAYERS
C MXC: MAXIMUM NUMBER OF COLUMNS
C
CCCCCCCCCCCCCCCCCCCCC VARIABLES CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
C NUMINP, NUMOUT, NUMEXM, NTSTEX, NVLDEX
C = NUMBERS OF INPUTS, OUTPUTS, TRAINING EXAMPLES,TEST EXAMPLES,
C VALIDATION EXAMPLES RESPECTIVELY
C
C
C USER SELECTABLE PARAMETERS:
C NTRIAL= NUMBER OF TRIALS = NUMBER OF RANDOM SEEDS
C WRANGE= THE RANGE FOR RANDOM INITIALIZATION OF WEIGHTS
C NETLAY= MAXIMUM NUMBER OF LAYERS PERMITTED <= MXL
C NETCOL= MAXIMUM NUMBER OF COLUMNS PERMITTED <= MXC
C MODVAL= MODE FOR SELECTING VALIDATION SET FORM TRAING EXAMPLES
C MODREM= REMAINDER VALUE ASSOCIATED WITH MODVAL
C NOXINP= ENABLE OR DISABLED SHORTCUT CONNECTIONS TO INPUTS
C NOVERT= ENABLE OR DISABLED SHORTCUT CONNECTIONS TO HIDDEN NODES
C IDXMSR= INDEX FOR CHOOSING AN ERROR MEASURE FOR OUTPUT TRAINING
C
C FOR INPUT TRAINING
C MXEPOC= MAXIMUM NUMBER OF TRAINING EPOCHS PERMITTED
C NBONUS= PATIENCE PARAMETER FOR INPUT TRAINING
C NTRANS= TYPE OF ACTIVATION FUNCTION FOR HIDDEN NODES
C ALPHA = LEANING RATE
C BETA = MAXIMUM LEARNING FACTOR
C GAMMA = MOMENTUM
c SHRINK= SHRINK FACTOR = BETA/(BETA+I.O)
C THRESH= THRESHHOLD OF INPUT CHANGE RATE
C NUMCND= NUMBER OF CANDIDATES USED
C
C FOR OUTPUT TRAINING
C MXEPCO= MAXIMUM NUMBER OF TRAINING EPOCHS PERMITTED
C NEUO = TYPE OF ACTIVATION FUNCTION FOR OUTPUT UNITS
C NEPCO = OUTPUT TRAINING EPOCHS USED
C NBONO = PATIENCE PARAMETER FOR OUTPUT TRAINING
C ALPHO, BETO, GAMMO, SHNKO, THREO
C = LEANING RATE, MAXIMUM LEARNING FACTOR, MOMENTUM,
C SHRINK FACTOR, THRESHHOLD OF INPUT CHANGE RATE FOR OUTPUT
C TRAINING
C DECAYO= DECAY FACTOR (USED IN QUICKPROP)
C ERRTHR= THRESHHOLD VALUE FOR STOPING TRAINING THE NET
C
C OTHER VARIABLES:
C NBESTC= THE INDEX OF THE BEST CANDIDATE
C NEPOCH= INPUT TRAINING EPOCHS USED
C NEPCO = OUTPUT TRAINING EPOCHS USED
C BSTSCR= THE BEST SCORE OF CANDIDATES
C
C FILE NAMES:
C NETFNM= NETWORK CONFIGRATION FILE, STORING THE SETUP PARAMETERS AND
C WEIGHTS OF TRAINED NETWORK
C RPTFNM= REPORT FILE THAT GIVES STATISTICAL RESULTS ABOUT TRAINING
C TRNFNM= TRAINING DATA FILE (ALL TRAINING EXAMPLES)
C RUNFNM= INPUT DATA FILE FOR RUNNING A TRAINED NETWORK
C OUTFNM= OUTPUT FILE FOR A TRAINED NETWORK
C
CCCCCCCCCCCCCCCCCCCCCCCCC ARRAYS CCCCCCCCCCCCCCCCCCCCCCCCCCCCC
C NEUR(1,J): STORING THE SEQUENTIAL ORDER FOR ADDING THE NODE IN THE
C ITH LAYER, JTH COLUMN
C
C NEUS(K,2): IFK=NEUR(I,J),THEN I=NEUS(KI1),J=NEUS(K,2)
C XINP(*,K): INPUT VECTOR OF THE KTH EXAMPLE
C DOUT (*, K) : TARGET OUTPUT VECTOR FOR THE KTH INPUT EXAMPLE
C YOUT (*, K) : ACTUAL NETWORK OUTPUT FOR THE KTH INPUT EXAMPLE
C HOUT (I, J,K) : THE OUTPUT OF HIDDEN NODE(1, J) FOR THE KTH EXAMPLE
C HWTS (*, I, J) : WEIGHTS OF THE NODE (I, J)
C OWTS (*, J) : WEIGHTS OF THE OUTPUT NODE J
C OSLP(*,J): DERIVATIVES W.R.P TO WEIGHTS OF THE OUPUT NODE J
C OPSL (*, J) : THE PREVIOUS OSLP (*, J)
C ODWT ( *, J) : CHANGES IN WEIGHTS OF OUPUT NODE J
C ERRO(J,K): RESIDUAL ERROR PRODUCED AT OUPUT NODE J FOR EXAMPLE K
C CDOU(J): AVERAGE OUPUTS OF THE CANDIDATE J OVER ALL THE EXAMPLES
C CCOR(J,I,2): COVARIANCE VALUES OF CANDIDATE I AT OUPUT NODE J
C CWTS (*, J) : WEIGHTS OF THE CANDIDATE J
C SLOP(*,J): DERIVATIVES W.R.P TO WEIGHTS OF THE CNADIDATE J
C PSLP (*, J) : PREVIOUS SLOP (*, J)
C DWTS (*, J) : CHANGES IN WEIGHTS OF THE CANDIDATE J
C TSTH ( I, J) : TEMPORARY ARRAY STORING THE OUTPUTS OF NODE (I, J) FOR
C SINGLE INPUT EXAMLE
C
C MRSEED(*): STORING RANDOM SEEDS
C NETFIG (N, * ) : INFORMATION ABOUT FINAL NETWORK ARCHITECTURE OBTAINED
C AT NTH RUN
C TRNERR(N,*): INFORMATION ABOUT TRANING ERROR FOR THE NTH RUN
C TSTERR(NIf): INFORMATION ABOUT TESTING ERROR FOR THE NTH RUN
C VLDERR(NIf): INFORMATION ABOUT ERROR ON VALIDATION SET FOR THE NTH
C RUN
C
CCCCCCCCCCCCCCCCCC SUBROUTINES AND FUNCTIONS CCCCCCCCCCCCCCCCCCCCCCCCCC
C NAME OF SUBROUTINE OR FUNCTION
C IF FOLLOWED BY { ... 1 :
c SUBROUTINES CALLED (PREFIXED WITH CALL)AND/OR FUNCTIONS CALLED
c ELSE : NOT CALL OTHER SUBROUTINES/FUNCTIONS
C
C
C MAIN { CALL ININET, CALL SETNET, CALL RDPROB, CALL SETNUR, CALL
C TRAIN, CALL GETSTA, CALL WRTNET, CALL TEST, CALL SHOWER, CALL
REPORT, CALL RUNNET)
SUBROUTINE ADJCOR
SUBROUTINE CNDSLP{FPRIME,OUTHNU,NUMHWT)
SUBROUTINE COMERR{MARKOP,OPRIME)
SUBROUTINE COREPC{CALL ADJCOR,OUTHNU}
SUBROUTINE ERRSTA
SUBROUTINE GETERVICALL GETSTD]
SUBROUTINE GETSTAICALL GETERV,NCONEX)
SUBROUTINE GETSTD{MARKOP}
SUBROUTINE HNUPAS{OUTHNU}
SUBROUTINE ININET
SUBROUTINE INITNN
SUBROUTINE OUTPAS{FTRANS)
SUBROUTINE OWTNEW
SUBROUTINE QKPROP
SUBROUTINE REPORT { CALL ERRSTA, CALL SHOWER]
SUBROUTINE RDHEAD
SUBROUTINE RDPROBICALL GETSTD}
SUBROUTINE RUNNETICALL HNUPAS,CALL OUTPAS}
SUBROUTINE SETNUR
SUBROUTINE SETANN{CALL SETNUR,CALL INITNN}
SUBROUTINE SETNETICALL RDHEAD,CALL SETANN,NUMHWT}
SUBROUTINE SHOWER
SUBROUTINE TESTICALL HNUPAS, CALL OUTPASICALL GETERV)
SUBROUTINE TRAIN { CALL OWTNEW, MTROUT, MTRINP , RANDOM)
SUBROUTINE TRNOUT{CALL OUTPASICALL COMERR,CALL GETERV,CALL UPOWTS}
SUBROUTINE UPHWTS{CALL QKPROP]
SUBROUTINE UPOWTS{ CALL QKPROP)
SUBROUTINE WRTNETICALL WTHEAD,NUMHWT)
SUBROUTINE WTHEAD
C
C FUNCTION FPRIME
C FUNCTION FTRANS
C FUNCTION MARKOP
C FUNCTION MTRINP { CALL CNDSLP, CALL UPHWTS , CALL COREPC, CALL AD JCOR,
C NUMHWT, RANDOM,OUTHNU}
C FUNCTION MTROUT{CALL TRNOUT}
C FUNCTION NCONEX
C FUNCTION NUMHWT
C FUNCTION OPRIME
C FUNCTION OUTHNU{NUMHWT,FTRANS}
C FUNCTION RANDOM
C
CCCCCCCCCCCCCCCC MAIN PROGARM CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
CCCCCC
PROGRAM XCAS
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
PARAMETER (MXI=36, MXO=5, MXE=44 00, MXT=2 0, MXD=8, MXS=10 )
PARAMETER (MXL=16, MXC=16)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NTSTEX~NVLDEX~NTRIAL
COMMON /NCANDP/ MXEPOC,NBONUS,IDXMSR,NETSTA,METHOD,NTRAN~,NR~EED
COMMON /PTRAIN/ ALPHA, BETA, GAMMA, SHRINK, BSTSCR, THRESH, EPCWTS
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTC,NUMLAY,NUMNE~
COMMON /STATIS/ TSQE, SQER, VSQE, VEIX, VDSTDJRRVAL, ERRTHR
COMMON /WTSVAR/ WRANGE,BSTERR,TVAR,TDSTD,TsDSTD,VCOE,TCOE
COMMON /NTRNOU/ MXEPCO,NEPCO,NBONO,NORMER,NEUO,NCENT,NRESO
COMMON /PTROUT/ ALPHO,BETO,GAMMO,SHNKO,THREO,DECAYO,WTSCRO
COMMON /NVAMOD/ MODVAL, MODREM
COMMON /MODNEX/ NOXINP, NOVERT
COMMON /NETTOP/ NETLAY~NETCOL
CHARACTER*30 NETFNM, TRNE'NM, RUNFNM, OUTFNM, RPTFNM
DIMENSION NEUR (MXL, l+MXC)
DIMENSION NEUS (MXL*MXC, 2)
DIMENSION XINP (MXI, MXE)
DIMENSION DOUT (MXO, MXE)
DIMENSION YOUT (MX0,MXE)
DIMENSION HOUT (MXL, MXC, MXE)
DIMENSION HWTS (l+MXI+MXL+MXC,MXL,MXC)
DIMENSION OWTS (l+MXI+MXL*MXC,MXO)
DIMENSION OSLP ( l+MXI+MXL*MXC, MXO)
DIMENSION OPSL ( l+MXI+MXL*MXC, MXO)
DIMENSION ODWT ( l+MXI+MXL*MXC, MXO)
DIMENSION ERR0 (MXO, l+MXE)
DIMENSION CDOU (MXD)
DIMENSION CCOR (MXO, MXD, 2 )
DIMENSION CWTS (l+MXI+MXL+MXC, MXD)
DIMENSION SLOP ( l+MXI+MXL+MXC,MXD)
DIMENSION PSLP ( l+MXI+MXL+MXC, MXD)
DIMENSION DWTS ( l+MXI+MXL+MXC, MXD)
DIMENSION TSTH (MXL, MXC)
DIMENSION MRSEED(l+MXT)
DIMENSION NETFIG (MXT, 6)
DIMENSION TRNERR (MXT, 6)
DIMENSION TSTERR (MXT, 6)
DIMENSION VLDERR (MXT, 6)
NZ=6
NS=MXT
NET CONFIGRATION FILE NAME: UNIT=30
NETFNM=' '
TRAINING DATA FILE NAME: UNIT=31
TRNE'NM= ' '
RUNNING SET FILE NAME: UNIT=32
RUNFNM= ' '
NET OUTPUT FILE NAME: UNIT=33
OUTFNM=' '
TRAINING AND TESTING REPORT FILE NAME: UNIT=34
RPTFNM=' '
NRPTFL=3 4
WRITE(*,*) 'Storage Limits For Network Layout:'
WRITE (*, * ) 'Max  Layer= ' , MXL, ' Max  Column= ' , MXC
WRITE(*, *)
CALL ININET WRITE(*,*) ' E n t e r configration f i l e name:'
READ (*, *) NETFNM
CALL SETNET (HWTS , OWTS, MRSEED, NS, NEUR, NEUS, NETFNM,
& MXI,MXL,MXC,MXO,MXD)
C
CCCCCC
I F (NETSTA . EQ. 0 ) THEN
WRITE(*,*) 'You A r e Going To T r a i n The N e t ! '
WRITE (*, * )
WRITE(*,*) ' E n t e r t r a i n i n g  data, report f i l e name:'
READ (*, *) TRNFNM, RPTFNM
CALL RDPROB (XINP, DOUT, TRNEWM,MXI, MXO, MXE)
BSTERR=l.OD30
DO 4 0 0 JSEED=l, NTRIAL
CALL SETNUR (NEUR, MXL, MXC )
NRSEED=MRSEED (JSEED+l)
WRITE ( *, *) ' TRIAL ' , JSEED, ' SEED= ' , NRSEED
CCCCCC
CALL TRAIN (XINP, HOUT, YOUT, DOUT, ERRO, HWTS, OWTS, ODWT, OSLP,
OPSL, CWTS, DWTS, SLOP, PSLP, CCOR, CDOUINEUSI NEUR,
MXI,MXL,MXC,MXO,MXE,MXD)
CALL GETSTA (NETFIG, NEUR, YOUT, TRNERR, VLDERR, NS, NZ, JSEED
,MXL,MXC,MXO,MXE)
IF(VSQE .LT. BSTERR) THEN
MRSEED ( 1 ) = JSEED
CALL WRTNET (HWTS, OWTS, MRSEED, NS, NEUR, NETFNM, TRNFNM,
MXI,MXL,MXC,MXO)
BSTERR=VSQE
END I F
CALL TEST (XINP, DOUT, YOUT, OWTS, TSTH, HWTS,NEUR,
TSTERRINSINZI JSEEDIMXIIMXLIMXCIMXOIMXE)
WRITE(*, *)
CALL SHOWER (NETFIG, TRNERR, VLDERR, TSTERR, NS, NZ, JSEED, 0 )
WRITE (*, *)
400 CONTINUE
C
OPEN (UNIT=NRPTFL, FILE=RPTFNM, STATUS= ' OLD ' )
C
CALL REPORT (NETFIG, TRNERR, VLDERR, TSTERR, MRSEED, NS, NZ,
& NRPTFL, TRNFNM)
C
CLOSE (NRPTFL)
WRITE (*, *) ' C o n f i g r a t i o n f i l e : ' ,NETFNM
WRITE ( * , * ) ' T r a i n i n g data file : ' , TRNFNM
C WRITE (*, *) ' T e s t i n g set file: ' , TSTFNM
WRITE(*,*)'Trn & T s t report: ',RPTFNM
CCCCCC
ELSE
WRITE(*,*)'You A r e Going To Run On The T r a i n e d N e t ! '
WRITE (*, *)
Table 9. LS Means for Production Characteristics by Treatment (Adjfat=0.4).
Treatment (Frame Size X Muscle Score)
Trait Units Small No.1 Small No.2 Med. No.1 Med. No.2 Large No.1 Large No.2
Purchase Weight of Cattle Pounds 465.758 458.808 459.54 454.334 470.1 54 470.59
Standard Error 6.004 6.247 4.327 4.104 4.647 5.802
Backgrounding ADG PoundsIDay 0.105 0.21 5 0.139 0.52 0.183 0.686
Standard Error 0.290 0.302 0.209 0.198 0.224 0.280
Pasture ADG
Standard Error
Feedlot ADG
Standard Error
Feed Efficiency In Feedlot FeedIGain in Pounds 6.65 la 6.828" 6.881a 7.477b 7.922' 7.673bC
Standard Error 0.1 19 0.124 0.086 0.08 1 0.092 0.115
Days Fed in Feedlot
Standard Error
Days 105.384" 106.545" 121.216" 137.2~ 152.307~ 141.93~
4.939 5.139 3.560 3.376 3.823 4.773
Harvest Weight Pounds 1064.345" 1128.108~~1 215.472~' 1237.64' 1289.412'~ 1336.958~
Standard Error 24.543 25.539 17.689 16.776 18.998 23.717
a,b,c,d Means in the same row for the same item with a different superscript letter differ (P>.05).
NEUR (NL, 0) =1
ELSE
READ(NFILE,*) NEUR(NL,O)
END IF
MAXCOL=MAX (MAXCOL,N EUR (NL,0 ) )
CONTINUE
IF (MAXCOL . GT. NETCOL) THEN
WRITE(*,*) 'Number of columns mismatched with network layout!'
NFLAG=l
GO TO 600
END IF
C
C READ WEIGHTS OF HIDDEN UNITS
READ (NFILE, * )
DO 300 NL=l, NUMLAY
DO 200 NC=l,NEUR(NL,O)
NWTS=NUMHWT (NL,NC,NUMINP,NBX,NBC,NBV) +1
READ (NFILE, * )
DO 160 NW=l,NWTS
READ (NFILE, 700) HW (NW,NL,NC)
160 CONTINUE
200 CONTINUE
300 CONTINUE
C
END IF
C
C READ WEIGHTS OF OUTPUT UNITS
READ (NFILE, * )
NWTS=l+NUMINP+NUMNEU
DO 500 NOUT=l,NUMOUT
READ (NFILE, *)
DO 400 NW=l,NWTS
R09D(NFILE1700) OW(NW,NOUT)
400 CONTINUE
5 0 0 CONTINUE
C
END IF
C
600 CLOSE (NFILE)
IF(NFLAG .GT. 0)STOP
700 FORMAT (lX, D24.15)
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C READ HEADER OF CONFIGRATION FILE
C GET SETUP PARAMETRS
CCCCCC
SUBROUTINE RDHEAD (NFILE, NCODE, MXI , MXL, MXC, MXO, MXCND)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NRUNEX,NTRIAL
COMMON /NCANDP/ MXEP,NBON,IDXMSR,NETSTA,METHOD~NUTYPH~NRSEED
COMMON /STATIS/ SSQE,STDD,VMSE,VDEV,VSTD~ERRVAL~ERRTHR
COMMON /WTSVAR/ WRANGE, BSTERR, TFWSE, TRDEV, TRSTD, VLDCOE, TRNCOE
COMMON /NTRAIN/ NRETUR,NCND,NEPOCH,NBESTC,NUMLAY,NUMNE~
COMMON /PTRAIN/ ALPHA, BETA, GAMMA, SHRINK, BSTSCR, THRESH,EPTH
COMMON /NTRNOU/ MXEPCO,NEPCO,NBONO,NORM,NUTYPO,NCENT,NRESO
COMMON /PTROUT/ ALPHOIBETOIGAMMOISHNKO,THREO,DECAYO~EPCWTO
COMMON /NVAMOD/ MODV, MODR
C O ~ O N/M ODNEX/ NOX, NOV
COMMON /NETTOP/ NETLAY,NETCOL
CHARACTER*30 NETF, TRNFNM, CH* 8 0
READ (NFILE, * )
READ (NFILE, *)
READ (NFILE, * ) NETSTA, NETLAY, NETCOL, NUTYPH , NUTYPO, NOX , NOV
READ (NFILE, * )
READ (NFILE, * ) MXEP, NBON, NCND, MXEPCO, NBONO
READ (NFILE, *)
READ (NFILE, * )ALPHA, BETA, THRESH, WRANGE
READ (NFILE, * )
READ (NFILE, *) ALPHO, BETO, THREO, DECAY0,ERRTHR
READ (NFILE, *)
READ (NFILE, *)NUMLAY,NUMNEU,NORM,NCENT, IDXMSR
READ (NFILE, * )
READ (NFILE, *) NUMINP, NUMOUT, MODV, MODR, NTRIAL
IF((NETC0L .GT. MXC) .OR. (NETLAY .GT. MXL)) THEN
WRITE(*, *) 'Storage ( ',MXL, ' BY ', MXC, ' ) not enough for ',
& NETLAY, ' by ',NETCOLI1 network topology!'
NCODE=l
RETURN
END IF
C
DECAYO=DECAYO/~OO.O
IF (NBONO . GT . MXEPCO) NBONO=8
IF(NBON0 .LT. 2) NBONO=4
C
IF((1DXMSR .LT. 0) .OR. (IDXMSR .GT. 6)) THEN
IDXMSR=5
ERRTHR=0.2
WRITE (*, *) 'Invalid range of IDXMSR (16), '
WRITE(*,*) 'Default IDXMSR=5 and ERRTHR=0.2 used.'
WRITE (*, *)
END IF
C
IF(N0X .NE. O)NOX=l
IF(N0V .NE. O)NOV=l
C CHECK CONFIG. FILE
C
IF(NETSTA .NE. 0) THEN
NCODE=l
IF (NUMLAY . GT. NETLAY ) THEN
WRITE(*,*)'Number of layers mismatched with network layout!'
ELSE IF(NUM1NP .GT. MXI) THEN
WRITE(*,*)'Number of inputs over storage limit!'
ELSE IF (NUMOUT . GT. MXO) THEN WRITE(*,*) 'Number of outputs over storage limit!'
ELSE IF(NCND .GT. MXCND) THEN
WRITE(*,*) 'Number of candidates over storage limit!'
ELSE
NCODE=O
END IF
ELSE
NUMLAY= 0
END IF
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C SET UP NETWORK ARCHITECTURE
C SETUP OF CONNECTIONS
C SEQUENCE FOR ADDING NODES
CCCCCC
SUBROUTINE SETANN(NEURON,NEUSEQ,MXL,MXC)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NETTOP/ NETLAY,NETCOL
C
DIMENSION NEURON (MXL, 0 :MXC) ,NEUSEQ (MXL*MXC, 2 )
CCCCCC
CALL SETNUR (NEURON,M XL, MXC)
C
IF (NETCOL . EQ. 1) THEN
C FOR CASCOR NETWORK
DO 100 NL=l,MXL
NEURON (NL, 1 ) =NL
NEUSEQ (NL, 1 ) =NL
NEUSEQ (NL, 2 ) =1
100 CONTINUE
ELSE
C FOR XCAS NETWORK
CALL INITNN (NEURON, NEUSEQ, MXL, MXC)
END IF
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C ZERO THE ARRAY OF NETWORK LAYOUT
CCCCCC
SUBROUTINE SETNUR (NEURON, MXL,MXC)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
DIMENSION NEURON (MXL, 0 : MXC)
C
DO 100 NL=l,MXL
NEURON (NL, 0) =O
100 CONTINUE
END
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ~ ~ ~ ~ ~ ~ C
C INTIALIZE ARRAY NEUR AND NEUS
C SET SEQUENCE ACCORDING TO WHICH THE NEURON WILL BE ADDED
C SYMETRIC ADDITION OF NODES RELATIVE TO DIAGONAL OF THE MATRIX
CCCCCC
SUBROUTINE INITNN (NEURON, NEUSEQ, MXL, MXC)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NETTOP/ NETLAY~NETCOL
C
DIMENSION NEURON (MXL,0 :MXC), N EUSEQ (MXL*MXC,2 )
C
MINSQR=MIN (NETLAY, NETCOL)
K= 1
DO 200 I=l, MINSQR
DO 100 J=l, I
NEURON (I, J) =K
NEUSEQ (K, 1) =I
NEUSEQ(K,Z)=J
K=K+1
IF (I .NE. J) THEN
NEURON ( J, I ) =K
NEUSEQ(K, 1)=J
NEUSEQ (K, 2) =I
K=K+1
END IF
10 0 CONTINUE
200 CONTINUE
C
IF (NETCOL . GT. MINSQR) THEN
DO 4 00 J= (MINSQR+l) , NETCOL
DO 300 I=l, NETLAY
NEURON (I, J) =K
NEUSEQ (K, 1) =I
NEUSEQ (K, 2) =J
K=K+ 1
3 0 0 CONTINUE
400 CONTINUE
END IF
C
IF(NETLAY . GT. MINSQR) THEN
DO 600 I= (MINSQR+l) , NETLAY
DO 500 J=1, NETCOL
NEURON (I, J) =K
NEUSEQ (K, 1) =I
NEUSEQ (K,2) =J
K=K+1
5 0 0 CONTINUE
600 CONTINUE
END IF
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCcCCCCCCCcccccccccccccccccccccccccccccccc
C
C COMPUTE NUMBER OF WTS+BIAS FOR A FINAL NETWORK
C NFIG=ARRAY STORING NUMBER OF NODES IN EACH LAYER
C NDIM=DIM. OF NFIG
C NX=NUMBER OF INPUTS
C MCX=1, SHORTCUT CONNECTIONS TO INPUTS ENABLED, OTHERWISE = 0
C MCV=l, VERTICAL SHORTCUT CONNECTIONS TO PREVIOUS NODES, OTHERWISE =
0
CCCCCC
FUNCTION NCONEX (NFI G, NDIM, NX , MCX I MCV)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
DIMENSION NFIG (NDIM)
C
NCONEX=O
DO 500 NL=l,NDIM
NUMC=NFIG (NL)
IF(NUMC .LE. 0) RETURN
MCCX=MCX
MCCC=l
MCCV=O
IF(NL .EQ. 1) THEN
MCCX=l
MCCC=O
ELSE
IF (NL . GT . 2 ) MCCV=MCV
END IF
C
DO 400 NC=l,NUMC
NODEWS=NX*MCCX+NC*MCCC+(NL2)*MCCV+1
NCONEX=NCONEX+NODEWS
400 CONTINUE
C
500 CONTINUE
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C SUBROUTINE FOR TRAINING THE NETWORK
CCCCCC
SUBROUTINE TRAIN (X,H,Y,D, ER,HWIOWIODWIOSIOPSICWDIW, S,P SI
& CC, CDINEUSfNEURIMXIIMXLI MXCIMXOIMXEI MXD)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM~NTSTEX~NVLDEX,NTRIAL
COMMON /NCANDP/ MXEPOC,NBONUS,IDXMSR~NETSTA~METHOD,NTRANS,NRSEED
COMMON /PTRAIN/ ALPHA, BETA, GAMMA, SHRINK, BSTSCR, THRESHmCWTS
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTC,NUMLAY,NUMNE~
COMMON /NTRNOU/ MXEPCO,NEPCO,NBONO,MSRO~NEUO~METHO~NRETO
COMMON /PTROUT/ ALPHO,BETO,GAMMO,SHNKO~THREO~DECAYO,WTSCRO
COMMON /STATIS/ TSQE, SQER~VSQE,W AR~VDSTD, ERRVAL, ERRTHR
COMMON /WTSVAR/ WRANGE,BSTERR,TVAR,TDSTDITSDSTDIVCOEITCOE
COMMON /NETTOP/ NETLAY,NETCOL
C
DIMENSION X(MX1,MXE) ,H (MXL,MXC,MXE) ,Y (MX0,MXE) ,D(ME)
DIMENSION ER (MXO, l+MXE) , HW (l+MXI+MXL+MXC,MXL,MXC) , CD (MXD)
DIMENSION OW ( l+MXI+MXL*MXC,MXO) , ODW (l+MXI+MXL*MXCI MXo)
DIMENSION OS (l+MXI+MXL*MXC,MXO) , OPS (l+MXI+MXL*MXC,MXO)
DIMENSION CW(l+MXI+MXL+MXC,MXD) ,DW(l+MXI+MXL+MXC,MXD)
DIMENSION S (l+MXI+MXL+MXC, MXD) ,P S (l+MXI+MXL+MXCIM XD)
DIMENSION NEUS (MXL*MXC, 2 ) , NEUR (MXL I l+MXC) I CC (MXO , MXD , 2 )
C
NUMN EU= 0
NUMLAY= 0
NEPCO=O
NEPOCH=O
EPCWTS=O.O
WTSCRO=O.O
NCODE=O
WSPAN=2.0*WRANGE
NOWTS=l+MXI+MXL*MXC
MAXNOD=NETLAY*NETCOL
C
DO 14 0 JOU=l, NUMOUT
DO 110 JCND=l,NUMCND
CC (JOU, JCND, 1) =O. 0
CC (JOU, JCND, 2) =O. 0
110 CONTINUE
C
DO 120 JW=l,NOWTS
0s (JW, JOU)=O.O
OPS (JW, JOU) =o. 0
ODW (JW, JOU) =O. 0
OW(JW, JOU)=O. 0
IF(JW .LE. (l+NUMINP)) THEN
OW (JW, JOU) =WSPAN*RANDOM (NRSEED)
END IF
12 0 CONTINUE
140 CONTINUE
CCCCCC
200 IF (NUMNEU . LT. MAXNOD) THEN
C
C OUTPUT TRAINING
NCODE=MTROUT (X, H, Y, D, ER, OW, ODW, 0s , OPS,NEUS,NEUR,
& MXI,MXL,MXC,MXO,MXE)
CCCCCC
IF (NCODE . EQ. 1) THEN
WRITE(",*) 'WIN! ! ! '
GO TO 600
END IF
WRITE(*,700) 'T SQE: ',TSQE, 'V SQE: ',VSQE, 'ErrVal: ',ERRVAL
WRITE(*, 800) ' ~A  ~ a ~ e r' ,sN:U M ~ Y',H  N odes: ' ,NUMNEU
WRITE (*, *)
C
C INPUT TRAINING
NCODE= MTRINP (X, H, HW, CC, CD, CW, DW, S, PS, ER, NEUS,NEUR,
& MXI,MXL,MXC,MXO,MXE,MXD)
CCCCCC
CALL OWTNEW(OW,CC,MXI,MXL,MXC,MXO,MXD)
C
GO TO 200
END IF
C
NCODE=MTROUT (X, HI Y, D, ERI OWI ODWI 0s I OPS I NEUS, NEURI
& MXI,MXL,MXC,MXO,MXE)
C
WRITE(*,*) 'OUT OF HIDDEN NODES ! '
600 WRITE(*,700)'T  SQE:',TSQE,'V A SQE:',VSQE,'ErrVal:',ERRVAL
WRITE(*,800) 'H  Layers: ',NUMLAY, 'H  Nodes: ',NUMNEU
WRITE (*, *)
700 FORMAT(A,3X,F12.4,5X,AIF12.4,3XIAIF12.4)
800 FORMAT (A, 112,3X,A, 112)
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C MAIN FUNCTION FOR OUTPUT TRAINING
CCCCCC
FUNCTION MTROUT (XI HI Y, D, ER, OW, ODW, 0s , OPS ,NEUS,NEUR,
& MXI,MXL,MXC,MXO,MXE)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NTRNOU/ MXEPCO,NEPCO,NBONO,MSRO~NEUO,METHO,NRETO
COMMON /PTROUT/ ALPHO,BETO,GAMMO,SHNKO,THREO,DECAYO,EPWT~~
COMMON /STATIS/ TSQE, SQER, VSQE, WAR, VDSTD, ERRVAL,ERRTHR
COMMON /WTSVAR/ WRANGE,BSTERR,TVAR,TDSTD,TSDSTD,VCOE,TCOE
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH~NBESTC~NUMLAY~NUMNEU
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NVLDEX,NTRIAL
C
DIMENSION ER (MXO, 0 :MXE) , NEUR (MXL, l+MXC) , NEUS (MXL*MXC, 2)
DIMENSION X(MX1,MXE) ,Y(MXO,MXE) ID(MXOIMXE)I H(MXLIMXCIMXE)
DIMENSION OW (l+MXI+MXL*MXC,MXO) , 0s (l+MXI+MXLfMXCI MXO)
DIMENSION ODW (l+MXI+MXL*MXC,MXO) I OPS (1+MXI+MXLfMXCIMX0)
C
MTROUT=3
NEPC=O
NFIRST=l
PREERR=O.O
NQUIT=MXEPCO
MEWTS=O
NWTS=l+NUMINP+NUMNEU
DO 500 NEPC=l, MXEPCO
CALL TRNOUT (XI HI Y, D, ER, OW, ODW, OS, OPS,NEUS, NEUR,
& MXI,MXL,MXC,MXO,MXE)
C
NEPCO=NEPCO+l
MEWTS=MEWTS+l
IF (ERRVAL . LE. ERRTHR) THEN
MTROUT=l
GO TO 600
ELSE
IF(NF1RST .EQ. 1) THEN
NFIRST=O
PREERR=TSQE
ELSE IF (ABS (TSQEPREERR) . GT . (PREERRfTHREO) ) THEN
NQUIT=NEPC+NBONO
PREERR=TSQE
ELSE
IF(NQU1T .LT. NEPC) THEN
MTROUT=2
GO TO 600
END IF
END IF
END IF
500 CONTINUE 600 EPWTSO=EPWTSO+DBLE(MEWTS*NWTS)/~OOO.O
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C VECTOR OUTPUT OF THE NET FOR THE KTH EXMPLE
C AFTER ALL OUTPUTS OF HIDDEN UNITS ARE OBTAINED
C X=XINP
C H=HOUT (l,l, k)
C OW=OWTS
CCCCCC
SUBROUTINE OUTPAS ( KTH I XI HI Y I OW I NEURI MXI I MXL I MXC I JYXO I MXE)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NRUNEX,NTRIA
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTC,NUMLAY,NUMNEU
COMMON /NTRNOU/ MXEPCO,NEPCO,NBONO,MSRO,NEUO,METH~,NRETO
COMMON /NETTOP/ NETLAY~NETCOL
C
DIMENSION NEUR (MXL,0 :MXC), X (MXI, MXE )C(H,)
DIMENSION OW (0 : (MXI+MXL*MXC) , MXO)
C
DO 600 JO=l,NUMOUT
SUM=OW (0, JO)
DO 100 JW=1, NUMINP
SUM=SUM+X (JW, KTH) *OW (JW, JO)
100 CONTINUE
C
IF (NUMNEU . GT. 0) THEN
DO 500 M=l,NETLAY
NCOL=NEUR (M, 0 )
IF(NC0L .GT. 0) THEN
DO 400 N=1, NCOL
JW=NUMINP+NEUR (M, N)
SUM=SUM+H (MI N) *OW (JW, JO)
400 CONTINUE
END IF
5 0 0 CONTINUE
END IF
550 Y (JO, KTH) =FTRANS (SUM, NEUO)
600 CONTINUE
C
END
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ~ ~ ~ ~ ~ C
C COMPUTE VALUE OF STOPPING CONDITION
CCCCCC
SUBROUTINE GETERV (Y, KX1, KX2, ERRV, NERR, MODSTD, MODTRN,
& MODV, MODR, MXO, MXE)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NTSTEXINVLDEXINTRIA
COMMON /NCANDP/ MXEPOC,NBONUS,IDXMSR,NETSTAIMETHODINTRANSINRSEED
COMMON /STATIS/ TSQE, SQER,VSQE, VEIX, VDSTD, ERRVAL, ERRTHR
COMMON /WTSVAR/ WRANGE,BSTERR,TVAR,TDSTD,TSDSTD,VCOE,TCOE
COMMON /NTRNOU/ MXEPCO,NEPCO,NBONO,NORMER,NEUO~NCENT,NRESO
C
DIMENSION Y (MXO, MXE) , ERRV (NERR) ,R(l6)
C
NV=o
NB=6
NR=16
PERCNT=1.0
COEV=l . 0
COVT=l. 0
VMSE=VSQE/(NVLDEXfNUMOUT)
TMSE=TSQE/ (NUMEXM*NvMoUT)
IF(NCENT .EQ. l)PERCNT=100.0
C
ERRV (2 ) =SQRT (TMSE)
ERRV (NB+2 ) =SQRT (VMSE )
ERRV ( 5 ) =SQRT (TMSE) /vDSTD
ERRV (NB+5 ) =SQRT (VMSE) /VDSTD
C
IF(M0DSTD .EQ. 1 )THEN
CALL GETSTD (Y, MXO,MXE, 1, NUMOUT, KX1, KX2 ,MODV, MODR, R, NR, NV)
VLDCOE= (R (NB+5) R (NB+6) )
COEV=l.O+NORMER*(VLDCOE1.0)
ERRV (NB+3 ) =R (NB+l)
ERRV (NB+4 ) =SQRT (ERRV (NB+3 ) )
ERRV (NB+6) =VMSE/R (NB+2 )
C
IF (MODTRN . EQ. 1) THEN
TRNCOE=(R(5)  R ( 6 ) )
COET=l.O+NORMER*(TRNCOE1.0)
ERRV(3) =R(l)
ERRV(4) =SQRT (ERRV(3) )
ERRV(6) =VMSE/R(NB+2)
END IF
END IF
C
VMSE=VMSEfCOEV
ERRV (NB+l ) =VMSE*PERCNT
IF (MODTRN . EQ. 1) THEN
TMSE=TMSEfCOET
ERRV ( 1 ) =TMSEf PERCNT
END IF
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C OUTPUT TRAINING FOR ONE EPOCH
CCCCCC
SUBROUTINE TRNOUT (XI HI Y, D, ER, OW, ODW, 0s , OPS , NEUS , NEUR,
& MXI,MXL,MXC,MXO,MXE)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM~NTSTEX,NVLDEX~NTRIAL
COMMON /STATIS/ TSQE, SQER~VSQE,W AR~VDSTD, ERRVAL, ERRTHR
COMMON /NTRNOU/ MXEPCO,NEPCO,NBONO,NORMERINEUO,NCENT,NRES0
COMMON /NCANDP/ MXEPC,NBONUS,IDXMSR,NET~TA,METHOD,NTRANS,NR~EED
COMMON /NVAMOD/ MODV,MODR
DIMENSION ER (MXO, 0:MXE) ,NEUR (MXL, l+MXC) ,NEUS (MXL*MXC, 2 1 , ERRV( 16)
DIMENSION X (MX1,MXE) , H (MXL,MXC,MXE) , Y (MXO, MXE) ,D(ME)
DIMENSION OW (0: (MXI+MXLfMXC), MXO) ,0 s (0: (MXI+MXLfMXC), MXO)
DIMENSION ODW ( l+MXI+MXL*MXC, MXO) , OPS ( l+MXI+MXL*MXC, MXO)
C
TSQE=O. 0
SQER = 0.0
VSQE=O . 0
MODSTD = 1
NB=6
NERR=16
C
IF((1DXMSR .EQ.5) .OR. (IDXMSR .EQ.2)) THEN
MODSTD=O
ELSE
IF (IDXMSR . EQ. 1 ) MODSTD=NORMER
END IF
C
DO 2 0 0 JOU= 1, NUMOUT
ER(JOU,O)= 0.0
DO 100 JW=O,MXI+MXLfMXC
0s (JW, JOU) =O. 0
10 0 CONTINUE
200 CONTINUE
C
DO 500 KTH=l,NUMEXM
CALL OUTPAS (KTH,X,H (1,1,KTH) ,Y,OW,NEUR,MXI,MXL,MXC,MXO,MXE)
CALL COMERR (KTH, XI H, Y, D, OS, ER, NEUS ,MI, MXL, MXC, MXO, MXE)
500 CONTINUE
C
m1=1
KXZ=NUMEXM
CALL GETERV (Y, KX1, KX2, ERRV, NERR,MODSTD, 0 ,MODV, MODR, MXO, MXE)
ERRVAL=ERRV (NB+IDXMSR)
IF (ERRVAL . GT. ERRTHR) THEN
CALL UPOWTS (OW, ODW, OS, OPS,MXI,MXL,MXC,MXO)
END IF
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C COMPUTE ERRORS AT KTH EXAMPLE
CCCCCCC
SUBROUTINE COMERR(K,X, H, Y, D, OS,ER,NEUS,MXI ,MXL,MXC,MXO,MXE)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM~NTSTEX,NVLDEX,NTRIAL
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTC,NUMLAY,NUMNE~
COMMON /STATIS/ TSQE, SQER, VSQE, WAR, VDSTD, ERRVAL, ERRTHR
COMMON /WTSVAR/ WRANGE,BSTERR,TVAR,TDSTD,TSDSTD,VCOE,TCOE
COMMON /NTRNOU/ MXEPCO,NEPCO,NBONO,MSRO,NEUO~METHO,NRETO
COMMON /NVAMOD/ MODV,MODR
C
DIMENSION X(MX1,MXE) ,Y(MXO,MXE) ID(MX0IMXE) rH(MXLIMXCIMXE)
DIMENSION OS (0 : (MXI+MXLfMXC) ,MXO) , NEUS (MXL*MXC, 2) W E )
CCCCCC
MARKV=MARKOP ( K, MODV, MODR)
SQE=O 0
C
DO 400 JOU=l,NUMoUT
DIF = Y(JOU,K)  D(JOU,K)
ER (JOU, K) =DIF
EFP= DIF*OPRIME (Y ( JOU, K) , NEUO)
ER(JOU,O)= ER(JOU,O)+EFP
SQE=SQE+ (DIF*DIF)
SQER = SQER+ (EFP*EFP)
OS (0, JOU) =OS (0, JOU) +EFP
C
DO 200 JW=l,NUMINP
VAL= (X (JW, K) *EFP)
0s (JW, JOU) = 0s (JW, JOU) +VAL
2 0 0 CONTINUE
C
IF (NUMNEU . GT. 0) THEN
DO 300 NNODES=l,NUMNEU
JW=NUMINP+NNODES
IROW=NEUS (NNODES, 1 )
JCOL=NEUS(NNODES,2)
OS (JW, JOU) = OS (JW, JOU) +H (IROW, JCOL, K) *EFP
3 0 0 CONTINUE
END IF
400 CONTINUE
C
TSQE =TSQE+SQE
IF (MARKV . EQ . 1) VSQE=VSQE+SQE
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C UPDATING WEIGHTS OF OUTPUT UNITS
CCCCCC
SUBROUTINE UPOWTS (OW, ODW, OS, OPS,MXI,MXL,MXC,MXO)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /PTROUT/ A,B,GAMMO,SF,THREO,DECAYO,WTSCRO
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH~NBESTC,NUMLAY,NUMNE~
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM~NTSTEX~NRUNEX~NTRIAL
C
DIMENSION OW ( l+MXI+MXL*MXC, MXO) , 0s ( l+MXI+MXL*MXC, MXO)
DIMENSION ODW ( l+MXI+MXL*MXC, MXO) , OPS ( l+MXI+MXL*MXCI MXO)
C
MAXWTS=l+MXI+MXLfMXC
NWTS=l+NUMINP+NUMNEU
EPS=A/NUMEXM
SF=B/ (l.O+B)
DO 400 JOU=l,NUMOUT
DO 300 JTHW=l,NWTS
C UPDATE WTS USING QKPROP
CALL QKPROP ( JTHW, MAXWTS , OW ( 1, JOU) , ODW ( 1, JOU) ,
& OPS(1, JOU) ,OS(l, JOU) ,EPS,B,SF,DECAYO)
3 0 0 CONTINUE
400 CONTINUE
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C MAIN FUNCTION FOR INPUT TRAINING
CCCCCC
FUNCTION MTRINP (XI HI HW, CC, CDI CWI DWI SI PSI ERI NEUS, NEUR,
& MXI,MXL,MXC,MXO,MXE,MXD)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTC~NUMLAY~NUMNEU
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NVLDEX,NTRIAL
COMMON /PTRAIN/ A, B, GAMMA, SF, BSTSCRJHRESH, EPCWTS
COMMON /NCANDP/ MXEPOC,NBONUS,IDXMSR,NETSTA,METHOD,NTRANS,NRSEED
COMMON /WTSVAR/ WRANGE,BSTERR,TRMSE,TRDEV,TRSTDIVLDCOEITRNCOE
C
DIMENSION ER (MXO, 0 :MXE) ,NEUS (MXL*MXC, 2) ,NEUR (MXL, 0 :MXC)
DIMENSION X(MX1,MXE) ,H(MXL,MXC,MXE) ,CD(MXD) ,CC(MXOIMXD,2)
DIMENSION CW (l+MXI+MXL+MXC,MXD) , DW (l+MXI+MXL+MXC,MXD)
DIMENSION S (l+MXI+MXL+MXC,MXD) , PS (l+MXI+MXL+MXC,MXD)
DIMENSION HW (l+MXI+MXL+MXC,MXL, MXC)
C
MTRINP=3
PSCORE=O . 0
NQUIT= MXEPOC
NFIRST= 1
NBX=O
NBC=O
NBV=O
NW=l+MXI+MXL+MXC
IROW=NEUS (NUMNEU+l, 1 )
JCOL=NEUS (NUMNEU+l,2 )
NWTS=NUMHWT ( IROW, JCOL, NUMINP, NBX, NBC, NBV) +1
WSCALE=2. OfWRANGE
C
DO 160 JCND=l, NUMCND
CD(JCND)=O. 0
DO 140 JW=l,NW
CW ( JW, JCND ) =WSCALEf RANDOM (NRSEED)
IF(JW .GT. NWTS)CW(JW,JCND )=0.0
DW (JW, JCND) =O. 0
S (JW, JCND)=O. 0
PS (JW, JCND) =O. 0
14 0 CONTINUE
C
DO 150 JOU=l,NUMOUT
CC (JOU, JCND, 1) =O. 0
CC(JOU, JCND,2) =O. 0
150 CONTINUE
160 CONTINUE
C
DO 170 JOU=l,NUMOUT
ER(JOU, O)=ER(JOU, 0) /NUMEXM
170 CONTINUE
C
CALL COREPC(XIHICD,CCICWIER,NEUS,MXI,MXLrMXCrMXOrMXErMXD)
C
NODEPC=O
DO 400 NEPC=l,MXEPOC
CALL CNDSLP (XI HI CC, CD, CW, S, ER,NEUS,
& MXI,MXL,MXC,MXO,MXE,MXD)
C
CALL UPHWTS (XI HI CC, CD, CW, DW, S, PSI ER,NEUS,NWTS,
& MXI,MXL,MXC,MXO,MXE,MXD)
C
CALL ADJCOR (ER, CD, CC,MXO,MXE,MXD)
C
NODEPC=NODEPC+l
NEPOCH=NEPOCH+l
IF(NF1RST . EQ. 1) THEN
NFIRST=O
PSCORE= BSTSCR
ELSE IF (ABS (BSTSCR PSCORE) . GT. (PSCORE*THRESH) ) THEN
PSCORE= BSTSCR
NQUIT= NEPC + NBONUS
ELSE
IF(NQU1T .LT. NEPC ) THEN
DO 200 JW=l,NWTS
HW (JW, IROW, JCOL) =CW (JW, NBESTC)
2 0 0 CONTINUE
C
DO 300 K=l,NUMEXM
VAL=OUTHNU(X(l,K) ,NUMINP,H(l, 1,K) ,CW(l,NBESTC) ,
& IROW, JCOL,MXI,MXL,MXC)
C
H ( I ROW, JCOL, K) =VAL
300 CONTINUE
MTRINP=2
GO TO 500
END IF
END IF
C
400 CONTINUE
C
500 NEUR (IROW, 0) =NEUR (IROW,O)+l
NUMLAY=MAX ( NUMLAY , I ROW )
E P C W T ~ = E P C W T ~ + D B L E ( N ~ D E P C * N W T S ) / ~ ~ ~ ~ . ~ D ~
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C OBTAIN INITIAL VALUES OF NEW WEIGHTS OF OUTPUT UNITS
C USING COVARIANCE VALUES
CCCCCC
SUBROUTINE OWTNEW (OW, CC, MXI ,MXLI MXC, MXO, MXD)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTC,NUMLAY,NUMNEU
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM~NTSTEX~NVLDEX~NTRIAL
C
DIMENSION CC (MXO,MXD, 2), OW (0: (MXI+MXL*mC) ,mO)
C
WM=l.O/(l+NUMINP+NUMNEU)
NUMNEU=NUMNEU+l
JW=NUMINP+NUMNEU
. DO 300 NOUT=l,NUMOUT
OW(JWrN0UT) = CC(NOUT,NBESTC,~)*WM
300 CONTINUE
C
END
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ~ ~ ~ ~ ~ C
C GET NUMBER OF WEIGHTS (NOT INCLUDING BIAS) FOR NEURON (IROW, JCOL)
C NBX,NBC,NBV: BASE VALUES FOR INDEXING WEIGHTS W.R.P TO ORIGINAL
C INPUTS, HIDDEN NODES IN THE (IROW1) LAYER AND IN THE JCOLTH
COLUMN
CCCCCC
FUNCTION NUMHWT ( IROW, JCOL, NUMINP , NBX , NBC, NBV)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /MODNEX/ NOX, NOV
C
NW=NUMINP*NOX+JCOL
NBX=O
NBC=NUMINP*NOX
NBV= 1
IF(IR0W .GT. 2) THEN
NW=NW+ ( IROW2 ) *NOV
IF (NOX . EQ. 0) NBX=1
IF(N0V .EQ. O)NBV=1
IF(N0V .EQ. l)NBV=NUMINP*NOX+JCOL
ELSE IF(IR0W .EQ. 2) THEN
IF(N0X .EQ. O)NBX=1
ELSE
NW=NUMINP
NBC= 1
END IF
NUMHWT=NW
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C OUPUT OF THE NEURON(IROW,JCOL) FOR THE KTH EXAMPLE
C X=XINP ( 1, K)
C NIN=NUMINP
C H=HOUT (1,1,K)
c HW=HWTS(~,IROW,JCOL)/CWTS(~~ICAND)
CCCCCC
FUNCTION OUTHNU (X,NINIHIHWI,R OWIJ COLIMXIIMXLIMXC)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NCANDP/ MXEPOC,NBONUS,IDXMSRINETSTAIMETHODINTRANSINRSEED
C
DIMENSION X (MXI) ,H (MXL,MXC), H W ( 0: (MXI+MXL+MXC))
C
NBC=O
NBX=O
NBV= 0
NWTS=NUMHWT ( IROW, JCOL, NIN, NBX, NBC, NBV)
S=HW (0)
C SUM OF WEIGHTED INPUTS
IF(NBX .GE. 0) THEN
DO 100 I=l,NIN
S=S+X ( I ) *HW (NBX+I )
100 CONTINUE
END IF
C
C SUM OF WEIGHTED OUTPUTS OF NODES IN THE PREVIOUS LAYER
IF(NBC .GE. 0) THEN
DO 300 J=l,JCOL
S=S+H (IROW1, J) *HW (NBC+J) .
3 0 0 CONTINUE
END IF
C
C SUM OF WEIGHTED OUTPUTS OF NODES IN THE JCOL COLUMN
IF(NBV .GE. 0) THEN
DO 200 I=l,IROW2
S=S+H (I, JCOL) *HW (NBV+I )
200 CONTINUE
END IF
C
OUTHNU=FTF?ANS (S, NTRANS)
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C COVARIANCE VALUES OF CANDIDATES AFTER ONE EPOCH OF INPUT TRAINING
CCCCCC
SUBROUTINE COREPC(X,H,CD,CC,CW,ER,NEUS,MXI,MXL,MXC,MXO,MXE,MXD)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
DIMENSION ER (MXO, 0 :MXE) , NEUS (MXL*MXC, 2 )
DIMENSION X (MXI,M XE) , H (MXL,MXC, MXE)
DIMENSION CD(MXD) ,CC(MXO,MXD,2) ,CW(l+MXI+MXL+MXC,MXD)
C
VAL=OUTHNU(X(l,K) ,NUMINP,H(l,l,K) ,CW(l, JCND) ,
& IR, Jc,MXI,MXL,MXC)
C
C SUM OF CANDIDATE'S OUTPUT, USED IN COMPUTING COVARINCE VALUE
CD (JCND) = CD (JCND) +VAL
C
DO 180 JOU=l, NUMOUT
CC(JOU,JCND,2)= CC(JOU,JCND,2)+VALfER(JOU,K)
18 0 CONTINUE
C
200 CONTINUE
C
300 CONTINUE
C
CALL ADJCOR (ER, CD, CC,MXO, MXE,MXD)
C
END
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ~ ~ ~ ~ ~ ~ C
C ADJUST COVARINCE VELUES OF CANDIDTATES
C CD=CDOU(l,O)
C ER=ERRO ( 1,O )
CCCCCC
SUBROUTINE ADJCOR (ER, CD, CC, MXO, MXE, MXD)
,
L
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTC,NUMLAY,NUMNEU
COMMON /PTRAIN/ ALPHA, BETA, GAMMA, SHRINK, BSTSCR, THRESH, EPCWTS
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NVLDEX,NTRIA
COMMON /STATIS/ SSQE, SQER, SUMY, SMYY, VSTD,ERRVAL, ERRTHR
C
DIMENSION CC(MXO,MXD,2) ,CD(MXD) ,ER(MXO,O:MXE)
CCCCCC
NBESTC=O
BSTSCR= 0.0
DO 200 JCND=l,NUMCND
CBAR = CD(JCND)/NUMEXM
COR = 0.0
SCORE= 0.0
C
DO 1 0 0 JOU= 1, NUMOUT
C NOMALIZE COVARIANCE VALUES
COR = (CC(JOU,JCND,2)  ER(JOU,O)*CBAR)/SQER
CC(JOU,JCND,l) = COR
CC(JOU,JCND,Z) = 0.0
SCORE=SCORE+ABS(COR)
100 CONTINUE
CD(JCND)= 0.0
IF(SC0RE .GT. BSTSCR) THEN
BSTSCR = SCORE
NBESTC=JCND
END IF
200 CONTINUE
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C UPDATE WEIGHTS OF HIDDEN NODES
CCCCCC
SUBROUTINE UPHWTS (X,H, CC, CD, CW, DW, S, PS, ER,NEUS,NWTS,
& MXI,MXL,MXC,MXO,MXE,MXD)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTCINUMLAY,NUMNEU
COMMON /NuMvAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NVLDEXINTRIAL
COMMON /PTRAIN/ A, B, GAMMA, SF, BSTSCR, THRESH, EPCWTS
C
DIMENSION ER (MXO, 0 :MXE) , NEUS (MXL*MXC, 2)
DIMENSION X (MXI, MXE) H (MXL,M XCIMXE)I CC (MXOIMXDI2 )
DIMENSION CW (l+MXI+MXL+MXC,MXD) , DW (l+MXI+MXL+MXC,MXD)
DIMENSION S (l+MXI+MXL+MXC,MXD) ,PS (l+MXI+MXL+MXC,MXD)
C
NW=l+MXI+MXL+MXC
C
EPS=A/ (NUMEXM*NWTS)
DEC=O . 0
SF=B/ ( l.O+B)
DO 500 JCND=l,NUMCND
DO 400 JW=l,NWTS
C
CALL QKPROP (JW,NW, CW (1, JCND) , DW (1, JCND) ,
& PS(1,JCND) ,S(l,JCND) IEPSIBISFIDEC)
C
400 CONTINUE
500 CONTINUE
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C COMPUTE PARTIAL DERIVATIVES OF COVARIANCE W.R.T TO WEIGHTS
C FOR EACH CANDIDATE
CCCCCC
SUBROUTINE CNDSLP (XI HI CC, CD, CWI S I ERI NEUSI MXI I MXLI MXCI MXOI MXEI MXD)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTC,NUMLAY,NUMNE~
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM~NTSTEX,NVLDEX~NTRIAL
COMMON /NCANDP/ MXEPOC,NBONUS,IDXMSR~NETSTA~METHOD~NTRANS,NRSEED
COMMON /STATIS/ SSQE,SQER,SUMY,SMYY,VSTD,ERRVAL~ERRTHR
C
DIMENSION ER (MXO, 0 :MXE) ,NEUS (MXLAMXCI2 ) ,C D (MXD)
DIMENSION X(MX1,MXE) ,H(MXL,MXC,MXE) ICC(MXOIMXD12)
DIMENSION CW (l+MXI+MXL+MXC,MXD) ,S(O : (MXI+MXL+MXC) ,MXD)
C
NBC=O
NBX=O
NBV=O
I ROW=NEUS ( l+NUMNEU, 1 )
JCOL=NEUS(1+NUMNEU12)
NWTS=NUMHWT (IROW, JCOL, NUMINP, NBX, NBC, NBV)
C
DO 400 K=l, NUMEXM
DO 300 JCND=l,NUMCND
CHANGE= 0.0
DELTA=O . 0
VAL=OUTHNU (X (1, K) ,NUMINP,H (1,1, K) , CW (1, JCND) ,
& IROW, JCOL,MXI ,MXL,MXC)
C
CD ( JCND) = CD ( JCND) +VAL
FP=FPRIME (VAL, NTRANS )
C
DO 100 JOU=l,NUMOUT
DIR=l .0
IF (CC (JOU, JCND, 1) . LT. 0.0) DIR=1.0
C NORMALIZED DELTA: DELTA=DIRfFP*(ER(JOU,K) ER(JOU,O) )/SQER
DELTA=DIR*FP* (ER (JOU,K) ER (JOU, 0) )
CHANGE=CHANGE+DELTA
CC (JOU, JCND, 2) =CC ( JOU, JCND, 2) +VAL*ER (JOU, K)
100 CONTINUE
C
C WR.P TO THE BIAS NODE
S (0, JCND) = S (0, JCND) +CHANGE
C
C W.R.P TO THE NODES IN THE PREVIOUS LAYER
IF(NBC .GE. 0) THEN
DO 200 JW=l,JCOL
S (JW+NBC, JCND) =S (JW+NBC,JCND) +CHANGE*H (IROW1 I JW, K)
200 CONTINUE
END IF
C
C W.R.P TO INPUTS
IF(NBX .GE. 0) THEN
DO 220 JW=l,NUMINP
S (NBX+JW, JCND) =S (NBX+JW, JCND) +CHANGE*X ( JW, K)
220 CONTINUE
END IF
C
C W. R. P TO THE NODES IN THE SAME COLUMN
IF(NBV .GT. 0) THEN
DO 240 JW=1, IROW2
S (NBV+JW, JCND) =S (NBV+JW, JCND) +CHANGEAH (JW, JCOL, K)
240 CONTINUE
END IF
C
3 0 0 CONTINUE
400 CONTINUE
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C TEST TRAINED NET ON TEST SET
CCCCCC
SUBROUTINE TEST (XI D, Y, OW, H, HW,NEUR, TSERINS ,NZ I JS I
& MXI,MXL,MXC,MXO,MXE)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM~NTSTEX,NRUNEX,NTRIAL
COMMON /WTSVAR/ WRANGE,BSTERR,TVAR,TDSTD,TSDSTD,VCOE,TCOE
COMMON /NCANDP/ MXEPOC,NBONUS,IDXMSRINETSTAIMETHODINTRANSINRsEED
COMMON /STATIS/ SSQE,SQER,SUMY,SMYY,VSTDIERRVALIERRTHR
C
DIMENSION NEUR(MXL,l+MXC) ,X(MXI,MXE) ,Y(MXO,MXE) ,R(16)
DIMENSION D (MX0,MXE) ,H (MXL,MXC) OW (1+MXI+MXL*MXC,MXO)
DIMENSION HW (l+MXI+MXL+MXC,MXL,MXC) ,TSER(NS,NZ) , ERRV(l6)
CCCCCC
SSQE=O. 0
C COMPUTE OUTPUTS OF NEURON FOR THE KTH EXAMPLE
DO 200 K=NUMEXM+l, NUMEXM+NTSTEX
CALL HNUPAS (X(1,K) ,H,HW,NEUR,MXI,MXL,MXC) CALL OUTPAS (K,X,H,Y,OW,NEUR,MXI,MXL,MXC,MXO,MXE)
C
DO 100 I=l,NUMOUT
DXY=Y (I, K) D (I, K)
SSQE=SSQE+ (DXY*DXY)
100 CONTINUE
C
200 CONTINUE
C
KX l=NUMEXM+ 1
KXZ=NUMEXM+NTSTEX
NEXM=NUMEXM
STDD=TDSTD
NUMEXM=NTSTEX
TDSTD=TSDSTD
NR=16
CALL GETERV (Y, KX1, KX2, ERRV, NR, 1,1,1,O,MXO,MXE)
DO 300 K=1,6
TSER (JS, K) =ERRV (K)
300 CONTINUE
C
NUMEXM=NEXM
TDSTD=STDD
END
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ~ ~ ~ ~ ~ ~ ~ C
C INITIALIZE PARAMETERS OF THE NET
CCCCCC
SUBROUTINE ININET
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXMINTSTEXINRUNEXINTRIAL
COMMON /NCANDP/ MXEPOC,NBONUS,IDXMSRINUMCODIMETHODINT~SINRSEED
COMMON /PTRAIN/ ALPHA, BETA, GAMMA, SHRINK, BSTSCR, THRESH~TOLER
COMMON /NTRAIN/ NRETUR,NUMCND,NEPOCH,NBESTC,NUMLAY,NUMNEU
COMMON /STATIS/ SSQE,STDD,VMSE,VDEV,VSTD,ERRV~,ERRTHR
COMMON /WTSVAR/ WRANGE,BSTERR,TRMSEITRDEV,TRSTDIVLDCOEITRNCOE
C
c /NUMVAR/
NUMINP=O
NUMOUT'O
NUMEXM=O
NTSTEX=O
NRUNEX=O
NTRIAL=O
c /NCANDP/
MXEPOC=IOO
NBONUS=8
IDXMSR=l
CCCCCC NETSTA=O, TRAIN, 1, RUN,
NETSTA=O
METHOD=O
NTRANS=2
NRSEED=13
c /PTRAIN/
ALPIIA=o. 7 5
BETA=l 75
GAMMA=O. 95
SHRINK=BETA/ ( 1 . O+BETA)
BSTSCR=O.O
THRESH=0.03
DTOLER=O.O
C
c /NTRAIN/
NRETUR=O
NUMCND=8
NEPOCH=O
NBESTC=O
NUMLAY= 0
NUMNEU=O
c /STATIS/
SSQE=O. 0
STDD=O . 0
VMSE=O . 0
VDEV=O . 0
ERRVAL=O . 0
VSTD=O . 0
ERRTHR=O . 0
C /WTSVAR/ WRANGE,BSTFW
WRANGE=O .5
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccc~ccc~~c~cccc
C
C WRITE HEADER OF CONFIGRATION FILE
CCCCCC
SUBROUTINE WTHEAD(NFILE,TRNFNM)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NRUNEX,NTRIAL
COMMON /NCANDP/ MXEP, NBON, IDXMSR, NETSTA, METHOD , NUTYPH, NRSEED
COMMON /STATIS/ SSQE, STDD,VMSE, VDEV,VSTD, ERRVAL, ERRTHR
COMMON /WTSVAR/ WRANGE,BSTERR,TRMSEfTRDEv,TRsTD,vLDcoE,TRNcoE
COMMON /NTRAIN/ NRETUR, NCND, NEPOCH, NBESTC, NUMLAY, NUMNEU
COMMON /PTRAIN/ ALPHA,BETA,GAMMA,SHRINK,BSTSCR,THRESH,EPTH
COMMON /NTRNOU/ MXEPCO,NEPCO,NBONO,NORM,N~TYPO~NCENT~NRE~~
COMMON /PTROUT/ ALPHO,BETO,GAMMO,SHNKO,THREO,DECAY~,EPCWT~
COMMON /NVAMOD/ MODV, MODR
COMMON /MODNEX/ NOX, NOV
COMMON /NETTOP/ NLAY,NCOL
CHARACTER*30 TRNFNM,CH*80
C
NETSTA=I
WRITE (NFILE, ' (A) ' ) TRNFNM
CH='RF,M NETSTA, NETLAY, NETCOL, HNUTYP, ONUTYP, ISHORT, HSHORT'
WRITE (NFILE, ' (A) ' ) CH
WRITE (NFILE, 900) NETSTA, NLAY, NCOL, NUTYPH, NUTYPO,NOX, NOV
CCCCCC
CH= ' RE24 HMXEPC, HBONUS , HCANDS , OMXEPC, OBONUS '
WRITE (NFILE, ' (A) ' ) CH
WRITE (NFILE, 900) MXEP, NBON, NCND, MXEPCO, NBONO
CH='REM HLRNRT, HMXLRN, HTHRES, WRANGE'
WRITE (NFILE, ' (A) ' ) CH
WRITE (NFILE, 850) ALPHA, BETA, THRESH, WRANGE
CH=' REM OLRNRT, OMXLRN, OTHRES , ODECAY, ETOLER '
WRITE (NFILE, ' (A) ' ) CH
WRITE (NFILE, 850) ALPHO, BETO, THREO, DECAYO*lOO 0 , ERRTHR
CH='REM NUMLAY, NUMNEU, ERNORM, PERCNT , ERRMSR'
WRITE (NFILE, ' (A) ' ) CH
WRITE (NFILE, 900) NUMLAY, NUMNEU, NORM, NCENT , IDXMSR
CH= ' REM INPUTS, OUTPTS, VLDMOD, VLDVAL, NSEEDS '
WRITE (NFILE, ' (A) ' ) CH
WRITE (NFILE, 900) NUMINP, NUMOUT,MODV,MODR,
C
850 FORMAT (4X, 10 (F7.4,lX) )
900 FORMAT (4X, 10 (17,lX) )
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C UPDATE NET CONFIGRATION FILE
C SETUP PARAMETERS
C RANDOM SEEDS FOR TRAING THE NET
C SAVE WEIGHTS OF THE BESTRUN NETWORK
CCCCCC
SUBROUTINE WRTNET (HW, OW, NSEED, NSI NEURI NETFNM, TRNFNM,
& MXI,MXL,MXC,MXO)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NCANDP/ MXEP,NBON,IDXMSR,NETSTA,METHOD,NTRANS,NRSEED
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM~NTSTEX,NRUNEX~NTRIAL
COMMON /NTRAIN/ NRETUR, NCND, NEPOCH, NBESTC, NUMLAY , NUMNEU
COMMON /NETTOP/ NETLAY,NETCOL
C
DIMENSION NEUR (MXL, 0 :MXC) , HW (l+MXI+MXL+MXC, MXL, MXC)
DIMENSION OW (l+MXI+MXL*MXC,MXO), NSEED (0: NS)
CHARACTER*30 NETFNM, TRNFNM, CH*80
C
NBC=O
NBX=O
NBV=O
NFILE=3 0
DERR=O . 0
NETSTA=l
OPEN (uNIT=NFILE, FILE=NETFNM, STATUS= ' OLD ' )
CALL WTHEAD (NFILE, TRNFNM)
CH='REM RANDOM SEEDS'
WRITE (NFILE, ' (A) ' ) CH
C
DO 120 NT=l, NTRIAL
WRITE (NFILE, * ) NSEED (NT)
120 CONTINUE
C
CH='REM BEST SEED NO. AND SEED VALUE'
NRSEED=NSEED ( 0 )
WRITE (NFILE, ' (A) ' ) CH
WRITE (NFILE, ' (21 10) ' ) NRSEED, NSEED (NRSEED)
C
IF (NUMLAY . GT. 0) THEN
CH='REM NUMBERS OF HIDDEN NODES'
WRITE (NFILE, ' (A) ' ) CH
IF (NETCOL .EQ. 1) THEN
WRITE (NFILE, * ) NUMNEU
ELSE
DO 100 NL=l,NUMLAY
WRITE (NFILE, *) NEUR(NL, 0)
10 0 CONTINUE
END IF
C
CH='REM WEIGHTS OF HIDDEN NODES'
WRITE (NFILE, ' (A) ' ) CH
DO 300 NL=1, NUMLAY
DO 200 NC=l,NEUR(NL,O)
NWTS=NUMHWT (NL, NC, NUMINP, NBX, NBC, NBV) +1
WRITE(NFILE,800) NL,NC
DO 160 NW=l,NWTS
WRITE (NFILE, 700) HW (NW, NL,NC)
160 CONTINUE
2 0 0 CONTINUE
3 0 0 CONTINUE
CCCCCC
END IF
C
CH='REM WEIGHTS OF OUTPUT NODES'
WRITE (NFILE, ' (A) ' ) CH
NWTS=l+NUMINP+NUMNEU
C
DO 5 0 0 NOUT=l , NUMOUT
WRITE (NFILE, 8 00) NOUT
DO 400 NW=l,NWTS
WRITE (NFILE, 700) OW (NW,NOUT)
400 CONTINUE
500 CONTINUE
C
CLOSE (NFILE)
700 FORMAT (lX, D24.15)
800 FORMAT (15,2X, 15)
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C OPERATION CODE SELECTING VALIDATION SET FROM TRAINING DATA
CCCCCC
FUNCTION MARKOP (K,M ODV, MODR)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
IF((M0DV .GT. 1) .AND. (MODR .GE. MODV))MODR=O
MARKOP=O
IF(M0DV .EQ. 1) THEN
IF(K .GT. MODR) MARKOP=l
ELSE
I F (MOD ( K, MODV) . EQ . MODR) MARKOP=l
END IF
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C COMPUTE VARIANCE AND STANDARD DEVIATION FOR A GIVEN SET OF EXAMPLE
C R(1) : I=l, VARIANCE
C 2, STANDARD DEVIATION
C 3, SUM(XA2)
C 4 , SUM (XI
c 5, MAX(X)
C 6, MIN(X)
C FOR ALL TRAINING EXAMPLES (I=l6) AND VALIDATION SET (1~712)
C NT=COUNT OF TOTAL SAMPLES
C NV=COUNT OF VALIDATION SAMPLES
CCCCCC
SUBROUTINE GETSTD (X,MDY,MDX, NY1 , NY2 , NX1 , NX2 , M D V , M D R )
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
DIMENSION X (MDY,M DX) , R (NR)
C
NBV=6
DO 100 I=l,NR
R(I)=O.O
100 CONTINUE
TMX=1.OD30
TMN=TMX
VMX=TMX
VMN=TMN
NV= 0
NT= 0
NY=NY2NY1+1
TCOE=O . 0
VCOE=O . 0
C
DO 500 K=NXl,NX2
NCODE=MARKOP ( K, MDV, MDR)
SMX=O . 0
SQX=O . 0
DO 400 J=NYl, NY2
SQX=SQX+X (J, K) *X (J, K)
SMX= SMX+X ( J, K)
TMX=MAX (TMX, X (Jr K) )
TMN=MIN (TMNIX( J IK ) 1
IF (NCODE . EQ. 1) THEN
VMX=MAX (VMX, X ( JI K) )
VMN=MIN (VMN, X (J, K) )
END IF
CONTINUE
NT=NT+l
TCOE=DBLE (NT1 ) /DBLE (NT)
R(3)=R(3)*TCOE+SQX/NT
R (4) =R (4 ) *TCOE+SMX/NT
IF(NC0DE .EQ. 1) THEN
Nv=NV+l
VCOE=DBLE(NV~)/DBLE(NV)
R (NBV+3) =R (NBV+3) *VCOE+SQX/NV R (NBV+4 ) =R (NBV+4 ) *VCOE+SMX/NV
END IF
C
500 CONTINUE
C
IF(NT .GT. 1) THEN
TCOE=DBLE (NT) / DBLE (NY*NT1 )
R(l)=(R(3)R(4) * (R(4) /NY) ) *TCOE
R(Z)=SQRT(R(l))
R(5)=TMX
R(6) =TMN
END IF
C
IF(NV .GT. 1) THEN
VCOE=DBLE (NV) /DBLE (NY*NVl )
R (NBV+l) = (R (NBV+3) R (NBV+4) * (R (NBV+4) /NY) ) *VCOE
R (NBV+2 =SQRT (R (NBV+l))
R (NBV+5) =VMX
R ( NBVS 6 ) =VMN
END IF
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C READ TRAINING DATA FROM INPUT FILE
CCCCCC
SUBROUTINE RDPROB (XI D, FNM,MXI, MX0,MXE)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NVLDEX,NTRIAI,
COMMON /STATIS/ TSQE, SQER, VSQE, WAR, VDSTD, ERRVAL,ERRTHR
COMMON /WTSVAR/ WRANGE,BSTERR,TVAR,TDSTD,TSDSTD,VCOE~TCOE
COMMON /NVAMOD/ MODV, MODR
C
DIMENSION X(MX1,MXE) ,D(MXO,MXE) ,TEMP(MXI+MXO) ,R(16)
CHARACTER*30 FNM, CHf256
CCCCCC
NFILE=3 1
NUM=O
NFLAG=O
NLEN=O
NUMEXM= 0
NTSTEX=O
NVLDEX=O
NTRNEX=O
C
OPEN (UNIT=NFILE, FILE=FNM, STATUS=' OLD ' )
C
READ (NFILE, *) CH, NUMINP
READ (NFILE, * ) CHI NUM
NUMINP=NUMINP+NUM
READ(NFILE,*)CH, NUMOUT
READ(NFILEI*)CHI NUM
NUMOUT=NUMOUT+NUM
READ (NFILE, * ) CHI NTRNEX
READ (NFILE, * ) CHI NVLDEX
READ (NFILE, * ) CH, NTSTEX
NUMEXM=NTRNEX+NVLDEx
NUM=NUMEXM+NTSTEX
C
IF(MODV .EQ. 1) THEN
MODR=NTRNEX
IF(MODR . GE. NUMEXM) THEN
MODR=O
NVLDEX=NUMEXM
END IF
ELSE
IF((MODV .GT. 4) .OR. (MODV .LT. 2) .OR. (MODR .LT. 0)
& .OR. (MODR . GT. MODV) ) THEN
WRITE(*,*) 'Invalid value for VLDMOD or VLDVAL;'
WRITE(*,*) 'Default VLDMOD=3, VLDVAL=2 assumed.'
MODV= 3
MODR=2
END IF
END IF
C
IF( (NUN .GT. MXE) .OR. (NUMINP .GT. MXI)
& . OR. (NUMOUT . GT . MXO) ) THEN
WRITE(*,*) 'Storage not enough for training data;'
NFLAG=l
END IF
C
IF(NFLAG .GT. 0) THEN
CLOSE (NFILE)
STOP 'Reading data terminated!'
END IF
NLEN=NUMINP+NUMOUT
CCCCCC
DO 500 K=l,NUM
READ (NFILE, *) (TEMP (J) , J=1, NLEN)
C
DO 100 NIN=l,NUMINP
X (NIN, K) =TEMP (NIN)
100 CONTINUE
C
DO 200 NOUT=l,NUMOUT
D (NOUT, K) =TEMP (NUMINP+NOUT)
200 CONTINUE
C
500 CONTINUE
C
CLOSE (NFILE)
C
NR=16
NB=6
CALL GETSTD (D,MXO,MXE, 1, NUMOUT, 1, NUMEXM,MODV,MODRr R, NR, NV)
NVLDEX=NV
TDSTD=R ( 2
VDSTD=R (NB+2 1
C
NX ~=NUMEXM+1
NX2=NUMEXM+NTSTEX
CALL GETSTD (D,MXO,MXE, l,NUMOUT, NX1,NX2,l,NX2+lI R,NR,NV)
TSDSTD=R (2 )
700 FORMAT (40 (lX, D24.15) )
END
C
~ ~ C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ~ ~ ~ ~ C
C RUN 'TRAINED NET ON NEW INPUT DATA
C X=XINP
C Y=YOUT
CCCCCC
SUBROUTINE RUNNET (XI Y, HI HW, OW, NEUR, FNMIN, FNMOUT ,
& MXI ,MXL,MXC,MXO,MXE)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NRUNEX,NTRIAL
C
DIMENSION NEUR (MXL,l +MXC) ,X (MXI, MXE )C(H,)
DIMENSION OW ( l+MXI+MXL*MXC,MXO) , HW ( l+MXI+MXL+MXC, MXL,MXC)
CHARACTERf30 FNMIN, FNMOUT
CCCCCC
NFX=32
NFY=33
NINP=O
OPEN (UNIT=NFX, FILE=FNMIN, STATUS= OLD ')
READ (NFX, * ) NINP, NRUNEX
C
IF(N1NP .NE. NUMINP) THEN
WRITE(*,*)'Mismatched number of inputs!'
CLOSE (NFX)
STOP
END IF
C
C FORWARD PASS: GET OUTPUTS OF THE HIDDEN AND OUTPUT UNITS
C
OPEN (UNIT=NFY, FILE=FNMOUT, STATUS= ' OLD ' )
WRITE (NFY, *) NINP, ' ', NRUNEX
C
NX=1
DO 100 K=l,NRUNEX
READ (NFX, *) (X (1,NX) , I=l,NUMINP)
CALL HNUPAS (X(1,NX) ,H,HWINEURIMXIIMXL,MXC)
CALL OUTPAS (NX, XI HI Y, OW, NEUR, MXI , MXL, MXC, MXO, MXE)
WRITE (NFY, 700) (Y ( JINX), J=l,NUMOUT)
100 CONTINUE
C
C
CLOSE (NFX)
CLOSE (NFY)
w ~ ~ ~ ~ ( * , * ) ~ O usptuortesd in file '//FNMOUT
700 F O R M A T ( ~ ~ ( ~ X , D ~ ~ . ~ ~ ) )
END
cccccccc~~~~~~~~~~~~~~CCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C COMPUTE OUTPUTS OF THE HIDDEN UNITS FOR SINGLE INPUT VECTOR
CCCCCC
SUBROUTINE HNUPAS (TX, TH, HW, NEUR,MXI ,MXL,MXC) IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
DIMENSION TX (MXI ) , TH (MXL, MXC)
DIMENSION NEUR (MXL, 0 : MXC) , HW ( l+MXI+MXL+MXC, MXL, MXC)
CCCCCC
DO 120 I=l,NETLAY
NCOL=NEUR ( I, 0 )
IF (NCOL . GT. 0) THEN
DO 110 J=1, NCOL
TH(1,J )=OUTHNU(TX,NUMINP,THIHW(l )ICI, I,J )
CONTINUE
ELSE
RETURN
END IF
C
120 CONTINUE
C
END
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ~ ~ ~ ~ ~ ~ C SHOW STATISTICAL RESULTS OF TRAINING
CCCCCC
SUBROUTINE SHOWER (NTS, TR, VR, SR, NS , NZ , JS I NF)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
DIMENSION NTS (NSrNZ) ,TR(NS,NZ) ,VR(NS,NZ) ,SR(NS,NZ)
CHARACTER*5 CHU(6) ,CHT(6) ,CHV(G)
CHARACTER CH1*17,CH2*17,CH3*171C1*18,C2*ll
C
CHT (1) ='MSE: '
CHT (2) ='VAR: '
CHT (3) ='IEW: '
CHT(4)='OEW: '
CHT (5) ='EIX: '
CHT(G)='FW: '
C
CHV(l)='MSE: '
CHV(2)='RMS: '
CHv(3)='VAR: '
cHV(4) ='STD: '
CHV(5) ='EIX: '
CHV(G)='FW: '
C
CHU (1) ='HNUS: '
CHU (2)='HWTS: '
CHU(3) ='TWTS: '
CHU (4) ='IEPC: '
CHU (5) ='OEPC: '
CHU (6) ='TEPC: '
C
C1=' actual error'
c2= ' '
IF(NORMER .EQ. 1)C1=' normalized error '
IF(NCENT .EQ. 1) C2=' percentage'
CHl='On Training Data'
CH2='On Validation Set'
CH3= ' On Test Set1
C
IF(NF .EQ. 0) THEN
WRITE(*,*) 'TRIAL NO: ',JS
WRITE(*,*)'SQUARED ERROR: '//Cl//C2
C WRITE(*, *)
WRITE(*, 700) CHlICH2,CH3
ELSE
WRITE (NF, *) 'TRIAL No: ' , JS
WRITE (NF, * ) ' SQUARED ERROR: ' //Cl//C2
C WRITE (NF, * )
WRITE (NF, 7 0 0 ) CHI, CH2 ,CH3
END IF
C
DO 200 J=1,6
IF(NF .EQ. 0) THEN
WRITE(*,800)CHU(J) ,NTS(JS,J) ,CHT(J) ,TR(JSIJ) ,CW(J) ,
& VR(JS, J) ,CHV(J) ,SR(JS, J)
ELSE
WRITE(NF, 800)CHU(J) ,NTS (JS, J) ,CHT(J) ,TR(JS, J) ,CHV(J) ,
& VR(JSI J) ,CHV(J) ,SR(JSI J)
END IF
C
200 CONTINUE
C
700 FORMAT (18X,A, 2 (2X,A) )
800 FORMAT(A, 1X,I10,3 (3X,AI F11.4) )
END
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ~ ~ ~ ~ ~ ~ ~ C
C GATHER THE STATISTICAL PARAMETERS ABOUT THE NET TRAINED
C NUMBER OF HIDDENS, WEIGHTS, EPOCHS
C ERRORS ON TRAINING DATA AND VALIDATION SET
CCCCCC
SUBROUTINE GETSTA (NTST, NEUR, Y, TNER, VAER, NS , NZ , JS , MXL, MXC , MXO, MXE)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NWAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NVLDEX,NTRIAL
COMMON /PTRAIN/ ALPHA, BETA, GAMMA, SHRINK, BSTSCR, THRESHTSIN
COMMON /NTRAIN/ NRETURINUMCNDINEPCININBESTCINUMLAYINUMNEU
COMMON /PTROUT/ ALPHO, BETO, GAMMO, SHNKO, THREO, DECAYO, EWTSOU
COMMON /NTFWOU/ MXEPCO, NEPCOU, NBONOI MSROI NEUO, METHO, NRESO
COMMON /MODNEX/ NOX, NOV
COMMON /NVAMOD/ MODV, MODR
C
DIMENSION NEUR(MXLI O:MXC) INTST (NSINZ) rTNER(NSINZ) ,VAER (NS,NZ)
DIMENSION R(16) IERRV(16) rY(MX0,MXE)
C
NB=6
NR=16
NHWTS=NCONEX (NEUR ( 1 I 0) MXLI NUMINP, NOXI NOV)
CALL GETERV (Y, KX1, KX2, ERRV, NR, 1,1, MODV, MODR, MXOI MXE)
C
DO 100 K=1,6
VAER ( JS , K) =ERRV (NB+K)
TNER ( JS, K) =ERRV (K)
100 CONTINUE
C
TNER ( JS ,3 1 =EWTSIN
TNER ( JS ,4 1 =EWTSOU
NTST ( JS ,I) =NUMNEU
NTST (JS, 2) =NHWTS
NTST ( JS, 3) =NHWTS+NYWTS
NTST ( JS, 4 ) =NEPCIN
NTST ( JS, 5 ) =NEPCOU
NTST (JS, 6) =NEPCIN+NEPCOU
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C OBTAIN STATSTICAL RESULTS OF TRAINING AND TESTING
CCCCCC
SUBROUTINE ERRSTA (NETTST I AERR, NS I NZ ,MS ,MODEI IDXMIN, NBST, NF)
r.
L
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
DIMENSION AERR (NS, NZ ) , NETTST (NS , NZ )
CHARACTER*4 TITLE ( 6) , CH* 8 0
CCCCCC
AVP=O . 0
TITLE(l)='MSE '
TITLE(2)='RMS '
TITLE(3)='VAR '
TITLE(4)='STD '
TITLE(5)='EIX '
TITLE(6)='FW1
C
IF (MODE . EQ. 0) THEN
WRITE (NF, * ) ' STATS ON VALIDATION SET: '
ELSE IF(M0DE .EQ. 1) THEN
WRITE (NF, *) ' STATS ON TEST SET: '
ELSE
WRITE (NF, * ) ' STATS ON TRAINING DATA '
TITLE (2) ='VAR '
TITLE(S)='IEW '
TITLE(4!='OEW '
END IF
C
CH= ' AVERAGE MIN MAX'
& / / ' MXMN AVPOS STD D EV'
WRITE (NF, ' (A) ' ) CH
CCCCCC
NBST=l
VALUE=O . 0
C
DO 550 JE=l,NZ
SUM=O 0
DEV=O . 0
VMIN=AERR ( 1, JE )
VMAX=AERR(l, JE)
DO 500 JS=l,MS
VALUE=AERR ( JS , JE )
SUM=SUM+VALUE
DEV=DEV+VALUE*VALUE
IF (VALUE . LT. WIN) THEN
VMIN=VALUE
IF(IDXM1N .EQ. JE) NBST=JS
END IF
IF (AERR ( JS, JE) GT . VMAX) VMAX=AERR (JS, JE)
500 CONTINUE
C
SUM=SUM/MS
DEV= (DEVSUM*SUM*MS ) / (MS1)
DEV=SQRT ( DEV)
AVP= (SUMVMIN) / (VMAXWIN)
WRITE (NF, 900) TITLE (JE) , SUM, VMIN,VMAX,VMAXVMINIAVPI DEV
550 CONTINUE
CCCCCC
C
WRITE (NF, * )
IF(M0DE . EQ. 0 ) THEN
WRITE (NF, *) ' BEST NET BY ' //TITLE (IDXMIN) // ' ON TRIAL No : ' , NBST
WRITE (NF, * )
END IF
900 FORMAT(AI1X,4F12.4, 1X,F6.2,F14.4)
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C SUMMARY REPORT FOR TRAINING AND TESTING
C THE STAT. RESULTS WRITTEN INTO FILE
CCCCCC
SUBROUTINE REPORT (NTST, TNER,VAER, TSER, MSED, NS , NZ , NF, TRNE'NM)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
COMMON /NUMVAR/ NUMINP,NUMOUT,NUMEXM,NTSTEX,NR~NEX,NTRIAL
COMMON /NCANDP/ MXEP,NBON,IDXMSR,NUMCOD,METHOD,NTRANS,NRSEED
COMMON /STATIS/ TSQE, SQER~VSQEW,A R, VDSTD, ERRVAL, ERRTHR
COMMON /WTSVAR/ WRANGE, BSTERR, TVAR, TDSTD, TSDSTD, VCOE, TCOE
C
DIMENSION NTST(NS,NZ),TNER(NS,NZ),TSER(NS,NZ)
DIMENSION VAER (NS, NZ ) , MSED ( 0 : NS )
CHARACTER"4 TA(6),CH*801TRNFNM*30
CCCCCC
TA(1) ='HNUS1
TA(2) ='HWTS1
TA(3)='TWTS1
TA(4)='IEPC1
TA(5) ='OEPC1
TA(6) ='TEPC1
C
CH='XCAS NETWORK'
IF(MXC .EQ. 1) CH='CASCOR NETWORK'
C
WRITE (NF, * ) CH
CALL WTHEAD (NF,T RNFNM)
WRITE (NF, *)
WRITE (NF, *) 'TARGET STD FOR TRAINING DATA ' , TDSTD
WRITE (NF, * 'TARGETSTD FOR VALIDATION SET I , VDSTD
WRITE (NF,* TARGETS TD FOR TEST SET ' ,T SDSTD
WRITE (NF, * )
C
AVP=O . 0
WRITE (NFIf)
CH= ' AVERAGE MIN MAX'
& // ' MXMN AVPOS STD DEV' 
WRITE (NF, ' (A) ' ) CH
CCCCCC
DO 200 K=l,NZ
SUM=O . 0
MINV=NTST ( 1, K)
MAXV=NTST ( 1, K)
DEV=O . 0
VAL=O . 0
DO 10 0 JS=1, NTRIAL
VAL=DBLE (NTST ( JS, K) )
SUM=SUM+VAL/NTRIAL
DEV=DEV+VAL* (VAL/ (NTRIALl) )
MAXV=MAX( MAXV, NTST(JS,K) )
MINV=MIN ( MINV, NTST ( JS, K) )
CONTINUE
AVP= ( SUMMINV) /DBLE (MAXVMINV)
DEV=DEVNTRIAL*SUM*(SUM/(NTRIALI))
DEV=SQRT ( DEV)
WRITE (NF, 900) TA (K) , SUM, MINV,MAXV, MAXVMINV,AVP, DEV
CONTINUE
NETBST=O
WRITE (NF, *)
CALL ERRSTA (NTST, VAER, NS, NZ , NTRIAL, O,1, NBST, NF)
NETBST=NBST
CALL ERRSTA(NTST, TSER,NS,NZ,NTRIAL, 1, IDXMSR,NBST, NFj
CALL ERRSTA (NTST, TNER, NS, NZ, NTRIAL, 2, IDXMSR, NBST, NF)
WRITE (NF, * )
WRITE(NF *)'===========STATS ON BEST NET====================='
CALL SHOWER (NTST, TNER, VAER, TSER, NS, NZ , NETBST, NF)
WRITE(NF,*) ...................................................
WRITE (NF, *
DO 300 NT=l,NTRIAL
CALL SHOWER (NTST, TNER, VAER, TSER, NS, NZ, NT, NF)
WRITE (NF, * )
CONTINUE
900 FORMAT(A,1X,F12.1,3112,1XIF6.2~4.1)
END
c c c c c C C C C C C ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ c c c c c c C
C RETURN A RANDOM NUMBER BETWEEN 0.5 AND 0.5
C REFERENCE: "A PORTABLE RANDOM NUMBER GENERATOR FOR USE IN SIGNAL
C PROCESSING", SANDIA NATIONAL LABORATORIES TECHNICAL
C REPORT, BY S D. STEARNS.
CCCCCC
mNCTION RANDOM (N)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
N=2045*N+l
N'N (N/l048576) *lo48576
RANDOM=(N+l) /1048577.00.5DO
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
C
C COMPUTE OUTPUT OF HIDDEN NEURON
C NTRANS=l:
C 1: LOGISTIC SIGMOID, O<Y<l. 0
C Y(x)=~.o/( ~+ExP(x))
C 2: SYMETRIC LOGSIGMOID, 0.5<Y<0.5
C Y(X)=1.0 /(1.0 + EXP(X)  0.5
C 3: HYPERBOLIC TANGENT SIGMOID, 1.O<Y<1.0
C Y (x) = (EXP (x) EXP (X) ) / (EXP (x) +EXP (X) )
CCCCCC
FUNCTION FTRANS (S, NTRANS)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
Y=O . 0
IF(NTRANS .EQ. 0) THEN
Y=S
ELSE IF(NTRANS .LT. 3) THEN
cccccc LOGISTIC SIGMOID: O<Y<1.0
IF(S .LT. 15.0) THEN
Y=O . 0
ELSE IF(S .GT. 15.0) THEN
Y=1.0
ELSE
Y=l.O/ (l+EXP (S) )
END IF
CCCCCC SYMETRIC LOGSIGMOID: 0.5<Y<0.5
IF(NTRANS .EQ. 2) Y=Y0.5
C
ELSE
CCCCCCHYPERBOLIC TANGENT SIGMOID, 1.O<Y<1.0
IF(S .LT. 8.0) THEN
Y=1. 0
ELSE IF (S .GT. 8.0) THEN
Y=l . 0
ELSE
Y=EXP (2. OfS)
~=(l.0Y)/(l.O+Y)
END IF
END IF
C
FTRANS=Y
END
C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C C ~ ~ ~ ~ ~ C
C COMPUTE DERIVATIVE OF ACTIVATION FUNCTION FOR A HIDDEN NODE
CCCCCC
FUNCTION FPRIME(F,NTRANS)
C
IMPLICIT DOUBLE PRECISION (AH.0Z),INTEGER(IdN)
C
FP=O . 0
IF( NTRANS .EQ. 0 ) THEN
FP=l. 0
ELSE IF( NTRANS .EQ. 1 ) THEN
FP=F* (1.0F)
ELSE IF(NTRANS .EQ. 2) THEN
FP=O. 2 5 F* F
ELSE
FP=l. 0F*F
END IF
C
FPRIME=FP
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccc~cc~cc~~ccc~c~cccc~~~~~C
C COMPUTE DERIVATIVE OF ACTIVATION FUNCTION FOR OUTPUT NODE
C
CCCCCC
FUNCTION OPRIME ( F, N T w S)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
FP=O . 0
IF( NTRANS .EQ. 0 ) THEN
FP=1 .0
ELSE IF( NTRANS .EQ. 1 ) THEN
FP=F* (1.0F) +O. 1
ELSE IF(NTRANS .EQ. 2) THEN
FF0.25F*F+0.1
ELSE
FP=l. 0F*F+O. 1
END IF
C
OPRIME=FP
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
QUICKPROP (MINIMIZING) FOR UPDATING WEIGHTS. BY SCOTT FAHLMAN, 1990
I: THE ITH WEIGHTS
HW: WEIGHTS OF HIDDEN UNITS
PS: PREVIOUS SLOPE
S: CURRENT SLOPE
DW: PREVIOUS DELTA W
A: LEARING RATE (0.11.0, 0.6 IS OK)
C B: MAX LEARING RATE(0.82.0)
C SF: SHRINK FACTOR =B/ ( l.O+B)
C DEC: DECAY FACTOR (0001 FOR OUTPUT UPDATING)
CCCCCC
SUBROUTINE QKPROP (IIMXWl CW, DWl PS, S, A, B, SF, DEC)
C
IMPLICIT DOUBLE PRECISION (AH,OZ),INTEGER(IN)
C
DIMENSION CW (raw) , PS (MXW) , s (MXW) , DW (MXW)
CCCCCC
D=DW (I)
SL=S ( I ) +Df DEC
PSL=PS (I )
DX=O . 0
CCCCCC
IF(D .LT. 0.0) THEN
C LAST STEP WAS NEGATIVE
C
IF(SL .GT. 0.0) DX=A*SL
C
IF(SL .GE. (PSL*SF) ) THEN
DX=DX+B*D
ELSE
DX=DX + D*SL/ (PSL  SL)
END IF
CCCCCC
ELSE IF(D .GT. 0.0) THEN
IF(SL .LT. 0.0) DX=A*SL
C
IF(SL .LE. (PSL*SF) ) THEN
DX=DX+B*D
ELSE
DX=DX + DfSL/ (PSL  SL)
END IF
ELSE
DX=A* S L
END IF
C
DW(1) = DX
CW(1) = CW(1) + DX
PS(1) = SL
S(1) = 0.0
C
END
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCccccccccccccccccccccccccccccccc
v
VITA
YULE1 BAI
Candidate for the Degree of
Master of Science
Thesis: AN EXTENDED CASCADE CORRELATION NEURAL NETWORK
Major Field: Computer Science
Biographical:
cation: Received the Bachelor of Science in PetroleumGeology from
Northwest University, Xi'an, China, in July 1985; received Master degree
in Petroleum Geology from Research Institute of Petroleum Exploration
and Development, Beijing, China, in July 1989. Completed the
requirements for Master of Science at Oklahoma State University in May
2002.
Professional Experience: Employed by Research Institute of Petroleum
Exploration and Development, Beijing, China, as a Research Geologist,
August 1989 to September 1997;