The Neural Net Algorithm

Creating a neural net solution to a problem involves the following steps:

These steps (except for the last one) are detailed below:

The Problem Input

The problem input to the neural net consists of a series of numbers. This input can be:

Defining the Topology

To set up the neural net:

The architecture of each neuron consists of:

Set up the first layer of neurons:

Set up the additional layers of neurons:



Set up a total of M layers of neurons. For each layer, set up the neurons in that layer. For layeri:

The Recogniton Trials

How each neuron works:

Once the neuron is set up, it does the following for each recognition trial.

Do the following for each recognition trial:

For each layer, from layero to layerm:

And for each neuron in each layer:

To Train the Neural Net

Key Design Decisions

In the simple schema above, the designer of this neural net algorithm needs to determine at the outset:

What the input numbers represent.

The number of layers of neurons.

The number of neurons in each layer (each layer does not necessarily need to have the same number of neurons).

The number of inputs to each neuron, in each layer. The number of inputs (i.e., interneuronal connections) can also vary from neuron to neuron, and from layer to layer.

The actual 'wiring' (i.e., the connections). For each neuron, in each layer, this consists of a list of other neurons, the outputs of which constitute the inputs to this neuron. This represents a key design area. There are a number of possible ways to do this:

(i) wire the neural net randomly; or

(ii) use an evolutionary algorithm (see next section of this Appendix) to determine an optimal wiring; or

(iii) use the system designer's best judgment in determining the wiring.

(i) set the synaptic strengths to the same value; or

(ii) set the synaptic strengths to different random values; or

(iii) use an evolutionary algorithm to determine an optimal set of initial val- ues; or

(iv) use the system designer's best judgment in determining the initial values.

(i) the outputs of layerm of neurons; or

(ii) the output of a single output neuron, whose inputs are the outputs of the neurons in layerm;

(iii) a function of (e.g., a sum of) the outputs of the neurons in layerm; or

(iv) another function of neuron outputs in multiple layers.

(i) For each recognition trial, increment or decrement each synaptic strength by a (generally small) fixed amount so that the neural net's output more closely matches the correct answer. One way to do this is to try both incrementing and decrementing and see which has the more desirable effect. This ran be time consuming, so other methods exist for making local decisions on whether to increment or decrement each synaptic strength.

(ii) Other statistical methods exist for modifying the synaptic strengths after each -recognition trial so that the performance of the neural net on that trial more closely matches the correct answer.



Note that neural net training will work even if the answers to the training trials are not all correct. This allows using real-world training data that may have an inherent error rate. One key to the success of a neural net-based recognition system is the amount of data used for training. Usually a very substantial amount is needed to obtain satisfactory results. Just like human students, the amount of time that a neural net spends learning its lessons is a key factor in its performance.

Variations

Many variations of the above are feasible. Some variations include:

There are different ways of determining the topology, as described above. In particular, the interneuronal wiring can be set either randomly or using an evolutionary algorithm.

There are different ways of setting the initial synaptic strengths, as described above.

The inputs to the neurons in layeri do not necessarily need to come from the outputs of the neurons in layeri-I. Alternatively, the inputs to the neurons in each layer can come from any lower layer or any layer.

There are different ways to determine the final output, as described above.

For each neuron, the method described above compares the sum of the weighted inputs to the threshold for that neuron. If the threshold is exceeded, the neuron fires and its output is 1. Otherwise, its output is 0. This 'all or nothing' firing is called a nonlinearity, There are other nonlinear functions that can be used. Commonly a function is used that goes from 0 to I in a rapid but more gradual fashion (than all or nothing). Also, the outputs can be numbers other than 0 and 1.

The different methods for adjusting the synaptic strengths during training, briefly described above, represent a key design decision.

The above schema describes a 'synchronous' neural net, in which each recognition trial proceeds by computing the outputs of each layer, starting with layero through layerm. In a true parallel system, in which each neuron is operating independently of the others, the neurons can operate asynchronously (i.e., independently). In an asynchronous approach, each neuron is constantly scanning its inputs and fires (i.e., changes its output from 0 to 1) whenever the sum of its weighted inputs exceeds its threshold (or, alternatively, using another nonlinear output function).