At the end of the backpropagation session, I save both network configurations, the one with the best performance on the validation set and the last training epoch configuration.. You may
Trang 1Articles » General Programming » Algorithms & Recipes » Neural Networks
Backpropagation Artificial Neural Network in C++
By Chesnokov Yuriy, 20 May 2008
Download demo - 95.7 KB
Download source - 19.5 KB
Introduction
I'd like to present a console based implementation of the backpropogation neural network C++ library I developed
and used during my research in medical data classification and the CV library for face detection: Face Detection C++ library with Skin and Motion analysis There are some good articles already present at The CodeProject, and you may consult them for the theory In my code, I present the necessary features as input data preprocessing in the input
layer with Minmax, Zscore, Sigmoidal, and Energy normalization These parameters are obtained from the training set, and then used for preprocessing every incoming vector for classification The console supports training data random separation to train, validation, and test sets before backpropagation training Random separation allows to obtain a representative train set comparing performance on validation and test parts A validation set is useful for preventing over-fitting by estimating the performance on that set At the end of the backpropagation session, I save both
network configurations, the one with the best performance on the validation set and the last training epoch
configuration For performance estimation, I use sensitivity, specificity, positive predictivity, negative predictivity, and accuracy metrics For validation performance estimation in the case of biased data distribution (for example, in my Face Detection article, you may find that there are a lot more non-faces than faces, 19:1 ratio) I provide a geometric mean and F-measure metrics to support the scenario To support a large amount of data vectors, I provide File Mapping
based data load That allows to map your hundreds of megs of training data to memory in 1 msec and start your
training session immediately For relatively small amounts of data, you may use a text file format Finally, the console implementation is easier to use, you avoid a lot of mouse clicking in GUI applications, and may automate the process with batch files for choosing the right network topology, the best performance on the validation and test sets, and so on
Background
Have a look at the CodeProject neural network articles I also used tutorials from the generation5 site For biased data distribution problems, I used: Evaluation of classifiers for an uneven class distribution problem Have a look there to understand geometric mean and F-measure metrics
Using the code
4.88 (22 votes)
Trang 2The help line to console is shown below:
argv[1] t-train
argv[2] network conf file
argv[3] cls1 files [0.9]
argv[4] cls2 files [0.1]
argv[5] epochs num
argv[6] [validation class]
argv[7] [test class]
argv[8] [validation TH 0.5]
argv[9] [vld metric mse]
argv[10] [norm]: [0-no], 1-minmax, 2-zscore, 3-softmax, 4-energy
argv[11] [error tolerance cls] +- 0.05 default
argv[1] r-run
argv[2] network conf file
argv[3] cls files
argv[4] [validation TH 0.5]
argv[5] [norm]: [0-no], 1-minmax, 2-zscore, 3-softmax, 4-energy
ann1dn.exe t net.nn cls1 cls2 3000 [tst.txt][val.txt]
[TH [0.5]][val type [mse]] [norm [0]] [err [0.05]]
ann1dn.exe r net.nn testcls [TH [0.5]] [norm [0]]
metrics: [0 - mse]
1 - AC
2 - sqrt(SE*SP)
3 - sqrt(SE*PP)
4 - sqrt(SE*SP*AC)
5 - sqrt(SE*SP*PP*NP*AC)
6 - F-measure b=1
7 - F-measure b=1.5
8 - F-measure b=3
The minimal number of parameters to start a training session:
>ann1dn.exe t network.nn data1_file data2_file 1000
It will use the network.nn file as a neural network, and load data form data1_file and data2_file, which represents data
vectors from positive and negative classes, and train it for 1000 epochs
The neural network file format is described in my Face Detection article To start with random initialized weights before the training session, you need to provide only the number of layers and the number of neurons per layer in that file
For example, in the demo zip, you will find a iris.nn file:
3
4 8 1
Three layers, and 4, 8, and 1 neurons per layer
It supports two data file formats The text one is:
vector_name class_mark
x1 x2 x3 xN
vector_name is the name of your particular vector; it should not start with a numeric element, only with a letter class_mark is the non zero number corresponding to the class attribute: 1, 2, etc In the console application, I use only 1 as positive class with a 0.9 desired output, and 2 as negative with 0.1 as the desired output The next line contains the vector entries in integer or floating point format
Trang 3In the demo zip, the IRIS data is organized using a text format with four-dimensional entries:
virgi1 2
64 28 56 22
virgi2 2
67 31 56 24
virgi3 2
63 28 51 15
virgi4 2
69 31 51 23
virgi5 2
65 30 52 20
virgi6 2
65 30 55 18
setosa1 1
50 33 14 2
setosa2 1
46 34 14 3
setosa3 1
46 36 10 2
setosa4 1
51 33 17 5
setosa5 1
55 35 13 2
The binary floating point file format is expedient when you have a large amount of data The data is saved in a
separate file as a sequence of floating point numbers in binary format, using 4 bytes per floating point number:
file1.dat (2x3 matrix)
[x11] [x12] [x13] [x21] [x22] [x23]
And, the dimensions of the data matrix are saved in the file with the same name but with a hea extension.
file1.hea
2 3
In the previous example, the file file1.dat contains two three-dimensional vectors.
In that case, your data1_file or data2_file may contain the entries with the full path to the files and the class marks:
fullpath\file1.dat 1
fullpath\file2.dat 1
The next parameters to the console application for backprop training are optional You may use them for validation and testing of your network, for input data normalization, and error limits during training process
>ann1dn t network.nn data1_file data2_file 1000 vld_file tst_file 0.5 2 2
This command line demonstrates that you use your validation and test sets in the vld_file and tst_file files in text or
binary format, as described above, with a validation threshold 0.5 (that is, the network output greater than 0.5 attributes a data vector to a positive class), with geometric mean of sensitivity and specificity as the performance metric for validation stopping and Zscore normalization The allowed validation metrics are specified at the end of the
console help line If your vld_file or tst_file are empty files, then the corresponding data set will be composed of
Trang 4randomly selected entries from your training set, with 25% records from each class.
The last eleventh argument to the console is the error tolerance If the difference between the desired output and the mean network output for positive and negative classes is less than the error for 10 consecutive epochs, the training stops During the backpropagation training, if the difference between the desired output and the network output for a particular vector is less than the error you specify, the network weights are not adjusted This allows to correct the network weight connections for the data vectors which are not yet 'memorized'
The next example demonstrates the sample training session for the IRIS data:
>ann1dn t iris.nn setosa_versi.dat virgi.dat 200 void void 0.5 2 3
loading data
cls1: 100 cls2: 50 files loaded size: 4 samples
validaton size: 25 12
validaton size: 26 13
normalizing minmax
training
epoch: 1 out: 0.555723 0.478843 max acur: 0.92 (epoch 1) se:84.00 sp:100.00 ac:89.19
epoch: 2 out: 0.582674 0.400396 max acur: 0.92 (epoch 1) se:84.00 sp:100.00 ac:89.19
epoch: 3 out: 0.626480 0.359573 max acur: 0.92 (epoch 3) se:84.00 sp:100.00 ac:89.19
epoch: 4 out: 0.655483 0.326918 max acur: 0.94 (epoch 4) se:96.00 sp:91.67 ac:94.59
epoch: 5 out: 0.699125 0.323879 max acur: 0.94 (epoch 5) se:88.00 sp:100.00 ac:91.89
epoch: 6 out: 0.715539 0.299085 max acur: 0.94 (epoch 6) se:88.00 sp:100.00 ac:91.89
epoch: 7 out: 0.733927 0.292526 max acur: 0.96 (epoch 7) se:92.00 sp:100.00 ac:94.59
epoch: 8 out: 0.750638 0.278721 max acur: 0.98 (epoch 8) se:96.00 sp:100.00 ac:97.30
epoch: 9 out: 0.774599 0.277550 max acur: 0.98 (epoch 8) se:96.00 sp:100.00 ac:97.30
epoch: 10 out: 0.774196 0.256110 max acur: 0.98 (epoch 8) se:96.00 sp:100.00 ac:97.30
epoch: 11 out: 0.793877 0.260753 max acur: 0.98 (epoch 8) se:96.00 sp:100.00 ac:97.30
epoch: 12 out: 0.806802 0.245758 max acur: 0.98 (epoch 8) se:96.00 sp:100.00 ac:97.30
epoch: 13 out: 0.804381 0.228810 max acur: 0.98 (epoch 13) se:96.00 sp:100.00 ac:97.30
epoch: 14 out: 0.814079 0.218740 max acur: 0.98 (epoch 13) se:96.00 sp:100.00 ac:97.30
epoch: 15 out: 0.827635 0.223827 max acur: 0.98 (epoch 13) se:96.00 sp:100.00 ac:97.30
epoch: 16 out: 0.832102 0.210360 max acur: 0.98 (epoch 13) se:96.00 sp:100.00 ac:97.30
epoch: 17 out: 0.840352 0.213165 max acur: 0.98 (epoch 17) se:96.00 sp:100.00 ac:97.30
epoch: 18 out: 0.848957 0.201766 max acur: 0.98 (epoch 18) se:96.00 sp:100.00 ac:97.30
epoch: 19 out: 0.844319 0.188338 max acur: 0.98 (epoch 19) se:96.00 sp:100.00 ac:97.30
epoch: 20 out: 0.856258 0.184954 max acur: 0.98 (epoch 19) se:96.00 sp:100.00 ac:97.30
epoch: 21 out: 0.853244 0.178349 max acur: 0.98 (epoch 19) se:96.00 sp:100.00 ac:97.30
epoch: 22 out: 0.867145 0.185852 max acur: 0.98 (epoch 22) se:96.00 sp:100.00 ac:97.30
epoch: 23 out: 0.863079 0.171684 max acur: 0.98 (epoch 23) se:96.00 sp:100.00 ac:97.30
epoch: 24 out: 0.870108 0.170253 max acur: 0.98 (epoch 24) se:96.00 sp:100.00 ac:97.30
epoch: 25 out: 0.873538 0.164185 max acur: 0.98 (epoch 25) se:96.00 sp:100.00 ac:97.30
epoch: 26 out: 0.871584 0.150496 max acur: 1.00 (epoch 26) se:100.00 sp:100.00 ac:100.00
epoch: 27 out: 0.879310 0.161155 max acur: 1.00 (epoch 26) se:100.00 sp:100.00 ac:100.00
epoch: 28 out: 0.879986 0.154784 max acur: 1.00 (epoch 26) se:100.00 sp:100.00 ac:100.00
epoch: 29 out: 0.880308 0.139083 max acur: 1.00 (epoch 26) se:100.00 sp:100.00 ac:100.00
epoch: 30 out: 0.890360 0.149518 max acur: 1.00 (epoch 26) se:100.00 sp:100.00 ac:100.00
epoch: 31 out: 0.888561 0.145144 max acur: 1.00 (epoch 26) se:100.00 sp:100.00 ac:100.00
epoch: 32 out: 0.880072 0.129197 max acur: 1.00 (epoch 32) se:100.00 sp:100.00 ac:100.00
epoch: 33 out: 0.896553 0.139937 max acur: 1.00 (epoch 32) se:100.00 sp:100.00 ac:100.00
epoch: 34 out: 0.893467 0.137607 max acur: 1.00 (epoch 32) se:100.00 sp:100.00 ac:100.00
epoch: 35 out: 0.893400 0.125793 max acur: 1.00 (epoch 32) se:100.00 sp:100.00 ac:100.00
epoch: 36 out: 0.905036 0.139306 max acur: 1.00 (epoch 32) se:100.00 sp:100.00 ac:100.00
epoch: 37 out: 0.900872 0.118167 max acur: 1.00 (epoch 32) se:100.00 sp:100.00 ac:100.00
epoch: 38 out: 0.909384 0.134014 max acur: 1.00 (epoch 32) se:100.00 sp:100.00 ac:100.00
training done.
training time: 00:00:00:031
classification results: maxacur.nn
train set: 49 25
sensitivity: 100.00
specificity: 100.00
+predictive: 100.00
-predictive: 100.00
accuracy: 100.00
Trang 5validation set: 25 12
sensitivity: 100.00
specificity: 100.00
+predictive: 100.00
-predictive: 100.00
accuracy: 100.00
test set: 26 13
sensitivity: 88.46
specificity: 92.31
+predictive: 95.83
-predictive: 80.00
accuracy: 89.74
classification results: iris.nn
train set: 49 25
sensitivity: 97.96
specificity: 100.00
+predictive: 100.00
-predictive: 96.15
accuracy: 98.65
validation set: 25 12
sensitivity: 96.00
specificity: 100.00
+predictive: 100.00
-predictive: 92.31
accuracy: 97.30
test set: 26 13
sensitivity: 88.46
specificity: 100.00
+predictive: 100.00
-predictive: 81.25
accuracy: 92.31
The network configuration is saved to maxacur.nn corresponding to the best performance on the validation set, and the last epoch configuration is saved to iris.nn At the end, you may compare the results for them.
To use your trained networks for testing, just run the console with these parameters:
>ann1dn r iris.nn test_data
The test_data file is in text or binary format You may provide as the fourth argument the threshold (default is 0.5) to
obtain the ROC curve on your test set, for example, varying it between 0.0 and 1.0
Neural Network Classes
The neural network is composed from the following classes:
ANNetwork
ANNLayer
ANeuron
ANLink
The ANNetwork class contains the implementation of the neural network for users of the library To avoid protected interface programming for the rest of the classes, I used friends I'll describe the library structure first, and then provide the functions you need to use from the ANNetwork class to maintain your own implementations
The ANNetwork contains an array of ANNLayer layers Every layer contains an array of ANeuron neuron objects, and
Trang 6every neuron contains arrays of ANLink input and output connections With that design, you may arrange any desired network structure; however, in my implementation, I provide only feed-forward full connectionist structure
The basic unit of the neural network is the neuron class, ANeuron You may add bias or input connection to it,
represented as the ANLink object
void ANeuron::add_bias()
void ANeuron::add_input(ANeuron *poutn)
The bias connection always take 1.0f as an input value, as you know With add_input(), you add connection to the neuron, supplying with its argument the neuron from the previous layer to which it connects
void ANeuron::add_input(ANeuron *poutn)
{
//poutn - Neuron from previous layer
ANLink *plnk = new ANLink( this , poutn);
inputs.push_back(plnk);
if (poutn)
poutn->outputs.push_back(plnk);
}
So, every neuron 'knows' which neurons from the next layer connect to its output The ANLink is like the 'arrow', pointing from the neuron in the previous layer, ANLink::poutput_neuron to the neuron in the next layer,
I organize a full connectionist neural network structure in this way:
void ANNetwork::init_links( const float *avec, const float *mvec, int ifunc, int hfunc)
{
ANNLayer *plr; //current layer
ANNLayer *pprevlr; //previous layer
ANeuron *pnrn; //neuron pointer
int l = 0
/////////////////////////input layer/////////////////////////////
plr = layers[l++];
swprintf(plr->layer_name, L "input layer" );
for ( int n = 0 ; n < plr->get_neurons_number(); n++) {
pnrn = plr->neurons[n];
pnrn->function = ifunc;
pnrn->add_input();
//one input link for every "input layer" neuron
if (avec)
pnrn->inputs[ 0 ]->iadd = avec[n];
if (mvec)
pnrn->inputs[ 0 ]->w = mvec[n];
else
pnrn->inputs[ 0 ]->w = 1 0f;
}
/////////////////////////////////////////////////////////////////
////////////////////////hidden layer's/////////////////////////////////////// 1bias
for ( int i = 0 ; i < m_layers_number - 2 ; i++) { //1input [l-2 hidden] 1output
pprevlr = plr;
plr = layers[l++];
swprintf(plr->layer_name, L "hidden layer %d" , i + 1 );
for ( int n = 0 ; n < plr->get_neurons_number(); n++) {
pnrn = plr->neurons[n];
Trang 7pnrn->function = hfunc;
pnrn->add_bias();
for ( int m = 0 ; m < pprevlr->get_neurons_number(); m++)
pnrn->add_input(pprevlr->neurons[m]);
}
}
//////////////////////////////////////////////////////////////////////////////
////////////////////////output layer///////////////////////////////////////// 1bias
pprevlr = plr;
plr = layers[l++];
swprintf(plr->layer_name, L "output layer" );
for ( int n = 0 ; n < plr->get_neurons_number(); n++) {
pnrn = plr->neurons[n];
pnrn->function = hfunc;
pnrn->add_bias();
for ( int m = 0 ; m < pprevlr->get_neurons_number(); m++)
pnrn->add_input(pprevlr->neurons[m]);
}
//////////////////////////////////////////////////////////////////////////////
}
The ANeuron functions that induce the neuron to 'fire', that is to take data from its inputs and process them to its outputs, are:
void ANeuron::input_fire()
void ANeuron::fire()
The first one is used for input layer neurons only The ANLink contains an additional term, iadd, used in
normalization The last one is used for hidden and output layer neurons:
inline void ANeuron::input_fire()
{
//input layer normalization
oval = (inputs[ 0 ]->ival + inputs[ 0 ]->iadd) * inputs[ 0 ]->w;
//single input for input layer neuron
switch (function) {
default :
case LINEAR:
break ;
case SIGMOID:
oval = 1 0f / ( 1 0f + exp( float ((- 1 0f) * oval)));
break ;
}
//transfer my output to links connected to my output
for ( int i = 0 ; i < get_output_links_number(); i++)
outputs[i]->ival = oval;
}
inline void ANeuron::fire()
{
//oval = SUM (in[]*w[])
oval = 0 0f;
//compute output for Neuron
for ( int i = 0 ; i < get_input_links_number(); i++)
oval += inputs[i]->ival * inputs[i]->w;
switch (function) {
Trang 8default :
case LINEAR:
break ;
case SIGMOID:
oval = 1 0f / ( 1 0f + exp( float ((- 1 0f) * oval)));
break ;
}
//transfer my output to links connected to my output
for ( int i = 0 ; i < get_output_links_number(); i++)
outputs[i]->ival = oval;
}
Now, you have some idea of the library internals, and further, I'll describe the ANNetwork class which you can use for your own implementations
You may load the neural network from the file, or arrange its structure specifying the number of layers and neurons per layer:
ANNetwork::ANNetwork(const wchar_t *fname);
ANNetwork::ANNetwork(int layers_number, int *neurons_per_layer);
int nerons_per_layer[ 4 ] = { 128 , 64 , 32 , 10 };
ANNetwork *ann = new ANNetwork( 4 , neurons_per_layer);
ann->init_links(); //feed-forward full connectionist structure
ann->randomize_weights();
If you want your custom neural network configuration with recurrent or any other connections, you have to provide your own functions
The ANNetwork::status() function returns the class status after construction Negative values indicate error, 0 -network loaded from file with success, and 1 - random weights
To train, classify, and save your network, the following functions are provided:
bool ANNetwork::train(const float *ivec, float *ovec, const float *dsrdvec, float error = 0.05);
void ANNetwork::classify(const float *ivec, float *ovec);
bool ANNetwork::save(const wchar_t *fname) const;
ivec and ivec represent the input vector fed to a neural network and the output vector where it stores its results Their dimensions should match the number of input and output neurons in the network structure dsrdvec is the desired output vector to which it adjusts its connections to match it within the error tolerance The
ANNetwork::train() function will return true in case the backpropogation took place, or false if the network output was within error to the desired vector
The backpropagation function uses this code:
bool ANNetwork::train( const float *ivec, float *ovec, const float *dsrdvec, float error)
// 0.0 - 1.0 learning
{
float dst = 0 0f;
//run network, computation of inputs to output
classify(ivec, ovec);
for ( int n = 0 ; n < layers[m_layers_number- 1 ]->get_neurons_number(); n++) {
dst = fabs(ovec[n] - dsrdvec[n]);
if (dst > error) break ;
}
Trang 9if (dst > error) {
backprop_run(dsrdvec); //it was trained
return true ;
} else //it wasnt trained
return false ;
}
void ANNetwork::backprop_run( const float *dsrdvec)
{
float nrule = m_nrule; //learning rule
float alpha = m_alpha; //momentum
float delta, dw, oval;
//get deltas for "output layer"
for ( int n = 0 ; n < layers[m_layers_number- 1 ]->get_neurons_number(); n++) {
oval = layers[m_layers_number- 1 ]->neurons[n]->oval;
layers[m_layers_number- 1 ]->neurons[n]->delta =
oval * ( 1 0f - oval) * (dsrdvec[n] - oval);
}
//get deltas for hidden layers
for ( int l = m_layers_number - 2 ; l > 0 ; l ) {
for ( int n = 0 ; n < layers[l]->get_neurons_number(); n++) {
delta = 0 0f;
for ( int i = 0 ; i < layers[l]->neurons[n]->get_output_links_number(); i++)
delta += layers[l]->neurons[n]->outputs[i]->w *
layers[l]->neurons[n]->outputs[i]->pinput_neuron->delta;
oval = layers[l]->neurons[n]->oval;
layers[l]->neurons[n]->delta = oval * ( 1 - oval) * delta;
}
}
////////correct weights for every layer///////////////////////////
for ( int l = 1 ; l < m_layers_number; l++) {
for ( int n = 0 ; n < layers[l]->get_neurons_number(); n++) {
for ( int i = 0 ; i < layers[l]->neurons[n]->get_input_links_number(); i++) {
//dw = rule*Xin*delta + moment*dWprv
dw = nrule * layers[l]->neurons[n]->inputs[i]->ival *
layers[l]->neurons[n]->delta;
dw += alpha * layers[l]->neurons[n]->inputs[i]->dwprv;
layers[l]->neurons[n]->inputs[i]->dwprv = dw;
//correct weight
layers[l]->neurons[n]->inputs[i]->w += dw;
}
}
}
}
Now, you may compose your own networks and proceed to typical classification tasks used in OCR, computer vision, and so on
License
This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3) About the Author
Chesnokov Yuriy Engineer
Russian Federation
Trang 10Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.121031.1 | Last Updated 20 May 2008 Everything else Copyright © Article Copyright 2007 by Chesnokov YuriyCodeProject , 1999-2012
Terms of Use
Member
Former Cambridge University postdoc (http://www-ucc-old.ch.cam.ac.uk/research/yc274-research.html), Department
of Chemistry, Unilever Centre for Molecular Informatics, where I worked on the problem of complexity analysis of cardiac data
As a subsidiary result we achieved 1st place in the annual PhysioNet/Computers in Cardiology Challenge 2006: QT Interval Measurement (http://physionet.org/challenge/2006/)
My research intrests are: digital signal processing in medicine, image and video processing, pattern recognition, AI, computer vision
My recent publications are:
Complexity and spectral analysis of the heart rate variability dynamics for distant prediction of paroxysmal atrial
fibrillation with artificial intelligence methods Artificial Intelligence in Medicine 2008 V43/2 PP 151-165
(http://dx.doi.org/10.1016/j.artmed.2008.03.009)
Face Detection C++ Library with Skin and Motion Analysis Biometrics AIA 2007 TTS 22 November 2007, Moscow, Russia (http://www.dancom.ru/rus/AIA/2007TTS/ProgramAIA2007TTS.html)
Screening Patients with Paroxysmal Atrial Fibrillation (PAF) from Non-PAF Heart Rhythm Using HRV Data Analysis Computers in Cardiology 2007 V 34 PP 459–463 (http://www.cinc.org/archives/2007/pdf/0459.pdf)
Distant Prediction of Paroxysmal Atrial Fibrillation Using HRV Data Analysis Computers in Cardiology 2007 V 34 PP 455-459 (http://www.cinc.org/archives/2007/pdf/0455.pdf)
Individually Adaptable Automatic QT Detector Computers in Cardiology 2006 V 33 PP 337-341
http://www.cinc.org/archives/2006/pdf/0337.pdf)
Comments and Discussions
38 messages have been posted for this article Visit
http://www.codeproject.com/Articles/21171/Backpropagation-Artificial-Neural-Network-in-C to post and view comments on this article, or click here to get a print view with messages