large pattern recognition system using multi neural networks - codeproject

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	13
Dung lượng	1,28 MB

Nội dung

Articles » General Programming » Algorithms & Recipes » Neural Networks Large pattern recognition system using multi neural networks By Vietdungiitb, 31 May 2012 Download drawing samples - 52 KB Download handwriting_recognition_using_multi_neural_networks.flv - 8.9 MB Download source - 1.3 MB Download demo - 146.6 KB Download lower_case_letter_v2.zip - 5.6 MB Download digit_v2.zip - 3.3 MB Download capital_letter_v2.zip - 5.6 MB Introduction 4.98 (36 votes) Now a day, artificial neural network has been applied popularly in many fields of human life. However, creating an efficient network for a large classifier like handwriting recognition systems is still a big challenge to scientists. In my last article named “Library for online handwriting recognition system using UNIPEN database”, I presented an efficient library for a handwriting recognition system which can create, change a neural network simply. The demo program showed good recognition results to digit set (97%) and alphabet sets (93%).This article I will continue to present a solution for a large patterns classification in general and handwriting recognition in particular. Recognition rate significantly increate when using additional spell checker module Neural network for a recognition system In the traditional model of pattern recognition, a hand-designed feature extractor gathers relevant information from input and eliminates irrelevant variabilities. A trainer classifier (normally, a standard, fully-connected multi-layer neural network can be used as a classifier) then categorizes the resulting feature vectors into classes. However, it could have some problems which should influent to the recognition results. The convolution neural network (CNN) solves this shortcoming of traditional one to achieve the best performance on pattern recognition task. The CNNs is a special form of multi-layer neural network. Like other networks, CNNs are trained by back propagation algorithms. The difference is inside their architecture. The convolutional network combines three architectural ideas to ensure some degree of shift, scale, and distortion invariance: local receptive field, shared weights (or weight replication) spatial or temporal sub-sampling. They have been designed especially to recognize patterns directly from digital images with the minimum of pre-processing operations. The architecture details of CNN have been described comprehensively in articles of Dr. Yahn LeCun and Dr. Patrice Simard (see my previous articles). Figure 1: The Architecture of LeNET 5 Figure 2: An input image followed by a feature map performing a 5 × 5 convolution and a 2 x 2 sub-sampling map The recognition results of the above networks are really high to small patterns collection such as digit, capital letters or lower case letters etc. However, when we want to create a larger neural network which can recognize a bigger collection like digit and English letters (62 characters) for example, the problems begin appear. Finding an optimized and large enough network becomes more difficult, training network by large input patterns takes much longer time. Convergent speech of the network is slower and especially, the accuracy rate is significant decrease because bigger bad written characters, many similar and confusable characters etc. Furthermore, assuming we can create a good enough network which can recognize accurately English characters but it certainly cannot recognize properly a special character outsize its outputs set (a Russian or Chinese character) because it does not have expansion capacity. Therefore, creating a unique network for very large patterns classifier is very difficult and may be impossible. The proposed solution to the above problems is instead of using a unique big network we can use multi smaller networks which have very high recognition rate to these own output sets. Beside the official output sets (digit, letters…) these networks have an additional unknown output (unknown character). It means that if the input pattern is not recognized as a character of official outputs it will be understand as an unknown character. Then the input pattern will be transferred to the next network until the system can recognize it correctly. Figure 3: Convolution neural network with unknown output Figure 4: Recognition System using multi neural networks This solution overcomes almost limits of the traditional model. The new system includes a several small networks which are simple for optimizing to get the best recognition results. Training these small networks takes less time than a huge network. Especially, the new model is really flexible and expandable. Depending on the requirement we can load one or more networks; we can also add new networks to the system to recognize new patterns without change or rebuilt the model. All these small networks have reusable capacity to an other multi neural networks system. Experiment The demo program is built to the purpose showing all stages of a recognition system including: create a component network, train a network, test networks on UNIPEN dataset and test networks on a mouse drawing control. It is tutorials which can help everybody can understand to a recognition system. All functions can be implemented on the program GUI. So you can create, train, and test your network on runtime without change any code or restart the program. Figure 5: Handwriting recognition system interface Creating new neural network Figure 6: Creating new neural network Interface Creating new neural network completely bases on GUI. Creating a network depends on the input pattern size, number of layers, data set…. On the output layer we can choose unknown output checkbox to create an additional unknown output to the network or ignore it to create a normal network. Of course, we can still to create a network by code: void CreateNetwork() <pre> { network = new ConvolutionNetwork(); //layer 0: inputlayer network.Layers = new Layer[6]; network.LayerCount = 6; InputLayer inputlayer = new InputLayer("00-Layer Input", new Size(29, 29)); network.InputDesignedPatternSize = new Size(29, 29); inputlayer.Initialize(); network.Layers[0] = inputlayer; ConvolutionLayer convlayer = new ConvolutionLayer("01-Layer ConvolutionalSubsampling", inputlayer, new Size(13, 13), 10, 5); convlayer.Initialize(); network.Layers[1] = convlayer; convlayer = new ConvolutionLayer("02-Layer ConvolutionalSubsampling", convlayer, new Size(5, 5), 60, 5); convlayer.Initialize(); network.Layers[2] = convlayer; FullConnectedLayer fulllayer = new FullConnectedLayer("03-Layer FullConnected", convlayer, 200); fulllayer.Initialize(); network.Layers[3] = fulllayer; fulllayer = new FullConnectedLayer("04-Layer FullConnected", fulllayer, 100); fulllayer.Initialize(); network.Layers[4] = fulllayer; OutputLayer outputlayer = new OutputLayer("05-Layer Output", fulllayer, Letters3.Count, true); outputlayer.Initialize(); network.Layers[5] = outputlayer; network.TagetOutputs = Letters3; network.UnknownOuput = '?'; } Training a network After creating a neural network using "Create network" function, the network will be trained using UNIPEN database. Figure 7: Training network interface Depending on the network size we can choose training set is 1a, 1b or 1c in the UNIPENdata folder. Statistic of training process can show many useful information such as: No. of epoch, MSE, training time per epoch, success rate… UNIPEN data browser and recognition testing The UNIPEN data browser control in the demo program can show all the UNIPEN data files. We can also test the trained neural network on these files by loading trained network parameters files. Figure 8: UNIPEN data browser and recognition interface Mouse Drawing test Figure 9: Mouse drawing recognition interface The mouse drawing control is based on the excellent article ”DrawTools” by Alex Fr. I just changed some codes to fit to my requirement. The cursive text in the image is divided to line, word and isolated character by same algorithm as follows: private void btRecognition_Click(object sender, EventArgs e) <pre> { //recognition all characters in the drawArea if (bitmap != null) { bitmap.Dispose(); bitmap = null; } bitmap = new Bitmap(drawArea.Width, drawArea.Height); drawArea.DrawToBitmap(bitmap, new Rectangle(0, 0, bitmap.Width, bitmap.Height)); drawBitmap =(Bitmap) bitmap.Clone(); if (bitmap != null) { lbRecognizedText.Items.Clear(); List<InputPattern> lineList=null; List<InputPattern> wordList=null; InputPattern parentPt=new InputPattern(bitmap,255,new Rectangle(0,0,bitmap.Width,bitmap.Height)); lineList = GetPatternsFromBitmap(parentPt,500,1,true,10,10); if (lineList.Count > 0) { if (characterList != null) { characterList.Clear(); characterList = null; } characterList = new List<InputPattern>(); foreach (var line in lineList) { String text = ""; wordList = GetPatternsFromBitmap(line, 50, 10,false, 10, 10); if (wordList != null) { if (wordList.Count > 0) { foreach (var word in wordList) { List<InputPattern> charList = GetPatternsFromBitmap(word, 5, 5, false, 10, 10); //check if have part bitmaps if (charList != null) { if (charList.Count > 0) { panelNavigation.Visible = true; foreach (var c in charList) { characterList.Add(c); c.GetPatternBoundaries(5,5,false,10,10); Char accChar = new Char(); PatternRecognition(c.OriginalBmp,out accChar); if (accChar != '\0') { text = String.Format("{0}{1}", text, accChar.ToString()); drawBitmap = c.DrawChildPatternBoundaries(drawBitmap); } } } } text = String.Format("{0} ", text); } } } lbRecognizedText.Items.Add(text); } } pbPreview.Image = drawBitmap; lblNavigation.Text = characterList.Count.ToString(); index = 0; } } Figure 10: Loading trained network parameters files In order to active the recognition function I simply load trained network parameters files. Depending to my recognition requirement I can load one, two or all files. The recognition results are really good (higher 90%) if I load only one network to recognize its output characters. However, when I load multi network the system’s accuracy rate becomes lower. The main reasons are many confusable characters in cursive text; the training sets are not large enough etc. For a large pattern collection like handwritten characters, there are so many similar characters which can make not only machine but also human confuse in some cases such as: O, 0 and o; 9, 4,g,q etc. These characters can make networks misrecognize. Hence the solution has been being upgraded which significant increate recognition rate by using an additional spellchecker/voting module at the output of system. The input pattern will be recognized by all component networks. These outputs (except unknown outputs) then will be set as the inputs of the spellchecker/voting module. The module will bases on previous recognized characters, internal dictionary and other factors to decide which one will be the most accurated recognized character. [...]... this article Visit http://www .codeproject. com/Articles/376798/Largepattern -recognition- system- using- multi- neura to post and view comments on this article, or click here to get a print view with messages Permalink | Advertise | Privacy | Mobile Web04 | 2.6.121031.1 | Last Updated 1 Jun 2012 Article Copyright 2012 by Vietdungiitb Everything else Copyright © CodeProject, 199 9-2 012 Terms of Use ...Figure 11: The new recognition system using Spell checker /voting module The new recognition system using Spell checker /voting module (internal dictionary) The spellchecker module makes the system recognizes much better Conclusion The proposed recognition model has solved amost prolems to a large recognition system: the capacity of recognizing large partern collection, flexible design... the system also can do easier by increasing recognition rate of component networks, using the spell checker /voting module etc The demo program also proved the capacity of the library which should be used in many other applications such as prediction application, face recognition Fututre work and upgrade Some features would be udate to the library: - Convolution and sampling layer of LeNET model - Spell... especially to the model, spell checker module and character segmentation algorithm History version 1.0: initial code version 1.1 the spell checker /voting module has been added to the system which increates significantly recognition rate It made me really supprised and happied I will publish it when I complete code rearrangement License This article, along with any associated source code and files,... such as prediction application, face recognition Fututre work and upgrade Some features would be udate to the library: - Convolution and sampling layer of LeNET model - Spell checker / voting module -character segmentation At the moment, the project took to much my free time It should be slowdown or temporary stop until I can rearrange everything and/or find a new good sponsorship Howerver the vote/comment . Recipes » Neural Networks Large pattern recognition system using multi neural networks By Vietdungiitb, 31 May 2012 Download drawing samples - 52 KB Download handwriting _recognition_ using_ multi_ neural_ networks. flv. messages have been posted for this article Visit http://www .codeproject. com/Articles/376798 /Large- patternrecognition- system- using- multi- neura to post and view comments on this article, or click. particular. Recognition rate significantly increate when using additional spell checker module Neural network for a recognition system In the traditional model of pattern recognition, a hand-designed

Ngày đăng: 28/04/2014, 10:11

Xem thêm