BÁO CÁO " HANDWRITTEN NUMBER RECOGNITION AND ITS APPLICATION AT DANANG UNIVERSITY OF TECHNOLOGY " pdf

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	5
Dung lượng	264,33 KB

Nội dung

Tuyển tập Báo cáo Hội nghị Sinh viên Nghiên cứu Khoa học lần thứ 8 Đại học Đà Nẵng năm 2012 HANDWRITTEN NUMBER RECOGNITION AND ITS APPLICATION AT DANANG UNIVERSITY OF TECHNOLOGY Authors: Duong Thi Kim Cuc, Dinh Quang Huy, Tran Hoang An, Nguyen Van Trong Da Nang University of Technology, Center of Excellence, ECE08 Advisors: Hoang Le Uyen Thuc, M.S., Pham Van Tuan, Ph.D. Da Nang University of Technology, Department of Electronics and Telecommunications ABSTRACT This paper presents the results of handwritten digit recognition on well-known image databases using state-of-the-art feature extraction and classification techniques. The tested databases are obtained from MNIST [1] and collected samples of digits handwritten by teachers at Da Nang University of Technology. For feature extraction, two features are chosen: Hu’s seven moments and image averaging (resizing the images to ones of less number of pixels for easier comparison). The preceding features are accompanied with corresponding classifiers, which are Neural Network classifier and Euclidean Distance. So far with the dictionary for matching collected at Da Nang University of Technology, the combination of image averaging feature and the Euclidean Distance gives the best accuracies (more than 93%) and can further be improved with a more comprehensive database. 1. Introduction One of the most troublesome and tedious tasks teachers at Da Nang University of Technology generally face is to manually put the exam grades into computers. This project aims at providing them with the convenience of not having to copy the grades by hands, by presenting a method of automatically importing grades into computers. This technique employs a well-known procedure in pattern recognition called OCR (optical character recognition). The performance of character recognition largely depends on the feature extraction approach and the classification/learning scheme. For feature extraction of character recognition, various approaches have been proposed. Hu’s seven moments have been extensively employed as invariant global features of images in pattern recognition. Averaging is a rather simple process of representing a square of pixels by a single pixel, leading to an image being expressed by a smaller image. An artificial neural network (ANN) consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation. When the network structure is appropriately designed and the training sample size is large, neural networks are able to give high classification accuracy to unseen test data. OCR using template matching is a system prototype that useful to recognize the character or alphabet by comparing two images of digits. We implement template-matching technique, which involves optimizing the Euclidean Distance between the patterns to be recognized with the sample patterns provided by the dictionary. 2. Main process Tuyển tập Báo cáo Hội nghị Sinh viên Nghiên cứu Khoa học lần thứ 8 Đại học Đà Nẵng năm 2012 2.1. Proposed System The scanned image is first preprocessed to give the normalized image. To ease the classification process, the normalized image is represented by a set of features for comparison. Finally, the conversion from the JPEG file to xls takes place. Figure 1. Proposed System Overview. 2.2. Preprocessing There are four main steps in this stage. Firstly, the scanned RGB image is converted to gray scale image. This process is completed using the formula Intensity = 0.2989*red + 0.5870*green + 0.1140*blue [2]. Secondly, the image is thresholded to obtain the binary one. The thresholding level, which is chosen to be 70 in this case, depends on the quality of the scanned image and the background noise The output binary image has value of 1 (white) for all pixels in the input image with luminance greater than 70 and 0 (black) for all other pixels. Thirdly, “Opening” morphology method [3] is applied to smoothen the number and eliminate small noise regions. Finally, normalization is used to regulate the size, position and shape of the image so that the differences between samples in one class are reduced. The key idea behind normalization involves bilinear interpolation theory [4]. All of these steps are depicted in the Figure 2. (a) (b) (c) (d) (e) Figure 2. Preprocessing Steps. (a) RGB Image. (b) Gray-scale Image. (c) Binary Image. (d) Image after Morphology. (e) Normalized Image. 2.3. Feature extraction The features used in our experiment are Hu’s seven moments and image averaging. 2.3.1 Hu’s seven moments (SM) An essential issue in the field of pattern analysis is the recognition of objects and characters regardless of their position, size and orientation as illustrated in figure 1. The idea of using moments in shape recognition gained prominence when Hu (1962) [5], derived a set of invariants using algebraic invariants. The two-dimensional (p + q) th order moments of an image with density function f(x, y) are defined in terms of Riemann integrals. Tuyển tập Báo cáo Hội nghị Sinh viên Nghiên cứu Khoa học lần thứ 8 Đại học Đà Nẵng năm 2012 The central moment are defined as: (2) (3) In particular, Hu (1962), defines seven values, computed by normalizing central moments through order three, that are invariant to object scale, position, and orientation. 2.3.2 Image Averaging Since the matrix expressions for each of the ten numbers from 0 to 9 are very different, it is reasonable to recognize them by checking each ‘pseudo pixel’, which is represented by a 4x4 block in a particular image number. A 4x4 block has its own averaging value and can be considered a ‘pixel’. By choosing 4x4 blocks, we can reduce the complexity of the recognition process but still maintain the shape of the image. Figure 3 shows the number zero images before and after the averaging. (a) (b) Figure 3. Example of Image Averaging. (a) Initial Image. (b) Average Image 2.4. Classification algorithm 2.4.1 Artificial neural network Artificial neural network [6], which are inspired from studies of biological nervous systems, are composed of many simple nonlinear computational elements (neurons or nodes) which are connected by links with variable weights. The inherent parallelism of these networks allows rapid pursuit of many hypotheses in parallel, resulting in high computation rates. Moreover, they provide a greater degree of robustness or fault tolerance than conventional computers because of the many processing nodes, each of which is responsible for a small portion of the task. Damage to a few nodes or links thus does not impair overall performance significantly. Neural networks can perform different tasks, one of which is in the context of a supervised classifier. This is a decision-making process which requires the net to identify the class or Tuyển tập Báo cáo Hội nghị Sinh viên Nghiên cứu Khoa học lần thứ 8 Đại học Đà Nẵng năm 2012 category which best represents an input pattern. It is assumed that the net has already adapted to the classes it is expected to recognize through a learning process using labeled training prototypes from each category Figure 4. General structural of a neural network [6] 2.4.2. Template matching using Euclidean Distance The Euclidean Distance [7] is based on the smallest ‘distance’ or error between the testing samples and a dictionary that is built up in advance. 2 1 1 min ( ( ( , )) ) k nN j d feature dictionary j n     (4) 3. Experimental results Two different features and classifiers result in four experiments: Hu’s SM and Neural Network, SM and Euclidean Distance, Image Averaging and Neural Network, and Image Averaging and Euclidean Distance. Table 1: Errors rates for data from MNIST Table 2: Errors rates for data from DUT (Over 1000 samples) (Over 90 samples) Figure 6 shows the actual result from the Graphic User Interface (GUI). Hu’s seven moments Image averaging Neural Network 48.18% 14.3% Euclidean Distance 7.2% 8.4% Hu’s seven moments Image averaging Neural Network 83.6% 4.2% Euclidean Distance 8.2% 10% Tuyển tập Báo cáo Hội nghị Sinh viên Nghiên cứu Khoa học lần thứ 8 Đại học Đà Nẵng năm 2012 Figure 6. The experimental results of DUT and MNIST databases. (a) DUT database. Figure 7 introduces the GUI. The user types the name of the JPEG file and the corresponding number of students, and then clicks Convert button to get an xls file containing extracted scores. To view the xls file, the user clicks Open to view xls file button. Figure 7. Graphic User Interface 4. Conclusion The experimental results indicate that Image Averaging and Euclidean distance give more stable and smaller errors than the combination of Neural Network and SM, while the best performance is obtained using Neural Network classifier. From these statistics, we decided to implement Image Averaging and Euclidean Distance in the final Number Recognition Product. In the future, this program will be upgraded to recognize the score written in decimal number (such as 9.5 or 10). Also, a score written in word recognition system will be added for checking the result. REFERENCES [1] Yann LeCun, Corinna Cortes, The MNIST Database of Handwritten Digits, http://yann.lecun.com/exdb/mnist/ [2] http://www.mathworks.com/help/toolbox/images/ref/rgb2gray.html [3] Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing, second edition (2001) P.528-532. [4] http://en.wikipedia.org/wiki/Bilinear_interpolation. [5] Ming – Kuel Hu, “Visual Pattern Recognition by Moment Invariants” (1962). [6] V. Venugopal, W. Baets, Neural Networks and Statistical Techniques in Marketing Research: A Conceptual Comparison (1994), Vol. 12 Iss: 7, pp.30 – 38. [7] Cheng Liu Liu, “Handwritten Digit Recognition: Benchmarking of state-of-the-art techniques”, Elsevier Ltd, (2002). . tập Báo cáo Hội nghị Sinh viên Nghiên cứu Khoa học lần thứ 8 Đại học Đà Nẵng năm 2012 HANDWRITTEN NUMBER RECOGNITION AND ITS APPLICATION AT DANANG UNIVERSITY. samples of digits handwritten by teachers at Da Nang University of Technology. For feature extraction, two features are chosen: Hu’s seven moments and image

Ngày đăng: 06/03/2014, 02:20

Xem thêm