Output file VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY NGUYEN DUY KHUONG BARCODE IDENTIFICATION IN BLURRED IMAGES Major Computer Science Code 60 48 01 MASTER THESIS Ha[.]
VIETNAM NATIONAL UNIVERSITY, HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY NGUYEN DUY KHUONG BARCODE IDENTIFICATION IN BLURRED IMAGES Major: Computer Science Code : 60 48 01 MASTER THESIS Hanoi – 2010 ĐẠI HỌC QUỐC GIA HÀ NỘI TRƯỜNG ĐẠI HỌC CÔNG NGHỆ NGUYYỄN DUY KHƯƠNG NHẬN DẠNG MẪU CÓ CẤU TRÚC CHO ẢNH BỊ BIẾN DẠNG LỚN Chuyên ngành: Khoa học Máy tính Mã số: 60 48 01 TĨM TẮT LUẬN VĂN THẠC SĨ Hà Nội – 2010 Table of Contents Chapter I: Introduction Chapter II: Background Input data Pre-processing 10 2.1 Image restoration 11 2.2 Input standardization 15 Feature extraction 17 Techniques for pattern recognition 19 Chapter III: Our algorithm for structural pattern recognition 20 Direct and indirect approaches 20 Proposed solution 22 Our algorithm 23 Candidate evaluation 26 Techniques to improve the speed and accuracy of our algorithm 28 Chapter IV: Applying our algorithm for barcode recognition 29 The structure of bar code 29 The signals of barcode 32 Previous algorithms for recognizing barcode 35 Applying our algorithm for recognizing barcode 36 Chapter V: Experiments and results 40 Chapter V: Conclusion 43 REFERENCES 44 Table of figures Figure Model of pattern recognition system Figure Example of grammar representation 10 Figure Example of graph representation 10 Figure Model of image degradation 11 Figure Oblique handwriting characters 16 Figure Oblique bar code 16 Figure The follow chart of the indirect pattern recognition 21 Figure The follow chart of the direct pattern recognition 21 Figure An example of the indirect pattern recognition 22 Figure 10 An example of the direct pattern recognition 22 Figure 11 The follow chart of structural pattern recognition in heavily distorted images 23 Figure 12 An example of follow chart of motorbike plate pattern recognition 25 Figure 13 Model of candidate evaluation 26 Figure 14 Several kinds of linear barcodes 30 Figure 15 Barcode construction 32 Figure 16 A clear barcode image 33 Figure 17 Original signals of barcode 33 Figure 18 Blurred and noised signals of barcode 33 Figure 19 Blurred barcode image 34 Figure 20 Comparing Candidates 40 Figure 21 the blurred image with noise 41 Figure 22 The image is restored by blind deconvolution in Matlab 41 Figure 23 Barcodes are recognized correctly 41 Figure 24 Barcodes are recognized incorrectly 42 Chapter I: Introduction Within images, pattern recognition problem becomes extremely hard It is caused mostly by the large number of dimensions of data and the limited visual information able to extract from the images Moreover, this problem becomes much more serious if the images are heavily distorted, impossible to restore using common image restoring techniques and restriction of the ability of heavy computation, especially for camera of handled devices or robots Based on structural characteristics of patterns, in pattern recognition, we divide patterns into two categories, namely non-structural and structural Non-structural patterns often have non-specific shape, while structural patterns contain several elements which are related to each other based on the rules or syntaxes Utilizing this structural information to reduce the number of classes and the number of data dimensions can increase precision and performance of the recognition of structural patterns (Horst B and Alberto S., 1990) There are many factors affecting on the process of acquiring images such as defocusing, lens aberration, internal reflections and scattering, moving pattern, and transmission errors, etc (Jèahne, Bernd, 2004) Among them, defocusing and transmission errors (noise) are the factors which occurs the most frequently, especially for camera of handled devices or robots Hence, in some papers, it is assumed that other factors are ignored (Selim Esedoglu, 2006) Pattern recognition with distorted images can be conducted by two approaches, namely direct and indirect recognition without the process of image restoration In the first approach, recognizing patterns is conducted directly in degraded images (Wang K et al, 2007; Ender T and James C., 2009) In this approach, it may be immensely difficult to recognize patterns correctly from distorted images because with their bad quality, features of patterns cannot be extracted from the images Therefore, this approach is only suitable with slightly distorted images (Ender T and James C., 2009) In the second approach, patterns are recognized indirectly after a step in which degraded images are restored (Selim Esedoglu, 2003) Image restoration is considered as a pre-process step, but it plays an important role in the success of the pattern recognition There are several different ways of restoring the images, namely inverse filter, Weiner filtering, regularization methods (Selim Esedoglu, 2003), and statistical methods (Bertero, Mario., 1998) Among these methods, regularization methods and statistical methods achieve better results because these methods can be conducted well with the occurrence of noisy in images, which leads to disastrous results in other methods In addition, these approaches can be highly successful if restored images differ slightly with the original images and the image degradation is known exactly However, in fact, image restoration cannot always be performed well because the process of image degradation is often a compound transform and unknown Hence, it is absolutely difficult to enumerate them wholly As a result, the quality of restored images may be not good enough for pattern recognition In overview, both of these approaches of recognizing structural patterns may have many difficulties if only features extracted from images are used in pattern recognition Hence, some information from prior knowledge of patterns needs to be utilized in order to improve pattern recognition accuracy In this thesis, we propose an effective algorithm in order to improve pattern recognition accuracy by utilizing structural information of patterns Candidates for patterns are generated and evaluated via prior knowledge of characteristics of patterns’ structure and extraction information of image degradation In addition, some techniques such as divide-conquer and local search are offered in order to increase the accuracy and performance of structural pattern recognition systems Moreover, this approach can avoid expensive but really uncertain computation such as image restoration and feature extraction from heavily distorted images For convincing the above theory, this algorithm is applied to recognize a kind of linear bar code in heavily distorted image, which is a kind of sensitive structural pattern The thesis is structured as follows: Firstly, we start by a more detailed review of backgrounds related to structural pattern systems in the Chapter II Subsequently, the Chapter III focuses on our general algorithm for structural pattern recognition After that, an approach for bar code recognition based on our general algorithm is mentioned in the Chapter IV and experiments and results of this approach are illustrated in the chapter V Finally, some conclusions are drawn in the chapter V Chapter II: Background The principal function of a pattern recognition system is to locate the position and identify the class of patterns It contains several components: Input, preprocessor, feature extractor and recognizer (Rafael et al, 1978) Figure Model of pattern recognition system The input is signals which are transformed into a type suitable for machine manipulation by a measurement device Although a pattern recognition system can operate directly on the input data from the device, it is common that there are some additional components included such as pre-processor and feature extractor before recognizer For pattern recognition in images, the pre-process can be to enhance the quality of images Subsequently, feature extractor obtains the necessary information for pattern recognition which is input for recognizer In this chapter, therefore, we discuss mainly three issues: Input data, pre-processing, feature extraction and techniques for pattern recognition Input data The input for pattern recognition is represented a form of signals which are stored in the system such as images and audio, in which patterns are contained Features extracted from this input must be suitable with the recognizer of pattern recognition Hence, pattern recognition is generally categorized as statistical and structural (or syntactic) (Rafael et al, 1978), which correspond to two kinds of feature, namely vector-based and structural feature Concisely, a vector-based feature pattern is represented in a numeric vector and there is minor relation among its elements Patterns including these features are called as non-structural patterns On the other hand, structural pattern containing structural features in stronger relation is a kind of pattern which contains several components which are related to each other according to a given set of rules and syntaxes in relational descriptions and formal grammars (Rafael et al, 1978) These rules and syntaxes are additional information about patterns based on prior knowledge of the underlying structure of pattern classes They are structural constraints which are modeled in the graph model of pattern or grammars of a restricted set of symbols which patterns have to follow such as character recognition and bar code Syntactic pattern recognition employs this information in innovative ways in order to develop pattern recognition approaches Two following examples are two types of pattern constraints In the first example, there is a syntax-liked constraint for bar code like a syntax, in which there are a limited number of pre-designed patterns to encode digits and there are guard bars for error checking In the second example, structural constraints are mainly represented by using relational descriptions, e.g graphs These constraints can be absolutely helpful for pattern recognition Figure Example of grammar representation Figure Example of graph representation Pre-processing The main aim of this pre-processing step is to enhance the quality of and to standardize the input For pattern recognition in images, the enhancement contains many different manipulations such as brightness and contrast enhancement (William, 2007), supper-resolution (S C Park et al, 2003; S Farsiu, 2004), and image restoration 10 (Katsaggelos, 2003), etc However, in this section, only image restoration is focused on because it is the most important technique to enhance the quality of distorted images Subsequently, in the second sub-section, several image standardization techniques are discussed 2.1 Image restoration Until recent, image restoration is still a hard problem although this problem has been tackled by numerous researchers The aim of this process is to reconstruct or recover an image which has been degraded by various factors such as defocusing, lens aberration, moving pattern, vibration, etc (Berne Jèahne, 2004) There have been several proposed techniques to solve this problem: inverse filter (Wiener, 1949), Wiener filter (M Rothenberg, 1972), regularization methods and statistical methods (M Bertero and P Boccacci, 1998) In these techniques, a linear space-invariant degradation process is popularly modeled as a convolution of the original image with a degradation function: g ( x, y ) = f ( x, y ) * h( x, y ) + n( x, y ) where: • f ( x, y ) is the original (desired) image, • g ( x, y ) is the degraded image, • h( x, y ) is the degradation function and the most popular function known is the point spread function, • n( x, y ) is a noise function Figure Model of image degradation 11 In this section, the techniques for image restoration will be discussed more details in the following sections, especially models for blurred images with noise because it is popular for the kind of devices used in this research, handled devices a Estimating the degradation function To recover the distorted image, estimating the degradation function is the first step which needs to be done An approach used popularly is derived from a mathematical model as a blurred model In the representation, the blurred image is modeled by the following equation (Milan S., 1998): T g ( x, y ) = ∫ f [ x − x0 (t ), y − y0 (t )]h[ x0 (t ), y0 (t )]dt + n( x, y ) T Or also g ( x, y ) = f ( x, y )* h( x, y ) + n( x, y ) where f ( x, y ) is the unblurred image and T is the exposure time b Inverse filter In this technique, it is assumed that the degradation model has a little noise As the discussion in the previous sub-section, we have: g ( x, y) = f ( x, y)* h( x, y ) + n( x, y ) The Fourier transform gives (Milan S., 1998): G (u, v) = F (u, v)i H (u, v) + N (u, v) Because the effect of noise is negligible, hence we have ^ F (u , v) = G (u , v ) H (u, v) ^ where F (u, v) is an estimation of F (u, v) Hence, the error can be estimated by: 12 ^ F (u, v) − F (u, v) = N (u, v) H (u, v) This model has some problems Firstly, if several values of H(u, v) are small, they can cause overflow Secondly, if there is some noise, it can dominate Therefore, this model is very sensitive with noise, in which case, the result cannot be really appreciated c Wiener filtering This technique can be known as minimum mean-square error (MMSE) filter It is ^ assumed that images and noise can be in a way as random processes An estimate f of the uncorrupted image f can be found such that the mean square error is minimized (Wiener, 1942; M Rothenberg, 1972; M Bertero and P Boccacci, 1998): ^ MSE = E{( g ( x, y ) − g (x,y))2 } In the frequency domain, we have: MSE = E{| Gˆ (u , v ) − G (u , v ) |2 } In this technique, to recover the original image, we need to assume that the original signal and noise are independent d Regularization methods To recover the degraded image, a common approach is to solve the regularized least squares (RLS) minimization problem (M Bertero and P Boccacci, 1998): LSR = min{|| Af − b ||2 + λ R( x, y )} x where: || Af − b ||2 is a least squares term that measures the noise R (i) is a convex regularizer used to stabilize the solution λ > is a regularization parameter providing the tradeoff between fidelity to measurements and noise sensitivity 13 In practice, there are some interests of R(i) as follows: Tikhonov regularization (Gene H G et al, 1999, M Bertero and P Boccacci, 1998): by setting R(i) =|| Lf ||2 , we obtain the standard Tikhonov regularization problem: LSR = min{|| Af − b ||2 +λ || Lf ||2 } x l1 regularization: by setting R(i) =|| f ||1 , we obtain the standard Tikhonov regularization problem: LSR = min{|| Af − b ||2 +λ || f ||1} x Wavelet-based regularization: by setting R(i) =|| Wf ||1 , in which W is a wavelet transform matrix, the wavelet-based regularization problem is recovered as follows: LSR = min{|| Af − b ||2 +λ || Wf ||1} x Total variation-based (TV-based) regularization (Chan T.F et al, 1998): by m setting n R(i) = TV ( x) =|| ∇f ||= ∑∑ || (∇f )i , j || , we can achieve the TV-based i =1 j =1 regularization problem as follows: LSR = min{|| Af − b ||2 +λ || ∇f ||} x Although Total variation-based regulation is focused on research in recent years, the results of this technique still are not completely precise In other word, the restored images may be not accurate enough for pattern recognition because the image degradation can be unknown exactly and there may be some additional factors affecting on the process of degradation The energy function LSR used in the TV-based technique is very sensitive with the suitability of the original image with the blurred image Therefore, in next chapter, this energy function can be used as an evaluation function in our pattern recognition system 14 e Statistical methods In this technique, it is assumed that captured images are realizations of random processes There are two primary classes of statistical methods: Maximum likelihood methods and Bayesian methods In maximum likelihood methods (Timothy 1988; M Bertero and P Boccacci, 1998), the object (desired image) is assumed to be deterministic It takes the role of parameters representing the probability distribution of the captured image Meanwhile, in Bayesian methods (M Bertero and P Boccacci, 1998), the object is also assumed to be a realization of a random process with a given probability distribution f Evaluation of image restoration methods Among these types of techniques, inverse filtering responds very badly to any noise which tends to be high frequency since it is a form of high pass filter (Tinku A and Ajoy K R., 2005) Besides, Wiener filtering performs better than the previous method since it executes an optimal tradeoff between inverse filtering and noise smoothing It can remove the additive noise and invert the blurring simultaneously, though it is very sensitive to additive noise In addition, it is optimal in terms of the mean square error to minimize the overall mean square error in the process of inverse filtering and noise smoothing (Tinku A and Ajoy K R., 2005) In the fact, this method is not evaluated highly because it needs an assumption about a linear estimation of original images Hence, regularization methods and statistical methods are considered as the most effective methods They have more precise results because they can response noisy factors (M Bertero and P Boccacci, 1998) 2.2 Input standardization Input standardization can be considered as a preparing step to normalize input data for pattern recognition system It is suited better for feature extractors and recognizers because it makes the following processing steps become simpler and it also improves the accuracy of pattern recognition There are two standardization techniques as scale and direction normalization of patterns 15 The scale normalization is simply a process to translate pattern signals into the same scale ratio This step may be not important for scale-invariant feature extraction and recognition, but for scale-variant ones such as shape matching, it is particularly significant Meanwhile, the orientation normalization changes the angle or directions of pattern in order to enhance the quality of feature extraction and improve pattern recognition accuracy because the presence of patterns in different directions may make feature extraction and recognition methods become more complex Although there are rotationinvariant methods such as Wavelet or Fourier transform to extract feature vectors of patterns, slight direction change can cause great difficulties some other methods For example, localization of characters or bar codes becomes a serious master if these patterns are shelved Figure Oblique handwriting characters Figure Oblique bar code 16 Feature extraction Feature extraction is an important step in pattern recognition in mages to locate significant feature regions and to reduce the number of data dimensions via extracted important information These features should depend on the characteristics of patterns or objects and they need to be suitable with pattern recognition The determination of these regions can be based on global or local operators in images Global operators is operators performed in the whole image in order to enhance the quality and contrast of image, while local operators are often local operators to find out the characteristic details of patterns such as edges, corners and shapes Based on the abstract of information, features can be divided into two kinds: low-level and high-level features (M S Nixon and A S Aguado, 2008) Low-level features are features which are extracted automatically from an image without any shape information such as edges and corners In other word, these features not have spatial relationships with each other Meanwhile, high-level features contain higher abstract information such shape and object description They are frequently represented in mathematical models such as graph model and template matching To detect features of patterns, edge and corner detection are the two most significant operators They are insensitive to overall illumination change while sensitive to the contrast level of image The contrast represented by difference in intensity in local regions can detect the boundaries of features within an image (M S Nixon and A S Aguado, 2008) Hence, the difference of this detection can be found out by the following basic operators: f ( x, y ) = | f ( x, y ) − f ( x + 1, y ) | dx f ( x, y ) = | f ( x, y ) − f ( x, y + 1) | dy where f ( x, y ) is the image 17 Based on this idea, first order edge detection operators such as Prewitt (Prewitt and Mendelsohn, 1966), Sobel (Sobel, 1970), and Canny (Canny, 1986) are employed with second order edge detection operators such Laplacian (Vliet and Young, 1989) and Marr– Hildreth (Marr and Hildreth, 1980) Besides edges, corners are frequently exploited in pattern recognition The detection approaches of corner can be based on two factors, namely boundary and gray Eigenvalues of the covariance matrix, template and gradient-based techniques and gradient-direction are boundary-based approaches which are employed by Tsai et al (1999), Singh and Shneier (1990), and Zheng et al (1999) repectively Meanwhile, for grey-based approaches, Gao et al (2007) used log-Gabor wavelet transform, while Arnow and Bovik (2007) utilized foveated visual search and automated fixation selection to search the corner of patterns and Ando (2000) detected corners and edges via gradient covariance Based on the low-level features, high-level features are extracted from computer images Shape or template matching extracts significant component features such as the eyes, the ears and the nose in the face (M S Nixon and A S Aguado, 2008) The main idea of this technique is to try the best match and the maximum count between detected components and templates in databases The basic requirements of this approach is that techniques need to be size or orientation-invariant, hence, Hough transform (Hough, 1962) and Generalized Hough transform (Ballard, 1981) are employed popularly More complexly, flexible shape extraction is used to describe components of patterns with sufficient accuracy and spatial information(M S Nixon and A S Aguado, 2008) Among techniques on this approach, active shape modeling (Lanitis et al., 1997; Cootes et al., 1994; Hill et al., 1994) can be considered a major new approach Finally, mathematical models are employed in order to describe objects such as Fourier descriptors (Cosgriff, 1960), regional shape descriptors (Rosin and Zunic, 2005) and graph model (D Conte et al, 2004) The most major advantage of these methods is the scale and orientation-invariant ability and representing the relational and relevant information between components of patterns in recognition system 18 Techniques for pattern recognition Pattern recognition is known as a fundamental topic in learning machine, of which major task is to identify the class of patterns Based on the characteristics of used techniques, pattern recognition can be divided into two main approaches (): statistical pattern recognition and structural recognition Statistical pattern recognition is primarily based on statistical learning techniques, in which statistical distributions of pattern feature vectors is the most basic in order to determine the class of patterns There are many unsupervised and supervised learning machines known in this approach such as: clustering techniques (P Berkhin, 2006), Bayes classifiers (Barber, D., 2011), support vector machine (Shigeo Abe, 2005) and artificial neural networks(K Gurney, 1995) The accuracy of statistical pattern recognition largely depends on the quality of pattern database and the suitability between learning machines and feature extraction approaches Meanwhile, structural pattern recognition utilizes structural features of the patterns to identify the class of patterns (R C Gonzalez, 1978) In this approach, interrelationships between the primitive components of patterns are emphasized to compare the pattern Hence, besides feature extraction techniques employed, the representation of interrelationships between componential features of patterns plays an important role in the success of this approach These interrelationships are frequently represented in formal grammar or graph model In addition, establishing these constraints need to be based on the prior structural of patterns The main advantages of this approach are no requirement about pattern databases and the ability of classifying the large number of pattern classes 19 Chapter III: Our algorithm for structural pattern recognition Until recent, pattern recognition in heavily distorted images is considered as a hard problem because there are several differences faced simultaneously, especially image restoration and pattern recognition in images Image restoration does not only cost computationally but not always have correct answers In particular, image restoration requires a complex computation such as statistic inference (Aristidis C L., Nikolas P G., 2004) or solving partial differential equation (PDE) (Chan T.F., Chiu-Kwong W., 1998; Selim Esedoglu, 2004) Moreover, the finding of this performance is only approximate and not completely accurate if the degrading process of image is unknown or complex but it often happens In addition, pattern recognition in distorted images is quite hard because the numerous dimensions of data and impossible to extract these features from images, especially in heavily distorted images As a result, pattern recognition accuracy in distorted images is regularly lower than other environments, for example online handwriting character recognition often has the higher accuracy than offline handwritten character recognition (Réjean P and Sargur N S., 2000) The main aim of this chapter is to represent our algorithm to recognize structural patterns in heavily distorted images more effectively More particular, within heavily distorted image, structural pattern recognition may have more advantages than nonstructural images because besides features extracted from data; our algorithm has utilized prior structural knowledge of the patterns in order to improve recognition accuracy Direct and indirect approaches In pattern recognition, there are two different approaches which depend on characteristics of patterns, in which a pattern can be recognized directly or indirectly, see Figure and Figure In the indirect pattern recognition, patterns are identified after an image restoration process (Selim Esedoglu, 2003) Meanwhile, in the direct pattern recognition, patterns are recognized directly from the image (Ohbuchi E et al, 2004) 20 ... noise In addition, it is optimal in terms of the mean square error to minimize the overall mean square error in the process of inverse filtering and noise smoothing (Tinku A and Ajoy K R., 2005) In. .. Several kinds of linear barcodes 30 Figure 15 Barcode construction 32 Figure 16 A clear barcode image 33 Figure 17 Original signals of barcode 33 Figure 18 Blurred. .. Patterns including these features are called as non-structural patterns On the other hand, structural pattern containing structural features in stronger relation is a kind of pattern which contains