COMBINING DISCRETE ORTHOGONAL MOMENTS AND DHMMS FOR OFF-LINE HANDWRITTEN CHINESE CHARACTER RECOGNITION Xianmei Wang, Yang Yang, and Kang Huang School of Information and Engineering University of Science and Technology Beijing No. 30, Xueyuan Road, Beijing, China, 100083 plum-wanggtom.com Abstract moments requires a coordinate transformation which may Discrete orthogonal moment set is one of the novel also cause precision loss. feature moment-based descriptors for image analysis. Some researchers such as MuKundan et al. have The Tchebichef moments and Krawtchouk moments are suggested the use of discrete orthogonal moments to the two representatives in this class. This paper studies overcome the problems associated with the continuous the performance of the two discrete orthogonal moments orthogonal moments. In the past few years, some in the recognition of off-line handwritten Chinese amount different discrete orthogonal moments are proposed such in words under Discrete-time Hidden Markov Models as Tchebichef moments [5-7] and Krawtchouk moments (DHMMs) framework. The lower order moments are [8] [9]. These discrete orthogonal moments are directly employed asfeatures. A serial of experiments are carried defined in the image coordinate space [(O, 0), (N-1, M- out to compare their performance with that of the 1)]. The implementation of the discrete orthogonal continuous orthogonal movements such as Zernike and moments doesn't involve any numerical approximation Legendre. Experimental results suggest that the and coordinate transformation. These properties make recognition performance of two discrete orthogonal discrete orthogonal moments much more suitable for 2-D moments is higher than that of the continuous discrete images as pattern features. It has been shown that these moments. In additional, different values of the number of discrete orthogonal moments have better performance zones, observation symbols and states are also used to than the conventional continuous orthogonal moments find the better model structure for the new approach. for image reconstruction [5] [8] [9]. As we know, for off-line handwritten character Keywords: Discrete orthogonal moments; DHMMs; Off recognition, most of obstacles remain in the strong line handwritten character recognition. variability of the handwriting styles. Hidden Markov models (HMMs) are stochastic models which can deal with dynamic properties and variations among human 1. INTRODUCTION handwriting. Since the last decade, HMMs have been widely used for off-line handwritten character Moments with orthogonal basis functions, introduced recognition. There are basically two classes of HMMs by Teague [1], have minimal information redundancy in depending of the type of observation sequence, i.e. a moment set. In this class, the two most important discrete-time HMMs (DHMMs) and continuous-time orthogonal moments which have been extensively HMMs (CHMMs). Both of them have been successfully researched in pattern recognition field are Zernike applied to the recognition of off-line handwritten moments and Legendre moments [2-4]. Now they have character. However, DHMMs are more attractive because been widely used as fundamental features for character of their low computational cost. recognition. But both Zemike and Legendre moments In this paper, we proposed a method by extracting belong to the class of continuous moments. For digital discrete orthogonal moments feature for DHMMs-based images, their computation requires numerical character recognition. Then we study and compare the approximation of continuous integral. This process can discrete orthogonal moments for unconstrained off-line cause error, especially when the order of the moments handwritten Chinese amount in words recognition with increases. Additionally, the use of Zernike and Legendre 12 Chinese words. In the literature of image processing, ' ~~~~~~~~~it is well established that the important and perceptually Proc. 5th IEEE Int. Conf. on Cognitive Informatics (ICCI'06) Y.Y. Yao, Z.Z. Shi, Y. Wang, and W. Kinsner (Eds.)78 1 -4244-0475-4/06/$20.OO @2006 IEEE78 significant information of an image lies in the lower of continuous integrals, and does not require coordinate order coefficients. The higher order coefficients can be space transformations. discarded without significantly affecting the quality of the image. So moments of lower order are extracted in 2.2 Tchebichef Moments our method as the feature descriptors. The organization of this paper is as follows. Section 2 Tchebichef moments are a set of moments formed by presents the definition of discrete orthogonal moments using Tchebichef polynomials as the basis function set including Tchebichef moments and Krawtchouk [5]. moments. Our HMMs-based recognition approach The definition of the nth order Tchebichef combining discrete orthogonal moments for off-line polynomial is handwritten character recognition is provided in section 3. Section 4 presents experimental results of our method. -n, -X, 1+ n Section 5 then concludes this paper. t, (x)=(I-N),, 3F2 1 i-N 1 (4) 2. DISCRETE ORTHOGOLNAL x,n = 0,1,2, N-1. MOMENTS and the function p Fq (.) is defined as The discrete orthogonal moment functions are based r _ _ _k on discrete polynomials set which is orthogonal in the ap (a1,a2,. . )k zk(5) discrete domain of the image coordinate space. In this q k=., bq)k k! section, we first introduce the theorem on the discrete orthogonal polynomials and discrete orthogonal where symbol (a)k is given by moments, followed by Tchebichef moments and Krawtchouk moments. ak =a(a + I)(a + 2) (a + k -1) (6) 2.1 The Generalized Discrete Orthogonal For Tchebichef polynomials, the weight function Polynomials and Moments cw(x) and squared norm p(n) are given by w(x)1 (7 Suppose {f, (x)} is a set of discrete orthogonal N + n7 polynomials, then it satisfies following condition [5]. p(n, N) = (2n)! 2n 0 ° <ml n <N-1 (8) N-1 Z )(x)f, (x)fm (x)= p(n; a, b)dmn a < x < b (1) Because the value of Tchebichef polynomial grows X=o rapidly as Nn [5], a set of weighted Tchebichef where c(x) is called the weight function and p(n) the polynomials is introduced by R. Mukundan, S. H. Ong squared norm. and P. A. lee to ensure the stability of Tchebichef The generalized discrete orthogonal moments of order polynomials. (n ± m) in term of f, (x) for an image with intensity The definition of weighted Tchebichef polynomial function I(x,y) ( O<x<N-1 , O<y<M-1 ) are It,(x)} is defined as t, (X) 1 N-IM-1 8 (n, N) F = E 1 o°(x)fn (x)fm (y)I(x, y) (2) Fnm - p(n, N)p(m, M) x=O y=o where ,8(n, N) is a suitable constant to maintain the n =0,1,2, N - 1,m =0,1,2, ,M -1. value of weighted Tchebichef polynomial within the For a given moment function Fnm , p(n, N) is rangeof[-1,1]. nm ~~~~Under (9), p(n, N) also gets modified by independent of x. So (2) can also be written as ' 9 Fnm =EE p(n N)p( M) (3) p,, (n, N) Ikn, N (I 0) We can see that, the moment definition given in (2) or There are some choices for the function ,6(n, N) [8-9]. (3) completely eliminates the need for any approximation Tesmls omi 789 The Tchebichef moments of order (n + m) in term of 3. HMMS FOR CHARACTER tn(x) for an image with intensity function I(x,y) RECOGNITION (0.x< N-1, 0.y.M-1),are definedas HMMs have been widely used in the field of pattern 1 N-IM-1 recognition. They are stochastic models for time Tn = E tn (X)t,,, (y)I(x, Y) sequences, which were introduced and studied in the late 5(n,N)5(m,M) x=O y=O 1960s and early 1970s [10] [11]. Their original NIM1 tl (x)t,(Y) I(x, ) (12) application was in speech recognition. During the last x=O y=o p(n, N)p(m, M) decade, HMMs have become popular in off-line character N-IM-1t (X)t (y)f8(n N)U8(m M) recognition system. Like all other classification methods, (X)trn ~ N)13(m, Y ^I(x, y) Feature extraction is a key for a HMMs-based classifier. x=O y=0 p(n, N)p(m, M) In this paper, we studied discrete orthogonal moments as n 0,1,2, N - 1, m = 0,1,2, , M -1. features under HMMs framework for off-line handwritten Chinese character recognition with a limited 2.3 Krawtchouk Moments lexicon. The set of Krawtchouk moments was first introduced 3.1 The Introduction of DHMMs by Yap et al. [8] [9]. The kernel of Krawtchouk moments consists of the set of Krawtchouk polynomials. Here the HMMs are referenced as DHMMs. A The definition of the nth order Krawtchouk DHMM is a probabilistic model that describes a random polynomial is [8] sequence O = 1,02, , T as the indirect observation of a hidden random sequence Q I Q12 , QT, where this Kn(x;p,N) , pak,n,p F fl, = x (13) hidden process is Markovian. A DHMM /2 {A, B,2T} is k=O - N P) characterized by following elements: where x, n = 0,1,2, N, N > 0, p E(0,1) . N: the number of states in amodel; The functions Fq and symbol are given by (5) M the total number of observation symbols; The functions andsym lak A (a, )NN : the state transition probability matrix. and (6) respectively. and (6) respectively. ~~B=(b )Am : the observation symbol probability For Krawtchouk polynomials, the weight B m (ti sl function c(x) and squared norm p(n) are given by matrix in each state ;T = (;Ti )N the initial state probability matrix. NNx c(x; p, N) = K x (1 - p)Nx- (14) 3.2 System Overview n n p(n; p, N) = (_P)n rl , 0 < n < N -1 . (15) At the top level, a traditional DHMMs-based character (-N)n K P ) recognition system can be divided into two basic functional components: training and recognition. Both Same to the Tchebichef polynomials, instability also training and recognition share a common pre-processing, exists among Krawtchouk polynomials. The definition of frames generation and feature extraction stage. weighted Krawtchouk polynomial {Kn(x;p,N)} is In the training phase, after applying pre-processing steps including binarization, noise removal, boundary Kn (x; ~, N) x; p, N) (16) obtainment, and size normalization, an image p(n; p, N) I(x, y) (0 < x, y < L) is segmented into T frames The Krawtchouk moments of order (n ± m) in term of frame(i) (1 < i < T) using sliding window technique to suit for DHMMs recognition engine. The feature weighted Krawtchouk polynomials for an image with extraction module then transforms a frame intensity function I(x, y) and N x M pixels are defined as image frame(i) into a feature vector fv(i), which is then Krn m =E Kn(x; p1 ,N - )Krn (y; p2 ,M - )I(x,y) (17) translated into a symbol O(i)(l .<i<.T) by clustering x=O y=o algorithm (VQ). The codebook with M code words output by VQ is kept for further use to quantized feature 790 vectors into symbols. By above steps, a character pattern is translated into a sequence of ' Frame(i) symbols 0 = 0(1), 0(2) , O(T) . For each pattern class, Z n 1 observation sequences 0 to adjust model Zonei parameters {A, B,;T}, so that the probability of the train .Zone Z observation sequences P(O /2) is maximized. In the recognition phase, after pre-processing, frames generation and feature extraction, T sequential feature According to (12) and (17), it's obvious that if the vectors are obtained. Then these feature vectors are highest order of moments is d, the dimension of a local quantized into the possible codebook vectors. Thus, each feature vector for a zone is word image is represented by a sequence of corresponding observation symbols. The resulting D'= (d + 1) x (d + 2)72. (19) observation sequence is then used to calculate the log- likelihoods for the model. The word associated with an Then the total dimension of a feature vector for a HMM of the highest log-likelihood is declared to be the frame is recognized word. D = Z x D'. (20) In our classifier, the size of a normalized image was 64 x 64. The length of an observation symbol sequence In our method, only discrete orthogonal moments with was set to T = 8. The K-Means algorithm was employed lower orders from 0 to 3 are extracted in a zone. For a for VQ because of its simpleness. The model estimation given zone Zone(j) ( 1 < j < Z ) of a frame and matching module were performed by Baum-Welch frame(i) (1 < i < 8 ), (12) and (17) can be written as algorithm and Viterbi algorithm respectively [12]. 1 ~~~x)t~ )I(W-lH-1 3.3 The Selection of Model Topology Tp(nn, W - l)5 (m H ) lEEO , I'( The transition topology becomes an important part for w-IH (21) the recognition system and a suitable topology can make Knm (i, j) Z Z Kn (x'; p1, W - I)Km (Y; P2, H - I)I(x, y) a significant increase in recognition accuracy. In x=O y=O character recognition, left-right models are widely used (22) because they well suit people's writing habit. In this paper, we choose the left-right topology with 1 skip where, x'=x-(i-l)xW, y'=y-(j-1)xH. (shown in Fig. 1) to model the image. And the number of Because the maximum order (n + m) is 3, a local states and observation symbols is set to different values Tchebiechef feature vector Tz(i j) or a local to find the better one in later experiments (see Section 3). Krawtchouk feature vector Fz (i, j) extracting from a zone can be written as Qe ~~~~~~~~~~~~~~~~TZ( j) = {TOO(i j), TO(I 0 j), FIO 0, j), TI I 0 j), T02(i, j), T20 (,1 {,T30 (i,j), T03 (i,2j), '7 (i,j), T21 (i), j) (23) Fig. 1. The left-right topology with 1 skip or 3.4 Extracting Discrete Orthogonal Moments Fz (i, j) = {FOO (i, j), FO (i, j), FO (i, j), Fl 1 (j,j), F02 (,1 j), Feature under DHMMs Framework In order to detail the image, zoning sliding window The feature vector for a frame frame(i) is formed by technique is used to extract the sequential feature vectors. Each frame is divided into Z equal-sized zones from up lirking the Z local feature vectors. That is to say, a to down as shown in Fig. 2. frame vector fi - T(i) or fi - K(i) can be written as Then the width W and the height H of a zone are given by f i-_ T(i)= {T (i,1), T (i,2),. , T_ (i, T)} (25) W =L IT fv K(i) ={Kz (i,l), Kz(i,2), , Kz(i, T)} (26) H =LIZ (8 791 3.5 The Acceleration of Feature Extraction tested. The results are shown in Table 1 and Table 2. It should be pointed out that the recognition speed in our According to (12), the computation of Trnm involves paper doesn't include the time used for pro-processing and the saving of the final recognition results. the computation of the function t (x) and p(n,N). And each function consists of several multinomials for Table 1. Recognition accuracy of using different multiplication operation, which will require much orthogonal moments with M=64, Z=4 computation time. In order to speed up the process of ate Number feature extraction, the coefficients t (x) and p(n,N) for Mome > 6 9 12 Tchebichef moments are computed and saved in advance. Tchebichef 91.42 92.69 92.36 Then in the feature extracting stage, all coefficients are loaded to memory from hard disk and used as constants Krawtchouk 89.89 90.89 90.81 to join the real-time calculation of Tchebichef moments. Zernike 86.47 89.55 89.39 The same acceleration method is also used in the Legendre 86.19 88.72 88.03 extraction process of Krawtchouk moments. Table 1 indicates that the recognition accuracy of 4. EXPERIMENTS AND ANALYSIS Tchebief and Krawtchouk moments is higher than that of the other two continuous moments, no matter the number Our work detailed in this paper deals with the of states is set to 6, 9 or 12. It also can be seen that the recognition of the isolated unconstrained off-line highest recognition accuracy can be achieved by setting handwritten Chinese amount in words including Chinese N to 9. However, the differences in accuracy rate for the characters from V to X and JU. All the handwritten values of N with 9 and 12 are small. Additionally, Table character samples used in the following experiments 1 also indicates that the recognition accuracy of Legendre were collected by our laboratory. They are written by moments is the lowest. numerous writers and in various writing styles. There are totally 11,966 binary digital images. We used 8,366 Table 2. Recognition speed of using different images for training and 3,600 images for testing. orthogonal moments with M=64, Z=4 All experiments were performed on a COMPAQ Evo tate Number N610C notebook PC with 768M memory and 1.8GHz M oen 6 9 12 CPU. All programs are written with Matlab 6.0 language. E Because of its simplest form, (11) was selected for the Tchebichef 513 477 442 function 8'(n, N) to maintain the equal weight of Krawtchouk 515 478 450 different Tchebichef moments. In (17), the condition of Zernike 239 235 223 p= q= 0.5 was used to extract Krawtchouk moments. Legendre 511 476 439 4.1 Comparison of Recognition Performance Table 2 shows the recognition speed of different moments. Note that all coefficients were computed in In order to compare the recognition performance of advance (see Section 3). From this table, we can see that different moments, following conditions are used to carry the recognition speed of Tchebichef and Krawtchouk out the experiments in this sub-section: (1) each frame is moments corresponds to that of Legendre moments, but divided into Z = 4 zones, (2) the number of observation is much faster than that of Zernike moments. symbols is set to M = 64. Considering both the recognition accuracy and The recognition accuracy q and recognition speed per recognition speed, we can conclude that the recognition minute s are defined as performance of discrete orthogonal moments is much better than that of continuous orthogonal moments. Number of correctly classified images 100 The total number of images used in the test 0004.2 Effect of the Number of Zones Z The total number of images used in the test s The total number of images used in thetestx 60 The number of zones determines the size of a feature The total time used in recognition test vector. In this paper, we tested the effect of the number (28) of zones on the recognition accuracy with 9 states and 64 In HMMs-based recognition system, the recognition observation symbols. Table 3 gives the results. rate is affected by the number of states. The performance From Table 3, we can find that, the recognition of different moments with state number 6, 9 and 12 was accuracy increases when increasing the number of zones. 792 With the number of zones given in Table 3, the highest discrete orthogonal moments. accuracy can be obtained by setting Z to 8. In next sub- section, Z =8 is used to evaluate the number of References observation symbols M. [1] M M. R. Teague, "Image analysis via the general theory of moments," J. Opt. Soc. Amer., vol. 70, no. 8, pp. 920-930, Table 3. Recognition accuracy of using different 1980. number of zones with M=64 and N=9 1 2 4 8 [2] Khotanzad, "Invariant image recognition by Zernike moments," IEEE Trans. Pattern Anal. Mach. Intell., vol. 12, Tchebichef 90.88 92.36 92.69 93.1 9 no. 5, pp. 489-497, 1990. Krawtchouk 88.81 89.42 90.89 93.00 [3] Rutvik Desai and H. D. Cheng, "Pattern recognition by 4 vautifObservation local radial moments," Proceedings of the 12th 4.3 Evaluation of the Number of Observahon International. Conference on Pattern Recognition, vol. 2, Symbols M pp. 168- 172, 1994. The number of observation symbols (namely the [4] Mehdi Dehghan and Karim Faez, "Farsi Handwritten codebook size in the stage of VQ) was experimentally Character Recognition with Moment invariants," optimized. We had evaluated the number of observation Proceedings of 13th International Conference on Digital symbols with 64, 96, 128 and 256. Table 4 gives the Signal Processing, vol. 2, pp.507-510, 1997. recognition accuracy of the two discrete orthogonal [5] Mukundan R., ong S.H., and Lee P.A. "Image Analysis by moments with 9 states and 8 zones. The best result is Tchebichef Moments," IEEE Transactions on Image obtained by using M = 256. Processing, vol. 10, no. 9, pp.1357-1364, 2001. Table 4. The recognition accuracy of discrete [6] R. Mukundan, "Some Computational Aspects of Discrete orthogonal moments with different values of M Orthonormal Moments," IEEE Transactions on Image 64 96 128 256 Processing, vol.13, no. 8, pp. 1055-1059, 2004. Tchebichef 93.19 94.11 94.78 95.58 [7] P. T. Yap and P. Raveendran. "Image focus measure based Krawtchouk 93.00 93.97 94.58 95.36 on Chebyshev moments," IEE Proc. -Vis. Image Signal Process, vol. 151, no. 2, pp. 128-136, 2004. 5. CONCLUSIONS [8] P.T.Yap, P.Raveendran and S.H.Ong. "Krawtchouk Moments as a new set of discrete orthogonal moments for In this paper, we have studied discrete-orthogonal- image reconstruction," International joint conference on moment-based feature descriptors in the recognition of Neural Network, WashingDC, pp. 908-912, 2001. unconstrained off-line handwritten Chinese amount in words under DHMMs framework. Two main works are [9] Pew-Thian Yap, Raveendram Paramestran and Seng-Hunt done in this paper. First, we discussed the theory of Ong. "Image Analysis by Krawtchouk Moments," IEEE Tchebichef and Krawtchouk moments. Second, we Transactions on image processing, vol. 12, no. 11, pp. combine the discrete orthogonal moments and DHMMs 1367-1377, 2003. for character recognition. A serial of experiments are [10] L. E. Baum and T. Petrie. "Statistical Inference for carried out to test the performance of our approach in the Probabilistic Functions of Finite State Markov Chains," recognition of off-line handwritten Chinese amount in Annals ofAMath. Statistics, vol. 37, pp. 1554-1563, 1966. words. Within the parametric range discussed in our paper, the preliminary results seem to indicate that: [11] L. E. Baum and T. Petrie, G. Soules and N. Weiss. "A (1) The performance of discrete orthogonal moments Maximization Technique Occurring in Statistical Analysis is better than that of continuous orthogonal moments of Probabilistic Functions of Markov Chains," Annals of because discrete orthogonal moments are directly defined Math. 