1. Trang chủ
  2. » Công Nghệ Thông Tin

Lecture Notes in Computer Science- P74 ppt

5 90 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 5
Dung lượng 198,68 KB

Nội dung

354 Z. Hu, H. Leung , and Y. Xu of the template is already known in the database. Hence, the stroke sequence can be easily verified based on the stroke correspondence. Computational complexity. We apply our pruning strategy to the graph matching, and compare the computational time with the existing method in [6]. The method in [6] also makes use of graph matching to find out the difference between the template and sample but without applying the pruning strategy. From Figure 9(a), it can be observed that the performance of finding stroke production errors for the method with and without pruning is similar. However, our proposed method is faster than the one in [6] as illustrated in Figure 9(b). The results show that our method can handle more kinds of handwriting errors such as stroke production errors, spatial relationship errors and stroke sequence error com- pared with existing methods. While our method can locate more kinds of handwriting errors with less computational time, the accuracy of our approach also outperforms existing methods. 4 Conclusion In this paper we have used the attributed relational graph to represent a Chinese handwriting character incorporating the spatial relationship information between strokes. A refined interval relationship with more granular levels is proposed to model the Chinese characters. A novel interval neighborhood graph is also proposed to com- pute the distances among the refined interval relationships. A pruning strategy is adopted to assist the A* algorithm in searching for the optimal matching and lowering the computational complexity. The experiments show that our proposal can handle more kinds of handwriting errors than existing methods with less computational time. To further improve the performance of our method, future work may focus on im- proving the current definition of relationship by combining the relationship along x- axis and y-axis to form a new relationship, and find a more computationally efficient pruning strategy based on the new relationship. Acknowledgments The work described in this paper was fully supported by a grant from City University of Hong Kong (Project No. 7001711). References 1. Law, N.N., Ki, W.W., Chung, A.L.S., Ko, P.Y., Lam, H.C.: Children’s stroke sequence er- rors in writing Chinese characters. Reading and Writing: An Interdisciplinary Journal, 267–292 (1998) 2. Lam, H.C., Pun, K.H., Leung, S.T., Tse, S.K., Ki, W.W.: Computer-Assisted-Learning for Learning Chinese Characters. Communications of COLIPS, an Intl. Journal of the Chinese and Oriental Languages Processing Society 3(1), 31–44 (1993) 3. Lam, H.C., Ki, W.W., Law, N., Chung, A.L.S., Ko, P.Y., Ho, A.H.S., Pun, S.W.: Design- ing CALL for learning Chinese characters. Journal of Computer Assisted Learning 17(1), 115–128 (2001) Automated Chinese Handwriting Error Detection 355 4. Tsay, Y.T., Tsai, W.H.: Attributed String Matching by Split-and-Merge for On-line Chi- nese Character Recognition. IEEE Trans. PAMI 5(2), 180–185 (1993) 5. Tonouchi, Y., Kawamura, A.: An On-line Japanese character recognition method using length-based stroke correspondence algorithm. In: Proceedings of the Fourth Intl. Conf. on Analysis and Recognition, vol. 2, pp. 633–636 (1997) 6. Hu, Z.H., Leung, H., Xu, Y.: Stroke Correspondence Based on Graph Matching for Detect- ing Stroke Production Errors in Chinese Character Handwriting. In: Pacific-Rim Confer- ence on Multimedia (PCM), pp. 734–743 (2007) 7. Tang, K.T., Leung, H.: Reconstructing the correct writing sequence from a set of Chinese character strokes. In: Matsumoto, Y., Sproat, R.W., Wong, K F., Zhang, M. (eds.) IC- CPOL 2006. LNCS (LNAI), vol. 4285, pp. 333–344. Springer, Heidelberg (2006) 8. Tan, C.K.: An algorithm for online strokes verification of Chinese characters using dis- crete features. In: 8th Intl. Workshop on Frontiers in Handwriting Recognition, pp. 339– 344 (2002) 9. Tang, K.T., Li, K.K., Leung, H.: A Web-Based Chinese Handwriting Education System with Automatic Feedback and Analysis. In: Liu, W., Li, Q., Lau, R. (eds.) ICWL 2006. LNCS, vol. 4181, pp. 176–188. Springer, Heidelberg (2006) 10. Tsai, W.H., Fu, K.S.: Error-correcting isomorphisms of attributed relational graphs for pat- tern analysis. IEEE Trans. SMC 9, 757–768 (1979) 11. Liu, J., Cham, W.K., Chang, M.M.Y.: Online Chinese character recognition using attrib- uted graph matching. IEEE Proc Vis. Image Signal Process 143, 125–131 (1996) 12. Ambauen, R., Fischer, S., Bunke, H.: Graph edit distance with node splitting and merging and its application to diatom identification. In: Hancock, E.R., Vento, M. (eds.) GbRPR 2003. LNCS, vol. 2726, pp. 95–106. Springer, Heidelberg (2003) 13. Bunke, H.: Error correcting graph matching: on the influence of the underlying cost func- tion. IEEE Trans. Pattern Analysis and Machine Intelligence, 21, 917–922 (1999) 14. Messmer, B.T., Bunke, H.: A new algorithm for error-tolerant subgraph isomorphism de- tection. IEEE Trans. Pattern Analysis and Machine Intelligence 20, 493–504 (1998) 15. Allen, J.F.: Maintaining knowledge about temporal interval. Communication of the ACM 26(11), 832–843 (1983) 16. Nabil, M., Ngu, A.H.H., Shepherd, J.: Shepherd. Picture similarity retrieval using the 2D projection interval representation. IEEE Trans., Knowledge and data engineering 8(4), 533–539 (1996) 17. Freksa, C.: Temporal reasoning based on semi-intervals. Artificial Intelligence 54(1-2), 199–227 (1992) F. Li et al. (Eds.): ICWL 2008, LNCS 5145, pp. 356 – 365, 2008. © Springer-Verlag Berlin Heidelberg 2008 A New Chinese Speech Synthesis Method Apply in Chinese Poetry Learning Chengsong Zhu and Yaoting Zhu College of Information Technical Science, Nankai University, No. 94, Weijin Road, Tianjin 300071, China zhuchs@nankai.edu.cn, zhuyt@nankai.edu.cn Abstract. This paper describes a new Chinese speech synthesis method apply in Chinese poetry teaching and learning, which focus on the prosody of syllables and words in Chinese poetry. As the word is a key semantic unit for Chinese poetry, we concentrate on Chinese word prosody and propose a speech synthesis method, which consider the appearance with the essence of human voice, use the homomorphism analysis over the time domain analysis to make the synthesized speech between syllables in a Chinese word sounds more natural. The prosody model of the method, however comply with the reading rules of Chinese poetry, make the user to learn Chinese poetry more convenient from the acoustic per- ception level. Keywords: Chinese poetry, Speech Synthesis, PSOLA, homomorphism. 1 Introduction China is a poetry country, she had reached her golden age in Tang dynasty and Song dynasty. The Chinese poetry is a quintessence of Chinese culture. In modern education the poetry teaching and learning is still play an import role. To learn Chinese poetry reciting is the first step, and then comprehension. Generally we recite things by our eyes or ears. Though we often get knowledge from the visual perception level, the sounding perception level can’t be ignored. Sometimes we reading out loudly when we learn the poetry, actually it affects our brain through our ears. The Chinese Text To Speech (hereafter call TTS) technology can translate Chinese text stored in computer into speech voices that can played through the sound devices. Thus the Chinese TTS technology can help users to learn the Chinese poetry more convenient, more efficient and to make the Chinese poetry teaching more vivid. In the following, we will give the prosody feature of Chinese poetry in section 2, and then present the synthesis speech prosody model in section3, and the new speech synthesis method in section 4, followed with the experiment and conclusion in section 5. 2 The Prosody Feature of the Chinese Poetry Take a Tang poetry “ , .” as ex- ample. It will be segmented into “ / /, / /. / / /, / A New Chinese Speech Synthesis Method Apply in Chinese Poetry Learning 357 / /.” and the Mandarin Chinese phonetics is spell as “chuang2qian2/ ming2yue4guang1/, yi2shi4/ di4shang4shuang1/, ju3tou2/ wang4/ ming2yue4/, di1tou2/ si1/ gu4xiang1/ .”, in which the number represents the syllable tone. Here we call one Chinese character as a syllable. The Chinese poetry format is strict neatly, it is often made of four or eight sentences which including five or seven syllables. The poetry gives abundant information in such limited words, so a character or a word can be explained into several associate mean- ings. Then the poetry often takes one syllable or one word as its reading unit. The poetry reading mood is a little different from the news broadcasting statements mood as usual, it is slower. Thus the rhythm of a word including monosyllable is made to a more prominent position and the prosody of them becomes the key prosody of Chinese poetry. We concentrate on the prosody between adjacent syllables in one word in Chinese poetry and propose a prosody module in section 3.2 and section 3.3. 3 The Chinese Poetry Prosody Model The Chinese poetry prosody mainly includes three levels, the syllable prosody, the word prosody and the sentence prosody. 3.1 The Syllable Prosody The pitch and duration are two important features in syllable prosody. It is well known that Chinese is a tonal language, the tone represent the pitch in physics. The tone is exits on certain duration. There are four lexical tones including tone1, tone2, tone3 and tone4 in Mandarin Chinese. Chinese phonetist Yuanren Zhao had introduced a method called five-grade tone-marking [1] to describe the Chinese tones. It gives the corresponding tone type for each lexical tone which conveys the pitch changing trends on certain duration of one syllable. As figure 1 show, it divides the vertical pitch axis into five 1 2 3 4 5 Relative pitch Duration tone1 tone2 tone3 tone4 Tone type 555 214 531234 Relative ratio 1 0.841 0.707 1.189 1.414 Fig. 1. Five-grade tone-marking 358 C. Zhu and Y. Zhu levels. The ratio between each adjacent level is 2 1/4 . The five levels is a relative value. If we set level3’s value equals 1, the level2’s value can be calculated as 0.841 etc. Then the tone1’s tone type is defined as “555” which means the relative pitch value at the start time, middle time and end time point on the duration of syllable, and we con- catenate the adjacent pitch point with a line. In the same way tone2’s tone type is de- fined as “345”, tone3 as “214” and tone4 as “531”. Once we give an absolute average pitch value of the syllable and a certain duration, we can calculate its pitch value of any time point in the duration by the module. 3.2 The Word Prosody The word prosody mainly includes tone changing and coarticulation [2]. Chinese word tone prosody is based on its character’s tone. In continuous speech the syllable tones type maybe changed and no longer equal to its basic tone type. For example, the word “ ” in spelling is “kang1kai3” whose tone combination is “1-3” and original tone type combination is “555-214”, but the tone type will be changed into “555-21” which makes it sounds more natural by our Mandarin Chinese pronunciation custom. We have experimented on all kinds of Chinese tone combinations and then give out the changed tone type combinations as table 1 shows. Table 1. Tone Changing tone i-tone j 1 st tonetype 2 nd tonetype tone i-tone j 1 st tone type 2 nd tone type tone1-tone1 555 444 tone3-tone1 311 444 tone1-tone2 555 234 tone3-tone2 311 124 tone1-tone3 555 21 tone3-tone3 234 21 tone1-tone4 555 531 tone3-tone4 211 531 tone1-tone0 555 21 tone3-tone0 311 31 tone2-tone1 234 444 tone4-tone1 531 333 tone2-tone2 345 345 tone4-tone2 531 234 tone2-tone3 345 21 tone4-tone3 531 21 tone2-tone4 234 531 tone4-tone4 531 531 tone2-tone0 345 31 tone4-tone0 531 31 The second feature of word prosody is coarticulation. Coarticulation is a phe- nomenon happened when one syllable pronounce by the end with another syllable is about to start pronounce in a word, the latter syllable’s pronouncing way will affect the former one’s. How human voice produces? The principle is the airflow comes from our lung and through the glottis, if the vocal chords vibrate then the vowel formed, or else the con- sonant formed. And the airflow continues to get through the vocal tract, which is . er- rors in writing Chinese characters. Reading and Writing: An Interdisciplinary Journal, 267–292 (1998) 2. Lam, H.C., Pun, K.H., Leung, S.T., Tse, S.K., Ki, W.W.: Computer- Assisted-Learning for. a Chinese handwriting character incorporating the spatial relationship information between strokes. A refined interval relationship with more granular levels is proposed to model the Chinese. – 365, 2008. © Springer-Verlag Berlin Heidelberg 2008 A New Chinese Speech Synthesis Method Apply in Chinese Poetry Learning Chengsong Zhu and Yaoting Zhu College of Information Technical

Ngày đăng: 05/07/2014, 09:20