1. Trang chủ
  2. » Luận Văn - Báo Cáo

Modeling the prosody of vietnamese language for speech synthesis

105 50 1

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 105
Dung lượng 2,08 MB

Nội dung

MINISTRY OF EDUCATION AND TRAINING HANOI UNIVERSITY OF TECHNOLOGY - Thesis for the degree of MASTER OF SCIENCE Modeling the prosody of Vietnamese language for speech synthesis Speciality: “Information processing and Communication” Code:23.04.3898 MẠC ĐĂNG KHOA Supervisor: Prof PHẠM THỊ NGỌC YẾN Hanoi, 2007 Faculty of Information Technology International research center of Multimedia Information, Communication and Application -1- Master thesis Acknowledgment Many people provided me generous help and inspiration during my time of master student First, I would like to express my deep sense of respect and gratitude towards my supervisors: Dr Eric Castelli and Prof Phạm Thị Ngọc Yến Thank you very much for orienting and guiding my research in speech processing domain Thank you for all your useful advices, your true criticisms and your patience during my time of master research Special thanks also goes to Mrs Geneviève Caelen-Haumont, PhD students Trần Đỗ Đạt, Vũ Minh Quang and all members of MICA’s speech group I could not have done this thesis without your supports Thank all of you for all your suggestions and your sincere remarks on entire of my research I would like to thank to Ms Đoàn Thị Ngọc Hiền, who guiding me in recording the corpus I would also like to thank to a lot of MICA member who spent much of time for recording and testing for my research I am grateful to Prof Nguyễn Trọng Giảng and MICA’s directorate supporting me the best convenient conditions during time working in International Research Center MICA Finally, I owe a great deal to my parents and my sister for their continued support I also give a very special thanks to my girl friend for her constant encouragement, giving me strength and motivation in my work and in my life Mạc Đăng Khoa -2- Master thesis Abstract Text-To-Speech (TTS) system is a computer system which is able to produce the speech from the text In the TTS system, the naturalness of the produced speech depends greatly on the variation of pitch, duration and energy during speaking We call it the “prosody controlling ability” A TTS system with good prosody controlling ability can be simulate the human speech prosody corresponding to the context of speaking With tonal languages such as Vietnamese, the prosody of an utterance is the combination results of the two components: "micro-prosody" corresponding to the tone of each syllable in a sentence and "macro-prosody" corresponding to the whole sentence The main goal of this thesis is to model the characteristics of Vietnamese prosody for speech synthesis It focuses on the influences of the macro-prosody on the micro-prosody, in three types of sentence: assertive, interrogative and imperative The first task is to set up a “prosody corpus” and extract all possible prosody parameters Base on the extracted data, we defined seventy-two simple prosody patterns for Vietnamese syllables in three types of sentence After that, these patterns were applied to synthesize some simple sentences Finally, some perception experiments were taken to evaluate these synthesized sentences The results shown that the proposed patterns can be applied successfully to generate the prosody of simple sentence This work is our preliminary work in Vietnamese prosody, just concerning the sentence types and the position of syllable in a sentence In the future, we expect to continue this research with more factors of Vietnamese prosody, improve our pattern and apply them Vietnamese TTS system Mạc Đăng Khoa -3- Master thesis Mạc Đăng Khoa -4- Master thesis List of Figures Figure 1-1: Category of methods for predicting syllable duration [6] 23 Figure 2-1: Example of the contours of six tones, as described in [21] 30 Figure 2-2: The shape of Tone with female and male voice [18] .31 Figure 2-3: The shape of Tone with female and male voice [18] .31 Figure 2-4: The shape of Tone with female and male voice [18] .32 Figure 2-5: The shape of Tone with female and male voice [18] .32 Figure 2-6: The shape of Tone with female and male voice [18] .32 Figure 2-7: The shape of Tone 5b with female and male voice [18] .33 Figure 2-8: The shape of Tone with female and male voice [18] .33 Figure 2-9: The shape of Tone 6b with female and male voice [18] .34 Figure 2-10: Sentence classification by structure [20] 35 Figure 2-11: The sentences “Lan thích ăn cơm không” in 36 Figure 2-12: The sentences “Bảo cố gắng tập đi” in .36 Figure 2-13: The sentences “Tân bỏ chứ” in 37 Figure 2-14: The differences of F0 contour between Assertive and Interrogative sentence [16] 37 Figure 3-1: A general function diagram of TTS system [13] 41 Figure 3-2: Fujisaki model 46 Figure 3-3: Fujisaki model for tonal language [19] 46 Figure 3-4: Function diagram of proposal TTS system 47 Figure 3-5: Prosody generation module 48 Figure 4-1: Key-syllable segmentation 56 Figure 4-2: Extracting F0 contour using PRAAT 57 Figure 4-3: An example of prosody pattern 60 Figure 5-1: An example of synthesized non-sense phrase 73 Figure 5-2: Perception test 74 Figure 5-3: An example of synthesized multi-type sentences .80 Mạc Đăng Khoa -5- Master thesis Figure 5-4: Interface for Perception test 82 Figure 5-5: Correct recognition rate with tones of last syllable 85 Figure 5-6: Correct recognition rate (%) with other types of sentences 86 Figure 5-7: Result comparison of three experiments 87 Mạc Đăng Khoa -6- Master thesis List of Tables Table 1.1: Prosody functions 16 Table 1.2:Links between levels of representation of prosodic phenomena [13] 17 Table 1.3: Intonation model classification .18 Table 2.1:Vietnamese vowels 27 Table 2.2:Vietnamese consonants .28 Table 2.3: Arrangement of Vietnamese consonants 28 Table 2.4:The phonological hierarchy of Vietnamese syllables with total numbers of each phonetic unit [14] .29 Table 2.5 The six Vietnamese tones .30 Table 3.1: Comparison between direct pattern and model pattern 50 Table 4.1: Prosody corpus structure .52 Table 4.2: Prosody corpus text information 53 Table 4.3: Recording information of Prosody corpus 54 Table 5.1: Confusion matrix (in %) for tones with male voice 75 Table 5.2: Confusion matrix (in %) for tones with female voice 75 Table 5.3: Confusion matrix (%) of sentence types with male voice .76 Table 5.4: Confusion matrix (%) of sentence types with female voice 77 Table 5.5: Test data for Experiment 79 Table 5.6: Confusion matrix (in %) of sentence types (with male voice) 82 Table 5.7: Confusion matrix (in %) of sentence types (with female voice) 83 Table 5.8: Confusion matrix (in %) of sentence types (average of Male and Female) 84 Table 5.9: Correct recognition rate (%) with other types of sentences 86 Table 5.10: Result of three experiments .87 Mạc Đăng Khoa -7- Master thesis Table of contents Acknowledgment Abstract List of Figures List of Tables Table of contents INTRODUCTION PROSODY AND PROSODIC MODEL 12 1.1 1.1.1 1.1.2 1.1.3 1.1.4 1.2 1.2.1 1.2.2 1.2.3 2.1.1 2.1.2 2.1.3 2.2 2.2.1 2.2.2 2.2.3 Prosody modeling 17 Intonation models 18 Duration modeling 21 This thesis work approach 23 Vietnamese language 25 Vietnamese characteristics 25 Vietnamese phoneme system 27 Syllable structure 29 Vietnamese prosody 29 Micro-prosody and tones system in Vietnamese 30 Macro-prosody and sentence types in Vietnamese 34 Some special phenomena in Vietnamese prosody 38 TTS SYSTEM AND PROSODY GENERATION 40 3.1 3.2 3.2.1 3.2.2 3.3 The concept of prosody 12 Major components of prosody 13 The functions of prosody 14 Levels of representation of prosodic phenomena 16 VIETNAMESE LANGUAGE AND PROSODY 25 2.1 Overview of prosody 12 An overview of TTS system 40 Prosody generation 41 Overview of prosody generation 41 From text to prosody 43 Other researches and our proposal 45 PROSODY PATTERNS EXTRACTION 51 4.1 Prosody corpus 51 Mạc Đăng Khoa - 88 - Chapter 6: Conclusion and Perspectives Overall, the result in this experiment is better than in its previous, which used non-sense sentences (50% of correction) In all types of sentence, the correct recognition rate of experiment is approximately 10% higher than in experiment It can be explain that, in the first, we use the same tone for all syllables in the sentence but we did not simulate the co-articulation phenomena Therefore, the synthesized sentences were very different from nature With test sentences in experiment 2, except the final syllable, the others were in tone (or non-tone) The co-articulation phenomenon is minimized, and the sentences are more natural Compare with the experiment M, the result of experiment is better in assertive sentences but worse in interrogative sentences The test sentences in Experiment M were synthesized by simulating the prosody of actual sentences That is why they could be more natural than the test sentences in Experiment 2, which used our proposal prosody patterns Therefore, the worse result in Experiment can be understandable and acceptable In this chapter, we have presented some experiments to evaluate our prosody pattern These patterns are currently very simple, just base on the position of syllable in sentence and type of container sentence However, the results of experiments show that our proposal pattern can be apply to predict the prosody of simple sentences Mạc Đăng Khoa - 89 - Chapter 6: Conclusion and Perspectives Conclusion and Perspectives The thesis subject is “Modeling the prosody of Vietnamese language for speech synthesis” But as all we know, finding a completed model to modeling Vietnamese prosody currently is a large and complex field, which requires many researches on linguistic, acoustic and on speech processing also Therefore, in scope of master thesis, we have studied on some basic factors of Vietnamese prosody, characterized and tried to apply them to speech synthesis In chapter 3, we have proposed a method and structure of prosody generation module The simplest way to generate the prosody of whole sentence is concatenation the direct prosody patterns of syllables Hence, in chapter 4, we set up a prosody corpus, analyzed and proposed 72 prosody patterns syllables, corresponding to the initial, middle and final positions of syllable in three type of sentence (assertive, interrogative and imperative) Although these patterns are not enough to model all case of Vietnamese prosody, but they are able to apply to generate the prosody of simple sentences That was proved by the results of experiments in chapter In the perception tests, the listener could correctly determine 64% of assertive sentences, 60% of interrogative sentences and 50% of imperative sentences These patterns are our first prosody patterns for Vietnamese syllables They were extracted from a small prosody corpus and just concern to three factors: tone, position of syllable and the sentence type In the future work, we expect to improve Mạc Đăng Khoa - 90 - Chapter 6: Conclusion and Perspectives these patterns by set up and analyze a larger corpus Additionally, we also research others factor such as syntactic, lexical meaning factors or glottalization and coarticulation phenomena to integrate into our patterns Moreover, these prosody patterns can be represented not only in absolute values of F0, duration and intensity but also in a set of parameters of other prosody model (such as Fujisaki model) By that way, we are be able to generate the prosody description in other types of prosodic model and apply to other type of TTS system The following is a summary on the works we have done in this master thesis, the limitations and future approaches: • The works we have done: Proposed a simple method a prosody generation and a structure of prosody generation module in TTS system Set up a corpus for researching prosody Proposed 72 prosody patterns for Vietnamese syllable Apply proposal prosody patterns to synthesize some simple sentences and evaluate these sentence by perception tests • The limited points The corpus is small The proposal patterns for syllables are simple, just concern to three factors: tone, position of syllable and the sentence type Intensity controlling in speech synthesis was done manually and not very accurate The listener in perception test was few (5 males and females) • Future approach Set up a larger corpus, which concern to other factors and phenomena in Vietnamese prosody Define prosody patterns from that corpus Research other prosodic model and apply to transfer these patterns to other model parameters Mạc Đăng Khoa - 91 - Chapter 6: Conclusion and Perspectives Develop a complete prosody generation module and integrate in a TTS system The last words This work is carried out in MICA center and I expected its result could be applied into the MICA speech synthesis system This work is also my preliminary work in the domain of speech processing With the guiding of my supervisors, the instruction and support of others in MICA’s speech processing group, my knowledge and skill of speech processing have been more and more improved That knowledge is the necessary background for my future studies and researches Once again, thank all of you very much! Mạc Đăng Khoa - 92 - Master thesis References [1] Chilin Shih , Greg Kochanski, “Prosody and Prosodic Models”, www.prosodies.org [2] Do T.D., Tran T.H., et al (1998), “Intonation system - A survey of twenty languages”, chap 22, Cambridge University Press [3] Dung Tien Nguyen, Hansjörg Mixdorff, Mai Chi Luong, Huy Hoang Ngo, Bang Kim Vu, (2005) “Fujisaki Model based F0 contours in Vietnamese TTS”, Eurospeech proceeding [4] H Fujisaki, S Ohno, C Wang (1974), “A command-response model for F0 contour generation in multilingual speech synthesis”, Journal of Phonetics, vol 2, pp 223-232, [5] Mixdorff H (1998), “Intonation patterns of German - Model-based quantitative analysis and synthesis of F0 contours”, PhD thesis, TU Dresden [6] Mixdorff H (2001), “An Integrated Approach to Modeling German Prosody”, TU Dresden [7] Mixdorff H., Nguyen Hung Bach, Hiroya Fujisaki and Mai Chi Luong (2003), “Quantitative Analysis and Synthesis of Syllabic Tones in Vietnamese”, Eurospeech proceeding [8] Nguyen T.T.H and Boulakia G (1999), "Another look at Vietnamese intonation", ICPhS'99 [9] Ninh Khanh Duy (2005), “Characterization of Vietnamese intonation for questions”, Master Thesis, Hanoi University of Technology, Mạc Đăng Khoa - 93 - Master thesis [10] Paul Alexander Taylor (1992), “A Phonetic Model of English Intonation”, PhD thesis, University of Edinburgh, [11] Sami Lemmetty (1999), “Review of Speech Synthesis Technology”, MSc thesis, Faculte Helsinki University of Technology [12] Thierry Dutoit (1993), “High Quality Text-To-Speech Synthesis of the French Language”, PhD thesis, Faculte Polytechnique de Mons, TCTS Lab, Belgium [13] Thierry Dutoit (1997), “An Introduction to Text-to-Speech Synthesis”, Kluwer Academic Publishers [14] Tran D.D., Castelli E., et al (2005), "Influence of F0 on Vietnamese syllable perception", Interspeech [15] Tran Do Dat (2003), “Building a large Vietnamese Speech Database”, Master Thesis, Hanoi University of Technology [16] Vu M.Q., Tran D.D., Castelli E (2006), “Prosody of Interrogative and Affirmative Sentences in Vietnamese Language: Analysis and Perceptive Results” [17] Vu M.Q., Tran D.D & Castelli E (2006), "Intonation des phrases interrogatives et affirmatives en langue vietnamienne", JEP2006, XXVIes Journées d’Etude sur la Parole Manoir de la Vicomté - Dinard, France [18] Nguyen Quoc Cuong (2002), “Reconnaissangce de la parole en langue Vietnamienne”, These, INPG, UJF Grenoble, France, Juin 2002 [19] Bạch Hưng Ngun, Nguyễn Tiến Dũng,(2005), "Mơ hình Fujisaki áp dụng phân tích điệu tiếng Việt" [20] Mai Ngọc Chừ, Vũ Đức Nghiệu, Hoàng Trọng Phiến (2005) “Cơ sở ngôn ngữ học tiếng Việt”, NXB Giáo dục Mạc Đăng Khoa - 94 - Master thesis [21] Nguyễn Hữu Quỳnh (2001) “Ngữ Pháp Tiếng Việt”, Nhà xuất từ điển Bách Khoa, pp.11-86, Hà Nội Mạc Đăng Khoa - 95 - Master thesis Appendix A Text for prosody corpus Code Role Sentences 1_As_In_A 1_As_Mid_A 1_As_Fi_A A A A Bên ta theo địch đến tận Bên địch bị bên ta theo đến tận Bên địch bị đánh bật khỏi bên ta 1_Int_In_A 1_As_In_B Context A B Kết thúc trận đánh, thủ trưởng hỏi: Này cậu Bên ta theo địch đến tận à? Vâng Bên ta theo bên địch đến tận 1_Int_Mid_A 1_As_Mid_B A B Này cậu Bên địch bị bên ta theo đến tận à? Đúng Bên địch bị bên ta theo đến tận 1_Int_Fi_A 1_As_Fi_B A B Này cậu Bên địch bị đánh bật khỏi bên ta? Đúng Bên địch bị đánh bật khỏi bên ta 1_Int_In_B 1_Imp_In_A Context B A Trong trận đánh, cấp hỏi thủ trưởng Bên ta theo anh có cần bám theo địch không? Bên ta theo địch cho tơi! 1_Int_Mid_B 1_Imp_Mid_A B A Địch rút bên ta theo có khơng anh? À Thế bên ta theo cho tôi! 1_Imp Fi_A Context A Trong trận đánh, thủ trưởng lệnh: Gọi quân cứu viện bên ta! Nhanh lên ! Code Role Sentence 2_As_In_A 2_As_Mid_A 2_As_Fi_A A A A Trên tà thêu hoa màu đỏ Áo chị tà thêu hoa Áo chị có thêu bơng hoa tà 2_Int_In_A 2_As_In_B Context A B Một chị đặt may áo dài Thợ may hỏi: Trên tà thêu không chị? À Trên tà thêu hoa màu đỏ 2_Int_Mid_A 2_As_Mid_B A B Áo chị tà thêu khơng thế? À Cái áo tà thêu hoa màu đỏ Mạc Đăng Khoa - 96 - Master thesis 2_Int_Fi_A 2_As_Fi_B A B Áo chị thêu tà ? À Cái áo thêu hoa đỏ tà 2_Int_In_B 2_Imp_In_A B A Trên tà thêu khơng chị? Giống kìa.Trên tà thêu y cho chị! 2_Int_Mid_B 2_Imp_Mid_A B A Áo chị đặt tà thêu khơng chị? Giống lần trước ý Cái áo tà thêu em! 2_Int_Fi_B 2_Imp Fi_A B A Áo chị đặt thêu tà? Giống lần trước Cứ thêu cho chị hoa tà! Code Role Sentence 3_As_In_A 3_As_Mid_A 3_As_Fi_A A A A Trên tã thêu hình bóng Chị nhìn thấy tã thêu hình bóng Chị nhìn thấy hình bóng thêu tã 3_Int_In_A 3_As_In_B Context A B Chồng vợ nói tã em bé: Trên tã thêu hình em ? À Trên tã thêu hình bóng anh 3_Int_Mid_A 3_As_Mid_B A B Em có thấy tã thêu hình khơng ? À Em thấy tã thêu hình bóng anh 3_Int_Fi_A 3_As_Fi_B A B Có hình bóng tã ? Vâng Em thấy có hình bóng tã 3_Int_In_B 3_Imp_In_A Context B A Vợ thêu tã cho con, hỏi chồng Trên tã thêu hình anh? Con thích bóng Trên tã thêu hình bóng đi! 3_Int_Mid_B 3_Imp_Mid_A B A Khơng biết tã thêu anh? Con thích bóng.Theo anh tã thêu bóng em! 3_Int_Fi_B 3_Imp Fi_A B A Anh ơi! Thế thêu tã? Con thích bóng.Thế em thêu cho bóng tã! Code Role Sentence 4_As_In_A 4_As_Mid_A A A Bên tả theo khuynh hướng bảo thủ Đảng trị bên tả theo khuynh hướng bảo thủ Mạc Đăng Khoa - 97 - Master thesis 4_As_Fi_A A Bảo thủ khuynh hướng đảng trị bên tả 4_Int_In_A 4_As_In_B Context A B Hai người bàn trị Bên tả theo khuynh hướng cậu nhỉ? À Bên tả theo khuynh hướng bảo thủ 4_Int_Mid_A 4_As_Mid_B A B Đảng trị bên tả theo khuynh hướng thế? À Đảng trị bên tả theo khuynh hướng bảo thủ 4_Int_Fi_A 4_As_Fi_B A B Anh ủng hộ đảng trị bên tả? À Tôi ủng hộ đảng bên tả 4_Imp_In_A Context A Trong buổi tổng duyệt điễu hành Người huy hô: Bên tả theo sau đội trống! Nhanh lên! 4_Imp_Mid_A A Chú ý Toàn đội bên tả theo sau đội trống! 4_Imp_Fi_A A Tất theo sau đội bên tả ! Nhanh lên Code Role Sentence 5_As_In_A A 5_As_Mid_A 5_As_Fi_A A A Đợt phong cấp lần này, lên tá theo anh chẳng khó khăn Anh ta thăng lên tá theo cách Cuối anh thăng lên tá Context 5_Int_In_A A 5_As_In_B B Trong họp bàn phong cấp đơn vị quân đội: Đợt phong cấp lần này, lên tá theo anh có khó khơng? À Lên tá theo tơi chẳng khó khăn 5_Int_Mid_A 5_As_Mid_B A B Thủ trưởng hỏi trường hợp vừa phong lên cấp tá Anh ta lên tá theo định thế? À Anh ta lên tá theo cách chẳng biết 5_Int_Fi_A 5_As_Fi_B A B Sao lại lên tá? Thưa anh, lại lên tá 5_Int_In_B 5_Imp_In_A Context B A Cấp hỏi thủ trưởng Cịn đồng chí Nam, lên tá theo anh có nên không? Nên Lên tá theo người ! 5_Int_Mid_B B 5_Imp_Mid_A A Đồng chí Nam mà phong lên tá theo người có khơng? Được Phong lên tá theo họ đi! Context Mạc Đăng Khoa - 98 - Master thesis 5_Int_Fi_B B 5_Imp_Fi_A A Anh khơng ủng hộ việc đồng chí phong lên tá ? Đúng Các anh đừng có phong lên tá ! Code Role Sentence 6_As_In_A 6_As_Mid_A 6_As_Fi_A A A A Lên tạ theo hướng dẫn cách tập luyện tốt Cách tốt tập lên tạ theo hướng dẫn Tất vận động viên phải tập lên tạ 6_Int_In_A 6_As_In_B Context A B Vận động viên hỏi huấn luyện viên tập lên tạ: Lên tạ theo theo anh có tốt khơng ? Có Lên tạ theo hướng dẫn cách tập luyện tốt 6_Int_Mid_A 6_As_Mid_B A B Theo anh tập lên tạ theo hướng dẫn có tốt khơng ? Có Cách tốt tập lên tạ theo hướng dẫn 6_Int_Fi_A 6_As_Fi_B A B Liệu em có phải tập lên tạ ? Có Mọi vận động viên phải tập lên tạ 6_Int_In_B 6_Imp_In_A B A Lên tạ theo cách anh ? À Lên tạ theo hướng dẫn cho tôi! 6_Int_Mid_B 6_Imp_Mid_A B A Thế em có phải tập lên tạ khơng? Có Cậu tập lên tạ theo ngay! 6_Int_Fi_B 6_Imp_Fi_A B A Thế em phải tập lên tạ? Chứ cịn Cứ theo hướng dẫn lên tạ! Code Role Sentence 5b_As_In_A A 5b_As_Mid_A 5b_As_Fi_A A A Trong thi nhà nông, bên tát theo cách giành thắng lợi Chiến thắng thuộc bên tát theo cách Cuối cùng, phần thắng thuộc bên tát Context 5b_Int_In_A 5b_As_In_B A B Trong thi nhà nông Hai khán giả hỏi nhau: Bên tát theo cách bên anh? À Bên tát theo cách bên mặc áo đỏ Mạc Đăng Khoa - 99 - Master thesis 5b_Int_Mid_A 5b_As_Mid_B A B Bên bên tát theo cách anh? À Bên đỏ bên tát theo cách 5b_Int_Fi_A 5b_As_Fi_B A B Trong lần thi này, liệu phần thắng có thuộc bên tát? Có Tơi nghĩ phần thắng thuộc bên tát 5b_Int_In_B Context B 5b_Imp_In_A A Hai anh em ruộng, nói chuyện với nhau: Ơ bố mẹ tát nước Lên tát theo bố mẹ khơng anh? Có Lên tát theo bố mẹ ! 5b_Int_Mid_B B 5b_Imp_Mid_A A 5b_Int_Fi_B B 5b_Imp_Fi_A A Code Role Sentence 6b_As_In_A A 6b_As_Mid_A 6b_As_Fi_A A A Trong thi điêu khắc, bên tạc theo cách hoàn thành tượng sớm Chiến thắng thuộc bên tạc theo cách Cuộc thi làm tượng nhanh diễn bên đúc bên tạc Ơ, bố mẹ tát nước Mình có lên tát theo bố mẹ khơng? Có, làm xong Mình lên tát theo bố mẹ ! Bố mẹ tát nước Hay anh em lên tát? Ừ, xong Nào anh em lên tát ! 6b_Int_In_A 6b_As_In_B A B Trong thi điêu khắc, hai khán giả trò chuyện với nhau: Bên tạc theo cách bên anh? À Bên tạc theo cách bên áo đỏ 6b_Int_Mid_A 6b_As_Mid_B A B Bên bên tạc theo cách thế? À Bên áo đỏ bên tạc theo cách 6b_Int_Fi_A 6b_As_Fi_B A B Theo anh, liệu phần thắng có thuộc bên tạc? Có Tôi nghĩ phần thắng thuộc bên tạc 6b_Int_In_B 6b_Imp_In_A Context B A Trong xưởng điêu khắc, thợ tạc hỏi người thợ cả: Có hai mẫu cũ Tạc theo mẫu anh ? À Tạc theo mẫu cho ! 6b_Int_Mid_B 6b_Imp_Mid_A B A Anh ơi! Lô tượng bên tạc theo mẫu anh ? À Lô bên tạc theo mẫu đi! Context Mạc Đăng Khoa - 100 - Master thesis 6b_Int_Fi_B B 6b_Imp_Fi_A A Vụ tạc tượng phật núi đủ thợ chưa anh? Liệu em có phải lên tạc? Có Cả cậu phải lên tạc! B: Datasheet of prosody patterns Female Assertive sentence Male Initial part F0 F0 (Hz) (Semitone) 162.65 8.23 163.03 8.27 162.96 8.25 162.76 8.23 162.37 8.19 162.07 8.15 161.80 8.13 161.84 8.14 161.78 8.13 161.42 8.09 161.55 8.11 161.49 8.10 161.27 8.08 161.07 8.06 160.69 8.03 160.42 8.00 160.02 7.96 159.48 7.91 158.63 7.82 157.04 7.65 Duration(ms) Syllable Voiced part 295.52 18.65 292.33 18.47 289.19 18.28 287.26 18.16 285.73 18.07 284.65 18.01 283.93 17.97 283.23 17.92 282.77 17.90 282.00 17.85 281.85 17.84 282.13 17.85 282.72 17.89 283.01 17.90 283.29 17.92 283.33 17.92 283.32 17.92 282.34 17.85 281.63 17.81 278.94 17.65 Middle part Intensity (dB) 70.99 71.53 71.82 71.92 71.87 71.70 71.48 71.25 71.04 70.84 70.62 70.33 69.94 69.40 68.65 67.65 66.34 64.73 62.83 60.75 149 99 69.11 69.54 69.77 69.89 69.93 69.92 69.85 69.72 69.51 69.22 68.85 68.48 68.13 67.75 67.25 66.54 65.53 64.11 62.21 59.87 Duration Syllable 168 F0 100 F0 F0 (Hz) (Semitone) 180.21 9.54 180.42 9.52 178.30 9.30 177.88 9.27 177.40 9.23 177.01 9.20 176.65 9.16 176.48 9.15 176.58 9.15 176.85 9.16 177.00 9.16 176.94 9.14 176.84 9.11 176.89 9.10 176.78 9.09 176.53 9.07 174.97 8.97 172.82 8.81 171.33 8.68 169.78 8.54 Duration(ms) Syllable Voiced part 264.90 16.55 261.16 16.32 258.42 16.14 257.24 16.07 256.46 16.02 255.35 15.95 254.52 15.89 253.64 15.83 253.43 15.82 253.37 15.82 253.47 15.82 254.04 15.86 254.42 15.89 254.88 15.92 255.35 15.94 255.78 15.97 255.10 15.90 254.54 15.85 252.85 15.74 249.09 15.47 Final part Intensity (dB) 69.90 70.80 71.26 71.43 71.35 71.07 70.70 70.34 70.04 69.77 69.46 69.05 68.53 67.91 67.15 66.18 65.04 63.64 61.75 59.26 177 121 70.87 71.71 72.10 72.21 72.14 71.98 71.78 71.53 71.24 70.96 70.71 70.43 70.02 69.41 68.53 67.29 65.62 63.47 60.98 58.81 Duration Syllable 196 F0 126 F0 F0 (Hz) (Semitone) 153.85 7.28 153.42 7.23 153.01 7.17 152.51 7.11 152.42 7.09 152.38 7.08 152.45 7.09 152.22 7.06 151.97 7.03 151.27 6.96 150.76 6.90 150.17 6.84 149.58 6.77 149.10 6.73 149.37 6.80 147.78 6.62 146.89 6.52 145.75 6.40 146.23 6.45 146.34 6.47 Duration(ms) Syllable Voiced part 240.01 14.65 236.88 14.42 235.57 14.32 235.05 14.28 235.11 14.28 235.47 14.31 235.45 14.32 235.21 14.30 234.78 14.27 233.65 14.20 232.86 14.13 232.18 14.09 231.20 14.02 229.98 13.93 229.68 13.91 229.24 13.88 228.43 13.82 225.83 13.63 224.13 13.51 225.79 13.63 Intensity (dB) 71.28 71.94 72.14 72.12 71.99 71.84 71.66 71.37 71.01 70.61 70.14 69.45 68.36 66.81 65.00 63.27 61.54 59.80 57.75 55.54 320 210 70.26 72.06 72.57 72.53 72.32 72.10 71.83 71.37 70.83 70.27 69.62 68.81 67.77 66.65 65.43 64.07 62.26 60.17 57.80 55.54 Duration Syllable 361 F0 229 Mạc Đăng Khoa - 101 - Female Interrogative sentence Male Master thesis 181.86 10.18 180.69 10.06 179.54 9.93 179.11 9.89 178.00 9.78 177.18 9.69 176.87 9.65 176.28 9.59 175.96 9.55 175.81 9.53 175.40 9.48 175.06 9.45 174.67 9.41 174.38 9.38 174.17 9.36 173.86 9.33 173.25 9.26 171.56 9.10 170.71 9.00 169.21 8.85 Duration(ms) Syllable Voiced part 312.38 19.63 307.61 19.39 305.55 19.27 302.99 19.14 301.12 19.03 299.87 18.96 298.81 18.90 298.09 18.86 297.63 18.83 297.28 18.81 297.09 18.80 297.07 18.80 297.01 18.80 297.24 18.81 297.68 18.84 298.50 18.88 298.50 18.88 297.57 18.83 296.96 18.79 294.52 18.66 73.56 74.16 74.50 74.69 74.75 74.69 74.51 74.25 73.92 73.55 73.12 72.63 72.05 71.34 70.47 69.36 67.95 66.20 64.18 62.07 132 86 72.34 73.23 73.80 74.14 74.34 74.42 74.44 74.36 74.20 73.94 73.57 73.10 72.51 71.80 70.93 69.83 68.42 66.62 64.38 61.75 Duration Syllable 145 F0 87 172.43 9.23 172.60 9.26 171.89 9.16 170.61 9.00 170.23 8.96 169.71 8.91 169.35 8.87 169.06 8.85 168.96 8.84 168.82 8.82 168.66 8.80 168.54 8.79 168.47 8.78 168.31 8.77 167.87 8.72 167.13 8.66 166.43 8.59 166.45 8.59 165.36 8.47 163.67 8.29 Duration(ms) Syllable Voiced part 298.25 18.82 294.71 18.63 292.32 18.49 290.69 18.40 289.01 18.29 287.49 18.21 286.66 18.16 285.86 18.11 285.43 18.09 285.56 18.09 285.83 18.11 286.29 18.14 286.90 18.18 287.53 18.22 288.06 18.25 288.71 18.29 289.32 18.33 289.54 18.34 288.86 18.30 286.81 18.18 72.50 73.09 73.25 73.25 73.17 73.00 72.76 72.52 72.29 72.07 71.82 71.52 71.08 70.46 69.66 68.66 67.33 65.51 63.02 59.91 185 127 70.57 71.83 72.73 73.36 73.78 74.01 74.09 74.06 73.93 73.72 73.42 73.04 72.59 72.07 71.40 70.46 69.14 67.35 65.07 62.32 Duration Syllable 168 F0 106 169.79 8.85 168.78 8.78 167.25 8.63 166.22 8.53 165.54 8.47 165.42 8.46 165.44 8.47 165.22 8.45 165.00 8.43 164.76 8.41 164.93 8.42 165.48 8.47 165.51 8.47 166.50 8.56 166.36 8.53 166.20 8.51 166.44 8.52 164.40 8.25 164.01 8.20 164.35 8.24 Duration(ms) Syllable Voiced part 291.53 18.42 286.30 18.12 283.53 17.95 281.86 17.86 281.04 17.81 280.33 17.77 280.58 17.79 280.23 17.77 280.17 17.76 280.29 17.77 281.01 17.80 282.34 17.88 283.69 17.96 285.85 18.08 288.17 18.21 288.25 18.22 291.93 18.42 291.64 18.40 292.02 18.41 290.68 18.33 71.48 72.34 72.38 71.95 71.32 70.92 70.64 70.34 70.14 69.99 69.74 69.22 68.35 67.23 66.13 65.13 64.03 62.55 60.67 58.35 281 187 70.55 72.38 73.19 73.44 73.45 73.27 73.00 72.61 72.13 71.63 71.08 70.35 69.49 68.61 67.63 66.31 64.50 62.22 59.60 57.14 Duration Syllable 281 F0 199 Mạc Đăng Khoa - 102 - Female Imperative sentence Male Master thesis 178.07 9.72 177.27 9.64 176.55 9.58 175.76 9.50 174.98 9.42 174.36 9.35 174.08 9.33 173.94 9.32 173.88 9.32 173.76 9.31 173.48 9.28 173.27 9.26 173.16 9.24 173.06 9.23 172.97 9.23 172.81 9.21 172.68 9.20 172.54 9.18 172.31 9.16 171.42 9.07 Duration(ms) Syllable Voiced part 269.43 16.50 266.25 16.34 263.43 16.18 261.32 16.05 259.70 15.95 258.46 15.87 257.49 15.81 256.97 15.78 256.61 15.75 256.85 15.77 257.22 15.79 257.56 15.81 258.24 15.85 258.67 15.88 259.16 15.90 259.81 15.95 259.58 15.93 259.13 15.90 254.95 15.66 250.14 15.36 74.22 74.47 74.61 74.67 74.69 74.69 74.65 74.55 74.38 74.12 73.77 73.36 72.90 72.36 71.72 70.97 70.08 68.98 67.64 66.00 129 75 71.84 72.66 73.17 73.44 73.54 73.50 73.37 73.16 72.89 72.54 72.11 71.62 71.05 70.38 69.54 68.46 67.05 65.35 63.37 61.20 Duration Syllable 0.154 F0 0.092 166.28 8.52 165.50 8.43 165.40 8.41 164.87 8.35 164.13 8.26 162.93 8.11 161.65 7.95 161.41 7.94 162.59 8.14 162.85 8.20 162.68 8.18 162.43 8.15 162.09 8.11 162.02 8.10 161.85 8.09 161.67 8.07 161.44 8.04 161.17 8.01 160.71 7.96 159.67 7.86 Duration(ms) Syllable Voiced part 293.55 18.56 289.61 18.34 287.29 18.21 285.88 18.12 284.37 18.03 283.76 17.99 282.70 17.93 281.91 17.88 281.38 17.84 281.21 17.83 281.18 17.83 281.21 17.82 280.92 17.80 281.13 17.81 281.27 17.82 281.49 17.83 281.07 17.80 281.60 17.83 280.33 17.75 280.52 17.76 72.52 73.52 74.14 74.47 74.65 74.74 74.76 74.68 74.53 74.28 73.97 73.59 73.15 72.63 71.97 71.11 69.96 68.41 66.46 64.15 142 96 71.79 72.91 73.65 74.12 74.35 74.36 74.22 74.01 73.75 73.46 73.15 72.81 72.39 71.83 71.10 70.16 68.92 67.31 65.32 63.02 Duration Syllable 0.182 F0 0.110 178.57 9.71 178.89 9.77 178.16 9.70 177.91 9.67 177.80 9.66 177.57 9.63 177.27 9.60 177.12 9.59 177.00 9.59 176.24 9.51 174.95 9.39 173.77 9.28 172.74 9.19 171.43 9.07 169.59 8.89 167.57 8.68 166.24 8.55 165.10 8.43 162.74 8.15 162.07 8.04 Duration(ms) Syllable Voiced part 311.46 19.61 306.69 19.35 304.61 19.23 303.23 19.16 302.40 19.11 302.06 19.09 301.84 19.08 301.71 19.08 301.29 19.05 300.63 19.01 299.39 18.94 298.05 18.86 297.06 18.81 295.21 18.70 295.00 18.69 293.48 18.60 294.19 18.65 292.11 18.52 285.62 18.12 282.03 17.91 75.21 77.01 77.78 77.91 77.65 77.28 76.86 76.38 75.80 75.05 74.21 73.36 72.41 71.22 69.79 68.28 66.72 65.00 62.88 60.61 291 195 72.53 74.05 74.48 74.62 74.50 74.19 73.80 73.49 73.20 72.74 72.22 71.69 71.19 70.59 69.79 68.61 66.97 64.72 61.72 58.59 Duration Syllable 0.317 F0 0.205 Mạc Đăng Khoa ... this thesis is to model the characteristics of Vietnamese prosody for speech synthesis It focuses on the influences of the macro -prosody on the micro -prosody, in three types of sentence: assertive,... characteristics of Vietnamese prosody to generate the ? ?prosody description” for speech synthesis In this thesis, we just focus on the differences of Vietnamese tones in different positions in the sentence... "naturalness" of synthesized speech is depends on ability of macro -prosody controlling during speech synthesis process Objectives and Tasks This thesis is part of MICA speech synthesis research

Ngày đăng: 28/02/2021, 00:01

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN

w