(Luận văn thạc sĩ) integrated linguistic to statistical machine translation, tích hợp thông tin ngôn ngữ vào dịch máy tính thống kê

36 9 0
(Luận văn thạc sĩ) integrated linguistic to statistical machine translation, tích hợp thông tin ngôn ngữ vào dịch máy tính thống kê

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

VIETNAM NATIONAL UNIVERSITY HANOI UNIVERSITY OF ENGINEERING AND TECHNOLOGY HOAI-THU VUONG INTEGRATED LINGUISTIC TO STATISTICAL MACHINE TRANSLATION MASTER THESIS HANOI - 2012 Contents Introduction 1.1 Overview 1.1.1 A Short Comparison Between English 1.2 Machine Translation Approaches 1.2.1 Interlingua 1.2.2 Transfer-based Machine Translation 1.2.3 Direct Translation 1.3 The Reordering Problem and Motivations 1.4 Main Contributions of this Thesis 1.5 Thesis Organization and Vietnamese Related works 2.1 Phrase-based Translation Models 2.2 Type of orientation phrases 2.2.1 The Distance Based Reordering Model 2.3 The Lexical Reordering Model 2.4 The Preprocessing Approaches 2.5 Translation Evaluation 2.5.1 Automatic Metrics 2.5.2 NIST Scores 2.5.3 Other scores 2.5.4 Human Evaluation Metrics 2.6 Moses Decoder Shallow Processing for SMT 3.1 Our proposal model 3.2 The Shallow Syntax 3.2.1 Definition of the shallow syntax 3.2.2 How to build the shallow syntax 3.3 The Transformation Rule 3.4 Applying the transformation rule into the shallow syntax tree 1 2 3 5 7 9 10 11 11 12 12 13 13 15 15 16 16 17 18 19 Experiments 21 4.1 The bilingual corpus 21 4.2 Implementation and Experiments Setup 21 4.3 BLEU Score and Discussion 22 Conclusion and Future Work 25 5.1 Conclusion 25 5.2 Future work 25 ii Contents Appendix A A hand written of the transformation rules 27 Appendix B Script to train the baseline model 29 Bibliography 31 List of Tables Corpus Statistical Details of our experimental, AR is named as using automatic named as using handwritten rules Size of phrase tables Translation performance for the English-Vietnamese task rules, MR is 21 22 23 23 List of Figures The machine translation pyramid The concept architecture of Moses Decoder An overview of preprocess before training and decoding A pair of source and target language The training process The decoding process A shallow syntax tree The building of the shallow syntax The building of the shallow syntax 14 15 15 16 17 17 18 20 Chapter Introduction In this chapter, we would like to give a brief of Statistical Machine Translation (SMT), to address the problem, the motivations of our work, and the main contributions of this thesis Firstly, we introduce the Machine Translation (MT), which is one of big applications in Natural Language Processing (NLP) and an approach to solve this problem by using statistical Then, we also introduce the main problem of this thesis and our research motivations The next section will describe the main contributions of this thesis Finally, the content of this thesis will be outlined 1.1 Overview In the field of NLP, MT is a big application to help a user translate automatically a sentence from one language to another language MT is very useful in real life: MT help us surf the website in foreign languages, which we don’t understand, or help you understand the content of an advertising board on the street However, the high quality MT is still challenges for researchers Firstly, the reason comes from the ambiguity of natural language at various levels At lexical level, we have problem with the morphology of the word such as the word tense or word segmentation, such as Vietnamese, Japanese, Chinese or Thai, in which there is no symbol to separate two words For an example, in Vietnamese, we have a sentence "học sinh học sinh học.", "học"is a verb, which means ”study” in English, "học sinh"is a noun, which means a pupil or student in English, "sinh học"is a noun, which means a subject (biology) in English At the syntax level, we have the ambiguity of coordinate For example, we have another sentence the man saws the girl with the telescope We can understand that the man used the telescope to see the girl or the girl with the telescope is seen by the man So on, the ambiguity is more difficult in the semantic level Secondly, Jurafsky and Martin (2009) shows that there are some differences in a pair of language such as the difference in structure, lexical, etc , which make MT become challenges Specially, one of the differences between two languages, which we want to aim on this thesis, is the order of words in each language For example, English is a type of Subject-Verb-Object (SVO) language, which means subject comes first, then verb follows the subject and the end of the sentence is Object In the sentence ”I go to school”, ”I” is its subject, the verb is go to and the object is school The different from English, Japanese is a type of SOV language, or Classical Arabic is VSO language In the past, the rule-based method were favorite They built MT system with some manually rules, which are created by human, so that, in closed domain or restricted area, Chapter Introduction the quality of rule based system is very high However, with the increase of internet and social network, we need a wide broad of MT system, and the rule based method is not suitable So that, we need a new way to help the MT, and the statistic is applied to the field of MT At the same time, the statistical method is applied in many studies: automatic speech recognition, etc So that, the idea of using statistical for MT has been coming out Nowadays, there are some MT systems, in which statistical method is used, can compare with human translation such as GOOGLE1 1.1.1 A Short Comparison Between English and Vietnamese English and Vietnamese have some similarities such as they base on the Latin character or are the type of SVO structure For an example: en: I go to school vn: Tôi ➒i học But the order of words in an English noun phrase is different from that in a Vietnamese one For example: en: a black hat vn: mũ màu_➒en In the above English example, hat is the head of the noun phrase and it stands at the end of the phrase And in Vietnamese, mũ is also the head noun, but it is in the middle of phrase The reorder of words can be seen in wh-question, too en: what is your job? vn: cơng_việc anh ? In this example, the word what mean in Vietnamese The position of these two words can be easil seen Because, English follows with S-Structure and Vietnamese follows with D-Structure 1.2 Machine Translation Approaches In this section, we would like to give a short of approaches in the field of machine translation We would like to begin with complex method (interlingua) and en with simple one (direct method) From a source sentence, we use some analyzing methods to get the complex structures, and then generate the structures or sentences in the target language The highest complex structure is the interlingua language (figure 1) http://translate.google.com 1.2 Machine Translation Approaches Figure 1: The machine translation pyramid 1.2.1 Interlingua The interlingua systems (Farwell and Wilks, 1991; Mitamura, 1999) are based on the idea of finding a language, which called interlingua language to represent the source language and is easy enough to generate the sentence in other language In the figure 1, we can see the process of this approach The analyzing method is the understanding process, in this step, from source sentence we can use some technical in NLP to map source sentence to data structure in the interlingua, then retrieve the target sentence by generating process The problem is how complex the interlingua is If the interlingua is simple, we can get many translation options In other way, the more complex the interlingua is, the more cost effort the analyzing and the generating are 1.2.2 Transfer-based Machine Translation Another approach is analyzing the complex structure (simpler than interlingua structure), then using some transfer rules to get the similar structure in the target language Then generating the target sentence On this model, MT involves three phrases: analysis, transfer and generation Normally, we can use all three phrases However, we sometimes use two of three phrases such as transfer from the source sentence to the structure in target language then generate the target sentence For example, we would like to introduce a simple transfer rule to translate source sentence to the target sentence2 [N ominal → AdjN oun]source language ⇒ [N ominal → N ounAdj]target language This example is take from Jurafsky and Martin (2009) Chapter Introduction 1.2.3 Direct Translation 1.2.3.1 Example-based Machine Translation Example based machine translation was first introduced by Nagao (1984), the author used a bilingual corpus with parallel texts as its main knowledge base, at run time The idea is behind it, is finding the pattern in the bilingual and combining with the parallel text to generate the new target sentence This method is similar with the process in human brain Finally, the problem of example based machine translation comes from the matching criteria, the length of the fragments, etc 1.2.3.2 Statistical Machine Translation Extending the idea of using statistical for speech recognition, Brown et al (1990, 1993) introduced the method using statistical, a version of noisy channel to MT Applyied noisy channel to machine translation, the target sentence is transformed to the source sentence by noisy channel We can represent MT problem as three tasks of noisy channel: forward task: compute the fluency of the target sentence learning task: from parallel corpus find the conditional probability between the target sentence and the source sentence decoding task: find the best target sentence from source sentence So that the decoding task can be represented as this formula: eˆ = arg max P r(e|f ) e Applying the Bayes rule, we have: eˆ = arg max e P r(f |e) ∗ P r(e) P r(f ) Because of the same denominator, we have: eˆ = arg max P r(f |e) ∗ P r(e) e (Jurafsky and Martin, 2000, 2009) define the Pr (e) as the fluency of the target sentence, known as the language model It is usually modeled by n-gram or n-th Markov model The P r(f |e) is defined as the faithfulness between the source and target language We use the alignment model to compute this value base on the unit of the SMT Basing on the definition of the translation unit we have some of approaches: • word based: using word as a translation unit (Brown et al., 1993) • phrase based: using phrase as a translation unit (Koehn et al., 2003) • syntax based: using a syntax as a translation unit (Yamada and Knight, 2001) 1.3 The Reordering Problem and Motivations 1.3 The Reordering Problem and Motivations In the field of MT, the reordering problem is the task to reorder the words, in the target language, to get the best target sentence Sometimes, we call the reordering model as distortion model Phrase-based Statistical Machine Translation (PBSMT), which was introduced by Koehn et al (2003); Och and Ney (2004), is currently the state of the art model in word choice and local word reordering The translation unit is the sequence of words without linguistic information So that, in this thesis, we would like integrate some linguistic information such as a chunking, a syntax shallow tree or transformation rule and with a special aim at solving the global reordering problem There are some studies on integrating syntactic resources within SMT Chiang Chiang (2005) shows significant improvement by keeping the strengths of phrases, while incorporating syntax into SMT Chiang (2005) built a kind of the syntax tree based on synchronous Context Free Grammar (CFG), known as the hierarchical of phrase Chiang (2005) used log linear model to determine the weighted of extracted rules and developed various of CYK algorithm to implement decoding So that, the reordering phrase is defined by the synchronous CFG Some approaches have been applied at the word-level (Collins et al., 2005) They are particularly useful for language with rich morphology, for reducing data sparseness Other kinds of syntax reordering methods require parser trees , such as the work in Quirk et al (2005); Collins et al (2005); Huang and Mi (2010) The parsed tree is more powerful in capturing the sentence structure However, it is expensive to create tree structure, and building a good quality parser is also a hard task All the above approaches require much decoding time, which is expensive The approach we are interested in here is to balance the quality of translation with decoding time Reordering approaches such as a preprocessing step Xia and McCord (2004); Xu et al (2009); Talbot et al (2011); Katz-Brown et al (2011) is very effective (improvement significant over state of-the-art phrase-based and hierarchical machine translation systems and separately quality evaluation of reordering models) 1.4 Main Contributions of this Thesis Inspiring this preprocess approach, we have proposed a combination approach which preserves the strength of phrase-based SMT in local reordering and decoding time as well as the strength of integrating syntax in reordering As the result, we use an intermediate syntax between the Parts of Speech (POS) tag and parse tree: shallow parsing Firstly, we use shallow parsing for preprocess with training and testing Secondly, we apply a series of transformation rules to the shallow tree We have get two sets of transformation rules: the first set is written by hand, and the other is extracted automatically from the bilingual corpus The experiment results from English-Vietnamese pair showed that our approach achieves significant improvements over MOSES, which is the state-of-the art phrase based system 3.2 The Shallow Syntax 17 Souce Language Sentence Building Shallow Syntactic Applying Transformation Rules Language Model (h1 (e)) Beam Search Translation Model (h2 (e, f )) M e∗ = arg max e λm hm (e, f ) m=0 Decoding Target Language Sentence Figure 6: The decoding process Figure is an example of the shallow syntax tree We have an English sentence: tom ’s two blue books are good with POS and function tags such as NP, CD1 This example shows that this tree is not the full parse tree It means the root of tree is S, and the last child has got tag JJ, the POS of the word good 3.2.2 How to build the shallow syntax Figure represents the process to build the shallow syntax tree Parsing by chunking (Sang, 2000; Tsuruoka and Tsujii, 2005; Tsuruoka et al., 2009) is a method which builds We use Penn Treebank Tags Set(Marcus et al., 1993) S VB NP NP NNP POS CD tom ’s JJ JJ NNS two blue books are good Figure 7: A shallow syntax tree 18 Chapter Shallow Processing for SMT S S VB NP VB NP NNP POS NNP POS CD JJ ’s NP NNS tom tom JJ JJ two blue books are good (a) First step in the building shallow syntax tree ’s books are good (b) Second step in the building shallow syntax tree Figure 8: The building of the shallow syntax the syntax tree of sentence by using a recursive of chunking process Firstly, the input sentence is chunked by one chunking base to a shallow tree (figure 8a) In fact, this shallow tree is used in some NLP problems such as Name Entity, Base NP detection, etc After that, a head word in each chunk is extracted (such as, the word books in figure 8b) Another chunking model is applied to build the shallow tree Then we loop the process until we get the final tree with only root node So that, we will retrieve the full syntax tree However, we stop at the first level of loop, and receive the shallow tree with maximum height is two (figure 8b) Finally, we will get the shallow syntax tree as figure 3.3 The Transformation Rule After building the shallow tree, we can use this syntax tree to reorder word in the source sentence Changing the order of some words in the source sentence is similar with changing the order of node in the syntax tree, whose nodes are augmented to include a word and a POS label To that, we apply the transformation rule, which is represented as (LSH → RHS, RS) In this form, LHS → RHS is an unlexicalized CFG rule and RS is a reordering sequence In this rule, LHS is the left hand side symbol, it is usually a POS label or a function tag in the grammar of the source language And RHS is the right hand side of the rule, which is a sequence of symbol in the grammar of the source language This is called as unlexicalized rule because, the RHS never contains a word in the source or target language Each element in the reordering sequence is the index of the symbol in the RHS Suppose, we have a rule (N P → JJN N, 0) which will transform the rule (N P → JJN N ) in the source language into the rule (N P → N N JJ) in the target language Note that, the reorder sequence will be one of the permutation of n elements, where each element is the index of the symbol in RHS and n is the length of the RHS symbol So that, with a same CFG rule, we will have a number of transformation rules 3.4 Applying the transformation rule into the shallow syntax tree 19 In this thesis, the transformation rule is written manually or extracted automatically from bilingual corpus A set of hand written rule will be provided in appendix A To extract the transformation rule from bilingual corpus we use a method of Nguyen and Shimazu (2006) 3.4 Applying the transformation rule into the shallow syntax tree Algorithm gives the details of how we apply the transformation rule into the shallow syntax tree We surf each node in the syntax tree and find the rule, matches structure of the tree and the transformation If no rule is found, we keep the order of the words in the sentence as the same with the input For example, we have a pair of phrase: en: tom ’s two blue books vn: hai cuốn_sách màu_xanh tom Figure 9a indicates the shallow syntax tree of an English sentence as the input of our preprocess, figure 9b is the result of reordering in base-chunk level Finally, figure 9c states the result of reordering process in the shallow syntax tree Finally, the new English sentence is two books blue ’s tom, similar with the order of the one in the target language Algorithm Apply transformation rule into the shallow syntax tree Require: a root of the shallow syntax tree Require: a set of transformation rule if root is not terminal node then x is CFG of root node for all transformation rule if x matches transformation rule then reorder child in this root break end if end for for all children recursive with each child end for end if return a source sentence with the order of words are similar with the target sentence 20 Chapter Shallow Processing for SMT S NP NNP POS CD ’s tom JJ NNS two blue books (a) An input shallow syntax tree S S NP NNP POS CD tom ’s NNS POS NNP NP JJ two books blue CD NNS JJ two books blue ’s tom (b) A shallow syntax tree with reorder of (c) A shallow syntax tree with reorder in nodes in base-chunk level overall Figure 9: The building of the shallow syntax Chapter Experiments This sections states the bilingual corpus, experiments which are done, and talk about the result of them 4.1 The bilingual corpus Table gives the information of a bilingual corpus, which is used in Nguyen et al (2007, 2008b,a) Before using this corpus, we did some cleaning tasks such as changing the encoding of the corpus to utf-8, tokenizing the English sentence, and converting the capital letter to lower form So that, we have about fifty thousand pairs of sentence for training, two hundred ones for tuning process and five hundred for testing purpose Corpus General Training Development Test Sentence pairs 55341 Training Set 54642 Sentences Average Length Word Vocabulary Sentences Average Length Word Vocabulary Sentences Average Length Word Vocabulary Development Set Test Set 200 499 English Vietnamese 54620 11.2 10.6 614578 580754 23804 24097 200 11.1 10.7 2221 2141 825 831 499 11.2 10.5 5620 6240 1844 1851 Table 1: Corpus Statistical 4.2 Implementation and Experiments Setup According to chapter 3, we implement the idea of Tsuruoka and Tsujii (2005); Tsuruoka et al (2009) to create the shallow syntax and apply the transformation rule, which is extracted or written, and retrieve the new source sentence that looks like the target sentence To some experiments, we have a server with x 2.66 GHz for CPU and GB 22 Chapter Experiments Name Baseline Baseline + MR Baseline + AR Baseline + AR (monotone) Baseline + AR(shallow syntactic) Baseline + AR(shallow syntactic+monotone) Description Phrase-based system Phrase-based system with corpus which is preprocessed by using handwritten rules Phrase-based system with corpus which is preprocessed by using automatic learning rules Phrase-based system with corpus which is preprocessed by using automatic learning rules and decoded by monotone decoder Phrase-based system with corpus which is transformed by using automatically transformation rules Phrase-based system with corpus which is transformed by using automatically transformation rules and decoded by monotone decoder Table 2: Details of our experimental, AR is named as using automatic rules, MR is named as using handwritten rules for memory We use SRI language model toolkit (Stolcke, 2002) to get a language model from training corpus and Moses Decoder (Koehn et al., 2007) to train a phrase translation model and decode the source sentence to the target sentence In the appendix B, we would like to give the script to train the language model and phrase base model After training, we used the development set to tune the parameters Table shows the experiments, which were carried out by us We modified the baseline model of WMT101 to apply for the English-Vietnamese phrase model and use difference corpus with them In the table 2, AR means automatic rule, which is the automatic extracted transformation rule, and MR means manual rule, which is written by us We also built the experiment with the monotone decoder Doing this kind of experiment, we hope that the distortion model will be disabled Hence, we can estimate the effect of our method with or without the distortion model 4.3 BLEU Score and Discussion The result of our experiments in table showed our applying transformation rule to process the source sentences Thanks to this method, we can find out various phrases in the translation model They also enable us to have more options for decoder to generate the best translation http://www.statmt.org/wmt10/baseline.html 4.3 BLEU Score and Discussion Name Baseline Baseline Baseline Baseline Baseline Baseline + + + + + 23 Size of phrase-table 1237568 MR 1251623 AR 1243699 AR (monotone) 1243699 AR (shallow syntactic) 1279344 AR (shallow syntactic + monotone) 1279344 Table 3: Size of phrase tables System Baseline Baseline Baseline Baseline Baseline Baseline + + + + + BLEU (%) 36.84 MR 37.33 AR 37.24 AR (monotone) 35.80 AR (shallow syntactic) 37.66 AR (shallow syntactic + monotone) 37.43 Table 4: Translation performance for the English-Vietnamese task Table describes the BLEU score (Papineni et al., 2002) of our experiments As we can see, by applying preprocess in both training and decoding, the BLEU score of our best system increases by 0.82 point ”Baseline + AR (shallow syntactic)” system) over ”Baseline system” Improvement over 0.82 BLEU point is valuable because baseline system is the strong phrase based SMT (integrating lexicalized reordering models) The improvement of ”Baseline + AR (shallow syntactic)” system is statistically significant at p < 0.01 We also carried out the experiments with handwritten rules Using some handwritten rules helps the phrased translation model generate some best translation more than the automatic rules Besides, the result proved that the effect of applying transformation rule on the shallow syntactic when the BLEU score is highest Because the cover of handwritten rules is larger than the automatic rules Furthermore, handwritten rule is made by human, and focus on popular cases So that, we get some pair of sentences with the best alignment, and then, we can extract more and better phrase tables Finally, the BLEU score of using monotone decoder decrease by 1% when we use preprocess in only base chunk level, and our shallow syntactic decreased a bit As, the default reordering model in the baseline system is better than one in this experiment2 The reordering model in the monotone decoder is distance based, introduced in Koehn et al (2003) This model is a default reordering model in Moses Decoder Koehn et al (2007) Chapter Conclusion and Future Work The main results and contributions of this thesis’s research is summarized and the directions for the future work is mentioned in this last chapter 5.1 Conclusion This thesis introduced the global reordering problem and described a method to solve this problem by using the linguistic information: the shallow syntax tree and the transformation rule We apply the transformation rule to reorder the node of shallow syntax as preprocessing steps before training or decoding The BLUE score, which is given of our method, is about 37.66% The result is not highly accurate, but this number proved that our method can be applied in the future to build a better SMT system Last but not least, by finishing this thesis, we have to study a SMT more and report a fewer about the reordering problem 5.2 Future work Our work, which have been done, can be extended in many directions So that, we can summary some of them In chapter 4, we can see the first limitation of our research: the corpus is small The European project introduced a big corpus with a ton of data (about ten gigabytes of monolingual and five gigabytes of parallel corpus) We hope that in the future, we will have the bigger corpus to prove the effect of our method, and deploy another language pairs Moreover, we need some experiments to compare the performance and accuracy when constructing the shallow tree versus the full parse tree In addition, because of inspiring the method in (Xia and McCord, 2004), we also need compare our method with theirs In other way, we can improve the method in (Nguyen and Shimazu, 2006) to extract the better transformation rule sets Last but not least, by using MOSES Decoder (Koehn et al., 2007), we ”forgot” the linguistic information (chunk phrase or shallow syntactic), which are built in the preprocessing, so that, we would like to develop a new decoder, in which the linguistic information is integrated into the decoding phrase Appendix A A hand written of the transformation rules 28 $NP → $DT $CD $CD $JJ $NN 04123 $NP → $DT $JJ $JJ $JJ $NN 04123 $NP → $DT $JJ $JJ $JJ $NN $NN 012345 $NP → $DT $JJ $JJ $NN 0312 $NP → $DT $JJ $JJ $NN $NN 04312 $NP → $DT $JJ $JJ $RP $NN $NN 054312 $NP → $DT $JJ $NN 021 $NP → $DT $JJ $NN $NN 0321 $NP → $PDT $DT $JJ $JJ $NN $NN 015423 $NP → $DT $CD $CD $JJ $NNS 04123 $NP → $DT $JJ $JJ $JJ $NNS 04123 $NP → $DT $JJ $JJ $NNS 0312 $NP → $DT $JJ $NNS 021 $NP → $DT $JJ $JJ $JJ $NN $NNS 012345 $NP → $DT $JJ $JJ $JJ $NNS $NN 012345 $NP → $DT $JJ $JJ $JJ $NNS $NNS 012345 $NP → $DT $JJ $JJ $NN $NNS 04312 28 Appendix A A hand written of the transformation rules $NP → $DT $JJ $JJ $NNS $NN 04312 $NP → $DT $JJ $JJ $NNS $NNS 04312 $NP → $DT $JJ $JJ $RP $NN $NNS 054312 $NP → $DT $JJ $JJ $RP $NNS $NN 054312 $NP → $DT $JJ $JJ $RP $NNS $NNS 054312 $NP → $DT $JJ $NNS $NN 0321 $NP → $DT $JJ $NN $NNS 0321 $NP → $DT $JJ $NNS $NNS 0321 $NP → $PDT $DT $JJ $JJ $NNS $NNS 015423 $NP → $PDT $DT $JJ $JJ $NNS $NN 015423 $NP → $PDT $DT $JJ $JJ $NNS $NNS 015423 Appendix B Script to train the baseline model ngram-count -order -interpolate -kndiscount -text training-target-file -lm lm-model-output \$SCRIPTS_ROOTDIR/training/train-model.perl -scripts-root-dir \$SCRIPTS_ROOTDIR -root-dir \$PWD -corpus corpus-file-prefix -f \$F -e \$E -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm 0:3:lm-model-output:0 Bibliography Peter F Brown, J Cocke, Stephen A Della Pietra, Vincent J Della Pietra, F Jelinek, John D Lafferty, R L Mercer, and P S Roossin A statistical approach to machine translation Computational Linguistics, 16(2):79–85, 1990 (Cited on pages and 7.) Peter F Brown, Stephen A Della Pietra, Vincent J Della Pietra, and R L Mercer The mathematics of statistical machine translation: parameter estimation Computational Linguistics, 19(2):263–311, 1993 (Cited on pages and 7.) David Chiang A hierarchical phrase-based model for statistical machine translation In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pages 263–270, Ann Arbor, Michigan, June 2005 (Cited on page 5.) David Chiang Hierarchical phrase-based translation Computational Linguistics, 33(2): 201–228, 2007 (Cited on pages 13 and 16.) M Collins, P Koehn, and I Kucerov´a Clause restructuring for statistical machine translation In Proc ACL 2005, pages 531–540 Ann Arbor, USA, 2005 (Cited on pages and 10.) George Doddington Automatic evaluation of machine translation quality using n-gram cooccurrence statistics In Proceedings of the Second International Conference on Human Language Technology Research, pages 138–145, San Francisco, CA, USA, 2002 Morgan Kaufmann Publishers Inc (Cited on page 12.) D Farwell and Y Wilks Ultra: A multi-lingual machine translator In Proceedings of Machine Translation Summit III, pages 19–24, Washington DC, USA, 1991 (Cited on page 3.) Marcello Federico, Nicola Bertoldi, and Mauro Cettolo Irstlm: an open source toolkit for handling large scale language models In INTERSPEECH, pages 1618–1621, 2008 (Cited on page 13.) Michel Galley and Christopher D Manning A simple and effective hierarchical phrase reordering model In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 848–856, Honolulu, Hawaii, October 2008 Association for Computational Linguistics URL http://www.aclweb.org/anthology/D08-1089 (Cited on pages and 9.) Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve DeNeefe, Wei Wang, and Ignacio Thayer Scalable inference and training of context-rich syntactic translation models In Proceedings of COLING/ACL 2006, pages 961–968 Sydney, Australia, 2006 (Cited on page 16.) ... multi-lingual machine translator In Proceedings of Machine Translation Summit III, pages 19–24, Washington DC, USA, 1991 (Cited on page 3.) Marcello Federico, Nicola Bertoldi, and Mauro Cettolo Irstlm:... 1.2.3.2 Statistical Machine Translation Extending the idea of using statistical for speech recognition, Brown et al (1990, 1993) introduced the method using statistical, a version of noisy channel to. .. problem is the task to reorder the words, in the target language, to get the best target sentence Sometimes, we call the reordering model as distortion model Phrase-based Statistical Machine Translation

Ngày đăng: 05/12/2020, 11:15

Mục lục

  • Contents

  • List of Tables

  • List of Figures

  • 1.1 Overview

  • 1.1.1 A Short Comparison Between English and Vietnamese

  • 1.2 Machine Translation Approaches

  • 1.2.1 Interlingua

  • 1.2.2 Transfer-based Machine Translation

  • 1.2.3 Direct Translation

  • 1.3 The Reordering Problem and Motivations

  • 1.4 Main Contributions of this Thesis

  • 1.5 Thesis Organization

  • 2.1 Phrase-based Translation Models

  • 2.2 Type of orientation phrases

  • 2.2.1 The Distance Based Reordering Model

  • 2.3 The Lexical Reordering Model

  • 2.4 The Preprocessing Approaches

  • 2.5 Translation Evaluation

  • 2.5.1 Automatic Metrics

  • 2.5.2 NIST Scores

Tài liệu cùng người dùng

Tài liệu liên quan