... a max-chart-span 15 for the hierarchical phrase-based SMT We used distortion limits of 12 or 20 for PBMT and a max-chart-span 15 for HPBMT The parameters for SMT were tuned by MERT using the first ... Feature Forest Models for Probabilistic HPSG Parsing In Computational Linguistics, Volume 34, Number 1, pages 81–88 Slav Petrov and Dan Klein 2007 Improved Inference for Unlexical...
Ngày tải lên: 30/03/2014, 17:20
... preprocessing simply included down-casing, separating punctuation from words and splitting off “’s” OOV Handling Techniques and their Combination We compare our baseline system (BASELINE) to each of ... Rate Training for Statistical Machine Translation In Proc of ACL H Okuma, H Yamamoto, and E Sumita 2007 Introducing translation dictionary into phrase-based SMT In Proc of...
Ngày tải lên: 31/03/2014, 00:20
Báo cáo khoa học: "Efficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices" pptx
... 2004 Minimum Bayes-risk decoding for statistical machine translation In Proceedings of Human Language Technologies: The 2004 Annual Conference of the North American Chapter of the Association for ... Efficient minimum error rate training and minimum bayes-risk decoding for translation hypergraphs and lattices In Proceedings of the Joint Conference of the 4...
Ngày tải lên: 23/03/2014, 16:20
Tài liệu Báo cáo khoa học: "Unsupervised Search for The Optimal Segmentation for Statistical Machine Translation" doc
... include the translation model probability in its cost calculation Specifically, the segmentation model takes into account the likelihood of both sides of the parallel corpus while searching for the optimal ... to the next The incremental updates are derived from the equations for the count collection and probability estimation steps of the EM algorithm as follows In...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Segmentation for English-to-Arabic Statistical Machine Translation" ppt
... for the baseline system and a 4-grams for segmented Arabic The average sentence length is for English, for Arabic, and 10 for segmented Arabic Since most of the data was originally intended for ... segmented clitics For example, for the word wlAwlAdh (’and for his kids’), the factored words are AwlAd and w+l+N+P:3MS We use two language models: a trigram for surface words a...
Ngày tải lên: 20/02/2014, 09:20
Báo cáo khoa học: "A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation" pptx
... contiguous tree sequence pairs but also from the non-contiguous tree sequence pairs where a non-contiguous tree sequence is a sequence of sub-trees and gaps With the help of the non-contiguous tree sequence, ... Figure 1: Rule extraction of tree- to -tree model based on tree sequence pairs algorithm for syntax-based SMT to facilitate the non-contiguous const...
Ngày tải lên: 17/03/2014, 01:20
Báo cáo khoa học: "Bridging Morpho-Syntactic Gap between Source and Target Sentences for English-Korean Statistical Machine Translation" pot
... preprocessing method for both training and decoding in EnglishKorean SMT In particular, we transform a source language sentence by inserting pseudo words and syntactically reordering it to form a target sentence ... sentence length ratio between source sentences and target sentences The largest gain (+2.39) is achieved when the combined pseudo word insertion (PWI) and...
Ngày tải lên: 17/03/2014, 02:20
Báo cáo khoa học: "Tree-to-String Alignment Template for Statistical Machine Translation" pdf
... maximum mutual information and minimum classification error training for statistical machine translation In Proceedings of the Tenth Conference of the European Association for Machine Translation ... avoid notational overhead Tree-to-String Alignment Template A tree-to-string alignment template z is a triple ˜ ˜ ˜ ˜ T , S, A , which describes the alignment A be˜ = T (F J )...
Ngày tải lên: 17/03/2014, 04:20
Tài liệu Báo cáo khoa học: "Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information" doc
... Semi-supervised Model Adaptation for Statistical Machine Translation Machine Translation, pages 77-94 Hua Wu, Haifeng Wang and Chengqing Zong 2008 Domain Adaptation for Statistical Machine Translation with ... out-of-domain translation model for domain-specific translation task In detail, we build an adapted translation model in the following steps: • Bui...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Modified Distortion Matrices for Phrase-Based Statistical Machine Translation" doc
... Chunk-lattices for verb reordering in ArabicEnglish statistical machine translation Machine Translation, Published Online David Chiang 2005 A hierarchical phrase-based model for statistical machine ... testing for statistical machine translation: Controlling for optimizer instability In Proceedings of the Association for Computational Lingustics, ACL 2011, Portland,...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "A Ranking-based Approach to Word Reordering for Statistical Machine Translation" doc
... the word order in target language To this end, we propose a simple but effective ranking-based approach to word reordering The ranking model is automatically derived from the word aligned ... baseline system for In order to show whether the improved performance is really due to improved reordering, we would like to measure the reorder performance directly 917 Reorder...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Bilingual Sense Similarity for Statistical Machine Translation" ppt
... part of the given rule, into the sense similarity measure Experiments We evaluate the algorithm of bilingual sense similarity via machine translation The sense similarity scores are used as feature ... vector space model to compute the sense 841 similarity for terms from parallel corpora and applied it to statistical machine translation We saw that the bilingual sense...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Bucking the Trend: Large-Scale Cost-Focused Active Learning for Statistical Machine Translation" docx
... translation) Figure shows the learning curves for the same systems and selection methods as in Figure but now the x-axis measures the number of foreign words in the training data The difference between ... and Anoop Sarkar 2009 Active learning for multilingual statistical machine translation In Proceedings of the Joint Conference of the 47th Annual Meeting of th...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Fixed Length Word Suffix for Factored Statistical Machine Translation" pdf
... P(e|f) ~ plm -word( eword)* plm -suffix( esuffix) * Σi=1n p(eword-j & esuffix-j|fj) * Σi=1n p(fj | eword-j & esuffix-j) Where plm -word is the n-gram language model probability over the word surface ... surface forms Similarly, plm -suffix( esuffix) is the language model probability over suffix sequences p(eword-j & esuffix-j|fj) and p(fj | eword-j & esuffix-j) are translation probabilit...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Corpus Expansion for Statistical Machine Translation with Semantic Role Label Substitution Rules" doc
... corpus without substitution In this paper, we call a tuple of semantic frame and semantic role a semantic signature Two phrase pairs with the same semantic signature are considered valid substitutions ... used directly in translation tasks or be interpolated with baseline phrase tables SRL Substitution Rules Swapping phrase pairs that serve as the same semantic role of...
Ngày tải lên: 20/02/2014, 04:20