machine translation to larger corpora

Báo cáo khoa học: "Scaling Phrase-Based Statistical Machine Translation to Larger Corpora and Longer Phrases" pptx

... arrays to parallel corpora to calculate phrase translation prob- abilities. 4.1 Applied to parallel corpora In order to adapt sufﬁx arrays to be useful for statistical machine translation we need ... fre- 8 3 6 1 9 5 0 4 7 2 to aid morocco to confirm that spain declined to aid morocco morocco spain declined to aid morocco declined to confirm that spain declined to aid morocco declined to aid morocco confirm ... declined to confirm that spain declined to aid morocco declined to confirm that spain declined to aid morocco to confirm that spain declined to aid morocco confirm that spain declined to aid morocco that...

Ngày tải lên: 17/03/2014, 05:20

8 316 0

Báo cáo khoa học: "Toward Statistical Machine Translation without Parallel Corpora" ppt

Ngày tải lên: 24/03/2014, 03:20

11 255 0

Tài liệu Báo cáo khoa học: "A Ranking-based Approach to Word Reordering for Statistical Machine Translation" doc

... English-Hindi Statistical Machine Translation. In Proc. IJCNLP. Roy Tromble. 2009. Search and Learning for the Lin- ear Ordering Problem with an Application to Machine Translation. Ph.D. Thesis. Karthik ... converted to dependency trees using Stanford Parser (Marneffe et al., 2006). We con- vert the tokens in training data to lower case, and re-tokenize the sentences using the same tokenizer from ... sensitive to parser errors; on the other hand, integrated model is forced to use a longer distortion limit which leads to more search errors during decoding time. It is possible to 918 ing...

Ngày tải lên: 19/02/2014, 19:20

9 616 0

Tài liệu Báo cáo khoa học: "HINDI TO PUNJABI MACHINE TRANSLATION SYSTEM" pdf

... Case Study of Hindi to Punjabi Machine Translation System. International Journal of Translation. (Accepted, In Print). Goyal V., Lehal G.S. 2011a. Hindi to Punjabi Machine Translation System. ... 2.6.3 Word -to- Word translation using lexicon lookup If token is not a title or a surname, it is looked up in the HPDictionary database containing Hindi to Punjabi direct word to word translation. ... gives it to translation engine for analysis till the complete input text is read and processed. 2.6 Translation Engine The translation engine is the main component of our Machine Translation...

Ngày tải lên: 20/02/2014, 05:20

6 458 0

Tài liệu Báo cáo khoa học: "The Contribution of Linguistic Features to Automatic Machine Translation Evaluation" docx

... according to the automatic metric. For this, we consider in each translation case c, the worse automatic translation t that equals or im- proves the human-aided translation t h according to the automatic ... and the Confusion of Tongues: an MT Metric. In Pro- ceedings of the Workshop on MT Evaluation ”Who did what to whom?” at Machine Translation Summit VIII, pages 55–59. Christoph Tillmann, Stefan ... Labelled Dependencies in Machine Translation Evaluation. In Proceedings of the ACL Workshop on Statistical Machine Translation, pages 104–111. Kishore Papineni, Salim Roukos, Todd Ward, and Wei- Jing...

Ngày tải lên: 20/02/2014, 07:20

9 514 0

Tài liệu Báo cáo khoa học: "Name Translation in Statistical Machine Translation Learning When to Transliterate" pptx

... USA me@hal3.name Abstract We present a method to transliterate names in the framework of end -to- end statistical machine translation. The system is trained to learn when to transliterate. For Arabic to English MT, we developed ... apply it to any base SMT system, and to human translationsas well. Our goal in augment- ing abaseSMT systemis toincreasethis percentage. A secondary goal is to make sure that our overall translation ... transliterator described in section 3 to the tagged items. We limit this transliteration to words that occur up to 50 times in the training corpus for single token names (or up to 100 and 150 times for...

Ngày tải lên: 20/02/2014, 09:20

9 546 0

Tài liệu Báo cáo khoa học: "Segmentation for English-to-Arabic Statistical Machine Translation" ppt

... their effect on translation. We also report on applying Factored Translation Models (Koehn and Hoang, 2007) for English -to- Arabic translation. 2 Previous Work The only previous work on English -to- Arabic ... techniques. We also report on the use of Factored Translation Models for English- to- Arabic translation. 1 Introduction Arabic has a complex morphology compared to English. Words are inﬂected for gender, ... attached to the word. So syArp +P:1S recombines to syArty (’my car’) 2. Letter Ambiguity: The character ’Y’ (Alf mqSwrp) is normalized to ’y’. In the recom- bination step we need to be able to decide whether...

Ngày tải lên: 20/02/2014, 09:20

4 374 0

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

... phrasal translation sequences is selected and collated into the ﬁnal phrasal translation. Re-scoring through a Test Collection Large-scale test collections could be used to re- score the translation ... comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for the disambiguation of translation alternatives and thus selection of best phrasal translations ... occurred is used to represent the re-scoring factor RF for each sequence of translation candidates. Phrasal translation candidates are sorted in decreasing order by re-scoring factors RF . Finally,...

Ngày tải lên: 20/02/2014, 16:20

4 377 0

Báo cáo khoa học: "Syntax-to-Morphology Mapping in Factored Phrase-Based Statistical Machine Translation from English to Turkish" ppt

... English -to- Turkish translation, but without using any morphology. 6 Conclusions We have presented a novel way to incorporate source syntactic structure in English -to- Turkish phrase-based machine translation ... of the factors. We aligned our training sets using only the root factor to conﬂate statistics from different forms of the same root. The rest of the factors are then automatically assumed to be aligned, ... possible way to address is to use longer distance constraints on the morphological tag factors, to see if we can select them better. 3.2.3 Experiments with higher-order language models Factored phrase-based...

Ngày tải lên: 07/03/2014, 22:20

11 452 0

Báo cáo khoa học: "Hindi-to-Urdu Machine Translation Through Transliteration" pptx

... a pivot to translate from En- glish to Urdu. This work also uses transliteration only for the translation of unknown words. Their work can not be used for direct translation from Hindi to Urdu ... models look for the most probable Urdu token sequence u n 1 for a given Hindi token sequence h n 1 . We assume that each Hindi token is mapped to exactly one Urdu token and that there is no reordering. ... and the translation model. We refer to the words known to the language model and to the translation model as LM-known and TM-known words respectively and to words that are unknown as LM-unknown and...

Ngày tải lên: 07/03/2014, 22:20

10 407 0

Báo cáo khoa học: "Applying Morphology Generation Models to Machine Translation" docx

... In the second setting, we allow the model to use up to 100 translations, and to automatically select the best number to use. As seen in Table 3, (n=16) translations were chosen for Russian and ... statistical machine translation. In HLT-NAACL. Xiaodong He. 2007. Using word-dependent transition models in HMM based word alignment for statistical machine translation. In ACL Workshop on Statistical Machine ... analysis for statistical machine translation. In HLT-NAACL. Einat Minkov, Kristina Toutanova, and Hisami Suzuki. 2007. Generating complex morphology for machine translation. In ACL. Sonja Nießen...

Ngày tải lên: 08/03/2014, 01:20

9 416 0

Báo cáo khoa học: "Tailoring Word Alignments to Syntactic Machine Translation" docx

... model that incorporates syntax-based distortion. Lopez and Resnik (2005) considers a sim- pler tree distance distortion model. Daum ´ e III and Marcu (2005) employs a syntax-aware distortion model ... case of syntactic machine translation, we want to condi- tion on crossing constituent boundaries, even if no constituents are skipped in the process. 4 Experimental Results To understand the ... French results to the pub- lic baseline GIZA++ using the script published for the NAACL 2006 Machine Translation Workshop Shared Task. 5 Similarly, we compared our Chi- nese results to the GIZA++...

Ngày tải lên: 08/03/2014, 02:21

8 287 1

Báo cáo khoa học: "A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model" pot

... tree -to- tree models can represent richer structural information, existing tree -to- tree models did not show advantage over string -to- tree models on translation accuracy due to a much larger ... suc- cessfully applied to Statistical Machine Translation (Graehl and Knight, 2004; Chiang, 2005; Ding and Palmer, 2005; Quirk et al., 2005). In some language pairs, i.e. Chinese -to- English translation, ... two-step string- to- CFG-tree translation model which employed a syntax-based language model to select the best translation from a target parse forest built in the ﬁrst step. Only translation probability...

Ngày tải lên: 17/03/2014, 02:20

9 442 0

Báo cáo khoa học: "Tree-to-String Alignment Template for Statistical Machine Translation" pdf

... as a bottom-up beam search. To translate a source sentence, we employ a parser to produce a parse tree. Moving bottom- up through the source parse tree, we compute a list of candidate translations ... Candidate translations of subtrees are placed in stacks according to the root index set by postorder transversal A candidate translation contains the following information: 1. the partial translation 2. ... of tree -to- string alignment templates obtained in training In the following, we formally describe how to introduce tree -to- string alignment templates into probabilistic dependencies to model...

Ngày tải lên: 17/03/2014, 04:20

8 338 0

Báo cáo khoa học: "Towards a Uniﬁed Approach to Memory- and Statistical-Based Machine Translation" pdf

... used to improve the performance of cur- rent translation systems. To determine this, we modified an existing decoding algorithm so that it can exploit information specific both to a statistical translation ... statistical TMEM and the translation model. Our experiments show that the automatically de- rived translation memory can be used within the statistical framework to often find translations of higher ... words into which e is going to be translated. Each English word e is then translated with probability t e into a French word , where ranges from 1 to the number of words (fertility of e ) into which...

Ngày tải lên: 17/03/2014, 07:20

8 434 0

Báo cáo khoa học: "A PARAMETERIZED APPROACH TO INTEGRATING ASPECT WITH LEXICAL-SEMANTICS FOR MACHINE TRANSLATION" doc

... "A Cross-Linguistic Approach to Machine Translation, " Proceedings of the Third International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages, ... discussed the application of the theoretical foundations to the automatic acquisition of aspectual representations from corpora in order to augment the lexical-semantic representations that ... Spanish only. Consider the following example: (7) (i) John went to the store when Mary arrived. (it) John had gone to the store when Mary arrived. In Dorr (1991), we discussed the selection...

Ngày tải lên: 17/03/2014, 08:20

8 441 0

Báo cáo khoa học: "Private Access to Phrase Tables for Statistical Machine Translation" pptx

Ngày tải lên: 23/03/2014, 14:20

5 294 0

Báo cáo khoa học: "A Word-Class Approach to Labeling PSCFG Rules for Machine Translation" pot

Ngày tải lên: 23/03/2014, 16:20

11 424 0

Báo cáo khoa học: "A Probabilistic Approach to Syntax-based Reordering for Statistical Machine Translation" potx

Ngày tải lên: 31/03/2014, 01:20

8 414 0

Báo cáo khoa học: "Machine Translation by Triangulation: Making Effective Use of Multi-Parallel Corpora" pptx

Ngày tải lên: 31/03/2014, 01:20

8 298 0