Báo cáo khoa học: "Statistical Machine Translation through Global Lexical Selection and Sentence Reconstruction" doc
... 2007. c 2007 Association for Computational Linguistics Statistical Machine Translation through Global Lexical Selection and Sentence Reconstruction Srinivas Bangalore, Patrick Haffner, Stephan ... Introduction Machine translation can be viewed as consisting of two subproblems: (a) lexical selection, where appro- priate target language lexical items are chosen for eac...
Ngày tải lên: 17/03/2014, 04:20
... lower-dimensional PMTG and a word-to-word translation model is similar in spirit to the way that regular grammars can help to es- timate CFGs (Lari & Young, 1990), and the way that simple translation ... statistical machine translation (MTSMT) is an architecture for SMT that revolves around multitrees. Figure 7 shows how to build and use a rudimentary MTSMT system, starting...
Ngày tải lên: 20/02/2014, 16:20
... local word order and idiomatic expressions through the use of phrases, and by the deployment of large n-gram language models to model fluency and lexical choice. 4.1 Question-Answer Translation Our ... 2000; Echihabi and Marcu, 2003; Soricut and Brill, 2006). Information retrieval (IR) is faced by a similar fundamental problem of “term mismatch” between queries and documents....
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Statistical Machine Translation with Word- and Sentence-Aligned Parallel Corpora" potx
... between sentence- and word-aligned train- ing material. 5.4 Ratio of word- to sentence- aligned data We also varied the ratio of word-aligned to sentence- aligned data, and evaluated the AER and Bleu ... word-aligned to sentence- aligned data 500 sentence pairs 2000 sentence pairs 8000 sentence pairs 16000 sentence pairs Figure 4: The effect on Bleu of varying the ratio of w...
Ngày tải lên: 31/03/2014, 03:20
Báo cáo khoa học: "Online Plagiarism Detection Through Exploiting Lexical, Syntactic, and Semantic Information" potx
... plagiarism by exploiting lexical, syntactic and semantic features that includes duplication-gram, reordering and alignment of words, POS and phrase tags, and semantic similarity of sentences. We establish ... 1995 Word + Sentence Percentage of matching sentences. White and Joy, 2004 Sentence Average overlap ratio of the sentence pairs using 2 pre-defined threshol...
Ngày tải lên: 23/03/2014, 14:20
Tài liệu Báo cáo khoa học: "Robust Machine Translation Evaluation with Entailment Features∗" pptx
... that are semantically close but not identical. Banerjee and Lavie (2005) and Chan and Ng (2008) use WordNet, and Zhou et al. (2006) and Kauchak and Barzilay (2006) exploit large collections of automatically-extracted ... (Harabagiu and Hickl, 2006). The relation between textual entailment and MT evaluation is shown in Figure 1. Perfect MT output and the reference translatio...
Ngày tải lên: 20/02/2014, 07:20
Tài liệu Báo cáo khoa học: "Collaborative Machine Translation Service for Scientific texts" pdf
... trans- lations for the same type of documents. Without appropriate tools, the expertise and time spent for translation activity by the first community is lost and do not benefit to translation requests of ... terminology choices when available, and the token alignment between source and tar- get sentences. • The translation is proposed to the user for post-editing through a rich...
Ngày tải lên: 22/02/2014, 03:20
Báo cáo khoa học: "Is Machine Translation Ripe for Cross-lingual Sentiment Classification" pdf
... 429–433, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics Is Machine Translation Ripe for Cross-lingual Sentiment Classification? Kevin Duh and Akinori Fujino and Masaaki ... where labeled translations and test data have some mismatch. Various prior work have achieved positive results using this ap- proach. In this opinion piece, we take a step ba...
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "Improving Machine Translation of Null Subjects in Italian and Spanish" pot
... corpus and tuned on 2,000 additional sentence pairs, and includes a 3- gram language model. Tables 3 and 4 show percentages of correct, in- correct and missing translations of personal and impersonal ... language pair, composed of 976 sentences for IT→FR and 1,005 sentences for ES→FR. We opted for machine translations also on the target side, rather than human refer- ence tran...
Ngày tải lên: 24/03/2014, 03:20
Tài liệu Báo cáo khoa học: "Term-list Translation using Mono-lingual Word Co-occurrence Vectors*" doc
... consists of two words, say wl and w~, and their transla- tion include wll for wl and w23 for w2, then (w11, w23) is a translation candidate. If wl and w~ have two and three alternatives respectively ... possible translation candidates. 2. Disambiguation: In this step, all possible translation candidates are ranked according to a measure that reflects the 'coherenc...
Ngày tải lên: 20/02/2014, 18:20