Báo cáo khoa học: "Generalizing Word Lattice Translation" docx
... of word lattices. 0 1 x 2 a y 3 b c 0 1 a x 2 b 3 d c 0 1 a 2 b 3 c Figure 1: Three examples of word lattices: (a) sentence, (b) confusion network, and (c) non-linear word lattice. A word lattice ... contiguous words and is trans- lated into a target phrase e i of one or more contigu- ous words. Each word in f must be translated ex- actly once. To generalize this model to word...
Ngày tải lên: 17/03/2014, 02:20
... Example word sense translation out- put The word sense translation accuracies of the above words are shown in Table 2. The results are highly positive given that those from previ- ous work in word ... bilingual word pair probability from dictionary Pr(cf|ef): Chinese to English frame mapping probability. Pr(cl,cf|el,ef): Chinese to English word sense translation probabilit...
Ngày tải lên: 17/03/2014, 04:20
... compat- ibility with the word alignment. For a constituent c of t, we consider the set of source words s c that are aligned to c. If none of the source words in the lin- ear closure s ∗ c (the words between ... VBN NNS DT AUX The jobs are career oriented . les emplois sont axés sur la carrière . . Legend Correct proposed word alignment consistent with human annotation. Proposed word al...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Multi-Engine Machine Translation Guided by Explicit Word Matching" docx
... MEMT combination. 2.1 The Word Alignment Matcher The task of the matcher is to produce a word- to -word alignment between the words of two given input strings. Identical words that appear in both ... word- alignment matcher provides three main benefits. First, it explicitly identifies trans- lated words that appear in multiple MT transla- tions, allowing the MEMT algorithm to reinfo...
Ngày tải lên: 08/03/2014, 04:22
Báo cáo khoa học: "Rare Word Translation Extraction from Aligned Comparable Documents" doc
... consequence is that in any corpus, there are very few frequent words and many rare words. We propose a novel approach to extract rare word translations from comparable corpora, relying on two main ... co-occurrence computation of functional words or high frequency content words, but we show through observations and experiments that this win- dow size is appropriate for rare words. Both thes...
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "Combining Word-Level and Character-Level Models for Machine Translation Between Closely-Related Languages" ppt
... account: (1) word- internal letter context, and (2) sentence-level word context. We generated a lattice for each Macedonian test sentence, which included the original Mace- donian words and the ... decoded the lattice using a Bulgar- ian language model; this increased BLEU to 22.74. Word- level translation. Naturally, lattice- based transliteration cannot really compete against stan...
Ngày tải lên: 23/03/2014, 14:20
Báo cáo khoa học: "Pseudo-word for Phrase-based Machine Translation" pot
... without using word aligner before feeding them into PB- SMT pipeline. We call such basic translational unit as pseudo -word to differentiate with word. Pseudo -word is a kind of multi -word expression ... pairs. In SSP, words that should be aligned to “empty” word are programmed to be aligned to real words. Unlike most word alignment methods (Och and Ney, 2003) that add “empt...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "A Word-Class Approach to Labeling PSCFG Rules for Machine Translation" pot
... distinguishing between one -word, two -word, and multiple -word phrases as follows: Each one -word phrase with tag T simply receives the label T , instead of T -T . Two- word phrases with tag sequence ... Since consecutive words within a rule stem from consecutive words in the training corpus and thus are already consistent, the boundary word tags are more informative than tags of words...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation" docx
... Modeling By partitioning all N v words of the vocabulary into N c sets, with c(w) mapping a word onto its equiva- lence class and c(w j i ) mapping a sequence of words onto the sequence of their ... the history of each n-gram, the sequence of words conditioned on, as well as the predicted word are replaced by their class. Once a partition of the words in the vocabulary is obtained, two-si...
Ngày tải lên: 31/03/2014, 00:20
Báo cáo khoa học: "Improved Word-Level System Combination for Machine Translation" doc
... the word order between two correct MT outputs may be dif- ferent and the Levenshtein alignment may not be able to align shifted words in the hypotheses. In (Matusov et al., 2006), different word ... . By default, METEOR script counts the words that match exactly, and words that match after a simple Porter stemmer. Additional matching modules in- cluding WordNet stemming and synonymy may a...
Ngày tải lên: 31/03/2014, 01:20