Báo cáo khoa học: "Adaptive Language Modeling for Word Predi

Báo cáo khoa học: "Adaptive Language Modeling for Word Prediction" potx

... (Companion Volume), pages 61–66, Columbus, June 2008. c 2008 Association for Computational Linguistics Adaptive Language Modeling for Word Prediction Keith Trnka University of Delaware Newark, DE 19716 trnka@cis.udel.edu Abstract We ... of the language model as well as the number of words in the prediction window. We focus on 5 -word prediction windows. Many com- mercial devices...

Ngày tải lên: 31/03/2014, 00:20

6 376 0

Báo cáo khoa học: "Grounded Language Modeling for Automatic Speech Recognition of Sports Video" doc

... the grounded language model, which like traditional language models encode the prior probability of words for an ASR system. Unlike traditional language models, however, grounded language models ... necessary to account for the words not included in the grounded language model itself (i.e. stop words, proper names, low frequency words). The traditional text-only language...

Ngày tải lên: 17/03/2014, 02:20

9 395 0

Báo cáo khoa học: "Conﬁdence Measure for Word Alignment" potx

... learned based on word alignment. In this paper we introduce a conﬁdence measure for word alignment, which is robust to extra or missing words in the bilingual sentence pairs, as well as word alignment ... probability of the aligned word pair with the translation probabilities of all the target words given the source word. If a word t occurs N times in the target sentence, for...

Ngày tải lên: 17/03/2014, 01:20

9 317 0

Báo cáo khoa học: "Probabilistic Document Modeling for Syntax Removal in Text Summarization" ppt

... distribution of the non-stop-words in the input such that for each word w, p(w) = n w N where n w is the number of occurrences of word w and N is the total number of words in the input. Sen- tences ... powerful approach to modeling human summarization. Nevertheless, for SumBasic to perform well, stop-words must be removed from the composition scoring function. Because these words...

Ngày tải lên: 07/03/2014, 22:20

6 449 0

Báo cáo khoa học: "Pivot Language Approach for Phrase-Based Statistical Machine Translation" pot

... performance of statistical translation systems. To solve this problem, this paper proposes a novel method for phrase-based SMT by using a pivot language. To perform translation between languages ... those using pivot language and those using a small bilingual corpus or scarce resources. For the first kind, pivot languages are employed to translate queries in cross -language inf...

Ngày tải lên: 08/03/2014, 02:21

8 205 0

Báo cáo khoa học: "Randomised Language Modelling for Statistical Machine Translation" doc

... 1) for j = 1 to qc(x) do for i = 1 to k do h i (x) ← hash of event {x, j} under h i BF[h i (x)] ← 1 end for end for end for return BF 3.1 Log-frequency Bloom ﬁlter The efﬁciency of our scheme for ... section of the Europarl (EP) corpus for parallel data and language modelling (Koehn, 2003) and the English Giga- word Corpus (LDC2003T05; GW) for additional language modelli...

Ngày tải lên: 17/03/2014, 04:20

8 268 0

Báo cáo khoa học: "Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm" pptx

... see Collins (2004) for more discussion. 3 Linear models for speech recognition We now describe how the formalism and algorithms in section 2 can be applied to language modeling for speech recognition. 3.1 ... The oracle word- error rate for the training set lattices was 12.2%. We alsoperformed trials with 1000-best lists for the same training set, rather than lattices. The orac...

Ngày tải lên: 23/03/2014, 19:20

8 459 0

Báo cáo khoa học: "NATURAL LANGUAGE INPUT FOR SCENE GENERATION " doc

... NATURAL LANGUAGE INPUT FOR SCENE GENERATION M Giovanni Adorni, Mauro Di Manzo Istituto di Elettrotecnica, University ... coordinates of an indefinite point P are given in the form: COORD K OF P (REFERRED_TO A)=H where K is a group of possible coordinates, H a set of values for these coordinates and A is the THOUGHT of ... WEIGHT contains information about the range of possible...

Ngày tải lên: 24/03/2014, 05:21

8 225 0

Báo cáo khoa học: "Semi-Supervised Modeling for Prenominal Modiﬁer Ordering" ppt

... Association for Computational Linguistics:shortpapers, pages 236–241, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics Semi-Supervised Modeling for Prenominal ... of 200,000 words, with over 50,000 words in the vocabulary of an educated adult (Aitchison, 2003). Up to a quar- ter of these words may be adjectives, which poses a signiﬁcant problem fo...

Ngày tải lên: 30/03/2014, 21:20

6 273 0

Báo cáo khoa học: "Arabic Language Modeling with Finite State Transducers" potx

... morphologically rich languages such as Arabic, the abundance of word forms result- ing from increased morpheme combinations is significantly greater than for languages with fewer inflected forms (Kirchhoff ... or different affixes to create additional word forms. Adding a single clitic to the words in Table 1 will double the number of forms. For instance, the word adrusu, meaning I stu...

Ngày tải lên: 31/03/2014, 00:20

6 278 0