Báo cáo khoa học: "Hierarchical Bayesian Language Modelling for the Linguistically Informed" pptx
... 2012. c 2012 Association for Computational Linguistics Hierarchical Bayesian Language Modelling for the Linguistically Informed Jan A. Botha Department of Computer Science University of Oxford, UK jan.botha@cs.ox.ac.uk Abstract In ... to identify their elementary modelling units. A proper account of compounds in terms of their component words therefore holds the potential o...
Ngày tải lên: 24/03/2014, 03:20
... repeated for the remaining words in the hypothesis extension. Once the final word in the hypothesis has been processed, the resulting ran- dom variable store is associated with that hypoth- esis. The ... with the new stack element. When a new hypothesis extends an existing hy- pothesis by more than one word, this process is first carried out for the first new word in the hy...
Ngày tải lên: 20/02/2014, 04:20
... arbitrary features of the string w together with the acoustic input a. In this paper we restrict Φ(a, w) to only consider the string w and/or the parse tree T (w) for w. For example, Φ(a, w) might ... In the simplest ver- sion, these are simply treated like other constituents in the parse tree. However, these can disrupt what may be termed the intended sequence of syntactic...
Ngày tải lên: 20/02/2014, 15:20
Tài liệu Báo cáo khoa học: "A Phonotactic Language Model for Spoken Language Identification" pptx
... of the best reported results on the 1996 NIST Language Recognition Evaluation database. 1 Introduction Spoken language and written language are similar in many ways. Therefore, much of the ... that, al- though the sounds of different spoken languages overlap considerably, the phonotactics differenti- ates one language from another. Therefore, one can easily draw the...
Ngày tải lên: 20/02/2014, 15:20
Báo cáo khoa học: "Mining Association Language Patterns for Negative Life Event Classification" doc
... can understand the negative life events embed- ded in the example sentences shown in Table 1. Therefore, this study proposes a framework for negative life event classification. We formulate this ... , k ww<> . Thus, the task of association pattern mining is to mine the language patterns of frequently associ- ated words from the training sentences. For this purpose...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "A Discriminative Language Model with Pseudo-Negative Samples" pptx
... strengthen the convergence. In each step in the exchange algorithm, the ap- proximate value of the change of the log-likelihood was examined, and the exchange algorithm applied only if the approximate ... data on performance. Figure 6 shows the result of the classification task using SMCM-bi-gram features. The result suggests that the performance could be further improved...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "BRINGING NATURAL LANGUAGE PROCESSING TO THE MICRO COMPUTER MARKET THE STORY OF Q&A" doc
Ngày tải lên: 08/03/2014, 18:20
Báo cáo khoa học: "Utilizing Dependency Language Models for Graph-based Dependency Parsing Models" pptx
... represent the features based on the DLM. The DLM-based features can capture the N-gram in- formation of the parent-children structures for the parsing model. Then, they are integrated directly in the ... x Rm ), where x Lk , x L1 are the children on the left side from the farthest to the nearest and x R1 x Rm are the children on the right side from the nearest t...
Ngày tải lên: 23/03/2014, 14:20
Báo cáo khoa học: "Revisiting Pivot Language Approach for Machine Translation" doc
... translating the pivot sentence in the source-pivot corpus into the target language with pivot-target translation models. We name it the synthetic method. The working condition with the pivot language approach ... data, the source-pivot RBMT system can be used to trans- late the test set into the pivot language, which can be further translated into the target language...
Ngày tải lên: 23/03/2014, 16:21
Báo cáo khoa học: "Continuous Space Language Models for Statistical Machine Translation" pdf
... i.e., the prob- ability should be 1.0 for the next word in the train- ing sentence and 0.0 for all the other ones. The first part of this equation is the cross-entropy be- tween the output and the ... (2) where N is the size of the vocabulary. The input uses the so-called 1-of-n coding, i.e., the ith word of the vocabulary is coded by setting the ith ele- ment...
Ngày tải lên: 31/03/2014, 01:20