constructing statistical language models

Báo cáo khoa học: "Generalized Algorithms for Constructing Statistical Language Models" pdf

Báo cáo khoa học: "Generalized Algorithms for Constructing Statistical Language Models" pdf

... . Class-based models. In many applications, it is nat- ural and convenient to construct class-based language models, that is models based on classes of words (Brown et al., 1992). Such models are ... construc- tion of language models found in new language process- ing applications and reported experimental results show- ing their practicality for constructing very large models. These algorithms ... by as- signing them some probabilities. There are classical techniques for constructing language models such as - gram models with various smoothing techniques (see Chen and Goodman (1998) and...

Ngày tải lên: 08/03/2014, 04:22

8 389 0
Tài liệu Báo cáo khoa học: "Reading Level Assessment Using Support Vector Machines and Statistical Language Models" pdf

Tài liệu Báo cáo khoa học: "Reading Level Assessment Using Support Vector Machines and Statistical Language Models" pdf

... using statistical language models. In this paper, we also use support vector machines to combine features from tradi- tional reading level measures, statistical language models, and other language ... that category or not, rather than constructing a classifier which ranks documents into different categories relative to each other. 4.1 Statistical Language Models Statistical LMs predict the probability ... of syntax. Our approach uses n- gram language models as a low-cost automatic ap- proximation of both syntactic and semantic analy- sis. Statistical language models (LMs) are used suc- cessfully...

Ngày tải lên: 20/02/2014, 15:20

8 447 0
Tài liệu Báo cáo khoa học: "Generating statistical language models from interpretation grammars in dialogue systems" potx

Tài liệu Báo cáo khoa học: "Generating statistical language models from interpretation grammars in dialogue systems" potx

... decades of statistical language modeling: Where do we go from here? In Proceed- ings of IEEE:88(8). Rosenfeld R. 2000. Incorporating Linguistic Structure into Statistical Language Models. In ... comparison of in- grammar recognition performance. 3 Language modelling To generate the different trigram language models we used the SRI language modelling toolkit (Stol- cke, 2002) with Good-Turing ... move specific statistical language models (DM-SLMs) by using GF to generate all utterances that are specific to certain dialogue moves from our in- terpretation grammar. In this way we can pro- duce models...

Ngày tải lên: 22/02/2014, 02:20

8 381 0
Báo cáo khoa học: "Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers" ppt

Báo cáo khoa học: "Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers" ppt

... of statistical machine translation: Parameter estimation. Computa- tional Linguistics, 19(2):263–311. Eugene Charniak, Kevin Knight, and Kenji Yamada. 2003. Syntax-based language models for statistical machine ... as language models for statistical machine translation. In Proceed- ings of AMTA. Sylvain Raybaud, Caroline Lavecchia, David Langlois, and Kamel Sma ¨ ıli. 2009. New confidence measures for statistical ... Computational Linguistics Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers Deyi Xiong, Min Zhang, Haizhou Li Human Language Technology Institute...

Ngày tải lên: 07/03/2014, 22:20

10 415 0
Tài liệu Báo cáo khoa học: "Incremental Syntactic Language Models for Phrase-based Translation" pptx

Tài liệu Báo cáo khoa học: "Incremental Syntactic Language Models for Phrase-based Translation" pptx

... research in statistical machine trans- lation has effectively used n-gram word sequence models as language models. Modern phrase-based translation using large scale n-gram language models generally ... to incorporate large- scale n-gram language models in conjunction with incremental syntactic language models. The added decoding time cost of our syntactic language model is very high. By increasing ... translation model. Instead, we incor- porate syntax into the language model. Traditional approaches to language models in speech recognition and statistical machine transla- tion focus on the use of...

Ngày tải lên: 20/02/2014, 04:20

12 511 0
Tài liệu Báo cáo khoa học: "The impact of language models and loss functions on repair disfluency detection" pptx

Tài liệu Báo cáo khoa học: "The impact of language models and loss functions on repair disfluency detection" pptx

... language models trained from text or speech corpora of vari- ous genres and sizes. The largest available language models are based on written text: we investigate the effect of written text language models ... dif- ferences among the different language models when extended features are present are relatively small. We assume that much of the information expressed in the language models overlaps with the lexical ... information from the external language models by defining a reranker feature for each external language model. The value of this feature is the log probability assigned by the language model to the candidate...

Ngày tải lên: 20/02/2014, 04:20

9 610 0
Tài liệu Báo cáo khoa học: "An Empirical Investigation of Discounting in Cross-Domain Language Models" ppt

Tài liệu Báo cáo khoa học: "An Empirical Investigation of Discounting in Cross-Domain Language Models" ppt

... 2006. MAP adaptation of stochastic grammars. Computer Speech & Language, 20(1):41 – 68. Jerome R. Bellegarda. 2004. Statistical language model adaptation: review and perspectives. Speech Commu- nication, ... of English Bigrams. Computer Speech & Language, 5(1):19–54. Joshua Goodman. 2001. A Bit of Progress in Language Modeling. Computer Speech & Language, 15(4):403– 434. Bo-June (Paul) Hsu ... N-gram Language Models Based on Ordinary Counts. In Proceedings of the ACL-IJCNLP 2009 Conference Short Papers, pages 349–352. Ronald Rosenfeld. 1996. A Maximum Entropy Ap- proach to Adaptive Statistical...

Ngày tải lên: 20/02/2014, 04:20

6 444 0
Tài liệu Báo cáo khoa học: "Improved Smoothing for N-gram Language Models Based on Ordinary Counts" doc

Tài liệu Báo cáo khoa học: "Improved Smoothing for N-gram Language Models Based on Ordinary Counts" doc

... Kneser-Ney and those methods. 1 Introduction Statistical language models are potentially useful for any language technology task that produces natural -language text as a final (or intermediate) output. ... perplexity of any known method for estimating N-gram language models. Kneser-Ney smoothing, however, requires nonstandard N-gram counts for the lower- order models used to smooth the highest- order model. ... best approach when language models based on ordinary counts are desired. References Chen, Stanley F., and Joshua Goodman. 1998. An empirical study of smoothing techniques for language modeling....

Ngày tải lên: 20/02/2014, 09:20

4 365 0
Tài liệu Báo cáo khoa học: "Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model " pptx

Tài liệu Báo cáo khoa học: "Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model " pptx

... distance (Wagner and Fischer, 1974) and ngram distance (Angell et al., 1983). Recently, statistical language models and feature- based method have been used for context-sensitive spelling correction, ... times is c/(n + r). 923 Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model Masaaki NAGATA NTT Information and Communication Systems Laboratories 1-1 ... novel OCR error correction method for languages without word delimiters that have a large character set, such as Japanese and Chinese. It consists of a statistical OCR model, an approxi- mate...

Ngày tải lên: 20/02/2014, 18:20

7 472 0
Tài liệu Báo cáo khoa học: "Web augmentation of language models for continuous speech recognition of SMS text messages" docx

Tài liệu Báo cáo khoa học: "Web augmentation of language models for continuous speech recognition of SMS text messages" docx

... 2007. Large language models in machine translation. In Proceedings of the 2007 Joint Conference on Empirical Meth- ods in Natural Language Processing and Com- putational Natural Language Learning ... Kneser- Ney smoothed n-gram models. IEEE Transac- tions on Audio, Speech and Language Processing, 15(5):1617–1624. A. Stolcke. 1998. Entropy-based pruning of backoff language models. In Proc. DARPA ... were selected for each language. The adaptation was thought to take place off-line on a server. 3.2.1 Data sets For each language, the adaptation takes place on two baseline models, which are the...

Ngày tải lên: 22/02/2014, 02:20

9 301 0
Báo cáo khoa học: "The use of formal language models in the typology of the morphology of Amerindian languages" potx

Báo cáo khoa học: "The use of formal language models in the typology of the morphology of Amerindian languages" potx

... grammars for modeling agglutination in this language, but first we will present the for- mer class of languages and its acceptor automata. 3.1 Linear context free languages and two-taped nondeterministic ... 2010. c 2010 Association for Computational Linguistics The use of formal language models in the typology of the morphology of Amerindian languages Andr ´ es Osvaldo Porta Universidad de Buenos Aires hugporta@yahoo.com.ar Abstract The ... natural representa- tion in terms of linear context-free languages. 2 Quichua Santiague ˜ no The quichua santiague˜no is a language of the Quechua language family. It is spoken in the San- tiago del...

Ngày tải lên: 07/03/2014, 22:20

6 439 0
Báo cáo khoa học: "Faster and Smaller N -Gram Language Models" pptx

Báo cáo khoa học: "Faster and Smaller N -Gram Language Models" pptx

... novel language model caching technique that improves the query speed of our language models (and SRILM) by up to 300%. 1 Introduction For modern statistical machine translation systems, language models ... with two different language models. Our first language model, WMT2010, was a 5- gram Kneser-Ney language model which stores probability/back-off pairs as values. We trained this language model on ... and Smaller N -Gram Language Models Adam Pauls Dan Klein Computer Science Division University of California, Berkeley {adpauls,klein}@cs.berkeley.edu Abstract N-gram language models are a major...

Ngày tải lên: 07/03/2014, 22:20

10 463 0
Báo cáo khoa học: "Randomized Language Models via Perfect Hash Functions" pptx

Báo cáo khoa học: "Randomized Language Models via Perfect Hash Functions" pptx

... 2007. Compressing trigram language models with golomb coding. In Proceedings of EMNLP-CoNLL 2007, Prague, Czech Republic, June. P. Clarkson and R. Rosenfeld. 1997 . Statistical language modeling using ... 2007a. Randomised language modelling for statistical machine translation. In 45th Annual Meeting of the ACL 2007, Prague. D. Talbot and M. Osborne. 2007b. Smoothed Bloom filter language models: Tera-scale ... alignment template approach to statistical machine translation. Computational Linguistics, 30(4):417–449. Andreas Stolcke. 1998. Entropy-based pruning of back- off language models. In Proc. DARPA Broadcast...

Ngày tải lên: 08/03/2014, 01:20

9 273 0
Báo cáo khoa học: "Segmented and unsegmented dialogue-act annotation with statistical dialogue models∗" ppt

Báo cáo khoa học: "Segmented and unsegmented dialogue-act annotation with statistical dialogue models∗" ppt

... the possibility of applying statistical models to the annotation problem is really inter- esting. Moreover, it gives the possibility of evalu- ating the statistical models. The evaluation of the performance ... or statistical ma- chine translation), an alternative data-based ap- proach has been developed in the last decade (Stol- cke et al., 2000; Young, 2000). This approach re- lies on statistical models ... Pr(W s k−d s k−(d+1) +1 |U k ) This model can be easily implemented using simple statistical models (N-grams and Hidden Markov Models) . The decoding (segmentation and DA assignation) was implemented using...

Ngày tải lên: 08/03/2014, 02:21

8 387 0
Báo cáo khoa học: "Combining a Statistical Language Model with Logistic Regression to Predict the Lexical and Syntactic Difficulty of Texts for FFL" potx

Báo cáo khoa học: "Combining a Statistical Language Model with Logistic Regression to Predict the Lexical and Syntactic Difficulty of Texts for FFL" potx

... use of language models in- stead of word lists to measure lexical complex- ity. Schwarm and Ostendorf (2005) developed a SVM categoriser combining a classifier based on trigram language models ... measures for first and second language texts. In Proceedings of NAACL HLT, pages 460–467. M. Heilman, K. Collins-Thompson, and M. Eskenazi. 2008. An analysis of statistical models and fea- tures for ... Methods in Language Process- ing, volume 12. Manchester, UK. S.E. Schwarm and M. Ostendorf. 2005. Reading level assessment using support vector machines and sta- tistical language models. Proceedings...

Ngày tải lên: 08/03/2014, 21:20

9 514 0

Bạn có muốn tìm thêm với từ khóa:

w