training continuous space language models some practical issues

Báo cáo khoa học: "Continuous Space Language Models for Statistical Machine Translation" pdf

Ngày tải lên: 31/03/2014, 01:20

8 345 0

Tài liệu Báo cáo khoa học: "Web augmentation of language models for continuous speech recognition of SMS text messages" docx

... 2007. Large language models in machine translation. In Proceedings of the 2007 Joint Conference on Empirical Meth- ods in Natural Language Processing and Com- putational Natural Language Learning ... Kneser- Ney smoothed n-gram models. IEEE Transac- tions on Audio, Speech and Language Processing, 15(5):1617–1624. A. Stolcke. 1998. Entropy-based pruning of backoff language models. In Proc. DARPA ... selection and its applications in LM augmentation and adaptation using web data. The language models are part of a continuous speech recognition system that enables users to use speech as an input...

Ngày tải lên: 22/02/2014, 02:20

9 301 0

Báo cáo khoa học: "TENSE GENERATION IN AN INTELLIGENT TUTOR FOR FOREIGN LANGUAGE TEACHING: SOME ISSUES IN THE DESIGN OF THE VERB EXPERT" pot

Ngày tải lên: 01/04/2014, 00:20

6 396 1

Tài liệu Báo cáo khoa học: "Incremental Syntactic Language Models for Phrase-based Translation" pptx

... to incorporate large- scale n-gram language models in conjunction with incremental syntactic language models. The added decoding time cost of our syntactic language model is very high. By increasing ... translation has effectively used n-gram word sequence models as language models. Modern phrase-based translation using large scale n-gram language models generally performs well in terms of lexical ... use supertag n-gram LMs. Syntactic language models have also been explored with tree-based translation models. Charniak et al. (2003) use syntactic language models to rescore the output of a...

Ngày tải lên: 20/02/2014, 04:20

12 511 0

Tài liệu Báo cáo khoa học: "The impact of language models and loss functions on repair disﬂuency detection" pptx

... language models trained from text or speech corpora of vari- ous genres and sizes. The largest available language models are based on written text: we investigate the effect of written text language models ... dif- ferences among the different language models when extended features are present are relatively small. We assume that much of the information expressed in the language models overlaps with the lexical ... derived from the Switchboard language model, since the ﬂu- ent sentence itself is part of the language model training data. We solve this by dividing the Switch- board training data into 20 folds....

Ngày tải lên: 20/02/2014, 04:20

9 610 0

Tài liệu Báo cáo khoa học: "An Empirical Investigation of Discounting in Cross-Domain Language Models" ppt

... of English Bigrams. Computer Speech & Language, 5(1):19–54. Joshua Goodman. 2001. A Bit of Progress in Language Modeling. Computer Speech & Language, 15(4):403– 434. Bo-June (Paul) Hsu ... 2008. N- gram Weighting: Reducing Training Data Mismatch in Cross-Domain Language Model Estimation. In Pro- ceedings of the Conference on Empirical Methods in Natural Language Processing, pages 829–838. Dietrich ... Language Modeling. In Pro- ceedings of International Conference on Acoustics, Speech, and Signal Processing. Robert C. Moore and William Lewis. 2010. Intelligent selection of language model training...

Ngày tải lên: 20/02/2014, 04:20

6 444 0

Tài liệu Báo cáo khoa học: "Improved Smoothing for N-gram Language Models Based on Ordinary Counts" doc

... estimating N-gram language models. Kneser-Ney smoothing, however, requires nonstandard N-gram counts for the lower- order models used to smooth the highest- order model. For some applications, ... Kneser-Ney and those methods. 1 Introduction Statistical language models are potentially useful for any language technology task that produces natural -language text as a ﬁnal (or intermediate) output. ... using a sequence of lower- order to higher-order language models has been shown to be an efﬁcient way of constraining high- dimensional search spaces for speech recognition (Murveit et al., 1993)...

Ngày tải lên: 20/02/2014, 09:20

4 365 0

Tài liệu Báo cáo khoa học: "Reading Level Assessment Using Support Vector Machines and Statistical Language Models" pdf

... statistical language models. In this paper, we also use support vector machines to combine features from tradi- tional reading level measures, statistical language models, and other language processing ... use scores from language models as features in another classiﬁer (e.g. an SVM). For example, perplexity (P P) is an information-theoretic measure often used to assess language models: P P = 2 H(t|c) , ... of syntax. Our approach uses n- gram language models as a low-cost automatic ap- proximation of both syntactic and semantic analysis. Statistical language models (LMs) are used suc- cessfully...

Ngày tải lên: 20/02/2014, 15:20

8 447 0

Tài liệu Báo cáo khoa học: "Some Pragmatic Issues in the Planning of Definite and Indefinite Noun Phrases" potx

... 199 Some Pragmatic Issues in the Planning of Definite and Indefinite Noun Phrases Douglas E. Appelt Artificial Intelligence Center, SRI International and Center for the Study of Language ... an agent's goals, but allows some of the actions to consist of the utterance of sentences. This approach to language generation emphasizes the view of language as action, and hence assigns ... plan-based analysis of noun ph~ is worked out, the taxonomy of actions presented here will still be of practical importance. Until an analysis like Cohen and Levesque's is worked out, the concept...

Ngày tải lên: 21/02/2014, 20:20

6 659 0

Tài liệu Báo cáo khoa học: "Generating statistical language models from interpretation grammars in dialogue systems" potx

... comparison of in- grammar recognition performance. 3 Language modelling To generate the different trigram language models we used the SRI language modelling toolkit (Stol- cke, 2002) with Good-Turing ... decades of statistical language modeling: Where do we go from here? In Proceed- ings of IEEE:88(8). Rosenfeld R. 2000. Incorporating Linguistic Structure into Statistical Language Models. In Philosophical Transactions ... statistical language models (DM-SLMs) by using GF to generate all utterances that are speciﬁc to certain dialogue moves from our interpretation grammar. In this way we can pro- duce models that...

Ngày tải lên: 22/02/2014, 02:20

8 381 0

Báo cáo khoa học: "The use of formal language models in the typology of the morphology of Amerindian languages" potx

... grammars for modeling agglutination in this language, but first we will present the for- mer class of languages and its acceptor automata. 3.1 Linear context free languages and two-taped nondeterministic ... example the Guarani language presents nasal harmony which expands from the root to both suffixes and prefixes (Krivoshein, 1994). This kind of characterization can have some value in language classification ... 2010. c 2010 Association for Computational Linguistics The use of formal language models in the typology of the morphology of Amerindian languages Andr ´ es Osvaldo Porta Universidad de Buenos Aires hugporta@yahoo.com.ar Abstract The...

Ngày tải lên: 07/03/2014, 22:20

6 439 0

Báo cáo khoa học: "Faster and Smaller N -Gram Language Models" pptx

... novel language model caching technique that improves the query speed of our language models (and SRILM) by up to 300%. 1 Introduction For modern statistical machine translation systems, language models ... with two different language models. Our ﬁrst language model, WMT2010, was a 5- gram Kneser-Ney language model which stores probability/back-off pairs as values. We trained this language model on ... 2010. Storing the web in memory: space efﬁcient language models with con- stant time retrieval. In Proceedings of the Conference on Empirical Methods in Natural Language Process- ing. Boulos Harb,...

Ngày tải lên: 07/03/2014, 22:20

10 463 0

Báo cáo khoa học: "Enhancing Language Models in Statistical Machine Translation with Backward N-grams and Mutual Information Triggers" ppt

... explore a dependency language model to improve translation quality. To some ex- tent, these syntactically-informed language models are consistent with syntax-based translation models in capturing ... or even trillions of English words, huge language models are built in a distributed man- ner (Zhang et al., 2006; Brants et al., 2007). Such language models yield better translation results but at ... integrate backward n-grams and mutual information (MI) triggers into language models in SMT. In conventional n-gram language models, we look at the preceding n − 1 words when calculating the probability...

Ngày tải lên: 07/03/2014, 22:20

10 415 0

Báo cáo khoa học: "Randomized Language Models via Perfect Hash Functions" pptx

... (lossless) languages models and our randomized language model. Note that the standard practice of measuring perplexity is not meaningful here since (1) for efﬁcient computation, the language model ... 2007. Compressing trigram language models with golomb coding. In Proceedings of EMNLP-CoNLL 2007, Prague, Czech Republic, June. P. Clarkson and R. Rosenfeld. 1997 . Statistical language modeling using ... pruning of backoff language models. In Proc. DARPA Broadcast News Transcription and Understanding Workshop, pages 270–274. D. Talbot and M. Osborne. 2007a. Randomised language modelling for...

Ngày tải lên: 08/03/2014, 01:20

9 273 0

Báo cáo khoa học: "Generalized Algorithms for Constructing Statistical Language Models" pdf

... construc- tion of language models found in new language processing applications and reported experimental results show- ing their practicality for constructing very large models. These algorithms ... experimental results demonstrating its efﬁciency. Representation of language models by WFAs. Clas- sical -gram language models admit a natural representation by WFAs in which each state encodes ... is natural and convenient to construct class-based language models, that is models based on classes of words (Brown et al., 1992). Such models are also often more robust since they may include...

Ngày tải lên: 08/03/2014, 04:22

8 389 0

Báo cáo khoa học: "Methods and Practical Issues in Evaluating Alignment Techniques" doc

... to 300 000 words per language, 2) SCI- ENCE, five scientific articles of about 50 000 words per language, 3) TECH, technical doc- umentation of about 40 000 words per language and 4) VERNE, ... perspective of some applications. These problems can be avoided by taking ad- vantage of the fact that a unit of a given gran- ~arity (e.g. sentence) can always be seen as a (possibly discontinuous) ... million words (ca. 1.1 million words per language) . The part used for JOC was composed of one fifth of the French and English sections (ca. 200 000 words per language) . 3.3 BAF The BAF corpus...

Ngày tải lên: 08/03/2014, 05:21

7 400 0

Báo cáo khoa học: "Cutting the Long Tail: Hybrid Language Models for Translation Style Adaptation" doc

... speech transcripts. Compared to standard language models, hybrid LMs generalize better to the test data and partially compensate for the disproportion be- tween in-domain and out-of-domain training data. At the ... Hoang. 2007. Factored translation models. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), ... Monz. 2011. Statistical Machine Translation with Local Language Models. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 869–879, Edinburgh, Scotland,...

Ngày tải lên: 08/03/2014, 21:20

10 335 0

Supporting English Language Learners A practical guide for Ontario educators ppt

Ngày tải lên: 10/03/2014, 05:20

123 656 0

Báo cáo khoa học: "Deciphering Foreign Language by Combining Language Models and Context Vectors" pdf

Ngày tải lên: 16/03/2014, 19:20

9 352 0

Báo cáo khoa học: "Conﬁdence-Weighted Learning of Factored Discriminative Language Models" pptx

Ngày tải lên: 17/03/2014, 00:20

6 300 0