a stochastic language model using dependency

Tài liệu Báo cáo khoa học: "A Phonotactic Language Model for Spoken Language Identification" pptx

Tài liệu Báo cáo khoa học: "A Phonotactic Language Model for Spoken Language Identification" pptx

Ngày tải lên : 20/02/2014, 15:20
... NIST Language Recognition Evaluation database. 1 Introduction Spoken language and written language are similar in many ways. Therefore, much of the research in spoken language identification, ... 2003. Acoustic, Pho- netic and Discriminative Approaches to Automatic language recognition, In Proc. of Eurospeech Masahide Sugiyama. 1991. Automatic language recog- nition using acoustic features , ... of acoustic vocabulary (AV) with mixture of token unigram, bigram, and trigram: a) AV1: 32 broad class phonemes as unigram, selected from 12 languages, also referred to as P-ASM as detailed...
  • 8
  • 436
  • 0
Tài liệu Báo cáo khoa học: "A Structured Language Model" ppt

Tài liệu Báo cáo khoa học: "A Structured Language Model" ppt

Ngày tải lên : 22/02/2014, 03:20
... Proceedings of the Human Language Technology Workshop, 272-277. ARPA. Raymond Lau, Ronald Rosenfeld, and Salim Roukos. 1993. Trigger-based language models: a maximum entropy approach. In Proceedings ... University, Baltimore, MD. Frederick Jelinek, John Lafferty, David M. Mager- man, Robert Mercer, Adwait Ratnaparkhi, Salim Roukos. 1994. Decision Tree Parsing using a Hid- den Derivational Model. ... those assigned man- ually in the Penn Treebank (Marcus95) after under- going headword percolation and binarization. All four LMs predict a word wk and they were implemented using the Maximum...
  • 3
  • 342
  • 0
Báo cáo khoa học: "SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations" pptx

Báo cáo khoa học: "SITS: A Hierarchical Nonparametric Model using Speaker Identity for Topic Segmentation in Multiparty Conversations" pptx

Ngày tải lên : 07/03/2014, 18:20
... sec- ond dataset contains three annotated presidential de- bates (Boydstun et al., 2011) between Barack Obama and John McCain and a vice presidential debate be- tween Joe Biden and Sarah Palin. Each ... Quintana, F. A. (2004). Nonparametric Bayesian data analysis. Statistical Science, 19(1):95–110. [Murray et al., 2005] Murray, G., Renals, S., and Carletta, J. (2005). Extractive summarization of meeting ... moderator. 7 Similarly, the “Question” speaker had a relatively high variance, consistent with an amalgamation of many distinct speakers. These topic shift tendencies suggest that all can- didates manage to...
  • 10
  • 555
  • 0
Báo cáo khoa học: "A Discriminative Language Model with Pseudo-Negative Samples" pptx

Báo cáo khoa học: "A Discriminative Language Model with Pseudo-Negative Samples" pptx

Ngày tải lên : 08/03/2014, 02:21
... DLMs are trained using correct sentences from a corpus and negative examples from a Pseudo-Negative generator. An advantage of sampling is that as many nega- tive examples can be collected as correct ... that they have the dis- advantage of being computationally expensive, and not all relevant features can be included. A discriminative language model (DLM) assigns a score to a sentence , measuring ... spe- cific applications and therefore were able to obtain real negative examples easily. For example, Roark (2007) proposed a discriminative language model, in which a model is trained so that a correct...
  • 8
  • 315
  • 0
Báo cáo khoa học: "Combining a Statistical Language Model with Logistic Regression to Predict the Lexical and Syntactic Difficulty of Texts for FFL" potx

Báo cáo khoa học: "Combining a Statistical Language Model with Logistic Regression to Predict the Lexical and Syntactic Difficulty of Texts for FFL" potx

Ngày tải lên : 08/03/2014, 21:20
... features, as described below: a statistical language model and a measure of tense difficulty. 4.1 The language model The lexical difficulty of a text is quite an elaborate phenomenon to parameterise. ... poems as outliers). 4 Selection of lexical and syntactic variables Any text classification tasks require an object (here a text) to be parameterised into variables, whether qualitative or quantitative. ... Belgium thomas.francois@uclouvain.be Abstract Reading is known to be an essential task in language learning, but finding the ap- propriate text for every learner is far from easy. In this context, automatic...
  • 9
  • 514
  • 0
Tài liệu Báo cáo khoa học: "A probabilistic generative model for an intermediate constituency-dependency representation" pptx

Tài liệu Báo cáo khoa học: "A probabilistic generative model for an intermediate constituency-dependency representation" pptx

Ngày tải lên : 20/02/2014, 04:20
... Portugal. Federico Sangati and Chiara Mazza. 2009. An English Dependency Treebank ` a la Tesni ` ere. In The 8th In- ternational Workshop on Treebanks and Linguistic Theories, pages 173–184, Milan, ... (92). Michael J. Collins. 1999. Head-Driven Statistical Models for Natural Language Parsing. Ph.D. the- sis, University of Pennsylvania. Marie-Catherine de Marneffe and Christopher D. Man- ning. ... coordination, a linguistic phenomena highly abundant in natural language production, but of- ten neglected when it comes to evaluating parsing resources. We have therefore proposed a special evaluation...
  • 6
  • 555
  • 0
Tài liệu Báo cáo khoa học: "A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation" doc

Tài liệu Báo cáo khoa học: "A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation" doc

Ngày tải lên : 20/02/2014, 04:20
... signif- icantly. Bear in mind that Charniak et al. (2003) in- tegrated Charniak’s language model with the syntax- based translation model Yamada and Knight pro- posed (2001) to rescore a tree-to-string ... Stochastic analysis of lexical and semantic enhanced structural language model. The 8th International Colloquium on Grammatical Inference (ICGI), 97-111. K. Yamada and K. Knight. 2001. A syntax-based ... (EMNLP), 858-867. E. Charniak. 2001. Immediate-head parsing for language models. The 39th Annual Conference on Association of Computational Linguistics (ACL), 124-131. E. Charniak, K. Knight and K. Yamada. 2003....
  • 10
  • 567
  • 0
Tài liệu Báo cáo khoa học: "Smoothing a Tera-word Language Model" doc

Tài liệu Báo cáo khoa học: "Smoothing a Tera-word Language Model" doc

Ngày tải lên : 20/02/2014, 09:20
... and Linda C. Bauman Peto. 1995. A hierarchical Dirichlet language model. Natural Lan- guage Engineering, 1(3):1–19. Y.W. Teh. 2006. A hierarchical Bayesian language model based on Pitman-Yor processes. ... n-grams: C(ab) − C(ab∗). A( ab) = max(1, K(C(ab) − C(ab∗))) A different K constant is chosen for each n-gram order. Using this formulation as an interpolated 5- gram language model gives a cross ... Speech and Language. R. Kneser and H. Ney. 1995. Improved backing-off for m-gram language modeling. In International Confer- ence on Acoustics, Speech, and Signal Processing. David J. C. Mackay and...
  • 4
  • 425
  • 1
Tài liệu Báo cáo khoa học: "A Succinct N-gram Language Model" ppt

Tài liệu Báo cáo khoa học: "A Succinct N-gram Language Model" ppt

Ngày tải lên : 20/02/2014, 09:20
... com- pression tasks achieved a significant com- pression rate without any loss. 1 Introduction There has been an increase in available N -gram data and a large amount of web-scaled N-gram data has been ... the ACL-IJCNLP 2009 Conference Short Papers, pages 341–344, Suntec, Singapore, 4 August 2009. c 2009 ACL and AFNLP A Succinct N-gram Language Model Taro Watanabe Hajime Tsukada Hideki Isozaki NTT ... Communication Science Laboratories 2-4 Hikaridai Seika-cho Soraku-gun Kyoto 619-0237 Japan {taro,tsukada,isozaki}@cslab.kecl.ntt.co.jp Abstract Efficient processing of tera-scale text data is an important...
  • 4
  • 457
  • 0
Tài liệu Báo cáo khoa học: "Lexical transfer using a vector-space model" doc

Tài liệu Báo cáo khoa học: "Lexical transfer using a vector-space model" doc

Ngày tải lên : 20/02/2014, 18:20
... using matrix PRICAI-00, 2000, (to appear). Tanaka H. (1995) Statistical Learning of “Case Frame Tree” for Translating English Verbs, Journal of NLP, 2/3, pp. 49-72, (in Japanese). Yamada, ... Laboratories 2-2 Hikaridai, Seika, Soraku Kyoto 619-0288, Japan sumita@slt.atr.co.jp Abstract Building a bilingual dictionary for transfer in a machine translation system is conventionally ... generalization (Akiba et. al., 1996 and Tanaka, 1995); (2) approaches using structural matching: to obtain transfer rules, several search methods have been proposed for maximal structural matching between...
  • 7
  • 654
  • 0
Tài liệu Báo cáo khoa học: "Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model " pptx

Tài liệu Báo cáo khoa học: "Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model " pptx

Ngày tải lên : 20/02/2014, 18:20
... 923 Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model Masaaki NAGATA NTT Information and Communication Systems Laboratories 1-1 Hikari-no-oka Yokosuka-Shi ... such as Japanese and Chinese. It consists of a statistical OCR model, an approxi- mate word matching method using character shape similarity, and a word segmentation algorithm us- ing a statistical ... Yokosuka-Shi Kanagawa, 239-0847 Japan nagata@nttnly, isl. ntt. co. jp Abstract We present a novel OCR error correction method for languages without word delimiters that have a large character...
  • 7
  • 472
  • 0

Xem thêm