Báo cáo khoa học: "Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation" docx
... novel reordering model for phrase -based statistical machine transla- tion (SMT) that uses a maximum entropy (MaxEnt) model to predicate reorderings of neighbor blocks (phrase pairs). The model ... memory. 3 Maximum Entropy Based Reordering Model In this section, we discuss how to create a max- imum entropy based reordering model. As de- scribed above, we de...
Ngày tải lên: 08/03/2014, 02:21
... the global phrase reordering model and its param- 1 It might be misleading to call our reordering model “global” since it is at most considers two phrases. A truly global reordering model would ... a translation model and is a target language model. In phrase -based statistical machine translation, the source sentence is segmented into a sequence of phrases , and each...
Ngày tải lên: 31/03/2014, 01:20
... 2009. c 2009 Association for Computational Linguistics Bilingually Motivated Domain-Adapted Word Segmentation for Statistical Machine Translation Yanjun Ma Andy Way National Centre for Language Technology School ... LDC seg- menter 2 and Stanford segmenter version 2006-05- 11 3 . Both ICTCLAS and Stanford segmenters utilise machine learning techniques, with Hidden Markov Models...
Ngày tải lên: 22/02/2014, 02:20
Báo cáo khoa học: "A Comparative Study on Reordering Constraints in Statistical Machine Translation" potx
... statistical machine translation (Brown et al., 1990). It allows an independent modeling of tar- get language model P r (e I 1 ) and translation model P r(f J 1 |e I 1 ). The target language model ... based lexicon as well as phrase- based models for this initialization. Our choice is the IBM Model4 to make the results as comparable Table 1: Ratio of the number of permitted r...
Ngày tải lên: 17/03/2014, 06:20
Báo cáo khoa học: "Maximum Entropy Based Restoration of Arabic Diacritics" ppt
... produces a more powerful model. 8 Conclusion We presented in this paper a statistical model for Arabic diacritic restoration. The approach we pro- pose is based on the Maximum entropy framework, which ... lexical, segment -based, and morpholog- ical information. Table 2 also shows that, when segment -based information is added to our sys- tem, a significant improvement is achieved...
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: "A Syllable Based Word Recognition Model for Korean Noun Extraction" potx
... that uses the syllable based word recognition model. It finds the most probable syllable-tag sequence of the input sentence by using automatically acquired statistical information from the POS ... syllable-tag sequence of the input sentence by using statistical information and extracts nouns by detecting the word boundaries. The statistical in- formation is automatically acquired from...
Ngày tải lên: 17/03/2014, 06:20
Báo cáo khoa học: "Maximum Entropy Model Learning of the Translation Rules" pot
... our target for the experiments included 1,375 English words and 1,195 Japanese words, and we prepared 1,375 fea- ture functions for model 1 and 2,744 for model 2 (56 part-of-speech for English ... algorithm. No- tice that the log-likelihood for the model 1+2 is always higher than the model 1. Thus, the model 1 + 2 is more'effective than the model 1 for lear...
Ngày tải lên: 23/03/2014, 19:20
Tài liệu Báo cáo khoa học: "An ERP-based Brain-Computer Interface for text entry using Rapid Serial Visual Presentation and Language Modeling" ppt
... assumes the EEG -based information and the language model information are statistically independent given the class label) is used to combine the RDA discriminant score and the language model score ... language model integration with RSVP is relatively straightforward, as we shall demonstrate. See Roark et al. (2010) for methods integrating language modeling into grid scanning. 2 RSV...
Ngày tải lên: 20/02/2014, 05:20
Tài liệu Báo cáo khoa học: "A Graph-based Semi-Supervised Learning for Question-Answering" doc
... Summarization SSL. of the models. As more labeled data is introduced, Hybrid SVM models’ performance increase dras- tically, even outperforming the state-of-the art MRR performance on TREC04 datasets ... in semi-supervised learning (SSL) environment, with an emphasis on graph -based methods, can im- prove the performance of information extraction from data for tasks such as question classi...
Ngày tải lên: 20/02/2014, 07:20
Tài liệu Báo cáo khoa học: "A FrameNet-based Semantic Role Labeler for Swedish" pdf
... following tools: • An HMM -based POS tagger, • A rule -based chunker, • A rule -based time expression detector, • Two clause identifiers, of which one is rule- based and one is statistical, • The MALTPARSER ... chunk; for classification, it is the type of the largest chunk or clause that starts at the leftmost token of the FE. For prepositional phrases, the preposition is attached to...
Ngày tải lên: 20/02/2014, 12:20