Báo cáo khoa học: "Language Model Based Arabic Word Segmentation" pdf

Báo cáo khoa học: "Language Model Based Arabic Word Segmentation" pdf

Báo cáo khoa học: "Language Model Based Arabic Word Segmentation" pdf

... does not handle the multiple affixes per word we observe in Arabic. 2 Words Prefixes Stems Suffixes Arabic Translit. Arabic Translit. Arabic Translit. Arabic Translit. تﺎﻳﻻﻮﻟ ا AlwlAyAt ... segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus. The algorithm uses...

Ngày tải lên: 08/03/2014, 04:22

8 189 0
Tài liệu Báo cáo khoa học: "Bilingually Motivated Domain-Adapted Word Segmentation for Statistical Machine Translation" pptx

Tài liệu Báo cáo khoa học: "Bilingually Motivated Domain-Adapted Word Segmentation for Statistical Machine Translation" pptx

... iterations). 4 Word Lattice Decoding 4.1 Word Lattices In the decoding stage, the various segmentation alternatives can be encoded into a compact rep- resentation of word lattices. A word lattice ... utilisation of word lattice decoding. 4.3 Phrase -Based Word Lattice Decoding Given a Chinese input sentence c J 1 consisting of J characters, the traditional approach is to determin...

Ngày tải lên: 22/02/2014, 02:20

9 236 0
Báo cáo khoa học: "Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese" potx

Báo cáo khoa học: "Incremental Joint Approach to Word Segmentation, POS Tagging, and Dependency Parsing in Chinese" potx

... beams to compare words of different lengths using beam search. More recently, Zhang and Clark (2010) pro- posed an efficient character -based decoder for their word -based model. In their new model, a single beam ... decoding, we take a character -based approach to produce our joint model. The incremental framework of our model is based on the joint POS tagging and dependency pars...

Ngày tải lên: 07/03/2014, 18:20

9 524 0
Tài liệu Báo cáo khoa học: "An ERP-based Brain-Computer Interface for text entry using Rapid Serial Visual Presentation and Language Modeling" ppt

Tài liệu Báo cáo khoa học: "An ERP-based Brain-Computer Interface for text entry using Rapid Serial Visual Presentation and Language Modeling" ppt

... language model. For the current study, all language models were estimated from a one million sentence (210M char- acter) sample of the NY Times portion of the English Gigaword corpus. Models were ... language model integration with RSVP is relatively straightforward, as we shall demonstrate. See Roark et al. (2010) for methods integrating language modeling into grid scanning. 2 RSVP base...

Ngày tải lên: 20/02/2014, 05:20

6 551 0
Tài liệu Báo cáo khoa học: "A Phrase-based Statistical Model for SMS Text Normalization" ppt

Tài liệu Báo cáo khoa học: "A Phrase-based Statistical Model for SMS Text Normalization" ppt

... normalization model consists of two sub-models: a word -based language model (LM), characterized by 1 (| ) nn P ee − ) k and a phrase- based lexical mapping model (channel model) , characterized ...  ……= 11 M kK s ss  s  = …… . The channel model can be rewritten in equation (3). 4.1 Basic Word -based Model The SMS normalization model is based on the source...

Ngày tải lên: 20/02/2014, 12:20

8 400 0
Báo cáo khoa học: "Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation" docx

Báo cáo khoa học: "Maximum Entropy Based Phrase Reordering Model for Statistical Machine Translation" docx

... flexible. It makes our model reorder any blocks, observed in training or not. The whole maximum entropy based reordering model is em- bedded inside a log-linear phrase -based model of translation. ... length 28.3 words on a 2GHz Linux system with 4G RAM memory. 3 Maximum Entropy Based Reordering Model In this section, we discuss how to create a max- imum entropy based reordering...

Ngày tải lên: 08/03/2014, 02:21

8 390 0
Tài liệu Báo cáo khoa học: A knowledge-based potential function predicts the specificity and relative binding energy of RNA-binding proteins ppt

Tài liệu Báo cáo khoa học: A knowledge-based potential function predicts the specificity and relative binding energy of RNA-binding proteins ppt

... optimistic that this knowledge -based potential function will find broad application to problems requiring the high-resolution modeling of protein– RNA interfaces, such as structure -based genome anno- tation, ... this work demonstrates that statistical models allow the quantitative analysis of protein–RNA recognition based on their structure and can be applied to modeling protein–RNA inte...

Ngày tải lên: 18/02/2014, 16:20

14 736 0
Tài liệu Báo cáo khoa học: "Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information" doc

Tài liệu Báo cáo khoa học: "Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information" doc

... successfully in NLP community. Based on the “bag-of-words” assumption that the or- der of words can be ignored, these methods model the text corpus by using a co-occurrence matrix of words and documents, ... Markov Model( HTMM) which is the basis of our method, then describe our approach to translation model adaptation in detail. 3.1 Hidden Topic Markov Model During the last couple o...

Ngày tải lên: 19/02/2014, 19:20

10 533 0
Tài liệu Báo cáo khoa học: "A Ranking-based Approach to Word Reordering for Statistical Machine Translation" doc

Tài liệu Báo cáo khoa học: "A Ranking-based Approach to Word Reordering for Statistical Machine Translation" doc

... mimic 912 the word order in target language. To this end, we propose a simple but effective ranking -based ap- proach to word reordering. The ranking model is automatically derived from the word aligned ... phrase- based SMT system. 1 Introduction Modeling word reordering between source and tar- get sentences has been a research focus since the emerging of statistical machine tra...

Ngày tải lên: 19/02/2014, 19:20

9 616 0
Tài liệu Báo cáo khoa học: "A Graph-based Semi-Supervised Learning for Question-Answering" doc

Tài liệu Báo cáo khoa học: "A Graph-based Semi-Supervised Learning for Question-Answering" doc

... information extraction of our QA system. The NER module is based on a combination of user defined rules based on Lesk word disambiguation (Lesk, 1988), WordNet (Miller, 1995) lookups, and many user- defined ... variation of Collins rules, hypernym extraction via Lesk word disambigua- tion (Lesk, 1988), regular expressions for wh- word indicators, n-grams, word- shapes(capitals), etc....

Ngày tải lên: 20/02/2014, 07:20

9 503 1
Từ khóa: