Báo cáo khoa học: "Parsing Flexible Word Order Languages" pdf
... As a matter of fact, ATN's were not originally conceived for flexible word order languages. (In the extreme free word order case, an ATN would have one single node and a large number ... Via dei Monti Tiburtini 509, 00157 Roma user interface. ABSTRACT A parser for " ;flexible& quot; word order languages must be substantially data driven. In our view syntax ha...
Ngày tải lên: 18/03/2014, 02:20
... languages including Indian and other languages have relatively free word or- der. tn free word order languages, order of words contains only secondary information such as em- phasis etc. Primary ... Each word group is uniquely identifiable before the core parser ex- ecutes, (b) Each demand word has only one karaka chart, and (c) There are no ambiguities between source wor...
Ngày tải lên: 08/03/2014, 07:20
... the “sure” word alignments in the trial data. This resulted in 13,285,942 possible word- to -word translation pairs (plus 66,406 possi- ble null -word- to -word pairs). For most models, the word translation ... the null word are frequently occuring func- tion words. Hence we initialize the distribution for the null word to be the unigram distribution of target words, so that freque...
Ngày tải lên: 23/03/2014, 19:20
Báo cáo khoa học: "PARSING A FREE-WORD ORDER LANGUAGE: WARLPIRI" doc
... in order to clearly delimit the morphemes.) The second word, karnarla, is the auxiliary which must appear in the second (Wackernagel's) position. Except for the auxiliary, the other words ... case-assignment and argument-linking are not directional. In this way, the fixed-morpheme order and free -word order have been properly accounted for. (KARLI (datum (v -)) (datum (...
Ngày tải lên: 17/03/2014, 20:20
Báo cáo khoa học: "Parsing the Internal Structure of Words: A New Paradigm for Chinese Word Segmentation" doc
... Generation of Words with Internal Structures Words with rich internal structures can be described using a context-free grammar formalism as word → root (3) word → word suffix (4) word → prefix word (5) Here ... out-of-vocabulary word 戽䊂 䠽吼 ‘English People’. Had there been only a few words with inter- nal structures, current Chinese word segmentation paradigm would be sufficient...
Ngày tải lên: 17/03/2014, 00:20
Tài liệu Báo cáo khoa học: "Fixed Length Word Suffix for Factored Statistical Machine Translation" pdf
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Learning Sub-Word Units for Open Vocabulary Speech Recognition" doc
... coherence. Hybrid word/ sub -word recognizers can produce a sequence of sub -word units in place of OOV words. Ideally, the recognizer outputs a complete word for in-vocabulary (IV) utterances, and sub -word ... recognize words beyond their vocab- ulary, many of which are information rich terms, like named entities or foreign words. Hybrid word/ sub -word systems solve this problem by...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Yet Another Word Alignment Tool" docx
... with Yawat. As the mouse is moved over a word, th e word and all words linked with it are highlighted. The highlighting is removed when the mouse leaves the word in qu estion. This allows the annotator ... assoc iated words are shown only for one wor d at a time, as determined by the location of the mouse pointer. When the mouse is moved over a word in the text, the word and all the...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "Guiding Statistical Word Alignment Models With Prior Knowledge" pdf
... a m 1 specifies the indices of source words that target words are aligned to. In an HMM-based word alignment model, source words are treated as Markov states while target words are observations that are ... as 1. In building word alignment models, a special “NULL” word is usually introduced to address tar- get words that align to no source words. Since this physically non-existing word...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Rethinking Chinese Word Segmentation: Tokenization, Character Classification, or Wordbreak Identification" pdf
... co-occurrence. Word based model. In this model, statistical data about word boundary frequencies for each character is retrieved word- wise. For example, in the case of a monosyllabic word only two word ... components of words, instead, they are contextual background providing informa- tion about the likelihood of whether each CB is also a wordbreak (WB). In other words, we model Chi...
Ngày tải lên: 20/02/2014, 12:20