... the graph for each target word in the context: for each target word W i , we concentrate the initial proba- bility mass in the senses of the words surrounding W i , but not in the senses of the ... on WordNet) in order to perform unsupervised Word Sense Disam- biguation. Our algorithm uses the full graph of the LKB efficiently, performing better than previous approaches in English...
Ngày tải lên: 31/03/2014, 20:20
... using the senses pro- vided in WordNet. The senses are ranked us- ing two sources of information: (1) the Inter- net for gathering statistics for word- word co- occurrences and (2)WordNet for measuring ... and using WordNet, form a similarity list for each sense of that word. For this, use the words from the synset of each sense and the words from the hypernym synsets....
Ngày tải lên: 08/03/2014, 06:20
Báo cáo khoa học: "Topic Models for Word Sense Disambiguation and Token-based Idiom Detection" pdf
... exploiting paraphrase information for the target senses rather than relying on the structure of WordNet as a whole. Topic models have also been applied to the re- lated task of word sense induction. ... semantic re- lations between senses, etc.). Sometimes such de- tailed information may not be available, for in- stance for languages for which such a resource does not exist or fo...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "Domain Kernels for Word Sense Disambiguation" ppt
... of the text in which the word is located is a crucial information for WSD. For example the (domain) polysemy among the COM- PUTER SCIENCE and the MEDICINE senses of the word virus can be solved ... this is clearly unfeasible for all-words WSD tasks, in which all the words of an open text should be dis- ambiguated. On the other hand, the word expert approach works very well for l...
Ngày tải lên: 23/03/2014, 19:20
Báo cáo khoa học: "Unsupervised Relation Discovery with Sense Disambiguation" docx
... features. For example, for pattern “A play B”, pairs which contain B argument “Mozart” could be in one sense, whereas pairs which have “Mets” could be in another sense. Words: The words between ... produced by sense disambiguation. For each sense, we randomly sample 5 entity pairs. We also show top features for each sense. Each row shows one feature type, where “num” stands fo...
Ngày tải lên: 23/03/2014, 14:20
Báo cáo khoa học: "Similarity-Based Methods For Word Sense Disambiguation" docx
... two words be- long. However, a word is therefore modeled by the average behavior of many words, which may cause the given word& apos;s idiosyncrasies to be ig- nored. For instance, the word ... scheme for deciding which word pairs require a similarity-based estimate, a method for combining information from simi- lar words, and, of course, a function measuring the similarit...
Ngày tải lên: 31/03/2014, 21:20
Tài liệu Báo cáo khoa học: "Generalization Methods for In-Domain and Cross-Domain Opinion Holder Extraction" pdf
... produces a prediction for every word token in a sentence, CK and RB only produce a prediction for every noun phrase. For evaluation, we project the pre- dictions from RB and CK to word token level ... LEX-PRED. 6 For this learning method, we use CRF++. 7 We choose a configuration that provides good perfor- mance on our source domain (i.e. ETHICS). 8 For semantic role labeling we...
Ngày tải lên: 22/02/2014, 02:20
Báo cáo khoa học: "Data Cleaning for Word Alignment" pdf
... mechanism to aug- ment one source word into several source words or delete a source word, while a NULL insertion is a mechanism of generating several words from blank words. Fertility uses a conditional ... score S W B,X for each pair of sentences where X is 4, 3, 2, and 1 for word- based MT decoder. Step 3: Train phrase-based MT for full parallel corpus. Note that we do not need to...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Ensemble Methods for Unsupervised WSD" doc
... polysemous word. Let N(w) = {n 1 , n 2 , . . .,n k } be the k most (dis- tributionally) similar words to an ambiguous tar- get word w and senses(w) = {s 1 , s 2 , . . .s n } the set of senses for w. For ... occurrences of a given word are collected together. For each sense of a target word, the strength of all connections involving that sense are summed, giving that sense a...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Hybrid Methods for POS Guessing of Chinese Unknown Words" pot
... unknown words only, as recall drops sig- nificantly for longer words. 4.4 Combining Models To determine the best way to combine the three models, their individual performances are evaluated for each ... recall for disyllabic words is low. The results for the trigram model are listed in Ta- ble 5. Candidates are restricted to the eight POS cat- egories listed in Table 2 for this model...
Ngày tải lên: 08/03/2014, 04:22