Báo cáo khoa học: "Topic Models for Word Sense Disambiguation and Token-based Idiom Detection" pdf
... neighbour for the chosen word and then assigns a sense based on the word, its neighbour and the topic. Boyd-Graber and Blei (2007) test their method on WSD and information retrieval tasks and find ... topic-document vectors (one for the sense and one for the context). We apply these models to coarse- and fine-grained WSD and find that they outperform comparable sys...
Ngày tải lên: 23/03/2014, 16:20
... each sense of one of the words. Pick one of the words, say W2, and using WordNet, form a similarity list for each sense of that word. For this, use the words from the synset of each sense and ... occurrences and (2)WordNet for measuring the semantic density for a pair of words. We report an average accuracy of 80% for the first ranked sense, and 91% for the...
Ngày tải lên: 08/03/2014, 06:20
... 403–410, Ann Arbor, June 2005. c 2005 Association for Computational Linguistics Domain Kernels for Word Sense Disambiguation Alfio Gliozzo and Claudio Giuliano and Carlo Strapparava ITC-irst, Istituto ... state-of-the-art systems for Word Sense Disambiguation (WSD) are de- signed according to a supervised learning frame- work, in which the disambiguation of each word...
Ngày tải lên: 23/03/2014, 19:20
Báo cáo khoa học: "Personalizing PageRank for Word Sense Disambiguation" docx
... the MFS in both Senseval-2 and Senseval-3 datasets. The results for the supervised system are given for reference, and we can see that the gap is relatively small, specially for Senseval- 3. 5 ... the graph for each target word in the context: for each target word W i , we concentrate the initial proba- bility mass in the senses of the words surrounding W i , but not in the...
Ngày tải lên: 31/03/2014, 20:20
Tài liệu Báo cáo khoa học: "Topic Models for Dynamic Translation Model Adaptation" pptx
... corpus with an addi- tional 150M words randomly selected from the non- NYT and non-LAT portions of the Gigaword v4 cor- pus using modified Kneser-Ney smoothing (Chen and Goodman, 1996). We used cdec ... 10, and 20. On FBIS, we can see that both models achieve moderate but consistent gains over the baseline on both BLEU and TER. The best model, LTM-10, achieves a gain of about 0.5 an...
Ngày tải lên: 19/02/2014, 19:20
Báo cáo khoa học: " New Models for Improving Supertag Disambiguation" pdf
... are formed. In our experiments, we have found that with k = 10, k = 20, and k = 40, the resulting models attain 94.61% accuracy and 1.86 tags per word, 95.76% accurate and 2.23 tags per word, ... predict heads. 3.2 Mixed Head and Trigram Models The head mod.el skips words that it does not con- sider to be head words and hence may lose valu- able information. The lack...
Ngày tải lên: 08/03/2014, 21:20
Báo cáo khoa học: "Data Cleaning for Word Alignment" pdf
... ¯e i,j be the j- th word in i-th sentence, and ¯e i be the i-th word in parallel corpus (Similarly for f i , ¯ f i,j , and ¯ f i ). Let |e i | be a sentence length of e i , and similarly for |f i |. ... Foundation Ireland (Grant No. 07/CE/I1142). Thanks to Yvette Graham and Sudip Naskar for proof read- ing, Andy Way, Khalil Sima’an, Yanjun Ma, and annonymous reviewers for...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Distortion Models For Statistical Machine Translation" doc
... across sentences. IBM Models 4 and 5 alleviate this limita- tion by replacing absolute word positions with relative positions. The latter models define the distortion pa- rameters for a cept (one or more words). ... models phrasal movement better since words tend to move in blocks and not independently. The distortion is con- ditioned on classes of the aligned source and target...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Structured Models for Fine-to-Coarse Sentiment Analysis" pdf
... the current sentence and document label y s i and y d , and the current and pre- vious sentence labels y s i and y s i−1 . Note that through these back-off features the joint models feature set will ... learning and/ or predicting mul- tiple outputs jointly. This includes parsing and rela- tion extraction (Miller et al., 2000), entity labeling and relation extraction (Roth a...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Combining Clues for Word Alignment" pdf
... certain words such as [hand,baggage] and handbagaget. However, between other word pairs such as is and sedan we find only low asso- ciations which conflict with others and therefore, they ... position and phrase type labels bear much less information about spe- cific words and phrases than POS tags, therefore, a lower weight of 0.1 was chosen for these two clues. 5.2 The r...
Ngày tải lên: 08/03/2014, 21:20