Báo cáo khoa học: "Learning Expressive Models for Word Sense

Báo cáo khoa học: "Learning Expressive Models for Word Sense Disambiguation" pot

... 41–48, Prague, Czech Republic, June 2007. c 2007 Association for Computational Linguistics Learning Expressive Models for Word Sense Disambiguation Lucia Specia NILC/ICMC University of São ... verbs. 1 Introduction Word Sense Disambiguation (WSD) is concerned with the identification of the meaning of ambi- guous words in context. For example, among the possible senses...

Ngày tải lên: 08/03/2014, 02:21

8 381 0

Báo cáo khoa học: "Learning Semantic Classes for Word Sense Disambiguation" pptx

... data for each word. This can be done be- cause the semantic classes are common to words unlike senses; for learning the properties of a given class, we can use the data from various words. For instance, ... parts of speech for a window of n words to both sides of word (excluding the word 1 Validation results showed that a window of two words to both sides yieldsthe best performan...

Ngày tải lên: 31/03/2014, 03:20

8 268 0

Báo cáo khoa học: "Exploiting Parallel Texts for Word Sense Disambiguation: An Empirical Study" potx

... problem is par- ticular severe for WSD, since sense- tagged data must be collected separately for each word in a language. One source to look for potential training data for WSD is parallel texts, ... sample task of SENSEVAL-2. We rely on two sources to decide on the sense classes of w: (i) The sense definitions in WordNet 1.7, which lists seven senses for the noun cha...

Ngày tải lên: 08/03/2014, 04:22

8 380 0

Báo cáo khoa học: "Similarity-Based Methods For Word Sense Disambiguation" docx

... long. However, a word is therefore modeled by the average behavior of many words, which may cause the given word& apos;s idiosyncrasies to be ig- nored. For instance, the word "red" ... scheme for deciding which word pairs require a similarity-based estimate, a method for combining information from similar words, and, of course, a function measuring the similar...

Ngày tải lên: 31/03/2014, 21:20

8 312 0

Báo cáo khoa học: "Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora" docx

... and French words, an empty English word, and an empty French word, • E comprises edges from the source to all the English words (including the empty one), edges from all the French words (including ... 2The empty words account for the fact that words may not be aligned with other ones, i.e. they are not exphcitely translated for example. 445 • from the source to the empty Engl...

Ngày tải lên: 17/03/2014, 07:20

7 379 0

Báo cáo khoa học: "Learning PP attachment for filtering prosodic phrasing" potx

... information should be used as features in training data: (i) lexical features (e.g. unigrams and bigrams of head words), and (ii) word cooccurrence strength values (the proba- bility that two words ... of the involved phrases, as well as for combinations of these words. Cooccur- rence strength values may provide additional clues to informational ties among words; when we in- vestigate the...

Ngày tải lên: 24/03/2014, 03:20

8 357 0

Báo cáo khoa học: "Exemplar-Based Models for Word Meaning In Context" pptx

... features for all the senses of the target. For example, among the top 20 features for coach, we get match and team (for the “trainer” sense) as well as driver and car (for the “bus” sense) . This ... a fundamental problem for distributional models. Typically, distributional models compute a single “type” vector for a target word, which contains cooccurrence counts for...

Ngày tải lên: 30/03/2014, 21:20

6 416 0

Báo cáo khoa học: "Log-linear Models for Word Alignment" ppt

... Model 5 training. For log-linear models, POS information and an additional dictionary are used, which is not the case for GIZA++/IBM models. However, treated as a method for performing symmetrization, ... features. Our experiments show that log-linear models signiﬁcantly outperform IBM translation models. We begin by describing log-linear models for word alignment. The desig...

Ngày tải lên: 31/03/2014, 03:20

8 283 0

Báo cáo khoa học: "A STOCHASTIC PROCESS FOR WORD FREQUENCY DISTRIBUTIONS" pot

... relations among words in lexical distributions. These empirical similarity relations, as observed for large corpora of words, impose additional criteria on the ad- equacy of models for word frequency ... (1975), have been put forward, all of which have Zipf's law as some special or limiting form. Unrelated to Zipf's law is the lognormal hypothesis, advanced for wor...

Ngày tải lên: 08/03/2014, 07:20

8 409 0

Báo cáo khoa học: "Exploring Entity Relations for Named Entity Disambiguation" pot

... challenges: Surface forms in text can be am- biguous, and the same entity can be referred to by different surface forms. For example, the surface form “George Bush” may denote either of two for- mer U.S. ... a method for candi- date selection that is based on an inverted index of surface forms and entities (Section 3.2). Instead of a bag-of-words approach we use co-occurring NEs in text...

Ngày tải lên: 23/03/2014, 16:20

6 363 0