integrating multiple knowledge sources to disambiguate word senses

Tài liệu Báo cáo khoa học: "Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification" doc

Tài liệu Báo cáo khoa học: "Using Multiple Sources to Construct a Sentiment Sensitive Thesaurus for Cross-Domain Sentiment Classification" doc

... automatic method to create a thesaurus that is sensitive to the sentiment of words expressed in different domains. • We describe a method to use the created the- saurus to expand feature vectors ... vector d ∈ R N , where the value of the j-th element d j is set to the total number of occurrences of the unigram or bigram w j in the review d. To find the suitable candidates to expand a vector ... excellent and delicious are positive sentiment words, then we can use this knowledge to expand a feature vector that contains the word delicious using the word excellent, thereby reducing the mismatch...

Ngày tải lên: 20/02/2014, 04:20

10 556 0
Viral copy - How to trade word for traffic

Viral copy - How to trade word for traffic

... marketer’s job to tell people a story they want to hear. Page 6 of 30 It’s up to you whether your story is a complete fabrication. I tend to lean aggressively toward complete ... about a niche topic that wants to increase traffic, use humor to their benefit? The first r good sense of humor is a wonderful asset, it’s all too easy to offend those you’re trying to attract. ... reference to dropping turkeys out of a helicopter. Others got it and smiled. Page 20 of 30 According to Wikipedia, a scoop: is a colloquial term to refer to a news story (especially...

Ngày tải lên: 27/01/2014, 11:07

30 374 0
Tài liệu PEPFAR Guidance on Integrating Prevention of Mother to Child Transmission of HIV, Maternal, Neonatal, and Child Health and Pediatric HIV Services pdf

Tài liệu PEPFAR Guidance on Integrating Prevention of Mother to Child Transmission of HIV, Maternal, Neonatal, and Child Health and Pediatric HIV Services pdf

... relationships and bringing all available resources and agents to the table to find solutions and forge partnerships in order to procure all elements essential to a high quality, comprehensive, integrated ... strengthens human resources capacity in two sectors for a marginal cost increase, as demonstrated in Haiti with syphilis and HIV testing training. 6+ 7 Integrating new HIV services into the existing ... Relief (PEPFAR) to use limited resources to leverage other key programs and strengthen the MNCH platform in each PEPFAR country through Partnership Frameworks. In so doing, PEPFAR aims to strengthen...

Ngày tải lên: 12/02/2014, 19:20

16 728 0
Tài liệu Effect modification of air pollution on Urinary 8-Hydroxy-2’-Deoxyguanosine by genotypes: an application of the multiple testing procedure to identify significant SNP interactions ppt

Tài liệu Effect modification of air pollution on Urinary 8-Hydroxy-2’-Deoxyguanosine by genotypes: an application of the multiple testing procedure to identify significant SNP interactions ppt

... glutathione to numerous potentially genotoxic compounds [50]. Indivi- duals with the deletion of GSTM1 or GS TT1 have been shown to reduce GST activity and thus may be unable to eliminate toxins as ... examinations, laboratory tests, collection of medical history, social status information, and adminis- tration of questionnaires on smoking history, food intake and other factors that may influence ... MALDI-TOF mass spectro- meter (Sequonom, CA, USA) with semiautomated pri- mer design (SpectroDESIGNER, Sequenom) and implementation of the very short extension method [45]. Assays failing to multiplex...

Ngày tải lên: 17/02/2014, 22:20

9 773 0
Tài liệu Báo cáo khoa học: "An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation" ppt

Tài liệu Báo cáo khoa học: "An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation" ppt

... few words or only one word, which is called an atom word group, an atom class or an atom node. The words in the same atom node hold the smallest semantic dis- tance. From the root node to ... revised to meet this de- mand. Extended Version of TongYiCiCiLin To extend the TongYiCiCiLin (Cilin) to hold more words, several linguistic resources are adopted for manually adding new words. ... ambiguous word need to simulate the function of the real ambiguous word, and to acquire semantic knowledge as the real ambiguous word does. Thus, we call it an equivalent pseudoword (EP)...

Ngày tải lên: 20/02/2014, 12:20

8 414 0
Tài liệu Báo cáo khoa học: "Finding Predominant Word Senses in Untagged Text" pptx

Tài liệu Báo cáo khoa học: "Finding Predominant Word Senses in Untagged Text" pptx

... (hereafter referred to as the WordNet similarity measure) to weight the contribution that each neighbour makes to the various senses of the target word. To find the first sense of a word ( ) we take ... BNC, but the first senses of words like division and goal shift towards the more specific senses (league and score respectively). Moreover, the chosen senses of the word tie proved to be a textbook ... data to automatically find a predominant sense for nouns in WordNet. We use an automatically acquired the- saurus and a WordNet Similarity measure. The au- tomatically acquired predominant senses...

Ngày tải lên: 20/02/2014, 16:20

8 371 0
Tài liệu Báo cáo khoa học: "Learning Word Senses With Feature Selection and Order Identification Capabilities" pdf

Tài liệu Báo cáo khoa học: "Learning Word Senses With Feature Selection and Order Identification Capabilities" pdf

... 2003). The solution of word sense learning is closely re- lated to the interpretation of word senses. Different interpretations of word senses result in different so- lutions to word sense learning. One ... manually compiled lexical resources. However these lexical resources often miss domain specific word senses, even many new words are not included inside. Learning word senses from free text will ... corresponds to word w j , then the entry speci- fied by i-th row and j-th column records the number of times that word w i occurs close to w j in corpus. We use v(w i ) to represent the word vector of...

Ngày tải lên: 20/02/2014, 16:20

8 463 0
Tài liệu Báo cáo khoa học: "Discovering Corpus-Specific Word Senses" pot

Tài liệu Báo cáo khoa học: "Discovering Corpus-Specific Word Senses" pot

... certain context. This gives rise to an automatic, unsuper- vised word sense disambiguation algorithm which is trained on the data to be disambiguated. The ability to map senses into a taxonomy using the ... any set of features. 1 Si mple cutoff functions proved unsatisfactory because of the bias they give to more frequent words. Instead we link each word to its top n neighbors where n can be determined by ... to automatically construct corpus-based taxonomies or to tune ex- isting ones. The same corpus evidence which sup- ports a clustering of an ambiguous word into dis- tinct senses can be used to...

Ngày tải lên: 22/02/2014, 02:20

4 329 0
Tài liệu Báo cáo khoa học: "Using Cycles and Quasi-Cycles to Disambiguate Dictionary Glosses" pdf

Tài liệu Báo cáo khoa học: "Using Cycles and Quasi-Cycles to Disambiguate Dictionary Glosses" pdf

... Senseval- 5 Recently, Princeton University released a richer corpus of disambiguated glosses, namely the “Princeton WordNet Gloss Corpus” (http://wordnet.princeton.edu). However, in order to allow for a comparison ... randomly selecting 1,000 word senses from the dictionary and annotating the content words in their glosses according to the dictionary sense in- ventory. Overall, 2,678 words were sense tagged. The ... part-of-speech- tagged ambiguous content words in the gloss of sense s from our reference dictionary. WordNet. When using WordNet as a reference resource, given a sense s whose gloss we aim to disambiguate, the dictionary...

Ngày tải lên: 22/02/2014, 02:20

9 421 0
Báo cáo khoa học: "Exploring Deterministic Constraints: From a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation" ppt

Báo cáo khoa học: "Exploring Deterministic Constraints: From a Constrained English POS Tagger to an Efficient ILP Solution to Chinese Word Segmentation" ppt

... to a typical character sequence. 3.4 Character- and word- based features As studied in previous work, word- based feature templates usually include the word itself, sub-words contained in the word, ... features are incorporated into word- based CWS models, some word- based features are no longer of interest, such as the start- ing character of a word, sub-words contained in the word, contextual characters ... i.e. all tag types that are assigned to the word in training data. Furthermore, we approximate unknown words in testing data by rare words in training data. For a word that occurs less than 5 times...

Ngày tải lên: 07/03/2014, 18:20

9 425 0
Báo cáo khoa học: "Investigations on Word Senses and Word Usages" ppt

Báo cáo khoa học: "Investigations on Word Senses and Word Usages" ppt

... annota- tors, rather than training annotators to conform to a common sense distinction guideline. By asking annotators to provide ratings for each individual sense, we strive to eliminate all bias towards ... data. WSsim is a word sense annotation task using WordNet senses. 5 Unlike previous word sense an- notation projects, we asked annotators to provide judgments on the applicability of every WordNet sense ... should be possi- ble to use existing sense-annotated data to explore this question: almost all sense annotation efforts have allowed annotators to assign multiple senses to a single occurrence,...

Ngày tải lên: 08/03/2014, 00:20

9 337 0
Báo cáo khoa học: "Using Syntax to Disambiguate Explicit Discourse Connectives in Text" pot

Báo cáo khoa học: "Using Syntax to Disambiguate Explicit Discourse Connectives in Text" pot

... related because of their timing). These top-level discourse relation senses are general enough to be annotated with high inter-annotator agreement and are com- mon to most theories of discourse. 2.2 ... tree which dominates the words in the connective but nothing else. For single word connectives, this might correspond to the POS tag of the word, how- ever for multi -word connectives it will ... 2008. Using automat- ically labelled examples to classify rhetorical rela- tions: An assessment. Natural Language Engineer- ing, 14:369–416. B. Wellner and J. Pustejovsky. 2007. Automatically identifying...

Ngày tải lên: 08/03/2014, 01:20

4 441 0
Báo cáo khoa học: "Large Scale Collocation Data and Their Application to Japanese Word Processor Technology" potx

Báo cáo khoa học: "Large Scale Collocation Data and Their Application to Japanese Word Processor Technology" potx

... Kana -to- Kanji conversion system consist of two kinds: (1) idiomatic expressions, whose meanings seem to be difficult to compose from the typical meaning of the individual compo- nent words ... results of e-bunsetsu-segmentation: , hitoh.a/kigqkikunikositagotol, taarimasen (there is nothing like being watchful) hitohdv'Mga/Idkimi/ko3itcv;kotoha/arimasen In the above examples, ... its evaluation by the cost. 3.1 Prototype System A We first developed a prototype Kana -to- Kanji con- version system which we call System A, revising Kana -to- Kanji conversion software on the...

Ngày tải lên: 08/03/2014, 05:21

5 413 0
Báo cáo khoa học: "Lexical Semantics to Disambiguate Polysemous Phenomena of Japanese Adnominal Constituents" potx

Báo cáo khoa học: "Lexical Semantics to Disambiguate Polysemous Phenomena of Japanese Adnominal Constituents" potx

... adjectives to con- sider what its head noun denotes in the sentence (Bouillon, 1996). Also, when we analyze word mean- ings, it is important to take both context and our world knowledge into account ... adver- bial form should apply to the semantics of the common noun, 494 Lexical Semantics to Disambiguate Polysemous Phenomena of Japanese Adnominal Constituents Hitoshi Isahara and Kyoko Kanzaki ... definition which can con- tain/represent/embody/refer to various items. (b) Fi~IC~Z (junsui_na, pure)J works to constrain this number to one. Extending the Generative Lexicon format, some-...

Ngày tải lên: 08/03/2014, 06:20

8 399 0
Báo cáo khoa học: "A STRUCTURED REPRESENTATION OF WORD-SENSESIR OR SEMANTIC ANALYSIS" pdf

Báo cáo khoa học: "A STRUCTURED REPRESENTATION OF WORD-SENSESIR OR SEMANTIC ANALYSIS" pdf

... short term objective is to enlarge the dictionary to 1000 words. A concept editor has been developed to facilitate this task. The editor also allows to visualize, for each word- sense, a list ... approach to represent word- senses. As discussed later, the latter seems not to provide sufficient information to analyze m~t trivial sentences. To make a clear distinction between word- sense ... word- sense definition really includes some other; each word has it own specific uses and only partially overlap with other words. The conclusion id that is not possible to arrange word- senses...

Ngày tải lên: 09/03/2014, 01:20

9 359 0
w