bagging and distributional similarity

Báo cáo khoa học: "Reducing semantic drift with bagging and distributional similarity" pdf

Báo cáo khoa học: "Reducing semantic drift with bagging and distributional similarity" pdf

... ACL and the 4th IJCNLP of the AFNLP, pages 396–404, Suntec, Singapore, 2-7 August 2009. c 2009 ACL and AFNLP Reducing semantic drift with bagging and distributional similarity Tara McIntosh and ... and WMEB using just the hand-picked seeds (S hand ) and 50 sample super- vised bagging (S gold BAG). Bagging with samples from S gold successfully increased the performance of both BASILISK and WMEB ... L hand to sample from and then another round with the 50 sets of randomly unsupervised seeds, S rand . The next decision is how to sample S rand from L hand . One approach is to use uniform random sampling...

Ngày tải lên: 08/03/2014, 00:20

9 339 0
Tài liệu Báo cáo khoa học: "Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity" pdf

Tài liệu Báo cáo khoa học: "Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity" pdf

... correct alignments and this causes many mistakes in the distributional simi- larity algorithm. We have given some examples in rows 4 and 5 of table 5. We have used the distributional similarity score only ... Alignment and Measures of Distributional Similarity Lonneke van der Plas & J ¨ org Tiedemann Alfa-Informatica University of Groningen P.O. Box 716 9700 AS Groningen The Netherlands {vdplas,tiedeman}@let.rug.nl Abstract There ... using measures of distributional similarity, but these typically are not able to distin- guish between synonyms and other types of semantically related words such as antonyms, (co)hyponyms and hypernyms. We...

Ngày tải lên: 20/02/2014, 12:20

8 516 0
Báo cáo khoa học: "Word classification based on combined measures of distributional and semantic similarity" docx

Báo cáo khoa học: "Word classification based on combined measures of distributional and semantic similarity" docx

... combined and the distributional weighting schemas. The combined weighting schema thus showed relative improvement on the distributional one: 1.5% (BNC) and 2.3% (AP) in terms of precision and 9.2% ... each weighted by the distributional similarity of the neighbor to the test word. Figure 3 compares the precision and learning accuracy of the combined weighting schema to the distributional weighting. ... semantic similarity to other classes. Besides distributional data, our method integrates this semantic information: the classification decision is a function of both (1) the distributional similarity of...

Ngày tải lên: 08/03/2014, 21:20

4 345 0
Báo cáo khoa học: "Finding Word Substitutions Using a Distributional Similarity Baseline and Immediate Context Overlap" potx

Báo cáo khoa học: "Finding Word Substitutions Using a Distributional Similarity Baseline and Immediate Context Overlap" potx

... (see Geffet and Dagan, 2004 and Gef- fet and Dagan, 2005, who improve the output of a distributional similarity system for an entailment task using a web-based feature inclusion check, and comment ... ‘murder’ and ‘abduct’ kill murder abduct two birds with babies that life her and make cancer cells and his wife and an innocent man a mocking bird thousands of innocent unsuspecting people and or ... Geffet and Ido Dagan. 2004. Feature Vector Quality and Distributional Similarity. Proceedings Of the 20th International Conference on Computa- tional Linguistics, 2004. Maayan Geffet and Ido...

Ngày tải lên: 08/03/2014, 21:20

9 248 0
Tài liệu Báo cáo khoa học: "Measures of Distributional Similarity" ppt

Tài liệu Báo cáo khoa học: "Measures of Distributional Similarity" ppt

... Speech and Language, 9:123-152. Ido Dagan, Lillian Lee, and Fernando Pereira. 1999. Similarity- based models of cooccur- rence probabilities. Machine Learning, 34(1- 3) :43-69. Ute Essen and ... Thomas M. Cover and Joy A. Thomas. 1991. Elements of Information Theory. John Wiley. Ido Dagan, Shanl Marcus, and Shanl Marko- vitch. 1995. Contextual word similarity and estimation from ... nando Pereira, and Stuart Shieber for helpful discussions, the anonymous reviewers for their insightful comments, Fernando Pereira for ac- cess to computational resources at AT&T, and...

Ngày tải lên: 20/02/2014, 18:20

8 338 0
Tài liệu Báo cáo khoa học: "Distributional Similarity Models: Clustering Neighbors" doc

Tài liệu Báo cáo khoa học: "Distributional Similarity Models: Clustering Neighbors" doc

... 1995. Contextual word similarity and estimation from sparse data. Computer Speech and Lan- guage, 9:123-152. Ido Dagan, Lillian Lee, and Fernando Pereira. 1999. Similarity- based models ... parison of distributional clustering and nearest- neighbors averaging on several large datasets, exploring the tradeoff in similarity- based mod- eling between memory usage on the one hand and estimation ... Douglas Baker and Andrew Kachites McCallum. 1998. Distributional clustering of words for text classification. In Plst Annual International A CM SIGIR Conference on Research and Development...

Ngày tải lên: 20/02/2014, 18:20

8 268 0
Báo cáo khoa học: "Syntactic Features and Word Similarity for Supervised Metonymy Resolution" pot

Báo cáo khoa học: "Syntactic Features and Word Similarity for Supervised Metonymy Resolution" pot

... Scotland subj-of subj-of win lose context reduction Pakistan Scotland-subj-of-losePakistan-subj-of-win similarity semantic class head similarity role similarity Pakistan had ... in the semi-finalScotland Figure 1: Context reduction and similarity levels draw this inference, two levels of similarity need to be taken into account. One concerns the similarity of the words ... both the similarity of the heads in the gram- matical relation (e.g., “win” and “lose”) and that of the grammatical role (e.g. subject). Figure 1 illus- trates context reduction and similarity...

Ngày tải lên: 08/03/2014, 04:22

8 603 0
Báo cáo khoa học: "Using lexical and relational similarity to classify semantic relations" pptx

Báo cáo khoa học: "Using lexical and relational similarity to classify semantic relations" pptx

... distinct types of word pair similarity: lexical similarity and relational similarity. We present an efficient and flexible technique for imple- menting relational similarity and show the effectiveness ... relational similarity but not both in com- bination. Previously proposed lexical models in- clude the WordNet-based methods of Kim and Baldwin (2005) and Girju et al. (2005), and the distributional ... co-occurrence probability vectors for w 1 and w 2 . Taking k jsd as a measure of word similarity and introducing parameters α and β to scale the contributions of w 1 and w 2 respec- tively, we retrieve...

Ngày tải lên: 08/03/2014, 21:20

9 416 0
Báo cáo khoa học: "Syntax is from Mars while Semantics from Venus! Insights from Spectral Analysis of Distributional Similarity Networks" ppt

Báo cáo khoa học: "Syntax is from Mars while Semantics from Venus! Insights from Spectral Analysis of Distributional Similarity Networks" ppt

... and intriguing question, whereby we construct the syn- tactic and semantic distributional similarity net- work (DSN) and analyze their spectrum to un- derstand their global topology. We observe that there ... commonalities and differences be- tween the syntactic and semantic distributional patterns of the words of a language? This study is an initial attempt to answer this fundamental and intriguing ... popular, visualization of distributional similarity is through graphs or networks, where each word is represented as nodes and weighted edges indicate the extent of distributional similar- ity...

Ngày tải lên: 17/03/2014, 02:20

4 250 0
Báo cáo khoa học: "Exploring Distributional Similarity Based Models for Query Spelling Correction" docx

Báo cáo khoa học: "Exploring Distributional Similarity Based Models for Query Spelling Correction" docx

... and describe two methods that can make use of distributional similarity information in Section 3. Experiments and results are presented in Section 4. The last section contains summaries and ... of valid words in certain contexts (Golding and Roth, 1996; Mangu and Brill, 1997). Distributional similarity between words has been investigated and successfully applied in many natural language ... 1998) and language model smoothing (Essen and Steinbiss, 1992; Dagan et al., 1997). An investi- gation on distributional similarity functions can be found in (Lillian Lee, 1999). 3 Distributional...

Ngày tải lên: 17/03/2014, 04:20

8 309 0
Tài liệu Báo cáo khoa học: "Word Vectors and Two Kinds of Similarity" pptx

Tài liệu Báo cáo khoa học: "Word Vectors and Two Kinds of Similarity" pptx

... similarity into two categories: taxonomic similarity and associative similarity. Taxonomic similarity, or categorical similarity, is a kind of semantic similarity between words in the same level ... LSA-based, cooccurrence-based and dictionary-based methods, were com- pared in terms of the ability to represent two kinds of similarity, i.e., taxonomic similarity and associative similarity. The result ... addresses three methods, LSA-based, cooccurrence-based, and dictionary-based methods, and two kinds of sim- ilarity, taxonomic similarity and associative sim- ilarity. Word vectors constructed...

Ngày tải lên: 20/02/2014, 12:20

8 473 0
Tài liệu Báo cáo khoa học: "The Distributional Inclusion Hypotheses and Lexical Entailment" pdf

Tài liệu Báo cáo khoa học: "The Distributional Inclusion Hypotheses and Lexical Entailment" pdf

... 1994; Lee, 1997; Lin, 1998; Pantel and Lin, 2002; Weeds and Weir, 2003). As it turns out, distributional similarity captures a somewhat loose notion of semantic similarity (see Table 1). It does ... Southampton, U.K. Geffet, Maayan and Ido Dagan, 2004. Feature Vector Quality and Distributional Similarity. In Proc. of Col- ing-04. Geneva. Switzerland. Grefenstette, Gregory. 1994. ... using the filter, with 20 and 40 feature sampling, com- pared to RFF top-40 and RFF top-26 simi- larities. ITA-20 and ITA-40 denote the web- sampling method with 20 and random 40 features, respectively....

Ngày tải lên: 20/02/2014, 15:20

8 432 0
Tài liệu Báo cáo khoa học: "Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model " pptx

Tài liệu Báo cáo khoa học: "Japanese OCR Error Correction using Character Shape Similarity and Statistical Language Model " pptx

... quality photocopies, and faxes are still difficult to process and cause many errors. The accu- racy of handwritten OCR is still about 90% (Hilde- brandt and Liu, 1993), and it worsens dramatically ... 91% for magazines and introductory textbooks of science and technology. (Ito and Maruyama, 1992) used part of speech bigram model and beam search in order to get multiple candidates in their ... al., 1991; Golding and Schabes, 1996). Similar techniques are used for correcting the output of English OCRs (Tong and Evans, 1996) and English speech recognizers (Ring- ger and Allen, 1996)....

Ngày tải lên: 20/02/2014, 18:20

7 472 0
Báo cáo khoa học: "Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments" ppt

Báo cáo khoa học: "Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments" ppt

... grammaticality, and coherence of the essay (Higgins et al., 2004), and the assessment of short student answers (Lea- cock and Chodorow, 2003; Pulman and Sukkarieh, 2005; Mohler and Mihalcea, 2009), ... & St. Onge (1998) [HSO], and two corpus- based measures: Latent Semantic Analysis [LSA] (Landauer and Dumais, 1997) and Explicit Seman- tic Analysis [ESA] (Gabrilovich and Markovitch, 2007). Briefly, ... MA. G. Hirst and D. St-Onge, 1998. Lexical chains as repre- sentations of contexts for the detection and correction of malaproprisms. The MIT Press. J. Jiang and D. Conrath. 1997. Semantic similarity...

Ngày tải lên: 07/03/2014, 22:20

11 478 0
Báo cáo khoa học: "CONTEXTUAL WORD SIMILARITY AND ESTIMATION FROM SPARSE DATA" ppt

Báo cáo khoa học: "CONTEXTUAL WORD SIMILARITY AND ESTIMATION FROM SPARSE DATA" ppt

... SIGIR. Fernando Pereira, Naftali Tishby, and Lillian Lee. 1993. Distributional clustering of English words. In Proc. of the Annual Meeting of the ACL. Philip Resnik. 1992. Wordnet and distributional ... using frequency information (Good, 1953; Katz, 1987; Jelinek and Mercer, 1985; Church and Gale, 1991). Church and Gale (Church and Gale, 1991) show, that for unobserved bigrams, the estimates ... 150 pairs, were constructed randomly and were restricted to words with indi- vidual frequencies between 500 and 2500. We term these two sets as the occurring and non-occurring sets. The...

Ngày tải lên: 08/03/2014, 07:20

8 334 0

Bạn có muốn tìm thêm với từ khóa:

w