Báo cáo khoa học: "Distributional Similarity vs PU Learning for Entity Set Expansion" doc

Báo cáo khoa học: "Distributional Similarity vs. PU Learning for Entity Set Expansion" doc

Báo cáo khoa học: "Distributional Similarity vs. PU Learning for Entity Set Expansion" doc

... 11-16 July 2010. c 2010 Association for Computational Linguistics Distributional Similarity vs. PU Learning for Entity Set Expansion Xiao-Li Li Institute for Infocomm Research, 1 Fusionopolis ... distributional similarity is a classic technique for entity set expansion, this paper showed that PU learning performs considerably better on our diverse corpo...

Ngày tải lên: 30/03/2014, 21:20

6 182 0
Tài liệu Báo cáo khoa học: "Distributional Similarity Models: Clustering Neighbors" doc

Tài liệu Báo cáo khoa học: "Distributional Similarity Models: Clustering Neighbors" doc

... to train k. Then, the performance of each similarity- based model was evaluated on the test triples for a sequence of settings for k. We expected that clustering performance with respect to ... train-test splits. For each split, test triples were created from the held-out test set. Each model used the training set to calculate all basic quantities (e.g., p(vln ) for each v...

Ngày tải lên: 20/02/2014, 18:20

8 268 0
Tài liệu Báo cáo khoa học: "A Maximum Expected Utility Framework for Binary Sequence Labeling" doc

Tài liệu Báo cáo khoa học: "A Maximum Expected Utility Framework for Binary Sequence Labeling" doc

... Association of Computational Linguistics, pages 736–743, Prague, Czech Republic, June 2007. c 2007 Association for Computational Linguistics A Maximum Expected Utility Framework for Binary Sequence ... affiliation: Google Inc. Former affiliation: Center of Computational Learning Systems, Columbia University. the prediction or decoding task. In general, decoding can be a hard computationa...

Ngày tải lên: 20/02/2014, 12:20

8 549 0
Tài liệu Báo cáo khoa học: "GEMINI: A NATURAL LANGUAGE SYSTEM FOR SPOKEN-LANGUAGE UNDERSTANDING*" doc

Tài liệu Báo cáo khoa học: "GEMINI: A NATURAL LANGUAGE SYSTEM FOR SPOKEN-LANGUAGE UNDERSTANDING*" doc

... constituent parser to populate a chart with edges containing syntactic, seman- tic, and logical form information. Then, a second utterance parser is used to apply a second set of syntactic and ... prediction for nongapped categories. This limited form of left-context constraint greatly reduces the total number of edges built for a very low overhead. In the 5875-utterance train...

Ngày tải lên: 20/02/2014, 21:20

8 377 0
Báo cáo khoa học: KCTD5, a putative substrate adaptor for cullin3 ubiquitin ligases docx

Báo cáo khoa học: KCTD5, a putative substrate adaptor for cullin3 ubiquitin ligases docx

... domain. As the BTB domain is responsible for homo-oligo- merization in BTB proteins [3], we addressed whether KCTD5 might form homo-oligomers. For this purpose, HEK293 cells were transfected with ... Journal compilation ª 2008 FEBS KCTD5, a putative substrate adaptor for cullin3 ubiquitin ligases Yolanda Bayo ´ n 1 , Antonio G. Trinidad 1 , Marı ´a L. de la Puerta 1 , Marı ´a del Carmen...

Ngày tải lên: 07/03/2014, 06:20

11 402 0
Báo cáo khoa học: "Domain Adaptation with Active Learning for Word Sense Disambiguation" pdf

Báo cáo khoa học: "Domain Adaptation with Active Learning for Word Sense Disambiguation" pdf

... Association of Computational Linguistics, pages 49–56, Prague, Czech Republic, June 2007. c 2007 Association for Computational Linguistics Domain Adaptation with Active Learning for Word Sense ... active learning for domain adaptation for WSD. A similar work is the recent research by Chen et al. (2006), where active learning was used successfully to reduce the annotation effort...

Ngày tải lên: 08/03/2014, 02:21

8 363 0
Báo cáo khoa học: "SenseRelate::TargetWord – A Generalized Framework for Word Sense Disambiguation" doc

Báo cáo khoa học: "SenseRelate::TargetWord – A Generalized Framework for Word Sense Disambiguation" doc

... disambiguation needs. 2.1 Format Filter The filter takes as input file(s) annotated in the S ENSEVAL-2 lexical sample format, which is an XML–based format that has been used for both the S ENSEVAL-2 ... text preprocessing modules, each of which perform a transformation on the input words. For example, the Compound Detection Module identifies sequences of tokens that form compound words that...

Ngày tải lên: 08/03/2014, 04:22

4 350 0
Báo cáo khoa học: "Classifying Biological Full-Text Articles for Multi-Database Curation" doc

Báo cáo khoa học: "Classifying Biological Full-Text Articles for Multi-Database Curation" doc

... representation to the set doesn’t improve the cross-validation performance. For classifying the documents with better features, we run the algorithm twice. We first start with an empty set and obtain ... performance, which implies that the inferior ones may contain important exclusive information. The cross-validation performance fairly predicts the performance on the test data,...

Ngày tải lên: 08/03/2014, 21:20

4 227 0
Báo cáo khoa học: "On-line Language Model Biasing for Statistical Machine Translation" docx

Báo cáo khoa học: "On-line Language Model Biasing for Statistical Machine Translation" docx

... Work Existing methods for target LM biasing for SMT rely on information retrieval to select a comparable subset from the training corpus. A foreground LM estimated from this subset is interpolated ... exists in LM adaptation for SMT. Snover et al. (2008) used a cross-lingual infor- mation retrieval (CLIR) system to select a subset of target documents “comparable” to the source docu- men...

Ngày tải lên: 17/03/2014, 00:20

5 311 0
Báo cáo khoa học: "A Morphologically Sensitive Clustering Algorithm for Identifying Arabic Roots" docx

Báo cáo khoa học: "A Morphologically Sensitive Clustering Algorithm for Identifying Arabic Roots" docx

... shows the results of the Two-stage algorithm for our data sets. The maximally effective cut of point for all sets lies closer. Figures for the first set have to be treated with caution. The perfect ... phenomena on which we wanted to concentrate. Table 8: Two-stage Algorithm Test Results Data set Set 1 Set 2 Set 3 Set 4 Set 5 Benchmark: Total Manual Clusters (A) 9 267 337 1...

Ngày tải lên: 17/03/2014, 07:20

8 263 0
w