... For comparison to previous work, all term candidates are extracted from the same domain corpora using the delimiter based algorithm TCE_DI (Term Candidate Extraction – Delimiter Identification) ... it is likely a term . Limited distribution information of term candi- dates in different documents often limits the abil- ity of such algorithms to distinguish terms from non-...
Ngày tải lên: 08/03/2014, 01:20
... content within a certain period and/or from a certain group of peo- ple such as people in the same region. Existing work on keyphrase extraction identifies keyphrases from either individual documents ... proposed methods are very effective in topical keyphrase extraction from Twitter. Interestingly, our proposed keyphrase ranking method can incorporate users’ interests by modeling th...
Ngày tải lên: 17/03/2014, 00:20
Tài liệu Báo cáo khoa học: "Extracting Comparative Sentences from Korean Text Documents Using Comparative Lexical Patterns and Machine Learning Techniques" doc
... non-comparative sen- tences from comparative sentence candidates with a CKL2 keyword, we employ machine learning techniques (MEM and Naïve Bayes). For feature extraction from each comparative- sentence ... eliminate non-comparative sentences from the candidates. As a result, we achieved signifi- cant performance, an F1-score of 88.54%, in our experiments using various web docu...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "INSIDE-OUTSIDE REESTIMATION FROM PARTIALLY BRACKETED CORPORA" ppt
... ( (from SF0) (to San Francisco))))).) GR (Tell ((me (((about the) public) transportation)) ( (from SF0) ((to San) (Francisco .))))) GB ((Tell (me (about (((the public) transportation) ( (from ... of local maxima grows with the number of nonterminals. Finally, while SCFGs do provide a hierarchical model of the language, that structure is undetermined by raw text and only by chance...
Ngày tải lên: 20/02/2014, 21:20
Báo cáo khoa học: "Mining Entity Types from Query Logs via User Intent Modeling" pdf
... inference procedures using signals from query context, click, entity, entity type, and user intent. 563 • We propose an efficient learning technique and a robust implementation of our models, using real-world ... place of (or in addition to) text corpora for learning seman- tic classes. Query logs can contain billions of en- tries, they provide an independent signal from text corpor...
Ngày tải lên: 07/03/2014, 18:20
Báo cáo khoa học: "Prototyping virtual instructors from human-human corpora" pdf
... ranging from trainers in simulated worlds to non player characters for virtual games. In this paper we present a novel algorithm for rapidly prototyping virtual in- structors from human-human corpora ... 2011. c 2011 Association for Computational Linguistics Prototyping virtual instructors from human-human corpora Luciana Benotti PLN Group, FAMAF National University of C ´ ordoba C...
Ngày tải lên: 07/03/2014, 22:20
Tài liệu Báo cáo khoa học: "A Bootstrapping Approach to Named Entity Classification Using Successive Learners" pdf
... explored using this method. There is considerable research on NE tagging using different techniques. These include systems based on handcrafted rules (Krupka 1998), as well as systems using ... approach from the co-training-based NE bootstrapping are also discussed. 1 Introduction Named Entity (NE) tagging is a fundamental task for natural language processing and information...
Ngày tải lên: 20/02/2014, 16:20
Báo cáo khoa học: Human recombinant prolidase from eukaryotic and prokaryotic sources Expression, purification, characterization and long-term stability studies pptx
... digestion performed using the same conditions. N-Terminal sequence The sequence of the N-terminal 25 amino acids of the recombinant prolidase purified from E. coli was unequivocally determined by automated ... properties mainly indistinguishable from those of the native prolidase from fibroblast lysate. The protein yield was higher from the prokaryotic source, and a detailed long -ter...
Ngày tải lên: 07/03/2014, 11:20
Báo cáo khoa học: "Learning Common Grammar from Multilingual Corpus" potx
... rewrite a non- terminal as a terminal A → w, and binary pro- ductions rewrite a nonterminal as two nontermi- nals A → BC, where A, B, C ∈ K and w ∈ W l . The rule probabilities for each nonterminal A ... where K is a set of nonterminals, W l is a set of terminals, and Φ l is a set of rule probabilities. Note that a set of nonterminals K is shared among languages, but a set of terminals W l an...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "Accurate Collocation Extraction Using a Multilingual Parser" docx
... collocational information from corpora by using a syntactic parser that supports several lan- guages. After describing the underlying method- ology (section 2), we report several extraction re- sults ... recursively. Generally speaking, a collocation extraction can be seen as a two-stage process: I. in stage one, collocation candidates are iden- tified from the text corpora, based...
Ngày tải lên: 17/03/2014, 04:20