Báo cáo khoa học: "Building trainable taggers in a web-based, UIMA-supported NLP workbench" potx

Báo cáo khoa học: "Building trainable taggers in a web-based, UIMA-supported NLP workbench" potx

Báo cáo khoa học: "Building trainable taggers in a web-based, UIMA-supported NLP workbench" potx

... Computational Linguistics Building trainable taggers in a web-based, UIMA-supported NLP workbench Rafal Rak, BalaKrishna Kolluru and Sophia Ananiadou National Centre for Text Mining School of Computer ... build a statistical model based on a manually or semi-automatically tagged sample data and then to tag new data using this model. Since the machine learning algorithms for...

Ngày tải lên: 16/03/2014, 20:20

6 320 0
Tài liệu Báo cáo khoa học: "Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political debates" pdf

Tài liệu Báo cáo khoa học: "Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political debates" pdf

... opinions? Finally, approaches to opi- nion mining have implicitly assumed that the prob- lem at stake is a balanced classification problem, based on the general assumption that positive and negative ... identifying the most rele- vant challenges in mining opinions targeting media personalities, namely politicians, in comments posted by users to online news articles. We are interest...

Ngày tải lên: 20/02/2014, 05:20

5 499 0
Tài liệu Báo cáo khoa học: "Identifying Linguistic Structure in a Quantitative Analysis of Dialect Pronunciation" docx

Tài liệu Báo cáo khoa học: "Identifying Linguistic Structure in a Quantitative Analysis of Dialect Pronunciation" docx

... (Manning and Schütze, 1999). Multidimensional scaling is data analysis technique that provides a spatial display of the data revealing relationships between the instances in the data set (Davison, ... multidi- mensional scaling. The analyses showed that results obtained using aggregate analysis of word pronunci- ations mostly conform with the traditional phonetic classification of Bulgaria...

Ngày tải lên: 20/02/2014, 12:20

6 651 0
Tài liệu Báo cáo khoa học: "Mapping Lexical Entries in a Verbs Database to WordNet Senses" doc

Tài liệu Báo cáo khoa học: "Mapping Lexical Entries in a Verbs Database to WordNet Senses" doc

... (WSD) in several ways. First, the words to be disambiguated are entries in a lexical database, not tokens in a text corpus. Second, we take an “all-words”rather than a “lexical-sample” approach ... Sim- pleProd, and SimpleWtdSum MajSgl+Aggr: Majority vote of MajSim- pleSgl and MajAggr MajPair+Aggr: Majority vote of MajSim- plePair and MajAggr Table 2 gives recall and precision mea...

Ngày tải lên: 20/02/2014, 18:20

8 415 0
Tài liệu Báo cáo khoa học: "Multiple Default Inheritance in a Unification-Based" pdf

Tài liệu Báo cáo khoa học: "Multiple Default Inheritance in a Unification-Based" pdf

... to alternatives at the same conceptual level in the hiersrchy, and in msny cases reflect the tra- ditional ides of 'paradigm'. Equations within a variant set are absolute constraints, ... of class definitions. In equs. tions, unification variables have initial capitals, and negation of constants is indicated by ' '. 'kk' is the string concatenation o...

Ngày tải lên: 20/02/2014, 21:20

7 362 0
Tài liệu Báo cáo khoa học: "Domain-transcending mappings in a system for metaphorical reasoning" docx

Tài liệu Báo cáo khoa học: "Domain-transcending mappings in a system for metaphorical reasoning" docx

... SPACE and IDEAS ARE PHYSICAL OBJECTS, con- taining the following relevant mappings: When a person's mind is being viewed as a physical space, an idea's being physically located in the space ... view neutral mapping adjuncts (VNMAs) 1 . We regard VNMAs as standard but implicit default aspects of all view-specific metaphorical mappings. They are defaults in that they can, i...

Ngày tải lên: 22/02/2014, 02:20

6 455 0
Báo cáo khoa học: "Unsupervised Coreference Resolution in a Nonparametric Bayesian Model" potx

Báo cáo khoa học: "Unsupervised Coreference Resolution in a Nonparametric Bayesian Model" potx

... Note that we take a Bayesian approach in which all pa- rameters are integrated out (or sampled). The infer- ence task is thus primarily a search problem over the index labels Z. 849 (a) (b) (c) The ... is a natural penalty for each cluster which is actually used. With Z observed during sampling, we can inte- grate out β and calculate P (Z i,j |Z −i,j ) analytically, using the Chines...

Ngày tải lên: 08/03/2014, 02:21

8 399 0
Báo cáo khoa học: "Searching for Topics in a Large Collection of Texts" doc

Báo cáo khoa học: "Searching for Topics in a Large Collection of Texts" doc

... The Local Search Algorithm (i) for , and (ii) cut always arises from by adding or taking away one document into/from it; 2. since the quality of modified cuts cannot in- crease in nitely, a finite ... simulate the behavior of human annotator who finds topi- cally coherent clusters in a training collection. The task of -optimization leads to a system of linear inequalities, which we so...

Ngày tải lên: 08/03/2014, 04:22

6 447 0
Báo cáo khoa học: "Multilingual Text Processing in a Two-Byte Code" pdf

Báo cáo khoa học: "Multilingual Text Processing in a Two-Byte Code" pdf

... sequences as a separate letter. There is just as much variation in handling letters" with diacritics. The umlauted letter ~ is alphabetized as a separate letter following _o in Hungarian, and ... ordinary n. In Table I., the digraphs and letters with diacritics which are not in parentheses or brackets are alphabetized separately as distinct single units. Those in par...

Ngày tải lên: 08/03/2014, 18:20

4 322 0
w