Báo cáo khoa học: "A Taxonomy, Dataset, and Classiﬁer for Au

Báo cáo khoa học: "A Taxonomy, Dataset, and Classiﬁer for Automatic Noun Compound Interpretation" potx

... Association for Computational Linguistics, pages 678–687, Uppsala, Sweden, 11-16 July 2010. c 2010 Association for Computational Linguistics A Taxonomy, Dataset, and Classiﬁer for Automatic Noun Compound Interpretation Stephen ... anno- tated dataset, and a supervised classiﬁca- tion method for automatic noun compound interpretation. 1 Introduction Noun comp...

Ngày tải lên: 23/03/2014, 16:20

10 475 0

Tài liệu Báo cáo khoa học: "A New Dataset and Method for Automatically Grading ESOL Texts" pdf

... the Association for Computational Linguistics, pages 180–189, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics A New Dataset and Method for Automatically Grading ... a ceiling for the perfor- mance of our system, we calculate the average corre- lation between the CLC and the examiners’ scores, and ﬁnd an upper bound of 0.796 and 0.792 Pear-...

Ngày tải lên: 20/02/2014, 04:20

10 538 0

Báo cáo khoa học: "Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language Text Categorization" potx

... T ∗ =  i T i . If the function ψ exists for every text t i z ∈ T ∗ and for every language L j , and is known, then the corpus is parallel and aligned at document level. For the purpose of this paper it ... words. We randomly split both the English and Italian part into 75% training and 25% test (see Table 2). We processed the corpus with PoS taggers, keeping only nouns, verbs...

Ngày tải lên: 17/03/2014, 04:20

8 361 0

Báo cáo khoa học: "A Syllable Based Word Recognition Model for Korean Noun Extraction" potx

... using automatically acquired statistical information from the POS tagged corpus and extracts nouns by detecting word boundaries. Furthermore, it does not require any labor for construct- ing and ... sentence by using statistical information and extracts nouns by detecting the word boundaries. The statistical information is automatically acquired from a POS an- notated corpus and t...

Ngày tải lên: 17/03/2014, 06:20

8 368 0

Tài liệu Báo cáo khoa học: "A Mobile Health and Fitness Companion Demonstrator" pptx

... performed before dinner, getting dinner, and activities to be performed after dinner. It knows activities such as playing football, squash, or badminton; going to the gym or shop- ping; and ... starting point for generation is predicate-form descriptions provided by the dialogue manager. Further details and contextual information are retrieved from the dialogue history and the u...

Ngày tải lên: 22/02/2014, 02:20

4 391 0

Báo cáo khoa học: "A UNIFIED MANAGEMENT AND PROCESSING OF WORD-FORMS, IDIOMS AND ANALYTICAL COMPOUNDS" ppt

... A UNIFIED MANAGEMENT AND PROCESSING OF WORD-FORMS, IDIOMS AND ANALYTICAL COMPOUNDS Dan Tufts Octav Popescu Research Institute for Informatics Miciurin 8-10, 71316, Bucharest, ... governing the compound verbal forms (including interrogative forms and "aliens" (adverbs, reflexive pronoun insertion) for English, French, Romanian, Russian and Span- ish. As an exa...

Ngày tải lên: 09/03/2014, 01:20

6 431 0

Báo cáo khoa học: "A Uniﬁed Single Scan Algorithm for Japanese Base Phrase Chunking and Dependency Parsing" pdf

... Kudo and Matsumoto, 2002; Sassano, 2004) for bunsetsu-based parsers. We use the following features for each morpheme: 1. major POS, minor POS, conjugation type, conjugation form, surface form ... chunking and dependency parsing and, in addition, does them with a single scan. Most of the modern dependency parsers for Japanese require bunsetsu chunking (base phrase chunking) before...

Ngày tải lên: 17/03/2014, 02:20

4 287 0

Báo cáo khoa học: "A Morphological Analyzer and Generator for the Arabic Dialects" pdf

... Kornai (1995), Bird and Ellison (1994), Pulman and Hep- ple (1993), whose formalism Kiraz adopts, and others. 4 Design Goals for MAGEAD This work is aimed at a uniﬁed processing archi- tecture for the morphology ... many of the analyses incorrect, and only the analysis chosen for the token in context usually hand-corrected. We use LATB ﬁles fsa 16* for de- velopment, and fo...

Ngày tải lên: 23/03/2014, 18:20

8 319 0

Báo cáo khoa học: "A Part of Speech Estimation Method for Japanese Unknown Words using a Statistical Model of Morphology and Context" pptx

... experiment, we randomly selected two sets of 100 thousand sentences. The first 100 thousand sentences are used for training the language model. The second 100 thousand sentences are used for test- ... <WT>, <U-t>) and f(ci[ci-t, <WT>, <U-t>) are the relative frequen- cies of the character unigram and bigram for each word type and part of speech, f(...

Ngày tải lên: 23/03/2014, 19:20

8 397 0

Tài liệu Báo cáo khoa học: "A Graph-based Semi-Supervised Learning for Question-Answering" doc

... non-copula questions and build the model for only copula questions. ponent of a candidate sentence. For example for the given question, ”When did Nixon die?”, when the following candidate sentence, ... afﬁrmed questions did not contain any object and they are also in copula (linking) sentence form that is, they are only formed by subject and information about the subject as: {su...

Ngày tải lên: 20/02/2014, 07:20

9 503 1