a reliably annotated corpus

Tài liệu Báo cáo khoa học: "Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political debates" pdf

Tài liệu Báo cáo khoa học: "Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political debates" pdf

... 564–568, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics Liars and Saviors in a Sentiment Annotated Corpus of Comments to Political debates Paula Carvalho ... Natu- ral Language Processing and Computational Natural Language Learning, Prague. Krippendorff, Klaus. 2004. Content Analysis: An Intro- duction to Its Methodology, 2 nd Edition. Sage Publi- cations, ... sentiments for a variety of topics and corresponding targets are potentially involved (Riloff and Wiebe., 2003; Sarmento et al., 2009). Alternative approaches to automatic and manual construction...

Ngày tải lên: 20/02/2014, 05:20

5 499 0
Tài liệu Báo cáo khoa học: "WebCAGe – A Web-Harvested Corpus Annotated with GermaNet Senses" docx

Tài liệu Báo cáo khoa học: "WebCAGe – A Web-Harvested Corpus Annotated with GermaNet Senses" docx

... hand-crafted sense -annotated corpora have been available (Agirre et al., 2007; Erk and Strapparava, 2012; Mihalcea et al., 2004), while WSD research for languages that lack these corpora has lagged behind ... representative examples in Yarowsky’s ap- proach is performed completely manually and is therefore limited to the amount of data that can reasonably be annotated by hand. Leacock et al. (1998), Agirre ... the 3rd In- ternational Language Resources and Evaluation (LREC’02), Las Palmas, Canary Islands, pp. 609– 612 Santamar ´ a, C., Gonzalo, J., Verdejo, F. 2003. Au- tomatic Association of Web Directories...

Ngày tải lên: 22/02/2014, 03:20

10 419 0
Tài liệu Báo cáo khoa học: "Using an Annotated Corpus as a Stochastic Grammar" ppt

Tài liệu Báo cáo khoa học: "Using an Annotated Corpus as a Stochastic Grammar" ppt

... ~ A may be 40 M. Marcus, 1991. "Very Large Annotated Database of America~ English". DARPA Speech and Naawal Language Workshop, ~ Grove, Morgan Kaufmarm. F. Pereira and Y. Schabes, ... VB Amsterdam rens@alf.leLuva.nl Abstract In Data Oriented Parsing (DOP), an annotated corpus is used as a stochastic grammar. An input string is parsed by combining subtrees from the corpus. ... expect a higher accuracy if the corpus is further enlarged. 6 Conclusions and Future Research We have presented a language model that uses an annotated corpus as a stochastic grammar. We...

Ngày tải lên: 22/02/2014, 10:20

8 393 0
Báo cáo khoa học: "Test Collection Selection and Gold Standard Generation for a Multiply-Annotated Opinion Corpus" potx

Báo cáo khoa học: "Test Collection Selection and Gold Standard Generation for a Multiply-Annotated Opinion Corpus" potx

... also annotated. The details of this corpus are shown in Table 1. Topics Documents Sentences Quantity 32 843 11,907 Table 1. Corpus size 3 Analysis of Annotated Corpus As mentioned, each ... strict and lenient met- rics are also applied in annotations of relevance. 4.2 High agreement To see how the generated gold standards agree with the annotations of all annotators, we analyze ... gold standard; for the lenient metric, sentences with annotations agreed by at least two annotators are selected as the testing collection and the major- ity of annotations are treated as the...

Ngày tải lên: 08/03/2014, 02:21

4 418 0
Tài liệu Báo cáo khoa học: "Collecting a Why-question corpus for development and evaluation of an automatic QA-system" pdf

Tài liệu Báo cáo khoa học: "Collecting a Why-question corpus for development and evaluation of an automatic QA-system" pdf

... each paid reward. • Qualifications To improve the data quality, a HIT can also be attached to certain tests, “qualifications” that are either system-provided or created by the requester. An example ... the assign- ments have been completed. • Rewards At upload time, each HIT has to be assigned a fixed reward, that cannot be changed later. Minimum reward is $0.01. Amazon.com collects a 10% (or a ... excess of information. FAQ-pages tend to also answer questions which are not asked, and also con- tain practical examples. Human-powered answers often contain unrelated information and discourse- like...

Ngày tải lên: 20/02/2014, 09:20

9 611 1
The Proposition Bank: An Annotated Corpus of Semantic Roles pdf

The Proposition Bank: An Annotated Corpus of Semantic Roles pdf

... created and annotation disagreements were adju- dicated by a small team of highly trained linguists: Paul Kingsbury created the frames files and managed the annotators, and Olga Babko-Malaya checked ... 2004. Palmer, Martha, Olga Babko-Malaya, and Hoa Trang Dang. 2004. Different sense granularities for different applications. In Second Workshop on Scalable Natural Language Understanding Systems at ... Douglas Appelt, John Bear, David Israel, Megumi Kameyama, Mark E. Stickel, and Mabry Tyson. 1997. FASTUS: A cascaded finite-state transducer for extracting information from natural-language text....

Ngày tải lên: 06/03/2014, 10:20

36 269 0
Báo cáo khoa học: " a Movie Dialogue Corpus for Research and Development" potx

Báo cáo khoa học: " a Movie Dialogue Corpus for Research and Development" potx

... Seve- ral factors, such as the availability of more power- ful computers, an almost unlimited storage ca- pacity, the availability of large volumes of data in digital format, as well as the ... dialogue management and natural language generation. Springer. Stallard D (2000) Talk’n’travel: a conversational system for air travel planning. In Proceedings of the 6 th Conference on Applied ... hand, contain all additional information/texts appearing in the scripts, which are typically of narrative nature and explain what is happening in the scene. Figure 1 depicts a browser snapshot...

Ngày tải lên: 07/03/2014, 18:20

5 424 0
Báo cáo khoa học: "Creating a Corpus of Parse-Annotated Questions" docx

Báo cáo khoa học: "Creating a Corpus of Parse-Annotated Questions" docx

... data repeat Parse a new section of raw data Manually correct errors in the parser output Add the corrected data to the training set Extract a new grammar for the parser until All the data has been processed Algorithm ... of Pennsylvania, Philadelphia, PA. Daniel Gildea. 2001. Corpus variation and parser perfor- mance. In Lillian Lee and Donna Harman, editors, Pro- ceedings of EMNLP, pages 167–202, Pittsburgh, PA. Charles ... can be rapidly induced from appropri- ate treebank material. However, treebank- and machine learning-based grammatical resources re- flect the characteristics of the training data. They generally...

Ngày tải lên: 08/03/2014, 02:21

8 405 0
Tài liệu Báo cáo khoa học: "Creating a manually error-tagged and shallow-parsed learner corpus" pptx

Tài liệu Báo cáo khoa học: "Creating a manually error-tagged and shallow-parsed learner corpus" pptx

... Computational Linguistics Creating a manually error-tagged and shallow-parsed learner corpus Ryo Nagata Konan University 8-9-1 Okamoto, Kobe 658-0072 Japan rnagata @ konan-u.ac.jp. Edward Whittaker ... 44th Annual Meeting of ACL, pages 241–248. Katsuaki Okihara. 1985. English writing (in Japanese). Taishukan, Tokyo. Alla Rozovskaya and Dan Roth. 201 0a. Annotating ESL errors: Challenges and rewords. ... Vera Sheinman The Japan Institute for Educational Measurement Inc. 3-2-4 Kita-Aoyama, Tokyo, 107-0061 Japan whittaker,sheinman @jiem.co.jp Abstract The availability of learner corpora, especially those...

Ngày tải lên: 20/02/2014, 04:20

10 467 0
Tài liệu Báo cáo khoa học: "ModelTalker Voice Recorder – An Interface System for Recording a Corpus of Speech for Synthesis" ppt

Tài liệu Báo cáo khoa học: "ModelTalker Voice Recorder – An Interface System for Recording a Corpus of Speech for Synthesis" ppt

... pitch, amplitude and pronuncia- tion and users are given immediate feedback on the acceptability of each recording. Users can then rerecord an unacceptable utterance. Recordings are automatically ... utterance. This alignment is retained so that each utterance is automatically labeled. Once the entire corpus has been recorded, alignments are automatically refined based on specific individual ... naturalness and individuality one associates with one’s own voice. Individuals with difficulty speak- ing can be any age, gender, and from any part of the country, with regional dialects and...

Ngày tải lên: 20/02/2014, 09:20

4 419 0

Bạn có muốn tìm thêm với từ khóa:
