Báo cáo khoa học: "Generating Usable Formats for Metadata and Annotations in a Large Meeting Corpus" pptx

Báo cáo khoa học: "Generating Usable Formats for Metadata and Annotations in a Large Meeting Corpus" pptx

Báo cáo khoa học: "Generating Usable Formats for Metadata and Annotations in a Large Meeting Corpus" pptx

... the AMI Corpus metadata and annota- tions as part of the larger objective of automating the generation of annotation and metadata databases to enhance search and browsing of meeting record- ings. ... Proceedings of the ACL 2007 Demo and Poster Sessions, pages 93–96, Prague, June 2007. c 2007 Association for Computational Linguistics Generating Usable Formats for Metada...

Ngày tải lên: 08/03/2014, 02:21

4 373 0
Tài liệu Báo cáo khoa học: "Re-Usable Tools for Precision Machine Translation∗" pdf

Tài liệu Báo cáo khoa học: "Re-Usable Tools for Precision Machine Translation∗" pdf

... available for system development and also serve as training data for machine learning approaches. Using the discriminant-based Redwoods approach to tree- banking (Oepen, Flickinger, Toutanova, ... distribution, and we combine the MaxEnt model with a traditional language model trained on a much larger corpus (the BNC). The latter, more standard approach to realization rank- ing, wh...

Ngày tải lên: 20/02/2014, 12:20

4 449 0
Báo cáo khoa học: "Generating Complex Morphology for Machine Translation" pdf

Báo cáo khoa học: "Generating Complex Morphology for Machine Translation" pdf

... gain in both monolingual and bilingual settings in both language pairs. We obtain a particularly large gain in the Russian bilin- gual case, in which the absolute gain is more than 4%, translating ... which also includes bilingual lex- ical features; 7 Monolingual-All, which has access to all the information available in the target lan- guage, including morphological and syn...

Ngày tải lên: 17/03/2014, 04:20

8 333 0
Tài liệu Báo cáo khoa học: "Know When to Hold''''Em: Shuffling Deterministically in a Parser for Non concatenative Grammars*" pdf

Tài liệu Báo cáo khoa học: "Know When to Hold''''Em: Shuffling Deterministically in a Parser for Non concatenative Grammars*" pdf

... most appropriate parsing algorithm to take advantage of the information that a semantic head provides. For example, a head usually provides information about the remaining daughters that the ... descriptions includes Bach's (1979) wrapping oper- ations, Pollard's (1984) head-wrapping operations, and Moortgat's (1996) extraction and infixation op- erations in (...

Ngày tải lên: 20/02/2014, 18:20

7 397 0
Tài liệu Báo cáo khoa học: "Generating statistical language models from interpretation grammars in dialogue systems" potx

Tài liệu Báo cáo khoa học: "Generating statistical language models from interpretation grammars in dialogue systems" potx

... important than the quantity. This makes extraction of domain data from larger corpora an important issue and increases the interest of generating artificial cor- pora. As the approach of using SLMs ... recognition performance. We are considering basing our re-ranking on the informa- tion held in the dialogue information state, knowl- edge of what is going on in the graphical interface a...

Ngày tải lên: 22/02/2014, 02:20

8 381 0
Tài liệu Báo cáo khoa học: "An annotation scheme for discourse-level argumentation in research articles" doc

Tài liệu Báo cáo khoa học: "An annotation scheme for discourse-level argumentation in research articles" doc

... Purposes, for tasks as varied as teaching English as a foreign language, human translation and citation analysis (Myers, 1992; Thompson and Ye, 1991; Duszak, 1994), but al- ways for manual analysis ... show that the annotation scheme can be learned by trained annotators and subsequently applied in a consistent way. Because the scheme is reliable, hand-annotated data can...

Ngày tải lên: 22/02/2014, 03:20

8 397 0
Báo cáo khoa học: "Phrase Table Training For Precision and Recall: What Makes a Good Phrase and a Good Phrase Pair?" doc

Báo cáo khoa học: "Phrase Table Training For Precision and Recall: What Makes a Good Phrase and a Good Phrase Pair?" doc

... phrases appear only a few times in training data, a phrase pair translation is also evaluated by lexical weights (Koehn et al., 2003) or term weighting (Zhao et al., 2004) as addi- tional features ... training data size is small. 3.2 Bilingual Information Metric Trying to find phrase translations for any possible n- gram is not a good idea for two reasons. First, due to data sparsi...

Ngày tải lên: 08/03/2014, 01:20

8 472 0
Báo cáo khoa học: "An Integrated Architecture for Shallow and Deep Processing" doc

Báo cáo khoa học: "An Integrated Architecture for Shallow and Deep Processing" doc

... trend in application- oriented natural language processing (e.g., in the area of term, information, and answer extraction) has been to argue that for many purposes, shallow natural language processing ... construction, and for providing se- mantically based selectional restrictions to help con- straining the search space during deep parsing. Ger- maNet (Hamp and Feldweg, 1997) i...

Ngày tải lên: 08/03/2014, 07:20

8 414 0
Báo cáo khoa học: "Weakly Supervised Learning for Hedge Classification in Scientific Literature" pot

Báo cáo khoa học: "Weakly Supervised Learning for Hedge Classification in Scientific Literature" pot

... is well within the range usually accepted as represent- ing ‘good’ agreement, and thus we are confident in accepting human labeling as a gold-standard for the hedge classification task. For our experiments, ... training samples, the basic paradigm for both co-training and self-training. However we generalise by framing the task in terms of the acqui- sition of labelled training...

Ngày tải lên: 23/03/2014, 18:20

8 470 0
Báo cáo khoa học: "An Unsupervised System for Identifying English Inclusions in German Text" doc

Báo cáo khoa học: "An Unsupervised System for Identifying English Inclusions in German Text" doc

... training on the space travel and test- ing on the internet data. We chose these two do- main pairs to ensure that both the training and test data contain a relatively large number of English in- clusions. ... described above. Although both domains contain a large number of English in- clusions, their type-token ratio amounts to 0.29 in the internet data and 0.15 in the s...

Ngày tải lên: 23/03/2014, 19:20

6 333 0
w