Báo cáo khoa học: "CS NIPER Annotation-by-query for non-canonical constructions in large corpora" pdf

Báo cáo khoa học: "Power-Law Distributions for Paraphrases Extracted from Bilingual Corpora" pdf

Báo cáo khoa học: "Power-Law Distributions for Paraphrases Extracted from Bilingual Corpora" pdf

... HLT/NAACL, pp. 17–24. Inderjit S. Dhillon and Yuqiang Guan. 2003. Informa- tion Theoretic Clustering of Sparse Co-Occurrence Data. Proc. IEEE Int’l Conf. Data Mining, pp. 517– 520. Inderjit S. Dhillon, ... 2 (pivoting) and 6 (KB). The former can be too restrictive and the latter can lead to excessive noise contamination when taking shallow syntac- tic information features into account. Instea...
Ngày tải lên : 17/03/2014, 22:20
  • 10
  • 370
  • 0
Báo cáo khoa học: "CS NIPER Annotation-by-query for non-canonical constructions in large corpora" pdf

Báo cáo khoa học: "CS NIPER Annotation-by-query for non-canonical constructions in large corpora" pdf

... scenario for identifying and annotating non-canonical grammatical constructions in large corpora based on linguistic queries and (ii) evaluation of annotation quality by mea- suring inter-rater ... grammatical constructions offers interesting in- sights into the constructional inventory of a lan- guage. It also opens up the possibility of comparing seemingly closely related lang...
Ngày tải lên : 16/03/2014, 20:20
  • 6
  • 356
  • 0
Tài liệu Báo cáo khoa học: "Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora" ppt

Tài liệu Báo cáo khoa học: "Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora" ppt

... full-form as a bridge in- volves four components: identifying abbreviations, learning their full-forms, inducing their translations, and integrating the abbreviation translations into the baseline ... popular ordering in Chinese texts. In elimination, one or more words of the origi- nal full-form phrase are eliminated and the rest parts remain as an abbreviation. For example, in the...
Ngày tải lên : 20/02/2014, 09:20
  • 9
  • 444
  • 0
Tài liệu Báo cáo khoa học: "An Ensemble Method for Selection of High Quality Parses" pdf

Tài liệu Báo cáo khoa học: "An Ensemble Method for Selection of High Quality Parses" pdf

... parsing model number 2 of Collins (1999) and the reranking parser of Charniak and Johnson (2005), both when the training and test data belong to the same domain (the in- domain scenario) and in ... Rerank- ing and SEPA are thus relatively independent. Bagging (Breiman, 1996) uses an ensemble of in- stances of a model, each trained on a sample of the training data 1 . Bagging was suggeste...
Ngày tải lên : 20/02/2014, 12:20
  • 8
  • 462
  • 0
Báo cáo khoa học: "Probabilistic Document Modeling for Syntax Removal in Text Summarization" ppt

Báo cáo khoa học: "Probabilistic Document Modeling for Syntax Removal in Text Summarization" ppt

... 2010 datasets us- ing the ROUGE metric. 1 Introduction While the dominant problem in Information Re- trieval in the first part of the century was finding relevant information within a datastream ... TAC involve taking a set of documents as input and outputting a short summary (either 100 or 250 words, depending on the year) containing what the system deems to be the most important informati...
Ngày tải lên : 07/03/2014, 22:20
  • 6
  • 448
  • 0
Báo cáo khoa học: "Bypassed Alignment Graph for Learning Coordination in Japanese Sentences" doc

Báo cáo khoa học: "Bypassed Alignment Graph for Learning Coordination in Japanese Sentences" doc

... coordinations in a sentence (Task 1) is straightforward in English. Indeed, nearly 100% precision and recall can be achieved in Task 1 sim- ply by pattern matching with a small number of coordination ... coordinations in the sentences/phrases detected in Task 1. The studies on English coordination listed above are concerned mainly with scope disam- biguation, reflecting the fact that...
Ngày tải lên : 08/03/2014, 01:20
  • 4
  • 353
  • 0
Báo cáo khoa học: "Statistical Machine Translation for Query Expansion in Answer Retrieval" pptx

Báo cáo khoa học: "Statistical Machine Translation for Query Expansion in Answer Retrieval" pptx

... two distinct languages. That is, the 10 million question-answer pairs extracted from FAQ pages are fed as parallel training data into an SMT training pipeline. This training procedure includes ... p LM (syn I 1 ) λ LM For estimation of the feature weights  λ defined in equation (4) we employed minimum error rate (MER) training under the BLEU measure (Och, 2003). Training data for MER tr...
Ngày tải lên : 08/03/2014, 02:21
  • 8
  • 392
  • 0
Báo cáo khoa học: "Machine-learned contexts for linguistic operations in German sentence realization" doc

Báo cáo khoa học: "Machine-learned contexts for linguistic operations in German sentence realization" doc

... complete, spanning parse: 85.14% of the sentences in the training and parameter tuning set, and 84.59% in the blind test set fall into that category. Most sentences yield more than one training case. ... a machine Computational Linguistics (ACL), Philadelphia, July 2002, pp. 25-32. Proceedings of the 40th Annual Meeting of the Association for learning approach. The linguistically...
Ngày tải lên : 08/03/2014, 07:20
  • 8
  • 338
  • 0
Báo cáo khoa học: "Towards a Resource for Lexical Semantics: A Large German Corpus with Extensive Semantic Annotation" pot

Báo cáo khoa học: "Towards a Resource for Lexical Semantics: A Large German Corpus with Extensive Semantic Annotation" pot

... raw cor- pora, often using information from ontologies like WordNet (Miller et al., 1990). Meanwhile, the lack of large, domain- independent lexica providing word-semantic information is one of ... serious bottlenecks for language technology. To train tools for the acquisition of semantic information for such lexica, large, extensively annotated resources are necessary. In this pape...
Ngày tải lên : 17/03/2014, 06:20
  • 8
  • 407
  • 0
Báo cáo khoa học: "Designing spelling correctors for inflected languages using lexical transducers" pdf

Báo cáo khoa học: "Designing spelling correctors for inflected languages using lexical transducers" pdf

... checker, it is assumed to be a misspelling and a warning is given to the user who has different options, being one of most interesting including its lemma in the user-lexicon. 2.1 The user lexicons ... dictionaries whose entries can recognise both the original and inflected forms. In languages with a high level of inflection such as Basque spelling checking cannot be resolved witho...
Ngày tải lên : 17/03/2014, 23:20
  • 2
  • 263
  • 0

Xem thêm

Từ khóa: