Báo cáo khoa học: "Approximation Lasso Methods for Language Modeling" doc

Báo cáo khoa học: "Approximation Lasso Methods for Language Modeling" doc

Báo cáo khoa học: "Approximation Lasso Methods for Language Modeling" doc

... use of lasso for statistical language modeling for text input. Owing to the very large number of parameters, directly optimizing the pe- nalized lasso loss function is impossible. Therefore, ... the lasso solu- tion can be directly computed via numerical methods. To our knowledge, this paper presents the first empirical study of lasso for a realistic, large scale task: L...
Ngày tải lên : 17/03/2014, 04:20
  • 8
  • 313
  • 0
Báo cáo khoa học: "Some Psychological Methods for Evaluating the Quality of Translations" doc

Báo cáo khoa học: "Some Psychological Methods for Evaluating the Quality of Translations" doc

... beliebiger Impulsform ergibt sich das Faltungsprodukt aus Membran- und Impulsform. (1) By any form of the impulse yields -self the products of the folding out membrane- and form of the impulse. ... any form of the impulse yields the products of the folding out membrane- and form of an impulse. (3) By any form of the impulse yields the products of the folding out membrane- and for...
Ngày tải lên : 30/03/2014, 17:20
  • 8
  • 427
  • 0
Báo cáo khoa học: "Similarity-Based Methods For Word Sense Disambiguation" docx

Báo cáo khoa học: "Similarity-Based Methods For Word Sense Disambiguation" docx

... for each test set, where the base language model was MLE-1. The methods, going from left to right, are RAND, Pc, L, and A. The performances shown are for settings offl that were optimal for ... huge margin; therefore, we con- clude that information from other word pairs is very useful for unseen pairs where unigram fre- quency is not informative. The similarity-based method...
Ngày tải lên : 31/03/2014, 21:20
  • 8
  • 312
  • 0
Tài liệu Báo cáo khoa học: "Conditional Random Fields for Word Hyphenation" docx

Tài liệu Báo cáo khoa học: "Conditional Random Fields for Word Hyphenation" docx

... many machine learn- ing methods, no strong guidance is available for choosing values for these parameters. For En- glish we use the parameters reported in (Liang, 1983). For Dutch we use the parameters ... positives) for TALO. For the Dutch language, the standard CRF us- ing the Viterbi path has overall error rate 0.08%, compared to 0.81% for the T E X algorithm. The serious e...
Ngày tải lên : 20/02/2014, 04:20
  • 9
  • 607
  • 0
Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

... new methods for the self-exploration of language based on the applied linguistic theories of “learning as discovery” and Data-Driven Learning (DDL) introduced by Johns (1991). One can search for ... Engkoo 1 , a system for exploring and learning language. It is built primarily by mining translation knowledge from billions of web pages - using the Inter- net to catch language in m...
Ngày tải lên : 20/02/2014, 05:20
  • 6
  • 658
  • 0
Báo cáo khoa học: "Polarity Consistency Checking for Sentiment Dictionaries" docx

Báo cáo khoa học: "Polarity Consistency Checking for Sentiment Dictionaries" docx

... assignment γ for a network N is a function from W ∪ S to the set P. Let γ be a polarity assignment for N. We say that γ is consistent if it satisfies the following condition for each w ∈ W: For p ∈ ... logic formulas, we can reduce it to C(s) = ¬s + ∨ ¬s − (1) For each word w with polarity p ∈ {−, +, 0 } in D we need a clause C(w, p) that states that w has polarity p. So, the Boolean...
Ngày tải lên : 07/03/2014, 18:20
  • 9
  • 333
  • 0
Báo cáo khoa học: "A Compositional Semantics for Focusing Subjuncts" doc

Báo cáo khoa học: "A Compositional Semantics for Focusing Subjuncts" doc

... operators for the focusing subjunct 7 (see below). 3.3 The sentential operators The sentential operators for only and even are given below. (The one for too is the same as that for even, ... sentence in its internal Prolog format. Secondly, the GPSG category obtained for the sen- tence, which incorporates a parse tree for the sen- tence, is displayed. For the sake...
Ngày tải lên : 08/03/2014, 18:20
  • 8
  • 304
  • 0
Báo cáo khoa học: "A Flexible Architecture for Reference Resolution" docx

Báo cáo khoa học: "A Flexible Architecture for Reference Resolution" docx

... turns input text ~ format into standard format for discourse referents. Coreference [Semantic type matching 1 nalysis for [lbr pronouns J efinite NPS Hobbs naive lagreement for | algorithm ... especially problematic for designers of dialogue systems trying to pre- dict how anaphora resolution techniques devel- oped for written monologue will perform when adapted for s...
Ngày tải lên : 08/03/2014, 21:20
  • 4
  • 279
  • 0
Báo cáo khoa học: "Modelling lexical redundancy for machine translation" doc

Báo cáo khoa học: "Modelling lexical redundancy for machine translation" doc

... Table 4). For each language pair the MRF learned features that capture intuitively redundant patterns: adjectival endings for French, case mark- ings for Czech, and mutation patterns for Welsh. The ... phrase-based SMT system for three different language pairs. The MRF prior improves the results and picks up fea- tures that appear to agree with linguistic intuitions of redundanc...
Ngày tải lên : 17/03/2014, 04:20
  • 8
  • 302
  • 0
Báo cáo khoa học: "Interactive Word Alignment for Language Engineering" pptx

Báo cáo khoa học: "Interactive Word Alignment for Language Engineering" pptx

... the sentence level beforehand. Input files may be numbered text files or annotated files in XML- format. The annotation records linguistic infor- mation on four levels: word form, base form, part-of-speech ... proved to be useful tools for various language and NLP tasks, such as bilingual lexicon extraction for lexicography, bilingual terminology and machine translation. Although perfor...
Ngày tải lên : 17/03/2014, 22:20
  • 4
  • 309
  • 0