Báo cáo khoa học: "Mining Co-Occurrence Matrices for SO-PMI Paradigm Word Candidates" docx

Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

Tài liệu Báo cáo khoa học: " Mining the Web for Language Learning" pdf

... therefore cannot cover fresh words or new usages of existing words. Secondly, their search 1 http://www.engkoo.com. functions are often limited, making it hard for users to effectively find information ... definition of a word/ phrase; 2) retrieve example sentences using keywords, POS tags or collocations; and 3) get the translation of a word/ phrase/sentence. While Engkoo is currently buil...

Ngày tải lên: 20/02/2014, 05:20

6 658 0
Tài liệu Báo cáo khoa học: "Mining Wiki Resources for Multilingual Named Entity Recognition" pdf

Tài liệu Báo cáo khoa học: "Mining Wiki Resources for Multilingual Named Entity Recognition" pdf

... language is available for download (download.wikimedia.org) in a text format suitable for inclusion in a database. For the remainder of this paper, we refer to this format. 1 Within Wikipedia, ... language article, if available, for additional information. • A second pass checks for multi -word phrases that exist as titles of Wikipedia articles. • We look for certain t...

Ngày tải lên: 20/02/2014, 09:20

9 429 1
Báo cáo khoa học: "Mining the Web for Bilingual Text" pot

Báo cáo khoa học: "Mining the Web for Bilingual Text" pot

... and formally evaluating perfor- mance. The most recent end-product is an au- tomatically acquired parallel corpus comprising 2491 English-French document pairs, approxi- mately 1.5 million words ... world changes. For example, Diekema et al., in a presentation at the 1998 TREC-7 conference (Voorhees and Har- man, 1998), observed that the performance of their cross-language informati...

Ngày tải lên: 08/03/2014, 06:20

8 229 0
Tài liệu Báo cáo khoa học: "A Gibbs Sampler for Phrasal Synchronous Grammar Induction" docx

Tài liệu Báo cáo khoa học: "A Gibbs Sampler for Phrasal Synchronous Grammar Induction" docx

... p null = 10 −10 for this value in the experiments we report below. 784 either φ P z i for phrase pairs or φ null for single lan- guage phrases. We choose Dirichlet process (DP) priors for these parameters: φ P z i ∼ ... undesirable. Word- based mod- els are incapable of learning translational equiv- alences between non-compositional phrasal units, while the algorithms used for indu...

Ngày tải lên: 20/02/2014, 07:20

9 474 1
Tài liệu Báo cáo khoa học: "An expressive formalism for describing tree-based grammars" docx

Tài liệu Báo cáo khoa học: "An expressive formalism for describing tree-based grammars" docx

... tree description and/or of semantic formu- las. The XMG formalism furthermore supports the sharing of identifiers across dimension hence al- lowing for a straightforward encoding of the syn- tax/semantics ... frame- work for the processing of linguistic meta- descriptions. 1 Introduction It is well known that grammar engineering is a complex task and that factorizing grammar in- formation...

Ngày tải lên: 22/02/2014, 02:20

4 329 0
Báo cáo khoa học: "Improving Decoding Generalization for Tree-to-String Translation" docx

Báo cáo khoa học: "Improving Decoding Generalization for Tree-to-String Translation" docx

... the most representative work is the forest-based translation method (Mi et al. 2008; Mi and Huang 2008; Zhang et al. 2009) in which a packed forest (forest for short) structure is used to effectively ... Association for Computational Linguistics:shortpapers, pages 418–423, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics Improving Decoding Genera...

Ngày tải lên: 17/03/2014, 00:20

6 256 0
Báo cáo khoa học: "Soft Syntactic Constraints for Hierarchical Phrased-Based Translation" docx

Báo cáo khoa học: "Soft Syntactic Constraints for Hierarchical Phrased-Based Translation" docx

... syntactic constraints; for example, AdvP, the top performer in Arabic, cannot possibly perform well for Chinese, since in our parses the AdvP constituents rarely in- clude more than a single word. At the ... rules of the grammar directly from word- aligned paral- lel text. Rules have the form X → ¯e, ¯ f, where ¯e and ¯ f are phrases containing terminal symbols (words) and possibly co...

Ngày tải lên: 17/03/2014, 02:20

9 235 0
Báo cáo khoa học: "A Probabilistic Model for Canonicalizing Named Entity Mentions" docx

Báo cáo khoa học: "A Probabilistic Model for Canonicalizing Named Entity Mentions" docx

... 1 for clarity. We sample the column index c 1 for the first word in the mention, marginalizing out probabilities of other words in the mention. After we sample the column index for the first word, ... features f for our featurized log-linear distri- bution (§3.1.2). We then downcased all words in mentions for the purpose of defining the table and the mention words w. Ten context wor...

Ngày tải lên: 23/03/2014, 14:20

9 293 0
Báo cáo khoa học: "Alignment Model Adaptation for Domain-Specific Word Alignment" pptx

Báo cáo khoa học: "Alignment Model Adaptation for Domain-Specific Word Alignment" pptx

... perform statistical word alignment. We use and to represent the bi-directional alignment sets, which are shown in Equation (4) and (5). For alignment in both sets, we use j for source words ... http://www.fjoch.com/GIZA++.html. In the following subsections, we will perform linear interpolation for word alignment in the source to target direction. For the word alignment...

Ngày tải lên: 31/03/2014, 03:20

8 329 0
Báo cáo khoa học: "An Efficient Method for Determining Bilingual Word Classes" doc

Báo cáo khoa học: "An Efficient Method for Determining Bilingual Word Classes" doc

... is met: FOR EACH word e: FOR EACH class E: [Determine the change of LP((E, 9v), rig) if e is moved to E. Move e to the class with the largest improvement. FOR EACH &apos ;word f: FOR EACH ... and Wu, 1995)). We define bilingual word clustering as the process of forming correspond- ing word classes suitable for machine translation purposes for a pair of languages usin...

Ngày tải lên: 31/03/2014, 21:20

6 289 0
w