probabilities from speech corpora

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

... Japanese-English language pair, especially if involving the comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for the disambiguation of translation ... comparable corpora- based techniques, re- spectively compared to the hybrid two-stages com- parable corpora and linguistics-based pruning. The proposed approach based on bi-directional comparable corpora ... TR2-007. P. Fung. 2000. A Statistical View of Bilingual Lexi- con Extraction: From Parallel Corpora to Non-Parallel Corpora. In Jean Veronis, Ed. Parallel Text Process- ing. G. Grefenstette. 1999....

Ngày tải lên: 20/02/2014, 16:20

4 377 0
Tài liệu Báo cáo khoa học: "Prefix Probabilities from Stochastic Tree Adjoining Grammars*" pptx

Tài liệu Báo cáo khoa học: "Prefix Probabilities from Stochastic Tree Adjoining Grammars*" pptx

... obtained from the first set allows the effective computation of the prefix probability. 4 Computing Prefix Probabilities This section develops an algorithm for the com- putation of prefix probabilities ... to be the label of the node, which is either a terminal from E or the empty string e. For each other node N, label(N) is an element from NT. At a node N in a tree such that label(N) • NT ... N of some derived elementary tree; and (ii) t's root spans from position i to position j in w, t's foot node spans from posi- tion fl to position f2 in w. In case N does not dominate...

Ngày tải lên: 20/02/2014, 18:20

7 311 0
Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

... translation knowledge acquisition from WWW news sites, this paper studies issues on the effect of cross-language retrieval of relevant texts in bilingual lexicon ac- quisition from comparable corpora. We experimentally ... parallel/comparative corpora. However, the sizes as well as the domain of existing parallel/comparative corpora are lim- ited, while it is very expensive to manually col- lect parallel/comparative corpora. ... translation knowledge acquisition from parallel/comparative corpora, various kinds of translation knowledge are acquired. Within this framework of translation knowledge acquisition from WWW news sites, this...

Ngày tải lên: 22/02/2014, 02:20

8 477 0
Báo cáo khoa học: "Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge" doc

Báo cáo khoa học: "Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge" doc

... of bilingual lexicon extraction from parallel corpora. This assumption should also be reasonable for many types of comparable corpora such as Wikipedia or news corpora, which are topically aligned ... trans- lation candidates from multilingual comparable corpora. By employing the algorithm we have improved precision scores of the methods rely- ing on per-topic word distributions from a cross- language ... efficiently bridge the gap between languages. That seed lexicon is usually crawled from the Web or obtained from parallel corpora. Recently, Li et al. (2011) have proposed an ap- proach that improves...

Ngày tải lên: 08/03/2014, 21:20

11 290 0
Báo cáo khoa học: "Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora" pptx

Báo cáo khoa học: "Toolkit for Multi-Level Alignment and Information Extraction from Comparable Corpora" pptx

... defined in the toolkit: “parallel data mining from comparable corpora and “named entity/terminology extraction and mapping from comparable corpora . The next section provides a general overview ... sentence pairs are extracted from the aligned comparable corpora (section 2.2). The workflow for named entity (NE) and terminology extraction and mapping from comparable corpora extracts data in ... corpus from comparable patents and its experimental application to SMT. Proceedings of the 3 rd workshop on building and using comparable corpora: from parallel to non-parallel corpora, ...

Ngày tải lên: 16/03/2014, 20:20

6 289 0
Báo cáo khoa học: "Building Emotion Lexicon from Weblog Corpora" potx

Báo cáo khoa học: "Building Emotion Lexicon from Weblog Corpora" potx

... 133–136, Prague, June 2007. c 2007 Association for Computational Linguistics Building Emotion Lexicon from Weblog Corpora Changhua Yang Kevin Hsin-Yih Lin Hsin-Hsi Chen Department of Computer Science and ... mine the relationships between words and emotions using weblog corpora. A collocation model is proposed to learn emotion lexicons from weblog articles. Emotion classification at sentence level ... Blog from January to July, 2006, spanning a period of 212 days. In total, 336,161 bloggers’ articles were col- lected. Each blogger posts 16 articles on average. We used the articles from...

Ngày tải lên: 17/03/2014, 04:20

4 302 0
Báo cáo khoa học: "Discovering Relations among Named Entities from Large Corpora" pot

Báo cáo khoa học: "Discovering Relations among Named Entities from Large Corpora" pot

... discovering relations among various entities from large text corpora. Our method does not need the richly annotated corpora required for supervised learning — corpora which take great time and effort ... discov- ery, however, needed large annotated corpora which cost a great deal of time and effort. We propose an unsupervised method for relation discovery from large corpora. The key idea is clustering ... these mea- Discovering Relations among Named Entities from Large Corpora Takaaki Hasegawa Cyberspace Laboratories Nippon Telegraph and Telephone Corporation 1-1 Hikarinooka, Yokosuka, Kanagawa 239-0847,...

Ngày tải lên: 17/03/2014, 06:20

8 283 0
Báo cáo khoa học: "Constructing Transliteration Lexicons from Web Corpora" docx

Báo cáo khoa học: "Constructing Transliteration Lexicons from Web Corpora" docx

... targeted language pair. To generate confusion matrices from automated speech recognition requires the effort of collecting many speech corpora for model training, costing time and labor. Automatically ... Regularly exploring Web corpora is a good way to update dictionaries. Transliterated-term extraction using non-parallel corpora has also been conducted (Kuo, 2003). Automated speech recognition-generated ... importance of term transliteration can be realized from our analysis of the terms used in 200 qualifying sentences that were randomly selected from English-Chinese mixed news pages. Each qualifying...

Ngày tải lên: 17/03/2014, 06:20

4 218 0
Báo cáo khoa học: "Constructing Semantic Space Models from Parsed Corpora" potx

Báo cáo khoa học: "Constructing Semantic Space Models from Parsed Corpora" potx

... also differ from Livesay and Burgess (1997) who found that mediated primes were fur- ther from their targets than unrelated controls, us- ing however a model and corpus different from the ones ... revealed that category coordination is reliably distinguished from all other relations and that phrasal association is re- liably different from antonymy and synonymy. Tax- onomy related relations ... traditional vector- based model (F(2,138) = 9.384, p < .001). Constructing Semantic Space Models from Parsed Corpora Sebastian Padó Department of Computational Linguistics Saarland University PO Box...

Ngày tải lên: 17/03/2014, 06:20

8 280 0
Báo cáo khoa học: "Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora" docx

Báo cáo khoa học: "Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora" docx

... from the source to all the English words (including the empty one), edges from all the French words (including the empty one) to the sink, an edge from the sink to the source, and edges from ... or through two edges, one from bandwidth to largeur de bande., and one from bandwidth to either largeur or hap.de (type 2), or even through the two edges from bandwidth to largeur ... distortion parameters, multiword no- tions, or information on part-of -speech, infor- mation derived from bilingual dictionaries or from thesauri. The integration of new param- eters is in general...

Ngày tải lên: 17/03/2014, 07:20

7 379 0
Báo cáo khoa học: "PRECISE N-GRAM PROBABILITIES FROM STOCHASTIC CONTEXT-FREE GRAMMARS" pptx

Báo cáo khoa học: "PRECISE N-GRAM PROBABILITIES FROM STOCHASTIC CONTEXT-FREE GRAMMARS" pptx

... we assume all SCFGs to be in CNF. Probabilities from expectations The first key insight towards a solution is that the n-gram probabilities can be obtained from the associated expected frequencies ... periment 2, a different set of bigram probabilities was used, computed from the context-free grammar, whose probabil- ities had previously been estimated from the same training corpus, using ... task, where it helped to improve bigram estimates obtained from relatively small amounts of data. Deriving n-gram probabilities from more sophisticated language models appears to be a generally...

Ngày tải lên: 17/03/2014, 09:20

6 167 0
Báo cáo khoa học: "Learning Translations of Named-Entity Phrases from Parallel Corpora" ppt

Báo cáo khoa học: "Learning Translations of Named-Entity Phrases from Parallel Corpora" ppt

... Named-Entity Phrases from Parallel Corpora Robert C. Moore Microsoft Research Redmond, WA 98052, USA bobmoore@microsoft.com Abstract We develop a new approach to learn- ing phrase translations from parallel ... same measures used by Melamed (2000) in his work on learning single-word trans- lations from parallel corpora. We use the coverage metric rather than recall, because in this data, phrases often ... not actually a gener- ative model, the probabilities being combined are comparable, and it seems to work well in practice. Since in named-entity translation from English to Spanish or French, capitalization...

Ngày tải lên: 17/03/2014, 22:20

8 312 0

Bạn có muốn tìm thêm với từ khóa:

w