using bilingual comparable corpora

Báo cáo khoa học: "Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment" pptx

Báo cáo khoa học: "Using Bilingual Parallel Corpora for Cross-Lingual Textual Entailment" pptx

... more generic multilingual resources (e.g bilin- gual dictionaries). 3 Using Parallel Corpora for CLTE Bilingual parallel corpora represent a possible solu- tion to overcome the inadequacy of ... extracted from bilingual corpora, we conducted a series of ex- periments using the different resources mentioned in Section 4.2. As it can be observed in Table 1, the highest results are achieved using ... parallel corpora be useful also for mono- lingual TE? To answer this question, we experiment on monolingual RTE datasets using paraphrase ta- bles extracted from bilingual parallel corpora. Our results...

Ngày tải lên: 17/03/2014, 00:20

10 284 0
Tài liệu Báo cáo khoa học: "Word Alignment for Languages with Scarce Resources Using Bilingual Corpora of Other Language Pairs" pptx

Tài liệu Báo cáo khoa học: "Word Alignment for Languages with Scarce Resources Using Bilingual Corpora of Other Language Pairs" pptx

... amounts of bilingual data are available for the desired language pair L1-L2, large-scale bilin- gual corpora in L1-L3 and L2-L3 are available. Using these two additional bilingual corpora, we ... as compared with the method using the two corpora in L1-L3 and L3-L2, and a relative error rate reduction of 21.30% as compared with the method using the small bilingual corpus in L1 and ... resources using bilingual corpora of other language pairs. To perform word alignment between languages L1 and L2, we introduce a third language L3. Al- though only small amounts of bilingual...

Ngày tải lên: 20/02/2014, 12:20

8 359 0
Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

... Japanese-English language pair, especially if involving the comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for the disambiguation of translation ... scheme. The proposed two-stages model using comparable corpora showed a better improvement in terms of av- erage precision compared to the simple model (one- stage comparable corpora- based translation) ... News (1998-1999) for English were considered as comparable corpora. We have also considered documents of NTCIR-2 test collection as comparable corpora in order to cope with special features of...

Ngày tải lên: 20/02/2014, 16:20

4 377 0
Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

Tài liệu Báo cáo khoa học: "Effect of Cross-Language IR in Bilingual Lexicon Acquisition from Comparable Corpora" pot

... of relevant texts in bilingual lexicon ac- quisition from comparable corpora. We experimentally show that it is quite ef- fective to reduce the candidate bilingual term pairs against which bilingual term correspondences ... of ac- quiring bilingual term correspondences from com- parable corpora, i.e., in reducing useless bilingual term pairs and in increasing the estimated confi- dence of useful bilingual term pairs. More ... of bilingual term correspondences. 2 Acquisition of Bilingual Term Correspondences from Compa- rable Corpora Previously studied techniques of estimating bilin- gual term correspondences from comparable...

Ngày tải lên: 22/02/2014, 02:20

8 477 0
Báo cáo khoa học: "Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language Text Categorization" potx

Báo cáo khoa học: "Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language Text Categorization" potx

... training parts. Using only comparable corpora. Figure 2 re- ports the performance without any use of bilingual dictionaries. Each graph show the learning curves respectively using a BoW kernel ... methodolo- gies and Section 6 concludes the paper suggesting some future developments. 2 Comparable Corpora Comparable corpora are collections of texts in dif- ferent languages regarding similar topics ... on comparable corpora is a feasible task. In particular, it is pos- sible to deal with it even when no bilingual re- sources are available. On the other hand when it is possible to exploit bilingual...

Ngày tải lên: 17/03/2014, 04:20

8 361 0
Tài liệu Báo cáo khoa học: "Exploring Syntactic Structural Features for Sub-Tree Alignment using Bilingual Tree Kernels" docx

Tài liệu Báo cáo khoa học: "Exploring Syntactic Structural Features for Sub-Tree Alignment using Bilingual Tree Kernels" docx

... Conclusion In this paper, we explore syntactic structure fea- tures by means of Bilingual Tree Kernels and ap- ply them to bilingual sub-tree alignment along with various lexical and plain structural ... translation, tree kernels are seldom applied. In this paper, we propose Bilingual Tree Ker- nels (BTKs) to model the bilingual translational equivalences, in our case, to conduct sub-tree alignment. ... structures. We propose two kinds of BTKs named dependent Bilingual Tree Kernel (dBTK), which takes the sub-tree pair as a whole and independent Bilingual Tree Kernel (iBTK), which individually models...

Ngày tải lên: 20/02/2014, 04:20

10 467 0
Tài liệu Báo cáo khoa học: "Semi-Supervised Learning of Partial Cognates using Bilingual Bootstrapping" doc

Tài liệu Báo cáo khoa học: "Semi-Supervised Learning of Partial Cognates using Bilingual Bootstrapping" doc

... monolingual bootstrapping technique, we also use bilingual bootstrapping. Diab (2002) has shown that unsupervised WSD systems that use parallel corpora can achieve results that are close to ... organizations and academic textbooks. We are using this set of sentences in our experiments to show that our methods perform well on multi- domain corpora and also because our aim is to be able ... that with simple methods and using available tools we can achieve good results in the task of partial cognate disambiguation. The accuracy might be increased by using de- pendencies relations,...

Ngày tải lên: 20/02/2014, 12:20

8 420 1
Tài liệu Báo cáo khoa học: "Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora" ppt

Tài liệu Báo cáo khoa học: "Weakly Supervised Named Entity Transliteration and Discovery from Multilingual Comparable Corpora" ppt

... English non-NEs paired with ran- dom Russian words. 4 Experimental Study We ran experiments using a bilingual comparable English-Russian news corpus we built by crawl- ing a Russian news web site ( The ... resource free language, given a bilingual corpora in which it is weakly temporally aligned with a resource rich language. NEs have similar time distributions across such corpora, and often some of ... Named Entities encountered in such corpora, and use them to develop an algorithm that extracts pairs of NEs across languages. Specifi- cally, given a bilingual corpora that is weakly tem- porally...

Ngày tải lên: 20/02/2014, 12:20

8 392 0
Tài liệu Báo cáo khoa học: "Context-dependent SMT Model using Bilingual Verb-Noun Collocation" doc

Tài liệu Báo cáo khoa học: "Context-dependent SMT Model using Bilingual Verb-Noun Collocation" doc

... automatically acquired from the chunk-aligned bilingual corpora. 4.1 Automatic Extraction of Bilingual Verb-Noun Collocation(BiVN) To automatically extract the bilingual verb-noun collocations, we utilize ... ob- tain more semantically plausible transla- tion results, we use bilingual verb-noun collocations; these are automatically ex- tracted by using chunk alignment and a monolingual dependency parser. ... Arbor, June 2005. c 2005 Association for Computational Linguistics Context-dependent SMT Model using Bilingual Verb-Noun Collocation Young-Sook Hwang ATR SLT Research Labs 2-2-2 Hikaridai Seika-cho Soraku-gun...

Ngày tải lên: 20/02/2014, 15:20

8 305 0
Tài liệu Báo cáo khoa học: " Word Translation Disambiguation Using Bilingual Bootstrapping" doc

Tài liệu Báo cáo khoa học: " Word Translation Disambiguation Using Bilingual Bootstrapping" doc

... Learning, vol. 34, pp. 107-130. G. Kikui, 1999. Resolving Translation Ambiguity Using Non-parallel Bilingual Corpora. In Proceedings of ACL ’99 Workshop on Unsupervised Learning in Natural ... −← γγ ;} Output: classifiers in English and Chinese Figure 2: Bilingual Bootstrapping Word Translation Disambiguation Using Bilingual Bootstrapping Cong Li Microsoft Research Asia 5F ... paper proposes a new method for word translation disambiguation using a machine learning technique called Bilingual Bootstrapping’. Bilingual Bootstrapping makes use of in learning a small...

Ngày tải lên: 20/02/2014, 21:20

9 480 0
Báo cáo khoa học: "Paraphrasing with Bilingual Parallel Corpora" pot

Báo cáo khoa học: "Paraphrasing with Bilingual Parallel Corpora" pot

... word- and sentence-aligned parallel corpora. In Proceedings of ACL. Mona Diab and Philip Resnik. 2002. An unsupervised method for word sense tagging using parallel corpora. In Proceedings of ACL. Ali ... and comfort as console. While monolingual parallel corpora often have identical contexts that can be used for identifying paraphrases, bilingual parallel corpora do not. In- stead, we use phrases in ... probability to include multiple corpora, as follows: ˆe 2 = arg max e 2 =e 1  C  f in C p(f|e 1 )p(e 2 |f) (5) where C is a parallel corpus from a set of parallel corpora. For this condition we...

Ngày tải lên: 08/03/2014, 04:22

8 308 0
Báo cáo khoa học: "Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models" pdf

Báo cáo khoa học: "Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models" pdf

... have presented two novel probabilistic models for unsupervised word sense disambiguation using parallel corpora and have shown that both models outperform existing unsupervised approaches. In addition, ... 39:1–38. Mona Diab and Philip Resnik. 2002. An unsuper- vised method for word sense tagging using paral- lel corpora. In Proceedings of the 40th Anniver- sary Meeting of the Association for Computa- tional ... Washington, April 4-5. David Yarowsky. 1992. Word-sense disambigua- tion using statistical models of Roget’s cate- gories trained on large corpora. In Proceedings of COLING-92, pages 454–460, Nantes, France, July. David...

Ngày tải lên: 08/03/2014, 04:22

8 361 0
Báo cáo khoa học: "Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge" doc

Báo cáo khoa học: "Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge" doc

... on precision and recall of bilingual lexicon extraction from parallel corpora. This assumption should also be reasonable for many types of comparable corpora such as Wikipedia or news corpora, which are ... need of a seed lex- icon as a prerequisite for bilingual lexicon extrac- tion. They train a cross-language topic model on document-aligned comparable corpora and intro- duce different methods for ... applicable to any language pair for which there exist sufficient comparable data for training of the topic model. Since comparable corpora often construct a very noisy environment, it is of the...

Ngày tải lên: 08/03/2014, 21:20

11 290 0

Bạn có muốn tìm thêm với từ khóa:
