noisy parallel software corpora

Báo cáo khoa học: "Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpora" pot

Báo cáo khoa học: "Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpora" pot

... desired. Paraphrases can be extracted from non -parallel corpora using contextual similarity (Lin, 1998). They can also be obtained from parallel corpora if such data is available (Barzilay and ... 4 August 2009. c 2009 ACL and AFNLP Extracting Paraphrases of Technical Terms from Noisy Parallel Software Corpora Xiaoyin Wang 1,2 , David Lo 1 , Jing Jiang 1 , Lu Zhang 2 , Hong Mei 2 1 School ... multilingual corpora (Bannard and Callison- Burch, 2005; Zhao et al., 2008). The approach in (Barzilay and McKeown, 2001) does not use deep linguistic analysis and therefore is suitable to noisy corpora...

Ngày tải lên: 08/03/2014, 01:20

4 293 0
Tài liệu Báo cáo khoa học: "A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora" doc

Tài liệu Báo cáo khoa học: "A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora" doc

... dim(V2) 240 A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Pascale Fung Computer Science Department Columbia University New York, NY 10027 ... aligned, parallel texts. However, sentence align- ment (Brown et al. 1991; Kay & RSscheisen 1993; Gale & Church 1993; Church 1993; Chen 1993; Wu 1994) is not always practical when corpora ... small Chi- nese/English parallel corpus of approximately 5760 unique English words. The outline of the algorithm is as follows: 1. Tag the English half of the parallel text. In the first...

Ngày tải lên: 20/02/2014, 22:20

8 427 0
A parallel implementation on modern hardware for geo-electrical tomographical software

A parallel implementation on modern hardware for geo-electrical tomographical software

... three groups [1]: • Instruction-Level Parallel (Fined-Grained Control Parallelism) • Process-Level Parallel (Coarse-Grained Control Parallelism) 4 OpenCL’s data parallelism is quite like CUDA. There ... 26 A PARALLEL IMPLEMENTATION ON 1 MODERN HARDWARE FOR GEO-ELECTRICAL TOMOGRAPHICAL SOFTWARE 1 NGUYỄN HOÀNG VŨ 2 A PARALLEL IMPLEMENTATION ON 2 MODERN HARDWARE FOR GEO-ELECTRICAL TOMOGRAPHICAL SOFTWARE ... 3 1.1 An overview of modern parallel architectures 4 1.1.1 Instruction-Level Parallel Architectures 5 1.1.2 Process-Level Parallel Architectures 6 1.1.3 Data parallel architectures 8 1.1.4...

Ngày tải lên: 23/11/2012, 15:03

58 374 0
Tài liệu Báo cáo khoa học: "ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation" pdf

Tài liệu Báo cáo khoa học: "ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation" pdf

... as there are parallel corpora avail- able for the targeted languages. Although large mul- tilingual corpora are still rather scarce, we strongly believe there will be more parallel corpora available in ... to WSD that directly incorporates evidence from four other languages. To this end, we build further on two well-known research ideas: (1) the possibility to use parallel corpora to extract translation ... of large quantities of parallel text, internet corpora such as the ever growing Wikipedia corpus, etc.). Another line of research could be the exploitation of comparable corpora to acquire addi- tional...

Ngày tải lên: 20/02/2014, 05:20

6 538 0
Tài liệu Báo cáo khoa học: "ALIGNING SENTENCES IN PARALLEL CORPORA" doc

Tài liệu Báo cáo khoa học: "ALIGNING SENTENCES IN PARALLEL CORPORA" doc

... sentences of each length less tha.n 8]. We estimated the probabilities 173 ALIGNING SENTENCES IN PARALLEL CORPORA Peter F. Brown, Jennifer C. Lai, a, nd Robert L. Mercer IBM Thomas J. Watson Research ... describe a statistical tech- nique for aligning sentences with their translations in two parallel corpora. In addition to certain anchor points that are available in our da.ta, the only information ... sentence, or even a whole passage, may be missing from one or the other of the corpora. If a person is given two parallel texts and asked to match up the sentences in them, it is na.tural for...

Ngày tải lên: 20/02/2014, 21:20

8 387 0
Báo cáo khoa học: "Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora" pdf

Báo cáo khoa học: "Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora" pdf

... for parallel data acquisition is highly benefi- cial for the SMT field. Comparable corpora exhibit various degrees of parallelism. Fung and Cheung (2004a) describe corpora ranging from noisy parallel, ... comparable or non -parallel sentence pairs. Since our approach can extract parallel data from texts which contain few or no parallel sen- tences, it greatly expands the range of corpora which can ... the relevant information concerning these corpora. 3.2 Extraction Experiments On each of our comparable corpora, and using each of our initial parallel corpora, we apply both the fragment extraction...

Ngày tải lên: 08/03/2014, 02:21

8 263 0
Báo cáo khoa học: "Paraphrasing with Bilingual Parallel Corpora" pot

Báo cáo khoa học: "Paraphrasing with Bilingual Parallel Corpora" pot

... probability to include multiple corpora, as follows: ˆe 2 = arg max e 2 =e 1  C  f in C p(f|e 1 )p(e 2 |f) (5) where C is a parallel corpus from a set of parallel corpora. For this condition we ... word- and sentence-aligned parallel corpora. In Proceedings of ACL. Mona Diab and Philip Resnik. 2002. An unsupervised method for word sense tagging using parallel corpora. In Proceedings of ... and comfort as console. While monolingual parallel corpora often have identical contexts that can be used for identifying paraphrases, bilingual parallel corpora do not. In- stead, we use phrases...

Ngày tải lên: 08/03/2014, 04:22

8 308 0
Báo cáo khoa học: "AUTOMATIC ALIGNMENT IN PARALLEL CORPORA" potx

Báo cáo khoa học: "AUTOMATIC ALIGNMENT IN PARALLEL CORPORA" potx

... (associated with content words) reduces the number of parameters 335 AUTOMATIC ALIGNMENT IN PARALLEL CORPORA Harris Papageorgiou, Lambros Cranias, Stelios Piperidis I Institute for Language ... the optimum alignment of units. The proposed scheme has been tested at sentence level on parallel corpora of the CELEX database. The success rate exceeded 99%. The next steps of the work ... paper addresses the alignment issue in the framework of exploitation of large bi- multilingual corpora for translation purposes. A generic alignment scheme is proposed that can meet varying...

Ngày tải lên: 08/03/2014, 07:20

3 193 0
Giáo trình SoftWare Testing

Giáo trình SoftWare Testing

... Confidential - 2 - Table of Contents 1 INTRODUCTION TO SOFTWARE 7 1.1 EVOLUTION OF THE SOFTWARE TESTING DISCIPLINE 7 1.2 THE TESTING PROCESS AND THE SOFTWARE TESTING LIFE CYCLE 7 1.3 BROAD CATEGORIES ... workings of the software more visible. These contrast with black box techniques that simply look at the official outputs of a program. White box testing is concerned only with testing the software ... FOR SYSTEM TESTING 51 Software Testing Confidential Cognizant Technology Solutions Performance Testing Process & Methodology Proprietary & Confidential - 8 - software development life...

Ngày tải lên: 18/08/2012, 10:59

179 1,9K 14

Bạn có muốn tìm thêm với từ khóa:

w