Báo cáo khoa học: "Minimum Error Rate Training in Statistical Machine Translation" potx

Báo cáo khoa học: "Minimum Error Rate Training in Statistical Machine Translation" potx

Báo cáo khoa học: "Minimum Error Rate Training in Statistical Machine Translation" potx

... obtained by simply summing over scores for individual sentences. 4 Training Criteria for Minimum Error Rate Training In the following, we assume that we can measure the number of errors in sentence ... Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California 4676 Admiralty Way, Suite...

Ngày tải lên: 23/03/2014, 19:20

8 296 0
Tài liệu Báo cáo khoa học: "Segmentation for English-to-Arabic Statistical Machine Translation" ppt

Tài liệu Báo cáo khoa học: "Segmentation for English-to-Arabic Statistical Machine Translation" ppt

... domains: text news, trained on a large cor- pus, and spoken travel conversation, trained on a sig- nificantly smaller corpus. We show that segmenting the Arabic target in training and decoding ... 200,000 word training set, a 500 sentence tuning set and a 500 sentence test set. We use the Arabic side of the training data to train the language model and use trigrams for the baseline sys...

Ngày tải lên: 20/02/2014, 09:20

4 374 0
Tài liệu Báo cáo khoa học: "A Localized Prediction Model for Statistical Machine Translation" ppt

Tài liệu Báo cáo khoa học: "A Localized Prediction Model for Statistical Machine Translation" ppt

... change in perfor- mance between training on the original training data in Eq. 2 or on the modified training data in Eq. 10. Line shows that even when training the float weights on an event set obtained ... results in line - are obtained by training ’float’ weights only. Here, the training is carried out by running only once over % of the training data. The model including t...

Ngày tải lên: 20/02/2014, 15:20

8 578 0
Tài liệu Báo cáo khoa học: "Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach" pptx

Tài liệu Báo cáo khoa học: "Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach" pptx

... shown in Table 5. 6.2 Training and test perplexities In order to compute the training and test perplex- ities, we split the whole aligned training corpus in two parts as shown in Table 6. The training and ... computed automatically using another statistical train- ing procedure (Och, 1999) which often pro- duces word classes including words with the same semantic meaning in...

Ngày tải lên: 20/02/2014, 18:20

8 427 0
Tài liệu Báo cáo khoa học: "ADP based Search Algorithm for Statistical Machine Translation" docx

Tài liệu Báo cáo khoa học: "ADP based Search Algorithm for Statistical Machine Translation" docx

... de- tail. Finally, experimental results for a bilingual cor- pus are reported. 1.1 Statistical Machine Translation In statistical machine translation, the goal of the search strategy can ... spoken dialogs in the domain of appointment sche- duling (Wahlster, 1993). German source sentences are translated into English. In Table 1 the character- istics of the training and...

Ngày tải lên: 20/02/2014, 18:20

8 480 0
Báo cáo khoa học: "Cohesive Phrase-based Decoding for Statistical Machine Translation" pot

Báo cáo khoa học: "Cohesive Phrase-based Decoding for Statistical Machine Translation" pot

... weakness in order model- ing, without affecting its strengths. To fur- ther increase flexibility, we incorporate cohe- sion as a decoder feature, creating a soft con- straint. The resulting cohesive, ... according to Minimum Error Rate Training or MERT (Och, 2003). Phrasal SMT draws strength from being able to memorize non- compositional and context-specific translations, as well as loc...

Ngày tải lên: 08/03/2014, 01:20

9 304 0
Báo cáo khoa học: "Moses: Open Source Toolkit for Statistical Machine Translation" pot

Báo cáo khoa học: "Moses: Open Source Toolkit for Statistical Machine Translation" pot

... data, train the language models and the translation models. It also contains tools for tuning these models using minimum error rate training (Och 2003) and evalu- ating the resulting translations ... additional sources of informa- tion have been shown to be valuable when inte- grated into pre-processing or post-processing steps. Moses also integrates confusion network de- coding, wh...

Ngày tải lên: 23/03/2014, 18:20

4 444 1
Báo cáo khoa học: "Using Noisy Bilingual Data for Statistical Machine Translation" pot

Báo cáo khoa học: "Using Noisy Bilingual Data for Statistical Machine Translation" pot

... probabilities increases the robustness. Wrong phrase pairs resulting from errors in the Viterbi alignment will have a low probability. 3 What's in the Training Data 3.1 The Corpora To train the Chinese-to-English ... covered by the training data. Coverage can be expressed in terms of tokens, i.e. how many of the tokens in the test sentences are covered by the vocabulary of th...

Ngày tải lên: 24/03/2014, 03:20

4 234 0
Báo cáo khoa học: "Computing Lattice BLEU Oracle Scores for Machine Translation" potx

Báo cáo khoa học: "Computing Lattice BLEU Oracle Scores for Machine Translation" potx

... re- laxing the clipping constraints: starting from an unconstrained problem, the counts clipping is en- forced by incrementally strengthening the weight of paths satisfying the constraints. The ... defined by Equation (4). In the beginning each arc is initialized with a singleton set containing one tuple with a single word as the partial hypothesis. For the semiring operations we define one ....

Ngày tải lên: 24/03/2014, 03:20

10 349 0
Báo cáo khoa học: "Enriching Morphologically Poor Languages for Statistical Machine Translation" doc

Báo cáo khoa học: "Enriching Morphologically Poor Languages for Statistical Machine Translation" doc

... answering questions, making suggestions and providing sup- port. References Birch, A., Osborne, M., and Koehn, P. 2007. CCG Supertags in factored Statistical Machine Translation. In Proceedings ... Computa- tional Linguistics. The Association for Computer Lin- guistics. Carpuat, M. and Wu, D. 2007. Improving Statistical Ma- chine Translation using Word Sense Disambiguation. In Pro...

Ngày tải lên: 31/03/2014, 00:20

8 313 0
w