metrics for mt evaluation evaluating reordering

Báo cáo khoa học: "Reordering Metrics for MT" docx

Ngày tải lên: 30/03/2014, 21:20

9 255 0

Báo cáo khoa học: "Combining Coherence Models and Machine Translation Evaluation Metrics for Summarization Evaluation" doc

... 4a and 4b, evaluation metrics al- ways correlate better on the initial task than on the update task. This suggests that there is much room for improvement for readability metrics, and metrics need ... DICOMER – a DIscourse COherence Model for Evaluating Readability. LIN outperforms all metrics on all correlations on both tasks. On the initial task, it outperforms the best scores by 3.62%, 16.20%, ... Explicit/Non-Explicit information, and demonstrate that they improve the original model. There are parallels between evaluations of machine translation (MT) and summarization with respect to textual content. For...

Ngày tải lên: 07/03/2014, 18:20

9 351 0

Báo cáo khoa học: "A Graphical Interface for MT Evaluation and Error Analysis" doc

... offering a rich set of metrics and meta -metrics for assessing MT quality (Gim ´ enez and M ` arquez, 2010a). Although automatic MT evaluation is still far from manual evaluation, it is indeed ... Association for Computational Linguistics, pages 139–144, Jeju, Republic of Korea, 8-14 July 2012. c 2012 Association for Computational Linguistics A Graphical Interface for MT Evaluation and ... existing evaluation measures and to support the development of further improve- ments or even totally new evaluation metrics. This information can be gathered both from the experi- 139 Figure 1: MT...

Ngày tải lên: 16/03/2014, 20:20

6 453 0

Tài liệu Báo cáo khoa học: "a Precision-Order-Recall MT Evaluation Metric for Tuning" pdf

... word alignment information. 3 Experiments 3.1 PORT as an Evaluation Metric We studied PORT as an evaluation metric on WMT data; test sets include WMT 2008, WMT 2009, and WMT 2010 all-to-English, ... Birch and M. Osborne. 2011. Reordering Metrics for MT. In Proceedings of ACL. C. Callison-Burch, C. Fordyce, P. Koehn, C. Monz and J. Schroeder. 2008. Further Meta -Evaluation of Machine Translation. ... and 22.0% ties). 1 Introduction Automatic evaluation metrics for machine translation (MT) quality are a key part of building statistical MT (SMT) systems. They play two 1 PORT: Precision-Order-Recall...

Ngày tải lên: 19/02/2014, 19:20

10 388 0

Báo cáo khoa học: "A Re-examination of Machine Learning Approaches for Sentence-Level MT Evaluation" ppt

... human assessment are higher than standard automatic evaluation metrics. 2 MT Evaluation Recent automatic evaluation metrics typically frame the evaluation problem as a comparison task: how similar ... in- valuable resource for measuring the reliability of automatic evaluation metrics. In this paper, we show that they are also informative in developing better metrics. 3 MT Evaluation with Machine ... Meeting of the Association for Computa- tional Linguistics, July. Chin-Yew Lin and Franz Josef Och. 2004b. Orange: a method for evaluating automatic evaluation metrics for machine translation....

Ngày tải lên: 08/03/2014, 02:21

8 476 0

Báo cáo khoa học: "Regression for Sentence-Level MT Evaluation with Pseudo References" pdf

Ngày tải lên: 31/03/2014, 01:20

8 290 0

Báo cáo hóa học: " Video Object Relevance Metrics for Overall Segmentation Quality Evaluation" potx

Ngày tải lên: 22/06/2014, 23:20

11 117 0

Economic metrics for wind energy projects

Ngày tải lên: 05/09/2013, 16:30

26 433 0

Tài liệu Báo cáo khoa học: "Collecting Highly Parallel Data for Paraphrase Evaluation" doc

... these metrics correlate highly with human judgments. 1 Introduction Machine paraphrasing has many applications for natural language processing tasks, including machine translation (MT) , MT evaluation, ... Paraphrase Evaluation Metrics One of the limitations to the development of machine paraphrasing is the lack of standard metrics like BLEU, which has played a crucial role in driving progress in MT. ... for what constitutes a high-quality paraphrase. In addition to the lack of standard datasets for training and testing, there are also no standard metrics like BLEU (Papineni et al., 2002) for...

Ngày tải lên: 20/02/2014, 04:20

11 418 0

Tài liệu Báo cáo khoa học: "MT Evaluation: Human-like vs. Human Acceptable" doc

... Similarity Metrics We begin by deﬁning a set of 22 similarity metrics taken from the list of standard evaluation metrics in Subsection 2.1. Evaluation metrics can be tuned into similarity metrics ... families of similarity metrics form a set of 104 metrics. Our goal is to obtain the subset of metrics with highest descriptive power; for this, we rely on the KING probability. A brute force exploration ... references: ORANGE was introduced by Lin and Och (2004b) 6 for the meta -evaluation of MT evaluation metrics. The measure provides information about the average behavior of automatic and manual...

Ngày tải lên: 20/02/2014, 12:20

8 334 0

Tài liệu Báo cáo khoa học: "A Unified Framework for Automatic Evaluation using N-gram Co-Occurrence Statistics" pptx

... R 2 for the family of metrics AEv(α,N), for correctness scores, second QA evaluation A Unified Framework for Automatic Evaluation using N-gram Co-Occurrence Statistics Radu SORICUT Information ... penalized). Another evaluation we consider in this paper, the DUC 2001 evaluation for Automatic Summarization (also performed by NIST), had specific guidelines for coverage evaluation, which ... Unified Framework for Automatic Evaluation In this section we propose a family of evaluation metrics based on N-gram co-occurrence statistics. Such a family of evaluation metrics provides...

Ngày tải lên: 20/02/2014, 16:20

8 462 0

Tài liệu Báo cáo khoa học: "Extending the BLEU MT Evaluation Method with Frequency Weightings" pdf

... used in the vec- tor-space model for Information Retrieval (Salton and Leck, 1968) and the S-score proposed for evaluating MT output corpora for the purposes of Information Extraction (Babych ... scores for both runs were compared using a standard deviation measure. 3. The results of the MT evaluation with frequency weights With respect to evaluating MT systems, the cor- relation for ... for translation: MT systems that have no means for prioritising this information often in- troduce excessive information noise into the tar- get text by literally translating structural information,...

Ngày tải lên: 20/02/2014, 16:20

8 267 0

Báo cáo " A web-based decision support system for the evaluation and strategic planning using ISO 9000 factors in higher education " pot

... 9000 factors for an evaluation and a strategic university planning. For the implementation, a Web-based DSS is based on ISO 9000 factors for the evaluation and strategic planning for a case study ... alternatives for an evaluation model / a strategic university planning. 3. DSS model application for an evaluation and a strategy planning 3.1. Application model using ISO 9000 factors for a strategic ... The forth step is to analyze the hierarchy model using ISO 9000 factors for an evaluation and a strategic planning. The final step is to build a Web-based DSS application based on AHP model for...

Ngày tải lên: 05/03/2014, 14:20

12 542 0

The ‘global health’ education framework: a conceptual guide for monitoring, evaluation and practice doc

... on overall driving forces for education reforms be consid- ered (Figure 5). Indicators Finally, we d educe ten core indicators from the above framework for the purpose of monitoring and evaluation via ... higher policy and decision-making fora, but equally - and potentially more important - they can be bottom-up, that is promoted and enforced by the health workforce, for instance by means of addressing ... the evaluation of educational interventions or the monitoring of curri- culum development during education reforms. It further suggests comprehensive consideration of the driving forces for education...

Ngày tải lên: 05/03/2014, 22:21

12 884 0

Báo cáo khoa học: "Incremental HMM Alignment for MT System Combination" pot

... tabular form CN, and E i (k) to denote the cell at the k-th row and the i-th column. W(k ) is the weight for E(k), and W i (k) = W (k) is the weight for E i (k). p i (k) is the normalized weight for ... newsgroup sections of MT0 6, whereas the test set is the entire MT0 8. The 10- best translations for every source sentence in the dev and test sets are collected from eight MT systems. Case-insensitive ... Open MT evaluation. 1 Introduction Word-level combination using confusion network (Matusov et al. (2006) and Rosti et al. (2007)) is a widely adopted approach for combining Machine Translation (MT) ...

Ngày tải lên: 17/03/2014, 01:20

9 264 0

Báo cáo khoa học: "An Automatic Method for Summary Evaluation Using Multiple Evaluation Results by a Manual Method" pptx

... 2006. c 2006 Association for Computational Linguistics An Automatic Method for Summary Evaluation Using Multiple Evaluation Results by a Manual Method Hidetsugu Nanba Faculty of Information Sciences, ... section, are necessary for a more accurate summary evaluation. 3 Investigation of an Automatic Method using Multiple Manual Evaluation Results 3.1 Overview of Our Evaluation Method and ... Consortium. 2 http://www.nist.gov/speech/tests /mt/ mt2001/resource/ 604 tested ROUGE and cosine distance, both of which have been used for summary evaluation. If a score by Yasuda’s method exceeds...

Ngày tải lên: 17/03/2014, 04:20

8 359 0

Báo cáo khoa học: "QARLA:A Framework for the Evaluation of Text Summarization Systems" pdf

... is, therefore, how to ﬁnd informative metrics, and then how to combine them into an op- timal single quality estimation for automatic sum- maries. The most immediate way of combining metrics is ... and (iii) test whether evaluating with that test-bed is reliable (JACK measure). 2 Formal constraints on any evaluation framework based on similarity metrics We are looking for a framework to evaluate ... Lin. 2004. Orange: a Method for Evaluating Au- tomatic Metrics for Machine Translation. In Pro- ceedings of the 36th Annual Conference on Compu- tational Linguisticsion for Computational Linguis- tics...

Ngày tải lên: 17/03/2014, 05:20

10 518 0

Báo cáo khoa học: "A Figure of Merit for the Evaluation of Web-Corpus Randomness" ppt

... whole corpus (BNC). C is the total number of categories. W stands for Written, S for Spoken. C1, C2, DE, UN are demographic classes for the spontaneous conversations, no cat is the BNC undeﬁned category. ples ... to investigate how the choice of the biased sampling method affects the performance of our procedure and its relations to uniform sampling. 3.1 Corpora as unigram distributions A compact way of representing ... collections of doc- uments is closely related to the similarity of the 218 A Figure of Merit for the Evaluation of Web-Corpus Randomness Massimiliano Ciaramita Institute of Cognitive Science and...

Ngày tải lên: 17/03/2014, 22:20

8 436 0

Báo cáo khoa học: "CD ER: Efﬁcient MT Evaluation Using Block Movements" doc

Ngày tải lên: 17/03/2014, 22:20

8 246 0