1. Trang chủ
  2. » Thể loại khác

DSpace at VNU: A systematic comparison between various statistical alignment models for statistical English-Vietnamese phrase-based translation

8 107 0

Đang tải... (xem toàn văn)

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 8
Dung lượng 170,48 KB

Nội dung

2012 Fourth International Conference on Knowledge and Systems Engineering A Systematic Comparison Between Various Statistical Alignment Models for Statistical English-Vietnamese Phrase-Based Translation Cuong Hoang1 , Cuong Anh Le1 , Son Bao Pham1,2 Faculty of Information Technology University of Engineering and Technology Vietnam National University, Hanoi Information Technology Institute Vietnam National University, Hanoi {cuongh, cuongla, sonpb}@vnu.edu.vn Abstract 3-5) yields better results than applying a simple word-based translation model [9][10] However, surprisingly, for the case of EnglishVietnamese phrase-based SMT, we found that this conclusion is not always true That is, the quality of those SMT systems which were trained by these alignment models is usually strong worse than using simple word-based alignment models (IBM Models 1-2) However, no previous work concerns with a systematic analyzing for the effects of using the alignment models for English-Vietnamese statistical phrase-based SMT system Hence, this paper focuses on a systematic comparison between the alignment models Following to the analyzing results, we also point out some important aspects of deploying the word-alignment component for the language pair English-Vietnamese, which could significantly lead the translation quality in overall These are the best training scheme [16], the number of iterations for training each model, or the probability of tossing in a spurious word [1] In addition, we also propose a scheme for improving the translation quality of using higher word-based alignment models In detail, we found that to attack to lexical translation seems to be the right approach to allow higher alignment models be able to “boost” their quality better To evidence this paradigm, we focus on initializing Model with a better heuristic parameter estimation After that, we present the “boosting” capacity in overall Not only taking the experimental evaluation on GIZA++ [16], we also implement LGIZA1 , as a lightweight SMT toolkit that is used to train Models 1-3 LGIZA is implemented completely following to the original descriptions by [1] for helping us to obtain an accurate comparison between the alignment models In addition, we also evaluate on the pair English-French to clearly see the difference from our In statistical phrase-based machine translation, the step of phrase learning heavily relies on word alignments This paper provides a systematic comparison of applying various statistical alignment models for statistical EnglishVietnamese phrase-based machine translation We will also invest a heuristic method for elevating the translation quality of using higher word-alignment models by improving the quality of lexical modelling In detail, we will experimentally show that taking up the lexical translation seems to be an appropriate approach to force “higher” word-based translation models be able to efficiently “boost” their merits We hope this work will be a reliable comparison benchmark for other studies on using and improving the statistical alignment models for English-Vietnamese machine translation systems Introduction Statistical Machine Translation (SMT) is a machine translation approach which depends on creating a parameter probabilistic model by analyzing parallel sentence pairs in a bilingual corpus In SMT, the best performing systems are based in some ways on phrases The basic idea of the phrase-based translation paradigm is to learn to break given source sentence into phrases, then separately translate each of them These translation phrases are finally combined to generate the target sentence [9] The step of phrase learning, which is a vital component in a phrase-based SMT system, usually relies on the alignments between words In general, for some pairs of popular languages such as English-French or English-German, using HMM or fertility-based alignment models (IBM Models 978-0-7695-4760-2/12 $26.00 © 2012 IEEE DOI 10.1109/KSE.2012.37 LGIZA 143 is available on: http://code.google.com/p/lgiza/ specific case We hope this work will be the reliable comparison benchmarks for following researches on building an English-Vietnamese SMT system From the above formulation, the distortion probability does not depend on the word positions but in the jump width (i − i ) 2.2 2.1 Word-based Machine Translation Models Model 3-4, which are quite more complex models2 , yield more accurate results than Model 1-2 mainly based on the fertility-based alignment scheme The original equation for Model 3-4, as described the “joint likelihood” for a tableau, τ , and a permutation, π, is3 : IBM Models 1-2 and HMM alignment model Model assumes a source sentence f1J of length J is translated into a target sentence eI1 of length I It is defined as a particularly simple instance of the translation framework, by assuming that all possible lengths for f1J (less than some arbitrary upper bounds) have an uniform probability The word order does not affect the alignment probability, or P r(J|eI1 ) is independent of eI1 and J Therefore, all possible choices of generating the target words by source words are equal Let t(fj |ei ) be the translation probability of fj given ei The alignment is determined by specifying the values of aj for j from to J [1] yields the following summarization equation: J P r(f |e) = (I + 1) J I P r(τ, π|e) I I φ0 k=1 (1) j=1 i=0 t(fj |ei )a(i|j, J, I) (2) J t(fj |ei )P (a(j)|a(j−1), I) j=1 (3) where the alignment probability P (a(j)|a(j − 1), I) is calculated as: P (i|i , I) = c(i − i ) I k=1 c(k − i ) P r(πik |πik−1 , π1i−1 , τ0I , φI0 , e) P r(π0k |π0k−1 , π1I , τ0I , φI0 , e) (5) How lexical models impact to the quality of applying fertility-based models Follow the scheme proposed by [1], to train Models 1-2, at first we uniform the lexical translation t values for every pair of words From Model 1, for each iteration of the training process, we collect the fractional counts over every possible alignment (pair of words) and then revise to update the values of parameter t Similarly, after training Model 1, we uniform the values of position alignment a However, we revise both the lexical translation t and the position alignment a for each iteration of Model For the training process of Model 3, we would like to use everything we learnt from the Model training estimation to set the initial values Then, to collect a subset of “reasonable” alignments, we start with the best “Viterbi” alignment which is marked by the “perspective” of Model 2, and use it to greedily search for the“Viterbi” alignment of Model It means that we collect up only the “reasonable” neighbours of the best “Viterbi” alignments Model attempts to model the absolute distortion of words in sentence pairs [18] suggests that alignments have a strong tendency to maintain the local neighbourhood after translation HMM word-based alignment uses a first order Hidden Markov Model to restructure the alignment model P r(eI1 , a|sJ1 ) used in Model to include the first order alignment dependencies: (I + 1)J φi P r(τik |τik−1 , τ0i−1 , φI0 , e) The comparison between applying these word-based alignment models to the English-Vietnamese statistical phrase-based SMT system will be described in depth in section Experiment j=1 i=0 P r(f1J , aJ1 |eI1 ) = φi i=1 k=1 I t(fj |ei ) I P r(φi |φi−1 , e)P r(φ0 |φ1 , e) i=0 k=1 I P r(f |e) = = i=1 The parameter t is normally estimated by the EM algorithms [2] In Model 1, we take no cognizance of where words appear in either string The first word in the f1J string is just as likely to be connected to a word at the end of the eI1 string as to one at the beginning Simply, for Model 2, we make the same assumptions as in Model except that we j−1 assume the translation probability P r(aj |aj−1 , J, e) , f1 depends on j, aj , and J, as well as on I: J Fertility-based Alignment Models For convenience, from now we will use the “higher models” term to imply Model 3-4 For more detail on how Model 3-5 parameterize fertility scheme, please refer to [1] (4) 144 Hence, we could see that the quality of applying fertilitybased models is heavily dependent on the accuracy of the lexical translation and the position alignment translation probabilities derived from Model These parameters directly impacts the fertility-based alignment parameters More important, there is no “trick” for helping us to train Model in a very fast way which we could “infer” all the possible alignments for each pair of parallel sentences Our strategy is to carry out sum of the translations only one the high probable alignments, ignoring the vast sea of much low probable ones Specifically, we begin with the most probable alignment that we can find and then include all alignments that can be obtained from it by small changes sums the differences between observed and expected values in all squares of the table, scaled by the magnitude of the expected values [11], as follow: X2 = i,j Improving Lexical Modelling 4.3 (6) Improving IBM Model Normally, the lexical translation parameter of Model is initialized to an uniform distribution over the target language vocabulary From the above analysis, we have strong reasons to believe that these values not produce the most accurate sentence alignments Hence, we use a heuristic model based on the log likelihood-ratio (LLR) statistic recommended by [4, 13] There is no guarantee, of course, that this is the optimal way However, we found that by applying our heuristic improvement, the improving of lexical translation model is significantly improved In addition, the more impressive thing which we want to emphasize is - by improving lexical translation model, using the fertility-based translation models could also gain a better final result The Robustness Problem There is a classic problem of Models, which was well described by [4] That is, their parameter estimation might lack the robustness for the global maximum problem In detail, these word-based alignment models lead to a local maximum of the probability of the observed pairs as a function of the parameters of the model There may be many such local maxima The particular one at which we arrive will, in general, depends on the initial choice of the parameters It is not clear that these maximum likelihood methods are robust enough to produce the estimation that can be reliably replicated in other laboratories To improve the final result, we will improve the initial choice of the lexical translation parameter in an effective way That is, we start the Model with a heuristic method for finding the corresponding between lexical translations in statistics based on the Pearson’s chi-square test 4.2 where ranges over rows of the table4 , ranges over columns, Oij is the observed value for cell (i, j) and Eij is the expected value [3] realized that it seems to be a particularly good choice for using the “independence” information Actually they used a measure of judge which they call φ2 , which is a X -like statistic The value of φ2 is bounded between and For more detail on the tutorial to calculate φ2 , please refer to [3] Of course, the performance for identifying word correspondences by using φ2 method is not good as using Model together with EM training scheme [16] However, we believe that this information is quite valuable From the above analysis, we could see that the accuracy of the lexical translation parameter obtains a very important aspect of improving the quality of using higher wordbased alignment models This section will focus on another approach to improve the quality of using higher alignment models in overall 4.1 (Oij − Eij ) Eij Experiment This experiment is deployed on various kinds of training corpora to have an accurate and reliable result The EnglishVietnamese training data was credited by [5] The EnglishFrench training corpus was the Hansards corpus [16] We use MOSES framework [8] as the phrase-based SMT framework In additions to use GIZA++, we also implement LGIZA toolkit Different to GIZA++, LGIZA is originally implemented based on the original documentary [1] without applying other latter improved techniques which are integrated to GIZA++ These are determining word classes for giving a low translation lexicon perplexity (Och, 1999), various smoothing techniques for the fertility, distortion or alignment parameters, symmetrization [16], etc which are Pearson’s Chi-Square Information Some previous researches pointed out that using Pearson’s chi-square test could also assist us in identifying word correspondences in bilingual training corpora[3] In fact, the essence of Pearson’s chi-square test is to compare the observed frequencies in a table with the frequencies expected for independence If the difference between observed and expected frequencies is large, then we can reject the null hypothesis of independence In the simplest case, the X test is applied to 2-by-2 tables The X statistic Sometimes 145 it is called “contingency tables” 5.1.2 Model vs HMM applied in GIZA++ [16] and therefore, the applying other improved techniques could make our results a little bit noise in comparison 5.1 Very different to other comparison for other well-known languages, we found that HMM gives a bad result when we compare to Model for the language pair EnglishVietnamese This comes from the fact that HMM model extends Model and Model 2, which models the lexical translation and the distortion translation, by also modelling the relative distortion In detail, the relative distortion is estimated by applying a first-order Hidden Markov Model, where each alignment probability is dependent on the distortion of the previous alignment However, for the language pair English-Vietnamese, the assumption that each alignment probability is dependent on the distortion of the previous alignment is not true We could see that the transformation of position alignment for the pair English-Vietnamese is quite more complicated than other well-known languages It reflects the quite difference in the word order between English and Vietnamese [14] This is another important aspect It leads to a bad quality when we apply the fertility-based models which trained based on the initial transferring parameters from HMM model instead of Model In addition, it also denotes one of the most difficult problems to enhance the quality of an English-Vietnamese machine translation system - the reordering transferring problem [6] Comparing IBM Models In this evaluation, we will iteratively use various wordbased alignment models for evaluating the performance in overall to “boost” the quality of phrase-based SMT system Table describes the comparison between BLEU score [17] for applying various word-based alignment training schemes to the language pair English-Vietnamese Similarly, Table presents the comparison results to the pair English-French The more detail will denoted in the next sections 5.1.1 The best training schemes The training schemes refer to the sequence of used models and the number of training iterations, which are used for training each model Our standard training scheme on the training data is 15 23 33 43 This notation shows that five iterations of Model 1, three iterations of Model 2, three iterations of Model 3, three iterations of Model are performed In practice, we found that this training scheme typically gives very good results for the language pair EnglishVietnamese when comparing to other training schemes and it does not lead to the over-fitting problem Choosing the best training scheme is an important task We found that if we apply the default GIZA++ training scheme to the language pair English-Vietnamese, in all various training corpora, the quality of the system in overall could be very bad Table points our clearly the worst effects of choosing the default training scheme of training GIZA++, which makes our SMT system significantly obtain a very bad results when comparing to our defined training scheme Corpus 20,000 30,000 40,000 50,000 60,000 Default 15.77 16.91 17.22 18.22 18.8 15 23 33 43 16.38 17.34 18.06 18.63 19.76 5.1.3 Model vs Model From the above analysis table, we see that Model gives a bad result when we compare to Model for the language pair English-Vietnamese Table and Table denote the difference between Model vs Model for the language pairs English-Vietnamese and English-French Corpus 20,000 30,000 40,000 50,000 60,000 Model (15 23 ) 17.13 17.71 18.43 19.00 19.60 Model (15 23 33 ) 16.59 17.67 18.24 18.82 19.49 Δ(%) -0.54 -0.04 -0.19 -0.18 -0.11 Table Comparing IBM Model - IBM Model for the language pair English-Vietnamese Δ(%) +0.61 +0.43 +0.84 +0.41 +0.96 Also, we found GIZA++ applies many improving techniques that mainly used to “boost” the quality of fertilitybased models In fact, by applying LGIZA - following to the original IBM Models description by [1] and we not apply any other improved techniques, we found that the comparative result should be stronger contrast Table and Table present the comparison results when applying each model to statistical phrase-based SMT sys- Table Compare to the default training scheme 146 Model Model Model HMM Model Model Training Scheme 15 15 23 15 H 15 H 33 15 23 33 3 H 15 23 33 43 5 3 H (Default GIZA++) 20,000 15.78 17.13 16.75 16.13 16.59 16.01 16.38 15.77 Size of training corpus 30,000 40,000 50,000 17.59 17.79 18.43 17.71 18.43 19.00 16.76 17.34 18.30 16.56 17.70 18.27 17.67 18.24 18.82 16.74 17.47 18.56 17.34 18.06 18.63 16.91 17.22 18.22 60,000 18.77 19.60 19.02 18.80 19.49 19.12 19.76 18.8 Table Compare BLEU scores between various training schemes (English-Vietnamese) Model Model Model HMM Model Model Training Scheme 15 15 25 15 H 15 H 33 15 25 33 5 3 H (Default GIZA++) 15 25 33 43 20,000 23.17 23.18 23.17 22.91 22.74 22.73 23.18 Size of training corpus 30,000 40,000 50,000 22.70 24.95 25.14 24.19 25.31 25.81 24.34 25.38 25.70 24.55 25.18 25.84 24.39 25.40 26.02 24.69 25.56 24.43 24.53 25.66 25.98 60,000 25.88 26.61 26.10 26.33 26.70 26.59 26.29 Table Compare BLEU scores between various training schemes (English-French) Corpus 20,000 30,000 40,000 50,000 60,000 Model (15 25 ) 23.18 24.19 25.31 25.81 26.61 Model (15 25 33 ) 22.74 24.39 25.40 26.02 26.70 Δ(%) -0.44 0.2 0.09 0.21 0.11 Corpus 20,000 30,000 40,000 50,000 60,000 Table Comparing IBM Model - IBM Model for the language pair English-French Model (15 23 33 ) 16.32 17.19 17.46 18.41 18.38 Δ(%) -0.70 -0.65 -0.77 -0.34 -0.91 Table Compare IBM Model - IBM Model (English-Vietnamese) by LGIZA tems of the language pairs of English - Vietnamese and English - French Seriously, without applying other techniques, the quality of applying Model is bad for the language pair English-Vietnamese However, for the language pair English-French, Model gives a slightly worse than Model That is why GIZA++ (with the help of some improve techniques as we discuss above) usually obtains a better result when we compare to Model 5.1.4 Model (15 23 ) 17.02 17.84 18.23 18.75 19.29 to move around as units More important, Model also provides us a very efficient way to integrate the linguistics knowledge between the language pairs into the statistical alignment model However, the training of Model is dependent on Model It means that Model uses the fertility-based modelling probability and other probabilities as the initial transferring parameters for training Model Since we could see from the above translation models that Model could not “boost” its total merits as some other well-known pairs of languages Hence, for the language pair English-Vietnamese, the improvement of Model is not very well, too IBM Model and vs IBM Model and It is steady confirm that Model is an significantly better than Model [16] This comes from the fact that the source language string constitutes phrases that are translated as units into the target language The distortion probabilities of Model not account well for this tendency of phrases 147 Corpus 20,000 30,000 40,000 50,000 60,000 Model (15 25 ) 23.31 24.07 25.29 25.75 26.51 Model (15 25 33 ) 23.27 23.95 25.21 25.49 26.30 Δ(%) -0.04 -0.12 -0.08 -0.26 -0.21 Corpus 20,000 30,000 40,000 50,000 60,000 than from Vietnamese to English Therefore, we will obtain a better system from English to Vietnamese than from Vietnamese to English (Bayesian reasoning [1]) 5.1.5 P0 vs P1 values For the fertility-based models, there is an important concept about the probability for generating an empty cept or a word In a formal explanation for the language pair EnglishFrench by [7], after we assign fertilities to all the “real” English words (excluding NULL), we will be ready to generate (say) z French words As we generate each of these z words, we optionally toss in a spurious French word also, with probability p1 We’ll refer to the probability of not tossing in (at each point) a spurious word as p0 = − p1 The couple (p0 , p1 ) is obtained as an unique value according to a pair of languages The following tables denote the probability (p0 , p1 ) for the language pair English-Vietnamese Table presents the pair (p0 , p1 ) value for English language Table presents the pair (p0 , p1 ) for Vietnamese language For a larger training corpus, we could see that the p0 of English language value converges to an approximate value around 0.89, and p1 converges to an approximate value around 0.11 In other directions, the p0 value of Vietnamese converges to an approximate value around 0.18 P0 0.9153 0.9012 0.8960 0.8911 0.8869 P1 0.1462 0.1686 0.1809 0.1888 0.1926 Table P0 vs P1 of Vietnamese for the pair English-Vietnamese Table Compare IBM Model - IBM Model (English-French) by LGIZA Corpus 20,000 30,000 40,000 50,000 60,000 P0 0.8537 0.8314 0.8191 0.8112 0.8074 Corpus 20,000 30,000 40,000 50,000 60,000 P0 0.8281 0.8341 0.8480 0.8432 0.8397 P1 0.1719 0.1659 0.1520 0.1568 0.1603 Table 10 P0 vs P1 of English for the pair English-French Corpus 20,000 30,000 40,000 50,000 60,000 P0 0.8426 0.8207 0.8299 0.8495 0.8460 P1 0.1574 0.1793 0.1701 0.1505 0.1540 Table 11 P0 vs P1 of French for the pair English-French P1 0.0847 0.0988 0.1040 0.1089 0.1131 In other words, an English to Vietnamese translation system will usually obtain a higher BLEU Score than a Vietnamese to English translation system However, from Table 10 and 11, this is not happened for the pair English-French Hence, our suggestion is that if we want to build an EnglishVietnamese parallel extraction system, it is better to translate from English to Vietnamese and then we process the translation sentences by the processing framework Otherwise, if we apply an improving technique, it is better to test its effect on a Vietnamese-English translation system Table P0 vs P1 of English for the pair English-Vietnamese The default training scheme of GIZA++ is set up in the way that p0 of English language is 0.999 and fixed the value for parameter p0 in Model and Model Since 0.9999 is far away from 0.89, it is better to change the value of p0 in training process to obtain a better result Also, because modelling the NULL translation is difficult and the probability p1 of Vietnamese is greater than English, it is harder to build a translation system from English to Vietnamese 5.2 Evaluation Modelling on Improving Lexical From the above comparison, we could see that we need to have some ways to improve the quality of using higher 148 of them has a difference in the computing equation for denoting the translation probability However, there is a strong relationship between them That is, the more complex translation models use the estimating results derived from a simpler translation model as the initializing value Our experimental results point out clearly and deeply this perspective Table 14 describes the improved results for the language pair English-Vietnamese Similarly, Table 15 shows the effects for the language pair English-French models This problem could be definite as the higher model could not “boost” all its hidden power As we mentioned, this is another important aspect In our opinion, to improve the quality of word alignment is one of the most important works to obtain the state-of-the-art of an EnglishVietnamese SMT system Addressing this problem, different to previous methods, which focus on improving the quality of statistical machine translation by combine the final result to other features as a log-linear combination model [15][12], we will focus on improving lexical modelling for boosting the quality of fertility-based models better This experiment section will take an evidence for our method 5.2.1 Corpus 20,000 30,000 40,000 50,000 60,000 Baseline Results We test our improving method for various training corpora for both two language pairs English-Vietnamese and English-French to see the effects of applying our improved heuristic initializing Model parameters The original results for applying the original model implementations were described in the Table 12 for the pair English-Vietnamese and in the Table 13 for the pair English-French Each column represents the BLEU score measuring for each IBM translation model Corpus 20,000 30,000 40,000 50,000 60,000 Model 15.90 17.44 18.05 18.49 19.00 Model 17.02 17.84 18.23 18.75 19.29 Model 22.19 23.57 24.47 25.01 25.75 Model 23.31 24.07 25.29 25.75 26.51 Corpus 20,000 30,000 40,000 50,000 60,000 Model 16.32 17.19 17.46 18.41 18.38 M.2 16.64 17.86 18.62 18.96 19.89 Δ -0.38 +0.02 +0.39 +0.21 +0.6 M.3 16.21 17.22 18.02 18.61 18.85 Δ -0.11 +0.03 +0.56 +0.2 +0.47 M.1 22.38 23.81 24.56 25.18 26.04 Δ +0.19 +0.24 +0.09 +0.17 +0.29 M.2 23.76 24.44 25.52 25.93 26.94 Δ +0.45 +0.37 +0.23 +0.64 +0.43 M.3 23.90 24.49 25.55 25.94 26.86 Δ +0.63 +0.54 +0.34 +0.45 +0.56 Table 15 Improved results of IBM Models for the pair English-French Recently researches point out that it is difficult to achieve heavily gains in translation performance based on improving word-based alignment results The better lexical translation models could be quite strong but it is very hard to “boost” the quality a translation system in overall [10] However, with a very basic improving in initializing the Model parameters, we could see that the BLEU score of using Model is increased even then the improved of Models 1-2 Together, the improving result is better for a larger training corpus Model 23.27 23.95 25.21 25.49 26.30 Table 13 Baseline results for the pair English-French 5.2.2 Δ -1.42 -0.02 +0.39 +0.02 +0.27 Table 14 Improved results of IBM Models for the pair English-Vietnamese Table 12 Baseline results for the pair English-Vietnamese Corpus 20,000 30,000 40,000 50,000 60,000 M.1 14.48 17.42 18.44 18.51 19.27 Conclusion The step of learning phrase in statistical phrase-based translation, which is the current state-of-the-art in SMT, is absolutely important In brief, word-based alignment component affects directly the phrase pairs that are extracted from the training corpora This research has taken a systematic comparison between using various word-based alignment models for phrase-based SMT systems We have Improved IBM Model by Heuristic Initializing Each translation model has its specific and different view of modelling the way of translation Consecutively, each 149 found that using HMM and fertility-based alignment models usually gives better results for the language pair EnglishFrench However, for English-Vietnamese, the comparison result is usually contrasted Previous researches for improving the overall quality of statistical phrase-based translation system point out the fact that it is very hard to improve the BLEU score over 1% However, from the comparison results, we could see that by appropriately configuring the best training scheme and other features such as the probability of tossing spurious words for each pair of languages could significantly lead the quality of statistical phrase-based machine translation The other contribution of our work is that we have clearly denoted the importance of the lexical alignment model to the higher translation models in the training process In details, we have pointed out the fact that we heavily need to improve the quality of fertility-based models to enhance the quality of using higher word-alignment models for statistical phrase-based machine translation It is especially important for the language pair English-Vietnamese, for which the quality of using Model as the word-based alignment component is bad when we compare to the pair EnglishFrench [5] C Hoang, A Le, P Nguyen, and T Ho Exploiting nonparallel corpora for statistical machine translation In Proceedings of The 9th IEEE-RIVF International Conference on Computing and Communication Technologies, pages 97 – 102 IEEE Computer Society, 2012 [6] V Hoang, M Ngo, and D Dinh A dependency-based word reordering approach for statistical machine translation In RIVF, pages 120–127, 2008 [7] K Knight A Statistical MT Tutorial Workbook Aug 1999 [8] P Koehn, H Hoang, A Birch, C Callison-Burch, M Federico, N Bertoldi, B Cowan, W Shen, C Moran, R Zens, C Dyer, O Bojar, A Constantin, and E Herbst Moses: open source toolkit for statistical machine translation In Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, ACL ’07, pages 177–180, Stroudsburg, PA, USA, 2007 Association for Computational Linguistics [9] P Koehn, F J Och, and D Marcu Statistical phrase-based translation In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1, NAACL ’03, pages 48–54, Stroudsburg, PA, USA, 2003 Association for Computational Linguistics [10] A Lopez Word-based alignment, phrase-based translation: Whats the link In In Proc of AMTA, pages 90–99, 2006 [11] C D Manning and H Schăutze Foundations of statistical natural language processing MIT Press, Cambridge, MA, USA, 1999 [12] J B Mari`oo, R E Banchs, J M Crego, A de Gispert, P Lambert, J A R Fonollosa, and M R Costajuss`a N-gram-based machine translation Comput Linguist., 32(4):527–549, Dec 2006 [13] R C Moore Improving ibm word-alignment model 1, 2005 [14] T P Nguyen, A Shimazu, T.-B Ho, M Le Nguyen, and V Van Nguyen A tree-to-string phrase-based model for statistical machine translation In Proceedings of the Twelfth Conference on Computational Natural Language Learning, CoNLL ’08, pages 143–150, Stroudsburg, PA, USA, 2008 Association for Computational Linguistics [15] F J Och and H Ney Discriminative training and maximum entropy models for statistical machine translation In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pages 295– 302, Stroudsburg, PA, USA, 2002 Association for Computational Linguistics [16] F J Och and H Ney A systematic comparison of various statistical alignment models Comput Linguist., 29:19–51, March 2003 [17] K Papineni, S Roukos, T Ward, and W.-J Zhu Bleu: a method for automatic evaluation of machine translation In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL ’02, pages 311– 318, Stroudsburg, PA, USA, 2002 Association for Computational Linguistics [18] S Vogel, H Ney, and C Tillmann Hmm-based word alignment in statistical translation In Proceedings of the 16th conference on Computational linguistics - Volume 2, COLING ’96, pages 836–841, Stroudsburg, PA, USA, 1996 Association for Computational Linguistics ACKNOWLEDGEMENT This work is partially supported by the CN.10.01 project at University of Engineering and Technology, Vietnam National University, Hanoi This work is also partially supported by the Vietnam’s National Foundation for Science and Technology Development (NAFOSTED), project code 102.99.35.09 and the project KC.01.TN04/11-15 We are thankful to the anonymous reviewers for their comments, especially to the one who suggests us to use the Berkeley aligner and also recommends us to correctly revise some our own affirmations References [1] P F Brown, V J D Pietra, S A D Pietra, and R L Mercer The mathematics of statistical machine translation: parameter estimation Comput Linguist., 19:263–311, June 1993 [2] A P Dempster, N M Laird, and D B Rubin Maximum likelihood from incomplete data via the em algorithm JOURNAL OF THE ROYAL STATISTICAL SOCIETY, SERIES B, 39(1):1–38, 1977 [3] W A Gale and K W Church Identifying word correspondence in parallel texts In Proceedings of the workshop on Speech and Natural Language, HLT ’91, pages 152– 157, Stroudsburg, PA, USA, 1991 Association for Computational Linguistics [4] W A Gale and K W Church A program for aligning sentences in bilingual corpora Comput Linguist., 19:75–102, March 1993 150 ... [10] A Lopez Word-based alignment, phrase-based translation: Whats the link In In Proc of AMTA, pages 90–99, 2006 [11] C D Manning and H Schăutze Foundations of statistical natural language processing... enhance the quality of using higher word -alignment models for statistical phrase-based machine translation It is especially important for the language pair English-Vietnamese, for which the quality... Poster and Demonstration Sessions, ACL ’07, pages 177–180, Stroudsburg, PA, USA, 2007 Association for Computational Linguistics [9] P Koehn, F J Och, and D Marcu Statistical phrase-based translation

Ngày đăng: 16/12/2017, 04:41