0

machine translation to larger corpora

Báo cáo khoa học:

Báo cáo khoa học: "Scaling Phrase-Based Statistical Machine Translation to Larger Corpora and Longer Phrases" pptx

Báo cáo khoa học

... arrays to parallel corpora to calculate phrase translation prob-abilities.4.1 Applied to parallel corpora In order to adapt suffix arrays to be useful for sta-tistical machine translation we need ... fre-8361950472 to aid morocco to confirm that spain declined to aid moroccomoroccospain declined to aid moroccodeclined to confirm that spain declined to aid moroccodeclined to aid moroccoconfirm ... declined to confirm that spain declined to aid moroccodeclined to confirm that spain declined to aid morocco to confirm that spain declined to aid moroccoconfirm that spain declined to aid moroccothat...
  • 8
  • 316
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "A Ranking-based Approach to Word Reordering for Statistical Machine Translation" doc

Báo cáo khoa học

... English-Hindi Statistical Machine Translation. In Proc. IJCNLP.Roy Tromble. 2009. Search and Learning for the Lin-ear Ordering Problem with an Application to Machine Translation. Ph.D. Thesis.Karthik ... converted to dependency trees us-ing Stanford Parser (Marneffe et al., 2006). We con-vert the tokens in training data to lower case, andre-tokenize the sentences using the same tokenizerfrom ... sensitive to parser er-rors; on the other hand, integrated model is forced to use a longer distortion limit which leads to moresearch errors during decoding time. It is possible to 918ing...
  • 9
  • 615
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "HINDI TO PUNJABI MACHINE TRANSLATION SYSTEM" pdf

Báo cáo khoa học

... Case Study of Hindi to Punjabi Machine Translation System. International Journal of Translation. (Accepted, In Print). Goyal V., Lehal G.S. 2011a. Hindi to Punjabi Machine Translation System. ... 2.6.3 Word -to- Word translation using lexicon lookup If token is not a title or a surname, it is looked up in the HPDictionary database containing Hindi to Punjabi direct word to word translation. ... gives it to translation engine for analysis till the complete input text is read and processed. 2.6 Translation Engine The translation engine is the main component of our Machine Translation...
  • 6
  • 458
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "The Contribution of Linguistic Features to Automatic Machine Translation Evaluation" docx

Báo cáo khoa học

... according to the automatic metric. Forthis, we consider in each translation case c, theworse automatic translation t that equals or im-proves the human-aided translation thaccording to the automatic ... andthe Confusion of Tongues: an MT Metric. In Pro-ceedings of the Workshop on MT Evaluation ”Whodid what to whom?” at Machine Translation SummitVIII, pages 55–59.Christoph Tillmann, Stefan ... Labelled Dependencies in Machine Translation Evaluation. In Proceedings of the ACLWorkshop on Statistical Machine Translation, pages104–111.Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing...
  • 9
  • 514
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Name Translation in Statistical Machine Translation Learning When to Transliterate" pptx

Báo cáo khoa học

... USAme@hal3.nameAbstractWe present a method to transliterate namesin the framework of end -to- end statistical machine translation. The system is trained to learn when to transliterate. For Arabic to English MT, we developed ... apply it to any base SMT system, and to human translationsas well. Our goal in augment-ing abaseSMT systemis toincreasethis percentage.A secondary goal is to make sure that our overall translation ... transliterator described in section3 to the tagged items. We limit this transliter-ation to words that occur up to 50 times in thetraining corpus for single token names (or up to 100 and 150 times for...
  • 9
  • 545
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Segmentation for English-to-Arabic Statistical Machine Translation" ppt

Báo cáo khoa học

... theireffect on translation. We also report on applyingFactored Translation Models (Koehn and Hoang,2007) for English -to- Arabic translation. 2 Previous WorkThe only previous work on English -to- Arabic ... techniques. We also report on the useof Factored Translation Models for English- to- Arabic translation. 1 IntroductionArabic has a complex morphology compared to English. Words are inflected for gender, ... attached to theword. So syArp +P:1S recombines to syArty(’my car’)2. Letter Ambiguity: The character ’Y’ (AlfmqSwrp) is normalized to ’y’. In the recom-bination step we need to be able to decidewhether...
  • 4
  • 374
  • 0
Tài liệu Báo cáo khoa học:

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Báo cáo khoa học

... phrasal translation sequences is selected and collated intothe final phrasal translation. Re-scoring through a Test CollectionLarge-scale test collections could be used to re-score the translation ... comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for thedisambiguation of translation alternatives and thusselection of best phrasal translations ... occurred is used to representthe re-scoring factor RF for each sequence of trans-lation candidates. Phrasal translation candidates aresorted in decreasing order by re-scoring factors RF .Finally,...
  • 4
  • 377
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Syntax-to-Morphology Mapping in Factored Phrase-Based Statistical Machine Translation from English to Turkish" ppt

Báo cáo khoa học

... English -to- Turkish translation, but without using any morphology.6 ConclusionsWe have presented a novel way to incorporatesource syntactic structure in English -to- Turkishphrase-based machine translation ... ofthe factors. We aligned our training sets using onlythe root factor to conflate statistics from differentforms of the same root. The rest of the factors arethen automatically assumed to be aligned, ... possible way to address is to uselonger distance constraints on the morphologicaltag factors, to see if we can select them better.3.2.3 Experiments with higher-orderlanguage modelsFactored phrase-based...
  • 11
  • 451
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Hindi-to-Urdu Machine Translation Through Transliteration" pptx

Báo cáo khoa học

... a pivot to translate from En-glish to Urdu. This work also uses transliterationonly for the translation of unknown words. Theirwork can not be used for direct translation fromHindi to Urdu ... models look for the most probableUrdu token sequence un1for a given Hindi tokensequence hn1. We assume that each Hindi token ismapped to exactly one Urdu token and that there isno reordering. ... and the trans-lation model. We refer to the words known to the language model and to the translation modelas LM-known and TM-known words respectivelyand to words that are unknown as LM-unknownand...
  • 10
  • 407
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Applying Morphology Generation Models to Machine Translation" docx

Báo cáo khoa học

... In the second setting, weallow the model to use up to 100 translations, and to automatically select the best number to use. Asseen in Table 3, (n=16) translations were chosen forRussian and ... statistical machine translation. InHLT-NAACL.Xiaodong He. 2007. Using word-dependent transitionmodels in HMM based word alignment for statistical machine translation. In ACL Workshop on Statistical Machine ... analysis for statis-tical machine translation. In HLT-NAACL.Einat Minkov, Kristina Toutanova, and Hisami Suzuki.2007. Generating complex morphology for machine translation. In ACL.Sonja Nießen...
  • 9
  • 416
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Tailoring Word Alignments to Syntactic Machine Translation" docx

Báo cáo khoa học

... model that incorporates syntax-based distor-tion. Lopez and Resnik (2005) considers a sim-pler tree distance distortion model. Daum´e III andMarcu (2005) employs a syntax-aware distortionmodel ... caseof syntactic machine translation, we want to condi-tion on crossing constituent boundaries, even if noconstituents are skipped in the process.4 Experimental Results To understand the ... French results to the pub-lic baseline GIZA++ using the script published forthe NAACL 2006 Machine Translation WorkshopShared Task.5Similarly, we compared our Chi-nese results to the GIZA++...
  • 8
  • 285
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model" pot

Báo cáo khoa học

... tree -to- tree models can rep-resent richer structural information, existing tree -to- tree models did not show advantage over string -to- tree models on translation accuracy due to a much larger ... suc-cessfully applied to Statistical Machine Translation (Graehl and Knight, 2004; Chiang, 2005; Ding andPalmer, 2005; Quirk et al., 2005). In some languagepairs, i.e. Chinese -to- English translation, ... two-step string- to- CFG-tree translation model which employed asyntax-based language model to select the best translation from a target parse forest built in the firststep. Only translation probability...
  • 9
  • 442
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Tree-to-String Alignment Template for Statistical Machine Translation" pdf

Báo cáo khoa học

... as a bottom-upbeam search. To translate a source sentence, we employ aparser to produce a parse tree. Moving bottom-up through the source parse tree, we compute alist of candidate translations ... Candidate translations of subtrees areplaced in stacks according to the root index set bypostorder transversalA candidate translation contains the followinginformation:1. the partial translation 2. ... of tree -to- string alignmenttemplates obtained in trainingIn the following, we formally describe how to introduce tree -to- string alignment templates intoprobabilistic dependencies to model...
  • 8
  • 338
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "Towards a Unified Approach to Memory- and Statistical-Based Machine Translation" pdf

Báo cáo khoa học

... used to improve the performance of cur-rent translation systems. To determine this, wemodified an existing decoding algorithm so that itcan exploit information specific both to a statisti-cal translation ... statistical TMEM and the translation model.Our experiments show that the automatically de-rived translation memory can be used within thestatistical framework to often find translations ofhigher ... words intowhich e is going to be translated.Each English word e is then translated withprobability t e into a French word ,where ranges from 1 to the number ofwords (fertility of e ) into which...
  • 8
  • 433
  • 0
Báo cáo khoa học:

Báo cáo khoa học: "A PARAMETERIZED APPROACH TO INTEGRATING ASPECT WITH LEXICAL-SEMANTICS FOR MACHINE TRANSLATION" doc

Báo cáo khoa học

... "A Cross-Linguistic Approach to Machine Translation, " Proceedings of the Third International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages, ... discussed the application of the theoretical foundations to the automatic acquisi- tion of aspectual representations from corpora in order to augment the lexical-semantic representations that ... Spanish only. Consider the following example: (7) (i) John went to the store when Mary arrived. (it) John had gone to the store when Mary arrived. In Dorr (1991), we discussed the selection...
  • 8
  • 441
  • 0

Xem thêm