Báo cáo khoa học: "Active Learning-Based Elicitation for Semi-Supervised Word Alignment" pptx

Báo cáo khoa học: "Active Sample Selection for Named Entity Transliteration" pptx

Báo cáo khoa học: "Active Sample Selection for Named Entity Transliteration" pptx

... Volume), pages 53–56, Columbus, Ohio, USA, June 2008. c 2008 Association for Computational Linguistics Active Sample Selection for Named Entity Transliteration Dan Goldwasser Dan Roth Department of ... which may not be available for all languages. We show how to effectively train an accurate transliter- ation classifier using very little data, obtained automatically. To perform this tas...

Ngày tải lên: 31/03/2014, 00:20

4 186 0
Tài liệu Báo cáo khoa học: "Using Confidence Bands for Parallel Texts Alignment" pptx

Tài liệu Báo cáo khoa học: "Using Confidence Bands for Parallel Texts Alignment" pptx

... intercept (the value of y when x is 0), substituting x for the Portuguese word position. For Table 3, the ex- pected word position for the word I at pt word position 3877 is 0.9165 × 3877 + 141.65 = ... brackets). For average size texts (e.g. the Written Ques- tions), these words account for about 5% of the total (about 3k words / text). This number varies according to langu...

Ngày tải lên: 20/02/2014, 18:20

8 464 0
Báo cáo khoa học: "Semi-Supervised Training for Statistical Word Alignment" docx

Báo cáo khoa học: "Semi-Supervised Training for Statistical Word Alignment" docx

... parameters used in generating Foreign words which are unaligned 11 backoff fertility for words with count <= 5 4 d 1 (j) movement probs of leftmost Foreign word translated from a particular ... English words (“zero fertil- ity”) and aligned English words, and unaligned Foreign words (“NULL-generated” words) and aligned Foreign words. This is a small sampling of the kinds of knowledge...

Ngày tải lên: 31/03/2014, 01:20

8 193 0
Tài liệu Báo cáo khoa học: "An Unsupervised Model for Joint Phrase Alignment and Extraction" ppt

Tài liệu Báo cáo khoa học: "An Unsupervised Model for Joint Phrase Alignment and Extraction" ppt

... will call HEUR-W as it generates word alignments. It should be noted that forcing align- ments smaller than the model suggests is only used for generating alignments for use in heuristic extrac- tion, ... not optimal for the final task of generating phrase tables that are used in translation. As a so- lution to this, they proposed a supervised discrimi- native model that performs joint...

Ngày tải lên: 20/02/2014, 04:20

10 641 0
Tài liệu Báo cáo khoa học: " A Declarative Language for Implementing Dynamic Programs∗" pptx

Tài liệu Báo cáo khoa học: " A Declarative Language for Implementing Dynamic Programs∗" pptx

... sentence and grammar by asserting values for certain items. If the input is John loves Mary, the user should assert values of 1 for word( John,0,1), word( loves,1,2), word( Mary,2,3), and end(3). If the ... acquire more over time: we in- tend for it to generalize and encapsulate best practices, and serve as a testbed for new practices. Dyna is now be- ing used for parsing, machin...

Ngày tải lên: 20/02/2014, 16:20

4 560 0
Báo cáo khoa học: "A Statistical Model for Lost Language Decipherment" pptx

Báo cáo khoa học: "A Statistical Model for Lost Language Decipherment" pptx

... potential Hebrew word forms for each Ugaritic word. By intersecting two such FSAs and mini- mizing the result we can efficiently represent all potential Hebrew words for a particular Ugaritic word. We ... language, in its written form (which lacks vowels) words can easily be segmented (e.g. wyplt . n becomes wy-plt . -n). Overall, we identified Hebrew cognates for 2,155 word forms,...

Ngày tải lên: 17/03/2014, 00:20

10 429 0
Báo cáo khoa học: "A Semantic Framework for Translation Quality Assessment" pptx

Báo cáo khoa học: "A Semantic Framework for Translation Quality Assessment" pptx

... it still performs well above random chance predictions, which, for the given average of 4 items per ranking, is about 25% for best and worst ranking predictions, and about 8.33% for both. Again, ... single measure by implementing a weighted harmonic mean, for which the weighting factor can be adjusted for optimizing the metric performance. The rest of the paper is organized a...

Ngày tải lên: 17/03/2014, 00:20

6 449 0
Báo cáo khoa học: "A Portable Algorithm for Mapping Bitext Correspondence" pptx

Báo cáo khoa học: "A Portable Algorithm for Mapping Bitext Correspondence" pptx

... for the tokenizer to know which words are compounds. A word that has another word as a substring should result in one axis position for the substring and one for the su- perstring. When lexical ... is acceptable for the tokenization pro- gram to overgenerate just as it is acceptable for the matching predicate. For example, when tokenizing German text, it is not necessa...

Ngày tải lên: 17/03/2014, 23:20

8 363 0
Báo cáo khoa học: "An Integrated Architecture for Generating Parenthetical Constructions" pptx

Báo cáo khoa học: "An Integrated Architecture for Generating Parenthetical Constructions" pptx

... impor- tant parts ofthe message. By structuring information this way, parentheticals make it easier for readers to decode the message conveyed by a text. Consider for example the following message that ... (Joshi, 2004). The input to the generator is a set of rhetorical relations and semantic formulas. For each formula the system selects a set of trees from the grammar, resulting in a n...

Ngày tải lên: 23/03/2014, 17:20

6 248 0
w