... performance (around 60- 70% F 1 -measure) is obtained only for specific do- mains (e.g., an ICT corpus) and patterns (Borg et al., 2009). Only few papers try to cope with the general- ity of patterns ... rather from specific patterns like “X such as Y”. There- fore a direct comparison with these methods is not possible. Nonetheless, we decided to implement Hearst’s patterns for the...
Ngày tải lên: 20/02/2014, 04:20
... likely pronunciation for each word. It is straightforward to extend to multiple pronunciations by first sampling a pronunciation for each word and then sampling a segmentation for that pronunciation. 8 Once ... b ax, d ae n. The latter is more useful for automatically recovering the word’s orthographic form, identify- ing that an OOV was spoken, or improving perfor- mance of a spoken t...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Learning with Unlabeled Data for Text Categorization Using Bootstrapping and Feature Projection Techniques" doc
... unlabeled data and a small amount of seed information to tell the learner about the specific task. In this paper, we consider seed information in the form of title words associated with categories. ... machine-labeled data. This paper provides solutions for these problems. For the first problem, we employ the bootstrapping framework. For the second, we use the TCFP classifier with...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Incremental Syntactic Language Models for Phrase-based Translation" pptx
... category for input “meets the”. A sample phrase structure tree is shown before and after the right-corner transform in Figures 2 and 3. Our parser operates over a right-corner trans- formed probabilistic ... Association for Computational Linguistics, pages 620–631, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics Incremental Syntactic Language Models...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "A Localized Prediction Model for Statistical Machine Translation" ppt
... blocks for for which . 560 4 Online Training of Maximum-entropy Model The local model described in Section 3 leads to the fol- lowing abstract maximum entropy training formulation: (8) In this formulation, ... corresponding to la- bel . The symbol is short-hand for the feature- vector . This formulation is slightly differ- ent from the standard maximum entropy formulation typ- ically enco...
Ngày tải lên: 20/02/2014, 15:20
Tài liệu Báo cáo khoa học: "Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach" pptx
... lexicon models lack from context infor- mation that can be extracted from the same paral- lel corpus. This additional information could be: Simple context information: information of the words surrounding ... surrounding the word pair; Syntactic information: part-of-speech in- formation, syntactic constituent, sentence mood; Semantic information: disambiguation in- formation (e.g. from WordNe...
Ngày tải lên: 20/02/2014, 18:20
Tài liệu Báo cáo khoa học: "ADP based Search Algorithm for Statistical Machine Translation" docx
... additional parameter into the recursion formula for DP. In the following, we will explain this method in detail. 2.3 Recursion Formula for DP In the DP formalism, the search process is described ... little meaningful information or the information is different from the input. Examples for each category are given in Table 3. Table 4 shows the statistics of the translation perfo...
Ngày tải lên: 20/02/2014, 18:20
Tài liệu Báo cáo khoa học: "Learning to Translate with Multiple Objectives" doc
... recording the set of hypotheses that maximizes k p k M k (h). For 0.6 < p 1 ≤ 1 we get h = (0.9, 0.1), for p 1 = 0.6 we get (0.7, 0.6), and for 0 < p 1 < 0.6 we get (0.4, 0.8). At no setting ... exponentiated-combination k p k M k (h) q , for a suitable q > 0, does satisfy necessary conditions for pareto optimality. However the proper tuning of q is not known a prior...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Learning Syntactic Verb Frames Using Graphical Models" doc
... between pairs of SCFs) and a mapping from surface frames to the underlying predicate-argument structure. In- formation about verb subcategorization is useful for tasks like information extraction (Cohen and ... that for each verb, it has an accurate distribution over that inventory. We therefore compare the lexicons based on their per- formance on a task that a good SCF lexicon should b...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Learning to Find Translations and Transliterations on the Web" doc
... system that outperforms previous work. 1 Introduction The phrase translation problem is critical to machine translation, cross-lingual information retrieval, and multilingual terminology (Bian ... translation for a given term, or to supplement a bilingual terminology bank (e.g., adding multilingual titles to existing Wikipedia); alternatively, they can be used as additional tra...
Ngày tải lên: 19/02/2014, 19:20