Báo cáo khoa học: "Prefix Probability for Probabilistic Synchronous Context-Free Grammars" ppt
... Association for Computational Linguistics, pages 460–469, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics Prefix Probability for Probabilistic Synchronous Context-Free ... probability is more precisely defined as the sum of the probabilities of translation pairs of the form [v 1 w 1 , v 2 w 2 ], for any strings w 1 and w 2 . A special case...
Ngày tải lên: 23/03/2014, 16:20
... in order to focus the re- search effort on common problems for the language pair in question. There have been previous attempts of describing typologies for EA for MT, but they are not unproblematic. ... fine-grained typology and guidelines for EA. I have also created a tool for performing MT error analysis (Stymne, 2011a). Initial annotations have helped to focus my research efforts...
Ngày tải lên: 23/03/2014, 16:20
... HLT 2010 First International Workshop on Formalisms and Methodology for Learning by Reading. Association for Computational Linguistics, Los Angeles, Califor- nia, June. Tom O’Hara and Janyce Wiebe. ... equal importance for each parameter. It thus prefers “general” solutions, assigning part of the probability mass to unlikely states (Johnson, 2007). We ran EM on each model for 100 it...
Ngày tải lên: 20/02/2014, 05:20
Tài liệu Báo cáo khoa học: "A Gibbs Sampler for Phrasal Synchronous Grammar Induction" docx
... p null = 10 −10 for this value in the experiments we report below. 784 either φ P z i for phrase pairs or φ null for single lan- guage phrases. We choose Dirichlet process (DP) priors for these parameters: φ P z i ∼ ... estimator of Koehn et al. (2003). We develop a novel Gibbs sampler to perform inference over the latent synchronous derivation trees for our training instances. Th...
Ngày tải lên: 20/02/2014, 07:20
Tài liệu Báo cáo khoa học: "Unsupervised Topic Modelling for Multi-Party Spoken Discourse" ppt
... Association for Computational Linguistics Unsupervised Topic Modelling for Multi-Party Spoken Discourse Matthew Purver CSLI Stanford University Stanford, CA 94305, USA mpurver@stanford.edu Konrad ... desired general level of granularity. For each set of annotations, we therefore performed two sets of segmentations: one in which the threshold was set for each meeting to give the known...
Ngày tải lên: 20/02/2014, 11:21
Tài liệu Báo cáo khoa học: "An expressive formalism for describing tree-based grammars" docx
... tree description and/or of semantic formu- las. The XMG formalism furthermore supports the sharing of identifiers across dimension hence al- lowing for a straightforward encoding of the syn- tax/semantics ... frame- work for the processing of linguistic meta- descriptions. 1 Introduction It is well known that grammar engineering is a complex task and that factorizing grammar in- formation...
Ngày tải lên: 22/02/2014, 02:20
Báo cáo khoa học: "Measure Word Generation for English-Chinese SMT Systems" ppt
... quantity of objects. Therefore, in the English-to-Chinese machine translation task we need to take additional efforts to generate the missing measure words in Chinese. For example, when translating ... (Chiang and Bikel, 2002) which can heuristically identify head words for sub-trees. For the bilingual corpus, we also per- form word alignment to get correspondences be- tween source...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Learning Intonation Rules for Concept to Speech Generation" pptx
... words. The baseline performance achieved by always guessing the majority class is 67.09% for break index, 54.10% for pitch accent, 66.23% for phrase accent and 79.37% for boundary tone based ... compo- nent provides very rich syntactic and semantic information which has not been explored before for intonation modeling. This includes, for ex- ample, the semantic role played by...
Ngày tải lên: 08/03/2014, 05:21
Báo cáo khoa học: "Collective Classification for Fine-grained Information Status" ppt
... Association for Computational Linguistics, pages 795–804, Jeju, Republic of Korea, 8-14 July 2012. c 2012 Association for Computational Linguistics Collective Classification for Fine-grained Information ... best of our knowledge, we therefore present the first English corpus reliably annotated for a wide range of IS categories as well as full anaphoric information for three main anap...
Ngày tải lên: 30/03/2014, 17:20
Báo cáo khoa học: "Exemplar-Based Models for Word Meaning In Context" pptx
... dataset, we obtained 33.1– 36.5 for actT; 33.3–38.0 for actP; 37.7-39.9 for actTP. On the EP09 dataset, the numbers were 35.8–39.1 for actT; 38.1–39.9 for actP; 37.2–39.8 for actTP. 4 We did not have ... contextual features for all the senses of the target. For example, among the top 20 features for coach, we get match and team (for the “trainer” sense) as well as driver a...
Ngày tải lên: 30/03/2014, 21:20