Báo cáo khoa học: "Learning Tense Translation from Bilingual Corpora" docx
... Learning Tense Translation from Bilingual Corpora Michael Schiehlen* Institute for Computational Linguistics, University ... future in the past (3) In some cases, tense was ambiguous when con- sidered in isolation, and had to be resolved in tandem with tense translation. Ambiguous tenses on the target side were disambiguated ... Formally, we define source tense and targe...
Ngày tải lên: 31/03/2014, 04:20
... instances), from the TimeBank corpus annotated in TimeML (Pustejovky et al., 2003). The non- WSJ articles (mainly political and disaster news) include both print and broadcast news that are from ... two peaks in this distribution. One is from 5 to 7 in the natural logarithmic scale, which corresponds to about 1.5 minutes to 30 minutes. The other is from 14 to 17 in the natural l...
Ngày tải lên: 20/02/2014, 12:20
... role in machine translation. The importance of term transliteration can be realized from our analysis of the terms used in 200 qualifying sentences that were randomly selected from English-Chinese ... organized, much invaluable information can be obtained from this large text corpus. Many researchers dealing with natural language processing, machine translation, and informatio...
Ngày tải lên: 17/03/2014, 06:20
Tài liệu Báo cáo khoa học: "Learning Hierarchical Translation Structure with Linguistic Annotations" ppt
... integrating lin- guistic information in translation systems. Syntax- based MT often suffers from inadequate constraints in the translation rules extracted, or from striving to combine these rules ... advancing from structures which mimic lin- guistic syntax, to learning linguistically aware latent recursive structures targeting translation, we achieve significant improvements in tr...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Computing Consensus Translation from Multiple Machine Translation Systems Using Enhanced Hypotheses Alignment" pot
... (EPPS). In our experiments, we used the translations from Span- Table 2: Improved translation results for the con- sensus translation computed from 5 translation outputs on the Chinese-English ... trans- lations. These translations can be improved if we compute a consensus translation from the output of at least two different speech translation systems. From each system, we...
Ngày tải lên: 22/02/2014, 02:20
Báo cáo khoa học: "Learning Common Grammar from Multilingual Corpus" potx
... languages from non-parallel multilingual corpora in an unsupervised fashion. For this purpose, we assume a generative model for multilingual corpora, where each sentence is generated from a language ... borrowing from nearby languages, and 3) the innate abilities of humans (Chomsky, 1965). We assume hidden commonalities in syntax across languages, and try to extract a common grammar fr...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "Learning Semantic Links from a Corpus of Parallel Temporal and Causal Relations" doc
... null label is NO-REL. train/test split from Table 1 and the feature sets: Syntactic The syntactic features from Section 4. Semantic The semantic features from Section 4. All Both syntactic and ... 2 nd words. POS Pair As 1 st Event, but using part of speech tag pairs. POS tags encode tense, so this suggests the performance of a tense- based classifier. The results on our test data are...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Learning Semantic Categories from Clickthrough Logs" pdf
... both precision and recall. We cast semantic category acquisition from search logs as the task of learning labeled in- stances from few labeled seeds. To our knowledge this is the first study that ... different from ours. An- other line of new research is to combine various re- sources such as web documents with search query logs (Pas¸ca and Durme, 2008; Talukdar et al., 2008). We differ...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Learning Transliteration Lexicons from the Web" pptx
... from corpora. The EX approach aims to construct a large and up-to- date transliteration lexicon from live corpora. Towards this objective, some have proposed extracting translation pairs from ... transliteration pairs (EX) from corpora. The TM approach models phoneme-based or grapheme-based mapping rules using a generative model that is trained from a large bilingual lexi...
Ngày tải lên: 31/03/2014, 01:20
Báo cáo khoa học: "Building Emotion Lexicon from Weblog Corpora" potx
... Blog from January to July, 2006, spanning a period of 212 days. In total, 336,161 bloggers’ articles were col- lected. Each blogger posts 16 articles on average. We used the articles from ... and emotions using weblog corpora. A collocation model is proposed to learn emotion lexicons from weblog articles. Emotion classification at sentence level is experimented by using the mined...
Ngày tải lên: 17/03/2014, 04:20