... extends DOP1 to unsupervised parsing (Bod 2006). Its key idea is to assign all unlabeled binary trees to a set of sentences and to next use (in principle) all subtrees from these binary trees to parse ... but since ML estimation is known to be very sensitive to the initialization of the parameters, it is convenient to start with parameters that are known to perform well. To...
Ngày tải lên: 20/02/2014, 12:20
... as tailored to the financial do- main, and some forward-looking exten- sions to the approach that enables users to specify classifications on the fly. 1 Introduction Our goal is to support the ... layoff. A second concern is thus to enable end users to interpret facts and events through automated context assessment. The route we have taken towards this end is to model the doma...
Ngày tải lên: 08/03/2014, 21:20
Báo cáo khoa học: "An Ontology–Based Approach for Key Phrase Extraction" docx
... all redirects that link to the article. A LINK. Each page consists of many links which function not only to point from the page to others, but also to guide readers to pages that provide ... of C 2 then C 1 is the most specific category. 5. to traverse the ViO ontology from C 1 & C 2 to find the nearest common ancestor node (C’). Calculate the distance between C...
Ngày tải lên: 08/03/2014, 01:20
Tài liệu Báo cáo khoa học: "A Ranking-based Approach to Word Reordering for Statistical Machine Translation" doc
... improves the performance for both English -to- Japanese and Japanese -to- English experiments over the BTG baseline system. It also out-performs the manual rule set on English -to- Japanese result, but the ... for German -English, Chinese -English, English- Hindi and English- Japanese respectively. Xu et al. (2009) designed a clever precedence reordering rule set for translation from...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation" ppt
... EPs. A Chinese thesaurus is adopted and revised to meet this de- mand. Extended Version of TongYiCiCiLin To extend the TongYiCiCiLin (Cilin) to hold more words, several linguistic resources ... word, which is called an atom word group, an atom class or an atom node. The words in the same atom node hold the smallest semantic dis- tance. From the root node to the leaf node, the...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging∗" docx
... differences hold to a lesser degree when a partial dictionary is provided. With MLHMM, different tokens of the same word type are usually assigned to the same cluster, but types are assigned to clusters ... We would also like to thank Noah Smith for providing us with his data sets. Eisner, 2005). Nearly all of these approaches have one aspect in common: the goal of learning is to id...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "A Feature Based Approach to Leveraging Context for Classifying Newsgroup Style Discussion Segments" pptx
... state of a simple finite-state automaton that only has two states. The automaton is set to initial state (q 0 ) at the top of a message. It makes a transition to state (q 1 ) when it encounters ... is to enable the quality and nature of discussions that occur within an on-line discussion board to be communicated in a summary to a potential new- comer or group moderators. We p...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Probing the lexicon in evaluating commercial MT systems Martin" pot
... we used the lexicon evalua- tion to check for agreement within the noun phrase. Translating from English to German the MT system has to get the gender of the German noun from the lexicon since ... evaluated the output. These steps had to be done for both translation directions (German to English and vice versa), but here we concentrate on English to German. 2.1 Pr...
Ngày tải lên: 22/02/2014, 03:20
Báo cáo khoa học: "A Nonparametric Bayesian Approach to Acoustic Model Discovery" docx
... the future, we plan to explore phonological context and use more flexible topological structures to model acoustic units within our framework. Acknowledgements The authors would like to thank Hung-an ... R 39 to denote the t th feature frame of the i th utterance. Fig. 1 illustrates how the speech signal of a single word utterance banana is converted to a sequence of feature vectors...
Ngày tải lên: 07/03/2014, 18:20
Báo cáo khoa học: "A Two-step Approach to Sentence Compression of Spoken Utterances" pdf
... first step, 8 anno- tators were asked to select words to be removed to compress the sentences. In the second step, 6 an- notators (different from the first step) were asked to pick the best one ... propose to use a two-step approach in this pa- per for sentence compression of spontaneous speech utterances. The contributions of our work are: • Our proposed two-step approach allows...
Ngày tải lên: 07/03/2014, 18:20