Báo cáo khoa học: "Exploiting Bilingual Information to Improve Web Search" pdf
... is helpful. 6 Conclusions and Future Work We aim to improve web search ranking for an important set of queries, called bilingual queries, by exploiting bilingual information derived from clickthrough logs ... unique to web search; it appears in a wide range of natural language processing prob- lems. Much recent work on bilingual data has fo- cused on exploiting these variati...
Ngày tải lên: 30/03/2014, 23:20
... novel topics. We utilize TH t,* , the row vector of the topic-hidden matrix TH for a topic t, as a feature set. In brief, we apply LDA to extract the topic-hidden vector TH t,* to model topic ... to determine the size of hidden topic categories. TH indicates the distribution of each topic to hidden topic categories, and HW indicates the distribution of each lexical term to hi...
Ngày tải lên: 30/03/2014, 17:20
... Topic t Words t ∀t ∈ T Topic None → ## Topic t → T t Topic None ∀t ∈ T Topic t → T None Topic t ∀t ∈ T T t → t Topical c 1 ∀t ∈ T Topical c i → (c i ) Topical c i+1 i = 1, . . . , − 1 Topical c → ... topic. Sentence Topic.pig T.None .dog NotTopical.child.eyes NotTopical.child.hands NotTopical.mom.eyes NotTopical.mom.hands NotTopical.mom.point # Topic.pig T.pig .pig Topical.child.eyes...
Ngày tải lên: 19/02/2014, 19:20
Báo cáo khoa học: "Using Similarity Scoring To Improve the Bilingual Dictionary for Word Alignment" doc
... 15213 ralf@cs.cmu.edu Abstract We describe an approach to improve the bilingual cooccurrence dictionary that is used for word alignment, and evaluate the improved dictionary using a version of the Competitive ... also other applications such as cross-lingual information retrieval. Since it is a hard and time-consuming task to hand-align bilingual data, the automation of this task...
Ngày tải lên: 08/03/2014, 07:20
Báo cáo khoa học: "From Bilingual Dictionaries to Interlingual Document Representations"Raghavendra Udupa Micros pptx
... x (t) i is aligned to a target doc- ument y (t) j . We want each document to align to at least one document from other language. Moreover, we want to encourage similar documents to align to each other. ... USA. ACM. Wei Gao, John Blitzer, Ming Zhou, and Kam-Fai Wong. 2009. Exploiting bilingual information to improve web search. In Proceedings of Human Language Tech- nolo...
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "Using Mutual Information to Resolve Query Translation Ambiguities and Query Term Weighting" doc
... mutual 223 information is used not only to select the best candidate but also to assign a weight to query terms in the target language. 1 Overall Query Translation Process Our Korean -to- English ... decided to use a bilingual dictionary at the second stage and a target-language corpus for the third and the fourth stages. Our strategy was to try not to depend on scarce...
Ngày tải lên: 17/03/2014, 07:20
Báo cáo khoa học: "Using bilingual dependencies to align words in Enlish/French parallel corpora" ppt
... Linguistics Using bilingual dependencies to align words in Enlish/French parallel corpora Sylwia Ozdowska ERSS - CNRS & Université de Toulouse le Mirail 5 allées Antonio Machado 31058 Toulouse ... and fall to the bottom. Les oeufs de très petite taille tombent sur le fond. (5) X is a model which was designed to stimulate… X est un modèle qui a été conçu pour stimul...
Ngày tải lên: 23/03/2014, 19:20
Báo cáo khoa học: "Using Bilingual Information for Cross-Language Document Summarization" pptx
... one-side information for sentence ranking, which is not very reliable. In order to leveraging both-side information for sentence ranking, we propose the following two methods to incorporate the bilingual ... M encn =(M encn ) T . Then M encn is normalized to encn M ~ to make the sum of each row equal to 1. We use two column vectors u=[u(s cn i )] n×1 and v =[v(s en...
Ngày tải lên: 30/03/2014, 21:20
Tài liệu Báo cáo khoa học: "An Unsupervised Approach to Recognizing Discourse Relations" pdf
... word pairs that are not likely to be good predictors of discourse relations. To test this hypothesis, we decided to carry out a second experiment that used as predictors only a subset of the word ... of lexical items are likely to co-occur in conjunction with each dis- course relation and a means to apply the learned pa- rameters to any pair of text spans in order to deter- mine...
Ngày tải lên: 20/02/2014, 21:20
Báo cáo khoa học: "Using Anaphora Resolution to Improve Opinion Target Identification in Movie Reviews" docx
... in comparison to documents from other domains: Turney (2002) observes that the movie reviews are hardest to classify since the review authors tend to give information about the storyline of the ... are about. This aboutness has been referred to as the opinion tar- get or opinion topic in the literature from the field. In this work our goal is to extract opinion target - opinion wor...
Ngày tải lên: 17/03/2014, 00:20