Báo cáo khoa học: "Automatically Mining Question Reformulation Patterns from Search Log Data" pdf
... concept, we propose a question reformulation framework. First, we mine the question reformulation patterns from search logs that record users’ reformulation behavior. Second, given a new question, we ... chal- lenge to search systems. In this paper, we au- tomatically mined 5w1h question reformula- tion patterns fro m large scale search log data. The question refo...
Ngày tải lên: 07/03/2014, 18:20
... annotate spoken dialogues. A major challenge to address is in reducing the dimensionality of the many features available to the learners. The motivation for our research arises from the need ... Tutoring SPOKEn dialogue System) project (Litman and Silliman, 2004). Ongoing research in ITSPOKE aims to recognize emotional states of students in order to build a spoken dialogue tutoring...
Ngày tải lên: 20/02/2014, 16:20
... comparable corpora from the Romanian translations of the European Union’s acquis communautaire which we mined from the Web, and has about 10M English words. We downloaded comparable data from three on- line ... boxes from the figure show, some parallel fragments of data do exist; but they are present at the sub-sentential level. In this paper, we present a method for extracting such pa...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Predicting Strong Associations on the Basis of Corpus Data" pdf
... the logarithm of rank+1, in order to avoid division by zero. 654 mean harmonic mean systems med mean rank1 med mean rank1 loglik 10 (baseline) 3 16.6 194 loglik 10 + doc 2 13.1 223 3 13.4 211 loglik 10 + ... word 10 3 13.8 182 3 14.2 187 loglik 10 + doc + syn 3 14.4 179 4 14.7 184 loglik 10 + doc + comp 2 11.8 249 2 12.2 221 Table 4: Results of ensemble methods. loglik 10 = log- likelihood r...
Ngày tải lên: 08/03/2014, 21:20
Báo cáo khoa học: "Word Clustering and Disambiguation Based on Co-occurrence Data" pdf
... the above as 6Ldat = - ~C~eT~ f(Cij, Co) log P(c'i'co P(co).P(C~) + Y~C.eT~ f(Ci, Co) log P(C,,C~) P(cd.P(c~) P(C1,Cv) + ~C~eTv f(Cs, Cv )log p(cD.P(c 0 . Thus, the quantity 6Laat ... L(SIM ) = - ~ log /5(n, v), (n,v)e8 where/5 stands for the maximum likelihood estimate of P (as defined in Section 3). We then calculate the model description length as k L(M)...
Ngày tải lên: 17/03/2014, 07:20
Tài liệu Báo cáo khoa học: "Automatically Generating Term-frequency-induced Taxonomies" doc
... Mohania IBM Research - India {karinmur|ftanveer|lvsubram|hkaranam|mkmukesh}@in.ibm.com Abstract We propose a novel method to automati- cally acquire a term-frequency-based tax- onomy from a corpus ... use lexical-syntactic formulation to de- fine patterns, either manually (Kozareva et al., 2008) or automatically (Girju et al., 2006), and apply those patterns to mine instances of the pat-...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Automatically Extracting Polarity-Bearing Topics for Cross-Domain Sentiment Classification" pptx
... classification accuracy. Manually examining the extracted polarity topics from JST reveals that when the topic number is small, each topic cluster contains well-mixed words from different domains. How- ever, ... sentiment-topic (JST) model (?; ?) was ex- tended from the latent Dirichlet allocation (LDA) model (?) to detect sentiment and topic simultane- ously from text. The only superv...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "The QuALiM Question Answering Demo: Supplementing Answers with Paragraphs drawn from Wikipedia" ppt
... the QuALiM Question Answering system. While the system actually gets answers from the web by querying major search engines, during pre- sentation answers are supplemented with rel- evant passages from ... improves a user’s search experience. 1 Introduction This paper describes the online demo of the QuALiM 1 Question Answering system (http://demos.inf.ed.ac.uk:8080/qualim/). We wi...
Ngày tải lên: 20/02/2014, 09:20
Báo cáo khoa học: "Automatic Compilation of Travel Information from Automatically Identified Travel Blogs" doc
... 'www.travelblog.org' and 'travel.blogmura.com' are portal sites for travel blogs. At these sites, travel blogs are manually registered by bloggers themselves, and the blogs are ... blogs, which are defined as travel journals writ- ten by bloggers in diary form. Travel blogs are considered a useful information source for ob- taining travel information, because many bl...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Opinion Mining Using Econometrics: A Case Study on Reputation Systems" pdf
... in this paper are available from http://economining.stern.nyu.edu. There are many other applications beyond reputa- tion systems. For example, using sales rank data from Amazon.com, we can examine ... the semantic orientation and the strength of an evaluation from the changes in the observed economic variable. Following this idea, we use techniques from econo- metrics to identify the...
Ngày tải lên: 08/03/2014, 02:21