Báo cáo khoa học: "Topic-Focused Multi-document Summarization Using an Approximate Oracle Score" doc
... Linguistics Topic-Focused Multi-document Summarization Using an Approximate Oracle Score John M. Conroy, Judith D. Schlesinger IDA Center for Computing Sciences Bowie, Maryland, USA conroy@super.org, ... multi- document summary given a collection of doc- uments. Most automatic methods of multi- document summarization are largely extractive. This mimics the behavior of humans f...
Ngày tải lên: 08/03/2014, 02:21
... Computational Linguistics Comparative News Summarization Using Linear Programming Xiaojiang Huang Xiaojun Wan ∗ Jianguo Xiao Institute of Computer Science and Technology, Peking University, Beijing ... (Peking University), MOE, China {huangxiaojiang, wanxiaojun, xiaojianguo}@icst.pku.edu.cn Abstract Comparative News Summarization aims to highlight the commonalities and differences betwe...
Ngày tải lên: 07/03/2014, 22:20
... importance of this task. One can cast answer-finding as a traditional docu- ment retrieval problem by considering each candidate answer as an isolated document and ranking each can- didate answer ... containing an answer to a question is rather stricter than mere relevance. Put another way, only a small number of documents actually contain the an- swer to a given query, while every docum...
Ngày tải lên: 23/03/2014, 19:20
Báo cáo khoa học: Protein database searches using compositionally adjusted substitution matrices docx
... the analysis of such proteins, we have previously described a rationale and an efficient algorithm, improved here, for transforming a standard matrix into one appropriate for any specified nonstandard compositional ... only substantial E-value chan- ges, of greater than a factor of 10, i.e., score changes greater than 3.3 bits, the case by case advantage of mode D is vitiated. We therefore pr...
Ngày tải lên: 07/03/2014, 21:20
Báo cáo khoa học: " Named Entity Recognition using an HMM-based Chunk Tagger" pptx
... F-measures of 96.6% and 94.1% respectively. It shows that the performance is significantly better than reported by any other machine-learning system. Moreover, the performance is even consistently ... ContainsDigitAndAlpha and ContainsDigitAndDash, the former will take precedence. The first eleven features arise from the need to distinguish and annotate monetary amounts, percentages...
Ngày tải lên: 17/03/2014, 08:20
Báo cáo khoa học: "Constraint-based Sentence Compression An Integer Programming Approach" docx
... summarisation (Jing 2000), subtitle genera- tion from spoken transcripts (Vandeghinste and Pan 2004) and information retrieval (Olivers and Dolan 1999). Sentence compression is a complex paraphrasing ... standard, D: Decision-tree, LM: IP language model, Sig: IP language model with sig- nificance score) Model CompR Rating Decision-tree 56.1% 2.22 ∗† LangModel 49.0% 2.23 ∗† LangModel+Significanc...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Stochastic Language Generation Using WIDL-expressions and its Application in Machine Translation and Summarization" pot
... this algorithm, using an average of tiles per sentence (for an average input sentence length of 30 words) and an average of possible trans- lations per tile, encodes a candidate set of about 10 possible translations. ... large set of candidate realizations, and, in a second phase, statistical knowledge about the target language (such as stochastic language models) to rank the candidat...
Ngày tải lên: 23/03/2014, 18:20
Tài liệu Báo cáo khoa học: "Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation" pdf
... errors, and the CRF models are trained using the alignment results as supervised data. 2.2 Insertion / Deletion Since an insertion can be regarded as replacing an empty word with an actual word, and ... statistical ma- chine translation (PBSMT), but there are three dif- ferences; 1) it adopts the conditional random fields, 2) it allows insertion and deletion, and 3) binary and real fea...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Identifying Text Polarity Using Random Walks" pptx
... Polarity Using Random Walks Ahmed Hassan University of Michigan Ann Arbor Ann Arbor, Michigan, USA hassanam@umich.edu Dragomir Radev University of Michigan Ann Arbor Ann Arbor, Michigan, USA radev@umich.edu Abstract Automatically ... product is very impor- tant for marketing and customer relation manage- ment (Morinaga et al., 2002). Manually handling reviews to identify reputation is a ver...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Automatic Headline Generation using Character Cross-Correlation" doc
... to handle huge amount of documents, which is a tedious and time-consuming process. Instead of reading every document, the headline can be used to decide which of them contains important infor- mation. ... extractive and abstrac- tive. In the work of (Douzidia and Lapalme, 2004), and extractive method was used to produce a 10- words summary (which can be considered as a headline) of an...
Ngày tải lên: 20/02/2014, 05:20