Báo cáo khoa học: "Text Summarization Evaluation" doc
... allow an analyst to quickly and correctly categorize a doc- ument. Here the topic was not known to the summarization system. Given a document, which could be a generic summary or a full-text ... topic, a 50-document subset was created from the top 200 ranked documents retrieved by a stan- dard IR system. For the categorization task, only 10 topics were selected, with 100 documents us...
Ngày tải lên: 31/03/2014, 21:20
... of summarization tasks: single-document summarization and multi- document summarization. While single-document summarization is to generate a summary from a single document, multi-document summarization is ... multiple documents regarding one topic. Such a set of multiple docu- ments is called a document cluster. The method proposed in this paper is applicable to both tasks. In both...
Ngày tải lên: 17/03/2014, 22:20
Ngày tải lên: 18/02/2014, 13:20
Tài liệu Báo cáo khoa học: "Text Alignment in a Tool for Translating Revised Documents" docx
... that a new document needs to be translated and there ex- ist a collection of bilingual documents in the same domain. It would be interesting to see how many sentences of the new document can ... "re-translation" prob- lem. It occurs when a new version of a previously translated document needs to be translated. The tool identifies the changes between the two versions of the ....
Ngày tải lên: 22/02/2014, 10:20
Báo cáo khoa học: "Topic-Focused Multi-document Summarization Using an Approximate Oracle Score" doc
... doc- uments. Most automatic methods of multi- document summarization are largely extractive. This mimics the behavior of humans for sin- gle document summarization; (Kupiec, Pendersen, and Chen ... per- formance for the 2005 Document Under- standing Conference (DUC) evaluation. 1 Introduction We consider the problem of producing a multi- document summary given a collection of doc- umen...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "On the Evaluation and Comparison of Taggers: the Effect of Noise in Testing Corpora." doc
Ngày tải lên: 08/03/2014, 05:21
Báo cáo khoa học: "Contents and evaluation of the first Slovenian-German online dictionary" doc
Ngày tải lên: 17/03/2014, 22:20
Báo cáo khoa học: "Text Analysis for Automatic Image Annotation" doc
... (term fre- quency) x idf (inverse document frequency) weight of the words (Amir et al., 2005). In exceptional cases, the hierarchical XML structure of a text doc- ument (which was manually annotated) ... proba- bility of appearing in the image than less important persons. Because of the short lengths of the docu- ments in our corpus, an analysis of lexical cohesion between terms in the text...
Ngày tải lên: 23/03/2014, 18:20
Báo cáo khoa học: "An Integrated Multi-document Summarization Approach based on Word Hierarchical Representation" pot
... introduces a novel hierarchical summarization approach for automatic multi- document summarization. By creating a hierarchical representation of the words in the input document set, the proposed ... objectives of multi- document summarization through an integrated framework. The evaluation is conducted on the DUC 2007 data set. 1 Introduction and Background Multi-document summariz...
Ngày tải lên: 31/03/2014, 00:20
Báo cáo khoa học: "Text Segmentation Using Reiteration and Collocation" docx
... Lindsay J. Evett Department of Computing Nottingham Trent University Nottingham NG1 4BU, UK lje @doc. ntu.ac.uk Abstract A method is presented for segmenting text into subtopic areas. The proportion ... Each segment could be summarised individually and then combined to provide an abstract for a document. Previous work on text segmentation has used term matching to identify cluster...
Ngày tải lên: 31/03/2014, 04:20