... exper- imentally tested using a manually labeled set of positive and negative words. It out- performs the state of the art methods in the semi-supervised setting. The results in the unsupervised setting is ... word. The method could be used both in a semi-supervised setting where a training set of labeled words is used, and in an unsupervised setting where a handful of seeds is used to...
Ngày tải lên: 20/02/2014, 04:20
... 5–8, Columbus, June 2008. c 2008 Association for Computational Linguistics Generating research websites using summarisation techniques Advaith Siddharthan & Ann Copestake Natural Language and Information ... stylesheets are often considered inappropriate for diverse organisations. Research summary pages using stylesheets can offer alternative methods of information access and browsin...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "Unsupervised Relation Disambiguation Using Spectral Clustering" ppt
... Dataset. (b) The clustering result using K-means; (c) Three elongated clusters in the 2D clustering space using Spectral clustering: two dominant eigenvectors; (d) The clustering result using ... three cir- cle dataset using K-means and Spectral-based clus- tering. From Figure 1(b), we can see that K-means can not separate the non-convex clusters in three cir- cle dataset successfully...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Tag set P.eduction Without Information Loss" doc
... constraint on the clustering of tags. Let )'V be the set of words, C the set of clusters (i.e. the reduced tagset), and 7" the original tagset. To restore the original tag from a combined ... and tagging was done with the reduced tagset. The reduced tagset was only in- ternally used, the output of the tagger consisted of the original tagset for all experiments. The Susan...
Ngày tải lên: 20/02/2014, 22:20
Báo cáo khoa học: "Comparative News Summarization Using Linear Programming" ppt
... retrieve ten related news articles for each topic using the Google News 2 search engine. Finally we write the comparative summary for each topic pair manually. The topics are showed in table 1. 4.2 Evaluation ... differences between two comparable news topics by using human readable sentences. The summarization system is given two collections of news articles, each of which is related...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "AN ENVIRONMENT FOR ACQUIRING SEMANTIC INFORMATION" pptx
... design and transportability of IRUS. Finally, a fifth kind of knowledge is a set of domain plans. Though no extensive set of such plans has been developed yet, there is growing agreement that ... relations. Using such a representation limits the kind and numbers of questions that have to be asked of the user by the semantic acquisition com- ponent. The representation dovetails we...
Ngày tải lên: 08/03/2014, 18:20
Báo cáo khoa học: "Automatic Set Instance Extraction using the Web" pptx
... This leads to sets of instances that are noisy; however, we will show that set expansion and re-ranking can improve the initial sets dramatically. Below, we will refer to the initial set of noisy ... name of the semantic set and a seed instance. Pasca (Pas¸ca, 2007b; Pas¸ca, 2007a) illustrated a set expansion approach that extracts instances from Web search queries given a set o...
Ngày tải lên: 08/03/2014, 00:20
Tài liệu Báo cáo khoa học: "Grammar Error Correction Using Pseudo-Error Sentences and Domain Adaptation" pdf
... and method. • TRG: The models were trained using only the real-error corpus (baseline). • SRC: Trained using only the pseudo-error corpus. • ALL: Trained using the real-error and pseudo- error corpora ... task is hindered by the difficulty of collecting large error cor- pora. We tackle this problem by using pseudo- error sentences generated automatically. Fur- thermore, we apply domain...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Automatic Headline Generation using Character Cross-Correlation" doc
... appropriate set of consecutive words (phrase) from a document body that should represent an adequate headline for the document. Then, eva- luate those headlines by calculating ROUGE score against a set ... “ and he wrote it ” is compared with the word “ﺐﺘﻛ” “ he wrote ” using the EWM method the resulting score will be 0, but when using the CCC method it will be 0.667. The CCC me...
Ngày tải lên: 20/02/2014, 05:20
Tài liệu Báo cáo khoa học: "Japanese Dependency Parsing Using Co-occurrence Information and a Combination of Case Elements" pdf
... the set of case elements that modify the i-th verb in T rs i (T ) : the set of particles in the set of case el- ements that modify the i-th verb in T ns i (T ) : the set of nouns in the set of ... element set (bunsetsu set) and the co- occurrence probability of the particle set, and we select the ˆ T that maximizes the probability. Some of the co-occurrence probabilities of the...
Ngày tải lên: 20/02/2014, 12:20