Báo cáo khoa học: "Topic Analysis for Psychiatric Document Retrieval" potx
... 1024–1031, Prague, Czech Republic, June 2007. c 2007 Association for Computational Linguistics Topic Analysis for Psychiatric Document Retrieval Liang-Chih Yu* ‡ , Chung-Hsien Wu*, Chin-Yew ... relevant documents. This work proposes the use of high-level topic information ex- tracted from consultation documents to im- prove the precision of retrieval results. The topic informat...
Ngày tải lên: 23/03/2014, 18:20
... unseen for a given domain s, we are already performing an implicit form of smoothing (when computing the expected counts), since each docu- ment has a distribution over all topics, and therefore we ... consistent within a document. Results Results for both settings are shown in Ta- ble 2. GTM models the latent topics at the document level, while LTM models each sentence as a separate...
Ngày tải lên: 19/02/2014, 19:20
... initially to be uniform 2. for several iterations do: a. set up a count( , ) table with zero entries b. for = 1 to Q[ ,1] = b( boundary) c. for = 2 to for = 1 to Q[ , ] = 0 for = 1 to Q[ , ] += ... 1 to Q[ , ] += Q[ , ] b( ) s( ) d. for = 1 to R[ , ] = b(boundary ) e. for = to 1 for = 1 to R[ , ] = 0 for = 1 to R[ , ] += R[ , ] b( ) s( ) f. for = 1 to for = 1 to count( ,...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Topic Models for Word Sense Disambiguation and Token-based Idiom Detection" pdf
... de- tailed information may not be available, for in- stance for languages for which such a resource does not exist or for expressions that are not very well covered in WordNet, such as idioms. For those ... value of two topic -document vectors (one for the sense and one for the context). We apply these models to coarse- and fine-grained WSD and find that they outperform comparabl...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "Text Analysis for Automatic Image Annotation" doc
... 2007. c 2007 Association for Computational Linguistics Text Analysis for Automatic Image Annotation Koen Deschacht and Marie-Francine Moens Interdisciplinary Centre for Law & IT Department ... bulk of un- structured information such as text, images and video, a situation witnessed in many domains (news, biomedical information, intelligence information, business documents, etc.)....
Ngày tải lên: 23/03/2014, 18:20
Báo cáo khoa học: "Resource Analysis for Question Answering" doc
... 100 0 100 200 300 400 500 600 700 800 900 1000 Web Retrieval Performance For QA document rank # questions Correct Doc Density First Correct Doc Figure 1: Web retrieval: relevant document density and rank of first relevant document. In order ... the density of documents contain- ing a correct answer, as well as the rank of the first document containing a correct answer. The sim- ple wo...
Ngày tải lên: 31/03/2014, 03:20
Báo cáo khoa học: "Transductive learning for statistical machine translation" potx
... over all possible valid translations t for a particular input sentence s. We can initialize this probability distribution to the uniform distribu- tion for each sentence s in the unlabeled data ... Algorithm 1 we run it for a fixed number of iterations and instead focus on finding useful def- initions for Estimate, Score and Select that can be experimentally shown to improve MT performa...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Pronunciation Modeling for Improved Spelling Correction" potx
... presents a method for incor- porating word pronunciation information in a noisy channel model for spelling cor- rection. The proposed method builds an explicit error model for word pronuncia- tions. ... correspond to nothing (i.e., are silent). For example, the entry in NETtalk (when we remove the empties, which contain information for phone level alignment) for the word able is A...
Ngày tải lên: 17/03/2014, 08:20
Báo cáo khoa học: "Extractive Summaries for Educational Science Content" potx
... granularity and therefore graphs that may not be suitable for presentation to a student. Multi -document summarization (MDS) re- search also informs our work. XDoX analyzes large document sets to ... basis for the construction of knowledge maps useful both as computational knowledge representations and as learning re- sources for presentation to the student. 2 Related Work Ou...
Ngày tải lên: 23/03/2014, 17:20
Báo cáo khoa học: "Discourse Cues for Broadcast News Segmentation" potx
... Figure 1) or MS-NBC. For example, the structure for the Jim Lehrer News Hour provides not only segmentation information but also content information for each segment. Thus, the order of stories ... content-based information browsing, retrieval, extraction, and summarization to ensure their value for tasks such as real-time profiling and retrospective search. Whereas image proc...
Ngày tải lên: 23/03/2014, 19:20