Báo cáo khoa học: "A Generative Entity-Mention Model for Linking Entities with Knowledge Base" doc
... new generative model, the entity-mention model, which can leverage heterogenous entity knowledge (including popularity knowledge, name knowledge and context knowledge) for the entity linking ... 946 2 The Generative Entity-Mention Model for Entity Linking In this section we describe the generative entity- mention model. We first describe the generative sto...
Ngày tải lên: 23/03/2014, 16:20
... hierarchical model and re- gression model to score sentences in new docu- ments, eliminating the need for building a genera- tive model for new document clusters. 3 Summary-Focused Hierarchical Model Our ... two step learn- ing problem building a generative model for pattern discovery and a regression model for inference. We calculate scores for sentences in document c...
Ngày tải lên: 20/02/2014, 04:20
... re- trieval model, and further assumed that all re- trieved documents contained relevant opinions. (2) Doc: The 2-stage document-based opinion retrieval model was adopted. The model used sentiment ... only 8 relevant documents without any opinion and 14 documents with relevant opinions. As a result, the graph constructed by insufficient documents worked ineffectively. Except f...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "A Joint Statistical Model for Simultaneous Word Spacing and Spelling Error Correction for Korean" pdf
... 61–64, Prague, June 2007. c 2007 Association for Computational Linguistics A Joint Statistical Model for Simultaneous Word Spacing and Spelling Error Correction for Korean Hyungjong Noh* Jeong-Won ... errors. Likewise, algo- rithms for solving spelling error problem cannot work well with word spacing errors. To cope with the limitation, there is an algo- rithm proposed fo...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "A Phonotactic Language Model for Spoken Language Identification" pptx
... June 2005. c 2005 Association for Computational Linguistics A Phonotactic Language Model for Spoken Language Identification Haizhou Li and Bin Ma Institute for Infocomm Research Singapore ... spoken languages. In addition, spoken documents 1 , in the form of digitized wave files, are far less structured than written documents and need to be treated with techniques that go be...
Ngày tải lên: 20/02/2014, 15:20
Tài liệu Báo cáo khoa học: "A Localized Prediction Model for Statistical Machine Translation" ppt
... blocks for for which . 560 4 Online Training of Maximum-entropy Model The local model described in Section 3 leads to the fol- lowing abstract maximum entropy training formulation: (8) In this formulation, ... Training Results We compare model performance with respect to the num- ber and type of features used as well as with respect to different re-ordering models. Results for e...
Ngày tải lên: 20/02/2014, 15:20
Tài liệu Báo cáo khoa học: "A SPEECH-FIRST MODEL FOR REPAIR DETECTION AND CORRECTION" docx
... cues for repair processing. Discussion In this paper, we have presented a"speech-first" model, the Repair Interval Model, for studying repairs in spon- taneous speech. This model ... problem of fragment identification. Rather, models for fragment identifica- tion might make use of initial phoneme distributions, in combination with information on fragment length a...
Ngày tải lên: 20/02/2014, 21:20
Báo cáo khoa học: "A Cascaded Linear Model for Joint Chinese Word Segmentation and Part-of-Speech Tagging" pdf
... all features, PER: perceptron model, WLM: word language model, PLM: POS language model, GPR: generating model, LPR: labelling model, LEN: word count penalty. LM with Witten-Bell smoothing, and ... the cascaded model has a two-layer architecture, with a character- based perceptron as the core combined with other real-valued features such as language models. We 897 Core Linear M...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "A Unified Statistical Model for the Identification of English BaseNP" pptx
... calculation formulas are similar with equations (13) and (14) respectively. Before training trigram model (3), all possible baseNP rules should be extracted from the training corpus. For instance, ... of "stock was down 9.1 points yesterday morning" Figure 3: the transformed form of the path with dash line for the second pass processing 2.4 The statistical parameter traini...
Ngày tải lên: 08/03/2014, 05:20
Báo cáo khoa học: " A Noisy-Channel Model for Document Compression" pptx
... on longer docu- ments. Unfortunately, the forests generated even for relatively small documents are huge. Because there are an exponential number of summaries that can be generated for any given ... memory for longer documents; therefore, we se- lected shorter subtexts from the original documents. We used both the WSJ and Mitre data for eval- uation because we wanted to see whether th...
Ngày tải lên: 08/03/2014, 07:20