Báo cáo khoa học: "Discourse Segmentation of Multi-Party Conversation" doc
... one of a growing number of corpora with human-to-human multi-party conversations. In this corpus, record- ings of meetings ranged primarily over three differ- ent recurring meeting types, all of ... error of P k = 15.79, while the average performance of the algorithm is P k = 15.31 on the WSJ test corpus (unknown number of segments). mean and the variance of the hypothesized...
Ngày tải lên: 23/03/2014, 19:20
... Automatic Segmentation of Multiparty Dialogue Pei-Yun Hsueh School of Informatics University of Edinburgh Edinburgh, EH8 9LW, GB p.hsueh@ed.ac.uk Johanna D. Moore School of Informatics University of ... shifts to the problem of identifying subtopic boundaries. We then explore the impact on performance of using ASR output as opposed to human transcription. Exam- ination of th...
Ngày tải lên: 24/03/2014, 03:20
... Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 428–435, Sydney, July 2006. c 2006 ... Computational Linguistics 428 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 1 2 3 4 5 6 7 8 entropy offset 429 430 431 432 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.55 0.6 0.65 0.7 0.75
Ngày tải lên: 20/02/2014, 12:20
Báo cáo khoa học: "Thematic segmentation of texts: two methods for two kinds of texts" pdf
... compound nouns in 11 years of the French Le Monde newspaper. They have been collected with the INTEX tool of Silberztein (1994). The part of speech tagger TreeTagger of Schmid (1994) is applied ... which is an indicator of the importance of a term according to its distribution in a text. It is defined by: wij = ~). log where tfij is the number of occurrences of a...
Ngày tải lên: 08/03/2014, 05:21
Báo cáo khoa học: "Discourse Processing of Dialogues with Multiple Threads" pot
... comparison of the perfor- mance of two versions of our discourse processor, one based on strict TST, and one with our extended version of TST, demonstrating that our extension of TST yields ... Implications of this model of Attentional State are explored more fully in (Rosd 1995). 3 Discourse Processing We evaluated the effectiveness of our theory of dis- course struc...
Ngày tải lên: 08/03/2014, 07:20
Báo cáo khoa học: "Unsupervised Segmentation of Words Using Prior Distributions of Morph Length and Frequency" ppt
... numerator of the multinomial is the factorial of the total number of morph tokens, N, which equals the sum of frequencies of every morph type. The de- nominator is the product of the factorial of the ... [%] Probabilistic Recursive MDL Linguistica No segmentation Figure 2: Expectation of the percentage of recog- nized morphemes for English data. a baseline of no segment...
Ngày tải lên: 31/03/2014, 03:20
Tài liệu Báo cáo khoa học: "Discourse Obligations in Dialogue Processing" docx
... parts of the discourse context to extend the coverage of a dialogue system. 1 Motivation Most computational models of discourse are based pri- marily on an analysis of the intentions of the ... complex set of motivations for action. In particular, much of one's behavior arises from a sense of obligation to behave within limits set by the society that the agent is...
Ngày tải lên: 20/02/2014, 21:20
Báo cáo khoa học: "Towards resolution of bridging descriptions" docx
... Background As part of our research on definite description (DD) interpretation, we asked 3 subjects to classify the uses of DDs in a corpus using a taxonomy related to the proposals of (Hawkins, ... DDs) found a total of 240 relations, dis- tributed over 107 cases of DDs. There were 54 cor- rect resolutions (distributed over 34 DDs) and 186 false positives. Types of bridg...
Ngày tải lên: 08/03/2014, 21:20
Báo cáo khoa học: "Automatic Creation of Domain Templates" doc
... we obtained TDT document clusters for 2 instances of airplane crashes, 3 instances of earthquakes, 6 instances of presidential elections and 3 instances of terrorist attacks. The number of the documents corresponding ... (from two documents for one of the earthquakes up to 156 documents for one of the terrorist attacks). This variation in the number of documents per topic is t...
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: "Semantic Transliteration of Personal Names" docx
... decision of gender had led to deterioration in MRR performance of the male names compared to the case where no prior information was assumed. Soft decision of gender yielded further gains of 17.1% ... the C-C corpus, out of the total of 4,507 characters, only 776 of them are for surnames. It is interesting to find that female given names are represented by a smaller set...
Ngày tải lên: 17/03/2014, 04:20