Báo cáo khoa học: "Recognizing Textual Parallelisms with edit distance and similarity degree" docx
... Recognizing Textual Parallelisms with edit distance and similarity degree Marie Gu ´ egan and Nicolas Hernandez LIMSI-CNRS Universit´e de Paris-Sud, France guegan@aist.enst.fr | hernandez@limsi.fr Abstract Detection ... it: a similarity degree measure, a string editing distance (Wagner and Fischer, 1974) and a tree editing distance 1 (Zhang and Shasha, 1989). Secti...
Ngày tải lên: 24/03/2014, 03:20
... the edit distance algorithm. A and B are two word patterns; M is the matrix in which the edit distance is calculated, and D is the matrix indicating the choice that produced the minimal distance ... starts. 2.3 Edit distance calculation So as to calculate the similarity between two pat- terns, a slightly modified version of the dynamic programming algorithm for edit- dist...
Ngày tải lên: 31/03/2014, 01:20
... (Pantel and Lin, 2002; Sch¨utze, 1998), there are other related efforts on word sense discrimination (Dorow and Widdows, 2003; Fukumoto and Suzuki, 1999; Pedersen and Bruce, 1997). In (Pedersen and ... For i = 1 to q do (2.1) Randomly split C T into disjoint halves, denoted as C T A and C T B ; (2.2) Estimate GMM parameter and cluster number on C T A using Cluster, and the par...
Ngày tải lên: 20/02/2014, 16:20
Báo cáo khoa học: "A Dialogue System with Contextually Appropriate Spoken Output Intonation" docx
... the Focus within Theme and Rheme. A Rheme must always contain a Focus, while Themes can be unmarked (without Focus) or marked (with Focus). Tunes are obtained by combining accents with appropriate ... context, and that contextually inappropriate intonation may have negative effect on intelligibility or lead to confusion. We demonstrate improvements of contextual appropriateness of Engl...
Ngày tải lên: 17/03/2014, 22:20
Báo cáo khoa học: "Attacking Parsing Bottlenecks with Unlabeled Data and Relevant Factorizations" pdf
... dogs and cats, edge-based counts would measure the associations between dogs and and, and and and cats, but never any web counts that include both dogs and cats. For the phrase ate spaghetti with ... sibling and grandparent factorizations described above–for Conversion 1, sibling scoring may help conjunctions and grandparent scoring may help prepositions, and for Conversion...
Ngày tải lên: 23/03/2014, 14:20
Báo cáo khoa học: "Combining POMDPs trained with User Simulations and Rule-based Dialogue Management in a Spoken Dialogue System" docx
... to spoken dialogue systems that includes rule-based and trainable dialogue managers, spoken language understanding and generation modules, and a compre- hensive dialogue system architecture. ... simula- tions. To optimize Q and populate the policy with ex- pected values, the learner needs to explore un- tried actions (system moves) to gain more expe- riences, and combine this wit...
Ngày tải lên: 23/03/2014, 17:20
Báo cáo khoa học: "Enhancing electronic dictionaries with an index based on associations" docx
... appropriate in a given context. Obviously, readers and writers come to the dictionary with different mindsets, information and expectations concerning input and output. While the decoder can provide ... network composed of nodes (words and con- cepts) and links (associations), with either being able to activate the other 5 . Finding a word in- volves entering the network a...
Ngày tải lên: 23/03/2014, 18:20
Báo cáo khoa học: "Complementing Word Net with Roget''''s and Corpus-based Thesauri for Information Retrieval" pdf
... co-occurence of words a and b with the indepen- dent probabilities of occurrence of a and b (Church and Hanks, 1990). P(a, b) I(a, b) = log P(a)P(b) where the probabilities of P(a) and P(b) are ... only one parse tree with highest possibility. During the parsing process, the parser keeps the unexpanded active nodes in a heap, and always expands the active node with...
Ngày tải lên: 31/03/2014, 21:20
Báo cáo khoa học: "Searching Questions by Identifying Question Topic and Question Focus" docx
... identifying question topic and question focus. Thus, this section will begin with a brief review of the MDL-based tree cut model and then follow that by an explanation of steps (a) and (b). 2.1 The ... units: BaseNP (Base Noun Phrase) and WH-ngram. A BaseNP is defined as a simple and non-recursive noun phrase (Cao and Li, 2002). A WH-ngram is an ngram beginning with WH...
Ngày tải lên: 17/03/2014, 02:20
Báo cáo khoa học: "ON THE INDEPENDENCE OF DISCOURSE STRUCTURE AND SEMANTIC DOMAIN" docx
... successive mention of each room and its position. This tour forms a tree composed of the entry to the apartment as root with the rooms and their locations as nodes, and with an associated pointer ... getting themaelves and their husbands and children off to work in the morning. (Linde, in preparation) These "morning routines" are typically well-structured and r...
Ngày tải lên: 17/03/2014, 19:20