Báo cáo khoa học: "Automatic Story Segmentation using a Bayesian Decision Framework for Statistical Models of Lexical Chain Features" pdf
... distri- bution for lexical chain starts and ends at story boundaries, the uniform distribution for lexical chain start / end at non -story boundary, and the normal distribution for lexical chain continua- tions. ... Modeling of Lexical Chain Features 4.1 Chain starts and ends We follow (Chan et al. 2007) to model the lexi- cal chain starts and ends at a s...
Ngày tải lên: 23/03/2014, 17:20
... Linear Text Segmentation using a Dynamic Programming Algorithm Athanasios Kehagias Dept. of Math., Phys. and Comp. Sciences Aristotle Univ of Thessaloniki GREECE kehagias@egnatia.ee.auth.gr Fragkou ... (Heinonen, 1998) and Utiyama and Isahara (Utiyama and Isa- hara, 2001). Finally, other researchers use probabilistic ap- proaches to text segmentation including the use of hidden...
Ngày tải lên: 31/03/2014, 20:20
... is a nominated list of head- lines of a length of 10 words. In the case of a para- graph of a length less than 10, there will be only one nominated headline of the same length of that paragraph. ... the cate- gory of short summaries. 3 Preparing Data The dataset used in this work was extracted from Arabic Gigaword (Graff, 2007). The Arabic Giga- word is a collec...
Ngày tải lên: 20/02/2014, 05:20
Báo cáo khoa học: "Automatic Sanskrit Segmentizer Using Finite State Transducers" pdf
... verse, San: n ¯ aradam paripapraccha v ¯ alm ¯ ıkirmunipu ˙ ngavam gloss: to the Narada asked Valmiki- to the wisest among sages Eng: Valmiki asked the Narada, the wisest among the sages. In the above ... vacah . . We assume that the sandhi handler handling the sandhi involving spaces is available and it splits the above string as, ´ srutv ¯ a caitattrilokaj ˜ nah . v ¯ alm ¯ ıkern ¯ aradah...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "Automatic Image Annotation Using Auxiliary Text Information" potx
... have large and diverse datasets both for training and evaluation. In this work, we aim to relieve the data acquisition bottleneck associated with automatic image annota- tion by taking advantage ... Mirella Lapata School of Informatics, University of Edinburgh 2 Buccleuch Place, Edinburgh EH8 9LW, UK Y.Feng-4@sms.ed.ac.uk, mlap@inf.ed.ac.uk Abstract The availability of databases of...
Ngày tải lên: 31/03/2014, 00:20
Báo cáo khoa học: " Automatic Verb Classification Using Distributions of Grammatical Features" ppt
... .158] 4 All raw and normalized corpus data are available from the authors. Table 1: Accuracy of the Verb Clustering Task. Features Accuracy 1. VBD ACT INTI~ CAUS 52% "2. VBD ACT CAUS 54% ... accuracy and standard error. This procedure is then repeated for 10 different ran- dom divisions of the data, and accuracy and standard error are again averaged across the ten runs. e...
Ngày tải lên: 31/03/2014, 21:20
Báo cáo khoa học: "Finding Word Substitutions Using a Distributional Similarity Baseline and Immediate Context Overlap" potx
... the head ‘rescue’, and lemma:failing arg:ARG1 var:bank which indicates that the argument of ‘failing’ is ‘bank’. Note that any tree can be transformed into a feature for a particular lexical item ... (The pattern-based approach uses a set of manually- constructed patterns applied to a web search.) In the same vein, Geffet and Dagan (2005) fil- ter the result of a pattern-ba...
Ngày tải lên: 08/03/2014, 21:20
Báo cáo khoa học: "Accurate Collocation Extraction Using a Multilingual Parser" docx
... varies from 50 to 500 at intervals of 50) are checked in each case for grammatical well-formedness and for lexicalization. By lexi- calization we mean the quality of a pair to con- stitute (part ... Its author had to simulate parsing because of the lack, at the time, of parsing tools for Ger- man. Our report, that concerns an actual system and a large data set, validates Bre...
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: "Fast Semantic Extraction Using a Novel Neural Network Architecture" docx
... is labeled for each particular verb as so-called frames. Addition- ally, semantic roles can also be labeled with one of 13 ARGM adjunct labels, such as ARGM-LOC or ARGM-TMP for additional locational ... solutions are compli- cated, consist of several stages and hand- built features, and are too slow to be applied as part of real applications that require such semantic labels, partly...
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: "Learning Stochastic OT Grammars: A Bayesian approach using Data Augmentation and Gibbs Sampling" pptx
... of the 43rd Annual Meeting of the ACL, pages 346–353, Ann Arbor, June 2005. c 2005 Association for Computational Linguistics Learning Stochastic OT Grammars: A Bayesian approach using Data Augmentation ... contains all the information needed for linguists’ use: for exam- ple, if there is a grammar that will generate the exact frequencies as in the data, such a grammar wil...
Ngày tải lên: 23/03/2014, 19:20