... appear in P than in N from the set of candidate words 7 . To select these words, we calculate the log likelihood ratio log( C(t,P ) C(t,N)+1 ) for each candidate word t, where C(t, P ) is the number ... the shaper Famil- iarity and the word “rely”, or between the shaper Pressure and the word “extend”. We suspect that the bootstrapping algorithm is likely to make poor word selectio...
Ngày tải lên: 08/03/2014, 00:20
... noun and modifier words as attributes, using a lexical hierarchy to extract these properties. This approach was used by Rosario and Hearst (2001) within a specific domain – medical texts. Using ... over 16 classes. Nastase and Szpakowicz (2003) use the position of the noun and modifier words within general se- mantic hierarchies (Roget's Thesaurus and Word- Net) as...
Ngày tải lên: 20/02/2014, 12:20
Báo cáo khoa học: "Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models " pptx
... Allan, 2004). Pedersen et al. (2006) and Purandare and Pedersen (2004) inte- grate second-order co-occurrence of words into the similarity function. Mann and Yarowsky (2003) use biographical ... on Learning on Cores, Clusters and Clouds. Ben Wellner, Andrew McCallum, Fuchun Peng, and Michael Hay. 2004. An integrated, conditional model of information extraction and coreference with...
Ngày tải lên: 07/03/2014, 22:20
Tài liệu Báo cáo khoa học: "Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data" pdf
... Chinese Word Segmentation without Using Lexicon and Hand-crafted Training Data Sun Maosong, Shen Dayang*, Benjamin K Tsou** State Key Laboratory of Intelligent Technology and Systems, ... Chinese word segmentation developed so far, both statistical and rule-based, exploited two kinds of important resources, i.e., lexicon and hand-crafted linguistic resources(manually segm...
Ngày tải lên: 20/02/2014, 18:20
Báo cáo khoa học: "Chinese Segmentation with a Word-Based Perceptron Algorithm" docx
... characters c 1 and c 2 of two con- secutive words 13 a word of length l and the previous word w 14 a word of length l and the next word w Table 1: feature templates do not involve word information. ... first and last characters c 1 and c 2 of any word 9 word w immediately before character c 10 character c immediately before word w 11 the starting characters c 1 and...
Ngày tải lên: 08/03/2014, 02:21
Tài liệu Báo cáo khoa học: "Fixed Length Word Suffix for Factored Statistical Machine Translation" pdf
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Learning Sub-Word Units for Open Vocabulary Speech Recognition" doc
... coherence. Hybrid word/ sub -word recognizers can produce a sequence of sub -word units in place of OOV words. Ideally, the recognizer outputs a complete word for in-vocabulary (IV) utterances, and sub -word ... a sub -word lexicon, the word and sub- words are combined to form a hybrid language model (LM) to be used by the LVCSR system. This hybrid LM captures dependencies betwee...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Yet Another Word Alignment Tool" docx
... into devising and improving automatic word alignment algorithms, and into evaluating their per- formance (e.g., Och and Ney, 2003; Taskar et al., 2005; Moore et al., 2006; Fraser and Mar c u, ... visualization and creation of word alignments have b e e n devel- oped (e.g., Melamed, 1998; Smith and Jahr, 2000; Ahrenberg et al., 2002; Rassier and Pedersen, 2003 ; Daum´e; Tiede...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "Predicting Unknown Time Arguments based on Cross-Event Propagation" ppt
... Filatova and Hovy, 2001; Mani et al., 2003; Lapata and Lascarides, 2006; Eidelman, 2008). Most of the prior work focused on the sen- tence level by clustering sentences into topics and ordering ... Propagation EM i and EM j are in the same sentence and only one time expression exists in the sen- tence; This follows the within-sentence infer- ence idea in (Lapata and Lascari...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "Guiding Statistical Word Alignment Models With Prior Knowledge" pdf
... Bag-of -word Model One method we investigate is a simple bag-of- word model as in monolingual LSA. We treat each sentence pair as a document and do not distin- guish source words and target words ... bag-of -word model puts all source words and target words as rows in the ma- trix, another method of deriving semantic constraint constructs the sparse matrix by taking source words as...
Ngày tải lên: 20/02/2014, 12:20