Tài liệu Báo cáo khoa học: "Improving Word Representations via Global Context and Multiple Word Prototypes" pdf
... Computational Linguistics Improving Word Representations via Global Context and Multiple Word Prototypes Eric H. Huang, Richard Socher ∗ , Christopher D. Manning, Andrew Y. Ng Computer Science Department, ... architecture which 1) learns word embeddings that better capture the se- mantics of words by incorporating both local and global document context, and 2) account...
Ngày tải lên: 19/02/2014, 19:20
... case, and numbers have been collapsed to a single token; the subset consists of 18,188,548 total words and 159,713 unique words. II. Context Priming It is not an uncommon notion that a word ... RAISINS, SULTANAS~ AND CURRANTS: LEXICAL CLASSIFICATION AND ABSTRACTION VIA CONTEXT PRIMING David J. Hutches Department of Computer Science and Engineering, Mail Code 0114 Uni...
Ngày tải lên: 20/02/2014, 21:20
... the quality of word alignments (Wu, 1997; Och and Ney, 2000; Marcu and Wong, 2002; Cherry and Lin, 2003; Liu et al., 2005; Huang, 2009), the correlation of the words in multi -word alignments ... set of source words that are connected to the same target word (Brown et al., 1993). An alignment between a source multi -word cept and a target word is a many-to-one multi -wo...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Improving Chinese Semantic Role Labeling with Rich Syntactic Features" ppt
... 2009)). Head word POS, head word of PP phrases, cat- egory of c k ’s lift and right siblings, CFG rewrite rule that expands c k and c k ’s parent (from (Ding and Chang, 2008)). 3.2 New Word Features We ... word and POS of head word of parent, left sibling and right sibling of c k . Lexicalized Rewrite rules: Conjuction of rewrite rule and head word of its corresponding...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: Improving Classification of Medical Assertions in Clinical Notes" pdf
... preceding adjective and preposition and one additional following verb and preposition. Contextual Features: We incorporated the ConText algorithm (Chapman et al., 2001) to de- tect four contextual ... the three words preceding it, and the three words fol- lowing it. We used the LVG annotator in Lexical Tools (McCray et al., 1994) to normalize each word (e.g., with respect to ca...
Ngày tải lên: 20/02/2014, 05:20
Tài liệu Báo cáo khoa học: "ParaSense or How to Use Parallel Corpora for Word Sense Disambiguation" pdf
... to the focus word itself being the word form of the focus word, the lemma, Part-of-Speech and chunk information • local context features related to a window of three words preceding and following ... mod- ules (e.g. (Gale and Church, 1993; Ng et al., 2003; Diab and Resnik, 2002; Chan and Ng, 2005; Da- gan and Itai, 1994)) and for WSD systems that use a combination of exis...
Ngày tải lên: 20/02/2014, 05:20
Tài liệu Báo cáo khoa học: "Improving Automatic Speech Recognition for Lectures through Transformation-based Rules Learned from Minimal Data" ppt
... rules. of contextual replacement rules is generated. The set contains the mismatched pair, by themselves and together with three contexts formed from the left, right, and both anchor context words. ... manual and ASR transcripts of training data, and then extracting the mismatched word sequences, anchored by matching words. The matching words serve as contexts for the rules’ appli...
Ngày tải lên: 20/02/2014, 07:20
Tài liệu Báo cáo khoa học: "Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition" pdf
... sentence. “Preceding Entity and Prev Word are fea- tures that capture specifically words for conjunc- tions such as and or “, (comma)”, e.g., for the phrase “OCIM1 and K562”, both “OCIM1” and “K562” are ... Example of Features Start/End Word w s , w e Inside Word w s , w s+1 , , w e Context Word w s−1 , w e+1 Start/End SP sp s , sp e Inside SP sp s , sp s+1 , , sp e Context SP...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Improving Pronoun Resolution Using Statistics-Based Semantic Compatibility Information" doc
... frequency StatSem(candi, ana) = count(candi, ana) (1) where count(candi, ana) is the count of the tuple formed by candi and ana, or alternatively, in terms of conditional probability (P (candi, ana|candi)), where ... compatibility between an anaphor and its an- tecedent candidate is commonly evaluated by ex- amining the relationships between the candidate and the anaphor’s context, bas...
Ngày tải lên: 20/02/2014, 15:20
Tài liệu Báo cáo khoa học: "A Practical Solution to the Problem of Automatic Word Sense Induction" doc
... from the contexts of a single word. That is, our computations are based on the concordance of a word. Also, we do not consider a term/term but a term /context matrix. This means, for each word ... respective word occurs in a context or not. We use binary vectors since we assume short contexts where words usually occur only once. By looking at the matrix it is easy to see that c...
Ngày tải lên: 20/02/2014, 16:20