Tài liệu Báo cáo khoa học: "Direct Word Sense Matching for Lexical Substitution" ppt
... of the 25 words in the Senseval sample as a target word for the sense matching task. Next, we had to pick for each target word a corresponding synonym to play the role of the source word. This ... one definition for each possible word sense. The algo- rithm looks for words in the sense definitions that overlap with context words in the given sentence, and chooses the sens...
Ngày tải lên: 20/02/2014, 12:20
... describes SENSELEARNER – a minimally supervised word sense disam- biguation system that attempts to disam- biguate all content words in a text using WordNet senses. We evaluate the accu- racy of SENSELEARNER ... topically related word classes, semantic density, and others. In recent SENSEVAL-3 evaluations, the most suc- cessful approaches for all words word sense disam- biguation...
Ngày tải lên: 20/02/2014, 15:20
... frequent words F to generalize words to word classes”. We define a word class as either a word itself or its part of speech. Given a sentence s = w 1 , w 2 , . . . , w |s| , where w i is the i-th word ... repre- sentation for elearning environments. In Proc. of Applications for Romanian. Proceedings of RANLP workshop, pages 19–25. Wenbin Jiang, Haitao Mi, and Qun Liu. 2008. Word l...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Discriminative Word Alignment with Conditional Random Fields" ppt
... model many-to-one word alignments, where each source word is aligned with zero or one target words, and therefore each target word can be aligned with many source words. Each source word is labelled ... Section 2 presents CRFs for word alignment, describing their form and their inference techniques. The features of our model are presented in Section 3, and experimental results fo...
Ngày tải lên: 20/02/2014, 11:21
Tài liệu Báo cáo khoa học: "Bayesian Word Sense Induction" pdf
... plausible senses for drug on the WSJ corpus (top half of Table 1). Sense 1 corresponds to the “enforcement” sense of drug, Sense 2 refers to “medication”, Sense 3 to the “drug industry” and Sense ... of N c word tokens. We shall write φ ( j) as a shorthand for P(w i |s i = j), the multinomial distri- bution over words for sense j, and θ (c) as a short- hand for the distr...
Ngày tải lên: 22/02/2014, 02:20
Tài liệu Báo cáo khoa học: "A structure-sharing parser for lexicalized grammars" pptx
... precompiling additional information, parsing can be broken down into recognition followed by parse recovery; • providing a formal treatment of the algo- rithms for transforming and minimising ... Introduction It is well-known that fully lexicalised grammar formalisms such as LTAG (Joshi and Schabes, 1991) are difficult to parse with efficiently. Each word in the parser's in...
Ngày tải lên: 20/02/2014, 18:20
Tài liệu Báo cáo khoa học: "An Efficient Generation Algorithm for Lexicalist MT" ppt
... it is well-formed or ill-formed. • maximal iff it is well-formed and its parent (if it has one) is ill-formed. In other words, a maxi- mal TNCB is a largest well-formed component of a TNCB. ... a polynomial time algorithm for lexicalist MT generation pro- vided that sufficient information can be transferred to ensure more determinism. 1 Introduction Lexicalist approaches to MT, .....
Ngày tải lên: 20/02/2014, 22:20
Tài liệu Báo cáo khoa học: "Performance Confidence Estimation for Automatic Summarization" ppt
... system performance for a given input is in fact relevant not only for summarization, but in general for all ap- plications aimed at facilitating information access. In question answering for example, ... examples for good and bad performance. We also extend the analysis to sin- gle document summarization, for which predict- ing system performance turns out to be much more accurate...
Ngày tải lên: 22/02/2014, 02:20
Tài liệu Báo cáo khoa học: Direct identification of hydrophobins and their processing in Trichoderma using intact-cell MALDI-TOF MS docx
... 847 tides. A considerable amount of information is avail- able on eukaryotic signal peptidase specificities, and cleavage site predictions can be performed using, for example, signalp (http://www.cbs.dtu.dk/services/ SignalP/) ... 1500 Da. For desorption of the components, a nitrogen laser beam (k ¼ 337 nm) was focused on the template. The laser power was set to just above the threshold of...
Ngày tải lên: 19/02/2014, 02:20
Tài liệu Báo cáo khoa học: "Improving Word Representations via Global Context and Multiple Word Prototypes" pdf
... context and one represen- tation per word. This is problematic because words are often polysemous and global con- text can also provide useful information for learning word meanings. We present a new neural ... for clustering word instances, which is used in the multi-prototype ver- sion of our model that accounts for words with mul- tiple senses. We evaluate our new model on the s...
Ngày tải lên: 19/02/2014, 19:20