Tài liệu Báo cáo khoa học: "AUTOMATIC ACQUISITION OF SUBCATEGORIZATION FRAMES FROM UNTAGGED TEXT" doc
... AUTOMATIC ACQUISITION OF SUBCATEGORIZATION FRAMES FROM UNTAGGED TEXT Michael R. Brent MIT AI Lab 545 Technology Square Cambridge, ... *greet to attend Table 1: The five subcategorization frames (SFs) detected so far The SF acquisition program has been tested on a corpus of 2.6 million words of the Wall Street Journal (kindly ... Vergnaud, 1980), a proposed rul...
Ngày tải lên: 20/02/2014, 21:20
... application of the method is auto- matic or semi-automatic compilation of a glossary or technical-term dictionary for a certain domain. Re- cursive application of the method enables to collect a list of ... select the appropriate passages from a document set. We use the Web for the document set and se- lect the passages that describe s for the corpus. The actual procedure of compi...
Ngày tải lên: 20/02/2014, 16:20
... Automatic Acquisition of Script Knowledge from a Text Collection Toshiaki Fujiki Hidetsugu Nanba Interdisciplinary Graduate School of Graduate School of Science and Engineering Information ... sequences (pairs) of actions from the text collection. 3. Selecting typical sequences. We show the outline of our method in Figure 1, where the process of automatic acquisition...
Ngày tải lên: 31/03/2014, 20:20
Tài liệu Báo cáo khoa học: "Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes" pdf
... all node children (starting from the root of the subtree) to the rule pattern subtree. Nodes of type *scope* and * match any number of nodes, similar to the semantics of Regex Kleene star (*). 5 ... Statistics of the BioScope corpus. The 2nd and 3d columns show the total number of cues within the datasets; the 4th and 5th columns show the percentage of negated and spec- ulativ...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Automatic learning of textual entailments with cross-pair similarities" ppt
... ex- amples of the previous section. From the point of view of bag -of- word methods, the pairs (T 1 , H 1 ) and (T 1 , H 2 ) have both the same intra-pair simi- larity since the sentences of T 1 and ... rules that describe a non trivial set of entailment cases. The experiments with the data sets of the RTE 2005 challenge show an improvement of 4.4% over the state -of- the-art...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Automatic Construction of Polarity-tagged Corpus from HTML Documents" docx
... polarity of words There are some works that discuss learning the po- larity of words instead of sentences. Hatzivassiloglou and McKeown proposed a method of learning the polarity of adjectives from corpus ... of reviews are not available. In addition, the corpus created from re- views is often noisy as we discuss in Section 2. This paper proposes a novel method of building p...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Automatic Identification of Pro and Con Reasons in Online Reviews" ppt
... computational model because of the difficulty of determining the unit of an opinion. In general, researchers study opinion at three different lev- els: word level, sentence level, and document level. ... a combination of two methods. The first method derived a list of opinion-bearing words from a large news corpus by separating opinion articles such as letters or editorial...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Automatic Evaluation of Sentence-Level Fluency Andrew Mutton∗" pdf
... Methods PoStag In the first of these, we constructed a rough approximation of typical sentence grammar structure by taking bigrams over part -of- speech tags. 6 Then, given a string of PoS tags of length n, t 1 . ... in- stead of PoS tags. The idea is that the supertags might give a more fine-grained definition of struc- ture, using partial trees rather than parts of speech. CFG We...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics" doc
... construction of N-best translation lexicons from parallel text. Melamed (1995) used the ratio (LCSR) between the length of the LCS of two words and the length of the longer word of the two ... WLCS score ending at word x i of X and y j of Y, w is the table storing the length of consecu- tive matches ended at c table position i and j, and f is a function of consecut...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Automatic clustering of collocation for detecting practical sense boundary" ppt
... vocabularies are selected from a given corpus and 2P C/VP is all sets of C/V. In the equation (1), the frequency of x is m in c. We can also express m=|c/x|. The window size of a collocation is ... the word senses numbered i of the word x. I x is the word sense indexing function of x that gives an index to each sense of the word x. All contextual words x i ±j of a cen...
Ngày tải lên: 20/02/2014, 16:20