Báo cáo khoa học: "Automatic Prediction of Cognate Orthography Using Support Vector Machines" potx
... Proceedings of the ACL 2007 Student Research Workshop, pages 25–30, Prague, June 2007. c 2007 Association for Computational Linguistics Automatic Prediction of Cognate Orthography Using Support Vector ... the cognate creation algorithm in detail. 26 Input: C1, a list of English-German cognate pairs {L1,L2}; C2, a test file of cognates in L1 Output: AL, a list of a...
Ngày tải lên: 31/03/2014, 01:20
... Pattern Singular “a(x) x is made up of ” NP QT is made up of NP’ C “a(x) x is made of NP QT is made of NP’ C “a(x) x comprises” NP QT comprises (of) ? NP’ C “a(x) x consists of NP QT consists of NP’ C Plural “p(x) ... NP’ C Plural “p(x) are made up of ” NP QT is made up of NP’ C “p(x) are made of NP QT are made of NP’ C “p(x) comprise” NP QT comprise (of) ? NP’ C “p(x) consis...
Ngày tải lên: 08/03/2014, 02:21
... (possible) medical conditions. The importance of the task of negation and spec- ulation (a.k.a. hedge) detection is attested by a num- ber of research initiatives. The creation of the Bio- Scope corpus (Vincze ... Statistics of the BioScope corpus. The 2nd and 3d columns show the total number of cues within the datasets; the 4th and 5th columns show the percentage of negated and...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Automatic learning of textual entailments with cross-pair similarities" ppt
... ex- amples of the previous section. From the point of view of bag -of- word methods, the pairs (T 1 , H 1 ) and (T 1 , H 2 ) have both the same intra-pair simi- larity since the sentences of T 1 and ... improvement of 4.4% over the state -of- the-art methods. 1 Introduction Recently, textual entailment recognition has been receiving a lot of attention. The main reason is that the...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Automatic Construction of Polarity-tagged Corpus from HTML Documents" docx
... there are two sentences in each of the 454 (1) kono software-no riten-ha hayaku ugoku koto this software-POST advantage-POS T quickly run to The advantage of this software is to run quickly. (2) ... the polarity of words There are some works that discuss learning the po- larity of words instead of sentences. Hatzivassiloglou and McKeown proposed a method of learning the polarity...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Automatic Identification of Pro and Con Reasons in Online Reviews" ppt
... specific and tangible features. Also, there are somewhat a fixed set of features of a specific type of product, for exam- ple, ease of use, durability, battery life, photo quality, and shutter lag ... examples of sen- tences that our system identified as reasons of complaints. (1) Unfortunately, I find that I am no longer comfortable in your establishment because of the...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Automatic Evaluation of Sentence-Level Fluency Andrew Mutton∗" pdf
... magnitude of the first three parser metrics, however, lends support to the idea of Wan et al. (2005) to use something like these as indicators of generated sentence fluency. The aim of the next ... baseline of 50%, indicating that the SVM can classify satisfactorily. We now move from looking at classification accu- racy to the main purpose of the SVM, using distance from support...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics" doc
... cognate candi- dates during construction of N-best translation lexicons from parallel text. Melamed (1995) used the ratio (LCSR) between the length of the LCS of two words and the length of ... score over N sets of N-1 refer- ences. The final score is the average of the N best scores using N different sets of N-1 references. The Jackknifing procedure is adopted since we...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Automatic clustering of collocation for detecting practical sense boundary" ppt
... the word senses numbered i of the word x. I x is the word sense indexing function of x that gives an index to each sense of the word x. All contextual words x i ±j of a central word x have ... normalization is using LSI (Latent Semantic Indexing). Throughout the LSI transformation, we can remove the dimension of the context vector and express the hidden features into the...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Automatic Collection of Related Terms from the Web" pptx
... application of the method is auto- matic or semi-automatic compilation of a glossary or technical-term dictionary for a certain domain. Re- cursive application of the method enables to collect a list of ... Web by using search engines and produces a dozen technical terms that are closely related to the seed word. 2 System Figure 1 shows the configuration of the system. The system c...
Ngày tải lên: 20/02/2014, 16:20