Báo cáo khoa học: "Detecting Erroneous Sentences using Automatically Mined Sequential Patterns" pdf
... Republic, June 2007. c 2007 Association for Computational Linguistics Detecting Erroneous Sentences using Automatically Mined Sequential Patterns Guihua Sun ∗ Xiaohua Liu Gao Cong Ming Zhou Chongqing ... identifying erroneous/ correct sentences. A set of training data containing correct and erroneous sentences is given. Unlike some previous work, our technique requires neit...
Ngày tải lên: 08/03/2014, 02:21
... Effects of using the empty categories 5 Experiments with Automatically Parsed data The next set of experiments use the BNC and Treebank, but strip POS and parse information, and parse them automatically ... summarise the findings : • Using the BNC, which is tagged with a com- plex tagging scheme but has no parse data, it is possible to get 76% F1 using lexical forms and POS data alon...
Ngày tải lên: 17/03/2014, 06:20
... problem by using pseudo- error sentences generated automatically. Fur- thermore, we apply domain adaptation, the pseudo-error sentences are from the source domain, and the real-error sentences ... and the correct sentences. However, col- lecting a sufficient number of pairs is expensive. To avoid this problem, we use additional corpus con- sisting of pseudo-error sentences automat...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Extracting Comparative Sentences from Korean Text Documents Using Comparative Lexical Patterns and Machine Learning Techniques" doc
... finally annotated 7,384 sentences. Table 3 shows the number of comparative sentences and non-comparative sentences in our corpus. Table 3. The numbers of annotated sentences Total Comparative ... ([gat]: same)’. But many sentences also ex- press comparison without those keywords. Simi- larly, although some sentences contain some keywords, they cannot be comparative sentence...
Ngày tải lên: 20/02/2014, 09:20
Báo cáo khoa học: "Detecting Semantic Relations between Named Entities in Text Using Contextual Features" pdf
... following NE is used as a feature. We call this feature Centering Top (CT). 2.4 Using Stack Structure The sorting algorithm using centering theory tends to rank highly thoes words that easily become ... are now so advanced that named entity (NE) taggers are in practical use. Researchers are now focusing on extracting semantic relations between NEs, such as “George Bush (person)” is “presi...
Ngày tải lên: 17/03/2014, 04:20
Tài liệu Báo cáo khoa học: "Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents" doc
... languages, and ii) entail- ment relations between T and H have to be checked in both directions. Using a combi- nation of lexical, syntactic, and semantic fea- tures to train a cross-lingual textual ... of lexical evidence. When only unidirectional entailment relations from T to H have to be determined (RTE-like setting), the full mapping of the hypothesis into the text usually provides eno...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Identifying Text Polarity Using Random Walks" pptx
... achieves better performance by only using WordNet synonym, hypernym and similar to rela- tions. Adding co-occurrence statistics slightly im- proved performance, while using glosses did not help at ... and it has a wide variety of applications. We proposed a method for automatically predict- ing the semantic orientation of words using ran- dom walks and hitting time. The proposed metho...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Automatic Headline Generation using Character Cross-Correlation" doc
... have shown the effectiveness of using charac- ter cross-correlation in choosing the best headline out of nominated sentences from Arabic document. The advantage of using character cross-correlation ... and S is the total number of sentences. Figure 1: Scaling function of a 1000 nominated headline document. According the nominating mechanism hundreds of sentences could be...
Ngày tải lên: 20/02/2014, 05:20
Tài liệu Báo cáo khoa học: "Generating research websites using summarisation techniques" pptx
... used. We do not use the full paper, as pdfs are not available for all papers in publication pages (due to copyright and other issues). The titles are then parsed using the RASP parser (Briscoe and ... stylesheets are often considered inappropriate for diverse organisations. Research summary pages using stylesheets can offer alternative methods of information access and browsing, aiding na...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "Japanese Dependency Parsing Using Co-occurrence Information and a Combination of Case Elements" pdf
... data are as follows: • Training data: 24,263 sentences, 234,474 bunsetsus • Development data: 4,833 sentences, 47,580 bunsetsus • Test data: 9,287 sentences, 89,982 bunsetsus The test data contained ... r, and verb v) by using probabilistic latent semantic indexing (PLSI) (Hofmann, 1999) 5 . If n, r, v is the co-occurrence of n and r, v, we can calculate P (n, r, v) by using the...
Ngày tải lên: 20/02/2014, 12:20