Báo cáo khoa học: "Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language Text Categorization" potx
... Linguistics and 44th Annual Meeting of the ACL, pages 553–560, Sydney, July 2006. c 2006 Association for Computational Linguistics Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language ... of texts defined by T ∗ = i T i . If the function ψ exists for every text t i z ∈ T ∗ and for every language L j , and is known, then the corpus is parall...
Ngày tải lên: 17/03/2014, 04:20
... Properties of Bilingual Dictionaries For Distinguishing Senses of English Words and Inducing English Sense Clusters Charles SCHAFER and David YAROWSKY Department of Computer Science and Center for Language ... between fair and blond, and between fair and just. S does not hold between blond and fair. We can infer that fair has at least 2 senses and, further, we can repre...
Ngày tải lên: 23/03/2014, 19:20
... in our parallel text. 5 Data Web-scale text data is used for monolingual feature counts, parallel text is used for classifier co-training, and labeled data is used for training and evaluation. Web-scale ... meat asbestos and English: polyvinyl chloride and asbestos w 2 h and w 1 polyvinyl English: asbestos , and polyvinyl chloride w 1 , and w 2 h chloride English: asb...
Ngày tải lên: 17/03/2014, 00:20
Báo cáo khoa học: "Clustering Comparable Corpora For Bilingual Lexicon Extraction" ppt
... Association for Computational Linguistics:shortpapers, pages 473–478, Portland, Oregon, June 19-24, 2011. c 2011 Association for Computational Linguistics Clustering Comparable Corpora For Bilingual ... studies and is now standard. 3.2.2 Results and Analysis In a first series of experiments, bilingual lexicons were extracted from the corpora obtained by our ap- proach (P 1...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "Using comparable corpora to solve problems difficult for human translators" pptx
... several comparable cor- pora for English and Russian, including large ref- erence corpora (the BNC and the Russian Refer- ence Corpus) and corpora of major British and Russian newspapers. All corpora ... com- parable corpora, POS-tagging and lemmatisation tools, and bilingual dictionaries are available. For example, we conducted a small study for transla- tion b...
Ngày tải lên: 31/03/2014, 01:20
Tài liệu Báo cáo khoa học: "Hand-held Scanner and Translation Software for non-Native Readers" docx
... a pseudo formalism which at- tempts to be both intuitive for linguists and rela- tively straightforward to code (for the time being, this is done manually). The conditions take the form of boolean ... computers (or even PDAs) and uses a hand- held scanner to get the input material. In other words, TwicPen consists of (i) a simple hand-held scanner and (ii) parsing and translation s...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "Skip N-grams and Ranking Functions for Predicting Script Events" doc
... field of script and narrative event chain understanding: • We explore for the first time the use of skip- grams for collecting narrative event statistics, and show that this approach performs better than ... achieves the lowest average rank and the highest Recall@50. For the Fairy Tale corpus, 1-skip bigrams and 2- skip bigrams perform similarly, and both have lower average rank...
Ngày tải lên: 22/02/2014, 02:20
Báo cáo khoa học: Cell biology, regulation and inhibition of b-secretase (BACE-1) potx
... (see text for full details). (B) Sites of cleavage of APP by b- and c-secretases to form Ab peptides. The sites of the juxtamembrane and intramembrane cleavages of transmembrane APP by b- and ... [36,43,44], and a truncated, soluble form of BACE-1 can be detected by activity assay in cerebrospinal fluid, which may provide a useful biomarker in AD and a source for monitoring the...
Ngày tải lên: 07/03/2014, 00:20
Báo cáo khoa học: "Resolving It, This, and That in Unrestricted Multi-Party Dialog" potx
... category restrictions for predicate argu- ment positions, and achieves a precision of 75.0 and a recall of 65.0 for it (50 instances) and a precision of 67.0 and a recall of 62.0 for that (93 instances) if ... provided for ALL and for NP and VP antecedents individually. The parameter tipster is not available for the baseline system. The best baseline performance is pre...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "Combining Textual Entailment and Argumentation Theory for Supporting Online Debates Interactions" ppt
... relation between T1 and H (Example 1), and a contradiction between T2 and H (Example 2). As introduced before, our paper proposes an approach to support the participants in forums or debates to ... on tokenized text is also considered, and obtains an ac- curacy of 0.61 on the training set and 0.62 on the test set). Even using a basic configuration of EDITS, and a small data set (...
Ngày tải lên: 16/03/2014, 20:20