Báo cáo khoa học: "Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities" docx
... process non-English documents are trans- lated using simple dictionary lookup techniques for translating Japanese and Russian documents, and the Systran translation system for the other languages ... MDC by Cognate NE Identification We propose an approach for MDC based only on cognate NE identification. The NEs cate- gories that we take into account are: PERSON, ORGANIZATION, LOCATI...
Ngày tải lên: 31/03/2014, 01:20
... goal, including ran- dom initialization (Fred and Jain, 2002), feature ex- traction based onrandom projection (Fern andBrod- ley, 2003) and the combination of sets of “weak” partitions (Topchy et ... 1990). Dimensional reduc- tion maps data in a high-dimensional space into a low-dimensional space, and improves both cluster- ing accuracy and speed. NMF is a dimensional reduction method (Xu ....
Ngày tải lên: 20/02/2014, 12:20
... solutions for all values of k from 1 N , and then determines the mean and standard deviation of the criterion function. Then, a score is computed for each value of k by sub- tracting the mean from ... from the criterion function, and dividing by the standard deviation. We adapt this technique by using the H2 criterion function, and limit k from 1 deltaK: P K1(k) = H2(k) − mean(H2[1 deltaK]) s...
Ngày tải lên: 22/02/2014, 02:20
Báo cáo khoa học: "The Treegram Index An Efficient Technique for Retrieval in Linguistic Treebanks" docx
... Index An Efficient Technique for Retrieval in Linguistic Treebanks Hans Argenton and Anke Feldhaus Infineon Technologies, DAT CIF, Postbox 801709, D-81617 Miinchen hans.argenton@infineon.com ... which han- dles the BH t (Biblia Hebraica transeripta) treebank comprising 508,650 phrase struc- ture trees with maximum degree eight and maximum height 17, containing altogether 3.3 millio...
Ngày tải lên: 17/03/2014, 23:20
Báo cáo khoa học: A new clan of CBM families based on bioinformatics of starch-binding domains from families CBM20 and CBM21 potx
... with a common ancestry, very similar tertiary structure and conserved catalytic machinery and reaction mechanism [79]. Here we propose that a clan of carbohydrate- binding modules contains CBM ... Minassian BA, Ianzano L, Meloche M, Andermann E, Rouleau GA, Delgado-Escueta AV & Scherer SW (2000) Mutation spectrum and predicted function of laforin in Lafora’s progressive myoclonus epilepsy...
Ngày tải lên: 07/03/2014, 21:20
Báo cáo khoa học: "A Framework for Figurative Language Detection Based on Sense Differentiation" pptx
... rep- resent senses of both an expression and a context as sets of documents. Our hypothesis is that these document sets differ significantly if and only if an expression is used figuratevely. Thus, ... similar meanings are often used in similar contexts. As it was men- tioned, we can treat a meaning of a metaphoric usage of an expression as an additional, not com- mon for the expressi...
Ngày tải lên: 07/03/2014, 22:20
Báo cáo khoa học: "Automatic Acquisition of English Topic Signatures Based on a Second Language" potx
... as context vectors in Concept Space. A context vector is the sum of the vectors of concepts that occur in a context win- dow. If many of the concepts in a window have a strong component for one ... w Chinese translation of sense 2 Chinese translation of sense 1 English-Chinese Lexicon 1. Chinese document 1 2. Chinese document 2 Chinese Search Engine Chinese segmentation...
Ngày tải lên: 08/03/2014, 04:22
Báo cáo khoa học: " Word Sense Disambiguation in Untagged Text based on Term Weight Learning" ppt
... that wv and nq are semantically related if w~i and nq are se- mantically related and (wp, nq) and (w~i , nq) are semantically similar (Dagan et al., 1993). Us- ing the estimation, collocations ... between vl and nj. We recall that wp and nq are semantically re- lated if w~i and nq are semantically related and (wv,n q) and (w'pi,nq) are semantically similar. (a) ' an...
Ngày tải lên: 08/03/2014, 21:20
Báo cáo khoa học: "Dependency Parsing of Japanese Spoken Monologue Based on Clause Boundaries" docx
... University, Japan ‡ Information Technology Center, Nagoya University, Japan § ATR Spoken Language Communication Research Laboratories, Japan The National Institute for Japanese Language, Japan Faculty ... Parsing Based on Clause Boundaries In accordance with the assumption described in Section 2, in our method, the transcribed sentence on which morphological analysis, clause bound-...
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: Diversity of human U2AF splicing factors Based on the EMBO Lecture delivered on 7 July 2005 at the 30th FEBS Congress in Budapest pptx
... gene functions, as well as by the creation and loss of different exons. Both the emergence of additional genomic copies by gene duplication and ret- rotransposition, and an increase in transcript ... mRNAs and ESTs for this gene region. Making allowance only for GT_AG, GC_AG or AT_AC splice site consensus and excluding isoforms with extensive intron retentions, the non- redundant set of lon...
Ngày tải lên: 23/03/2014, 10:20