Báo cáo khoa học: "Distributed Word Clustering for Large Sca

Báo cáo khoa học: "Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation" docx

... obtain automatic word classiﬁcations for large vocabularies (>1 million words) using such large training corpora (>30 billion to- kens). The resulting clusterings are then used in training ... Proceedings of ACL-08: HLT, pages 755–762, Columbus, Ohio, USA, June 2008. c 2008 Association for Computational Linguistics Distributed Word Clustering for Large Scale Cl...

Ngày tải lên: 31/03/2014, 00:20

8 336 0

Tài liệu Báo cáo khoa học: "Learning Word Vectors for Sentiment Analysis" ppt

... emphasis in LDA is on modeling top- ics, not word meanings, there is no guarantee that the row (word) vectors are sensible as points in a k-dimensional space. Indeed, we show in section 4 that using ... weighting in previous work suggests that incorporating sentiment information into VSM values via supervised methods is help- ful for sentiment analysis. We adopt this insight, bu...

Ngày tải lên: 20/02/2014, 04:20

9 591 0

Báo cáo khoa học: "SVD and Clustering for Unsupervised POS Tagging" docx

... descriptor for word type i. We next include a normalization step in which each row in each of L * and R * is scaled to unit length, yielding matrices L ** and R ** . Finally, we form a single ... descriptors into k 1 = 500 groups, using a k-means clustering algo- rithm. Centroid initialization is done by placing the k initial centroids on the descriptors of the k most freq...

Ngày tải lên: 07/03/2014, 22:20

5 269 0

Báo cáo khoa học: "Measure Word Generation for English-Chinese SMT Systems" ppt

... objects. Therefore, in the English-to-Chinese machine translation task we need to take additional efforts to generate the missing measure words in Chinese. For example, when translating the English ... four major kinds of errors as listed in Table 8. Most errors are caused by failures in finding posi- tions to generate measure words. The main reason for this is some hint in...

Ngày tải lên: 08/03/2014, 01:20

8 288 0

Báo cáo khoa học: "Interactive Word Alignment for Language Engineering" pptx

... inspectors for viewing, search- ing and editing the static and dynamic resources and a Link Reporter that can summarize and con- figure the information in the database, including compiling fine-grained ... up- dated incrementally during the manual revision stage. Each time the user confirms a proposed link the information inherent in the link is stored in the different dynamic resourc...

Ngày tải lên: 17/03/2014, 22:20

4 310 0

Báo cáo khoa học: " Exploring Asymmetric Clustering for Statistical Language Modeling" docx

... bigram ACM used in a Chinese text input system [Gao et al. 2002]. However, quite a few techniques (including clustering) were integrated to construct a Chinese language modeling system, and ... Asymmetric clustering The basic criterion for statistical clustering is to maximize the resulting probability (or minimize the resulting perplexity) of the training data. Many tradit...

Ngày tải lên: 23/03/2014, 20:20

8 357 0

Báo cáo khoa học: "Clique-Based Clustering for improving Named Entity Recognition systems" pot

... similarities between NEs. The approach that we propose is inspired from the language modeling framework introduced in the information retrieval ﬁeld (see for example (Lavrenko and Croft, 2003)). Then, we ... cliques containing Oxford 2.4 Cliques clustering We use a clustering technique in order to group cliques of NEs which are mutually highly simi- lar. The clusters of cliques...

Ngày tải lên: 31/03/2014, 20:20

9 297 0

Tài liệu Báo cáo khoa học: Pathways and products for the metabolism of vitamin D3 by cytochrome P450scc docx

... for 10 min in a bath-type sonicator [22]. Vitamin D3, cholesterol and hydroxyvitamin D3 derivatives were included in the mixture for sonication as required. Puriﬁed P450scc was incorporated into ... 20,23-di- hydroxyvitamin D3 (80 lg) for structure determination by NMR was performed using a 50 mL incubation of 50 lm vitamin D3 with 2 lm P450scc in 0.45% cyclodextrin, with the prod...

Ngày tải lên: 18/02/2014, 17:20

12 705 0

Báo cáo khoa học: "A Descriptive Framework for Translating Speaker''''s Meaning Towards a Dialogue Translation System between Japanese and English" pot

... x-WISH INFORMATIVR various S-INFORM 3.2.Unification-based analysis Figure 1 diagrams an overview of the procedure for translating speaker's meaning. In contrast to a conventional machine ... REQUESTING COMPLAINING ADVISING CONFIRMING etc. Conversely, the same intention can be conveyed through various surface expressions, as in the following variations of (2-1): RE...

Ngày tải lên: 09/03/2014, 01:20

8 329 0

Báo cáo khoa học: A new paradigm for oxygen binding involving two types of ab contacts docx

... a basis for the continuity of haemoglobin and myoglobin functions in vivo, since the autoxidation reaction is inevitable in nature for all oxygen-binding haem proteins [21,23,24], as well as for ... contacts in HbA In haemoglobin (Hb) research, the central problem is understanding the mechanism for the cooperative oxygen binding to the a 2 b 2 tetramer. For human HbA, the a...

Ngày tải lên: 17/03/2014, 10:20

11 371 0