Báo cáo khoa học: "Identifying Word Translations from Comparable Corpora Using Latent Topic Models" potx

Báo cáo khoa học: "Identifying Word Translations from Comparable Corpora Using Latent Topic Models" potx

Báo cáo khoa học: "Identifying Word Translations from Comparable Corpora Using Latent Topic Models" potx

... language-independent framework for mining translations of words from latent topic models. We have proven that topical knowledge is useful and improves the quality of word translations. The quality of translations de- pends ... 2011. c 2011 Association for Computational Linguistics Identifying Word Translations from Comparable Corpora Using Latent Topic Models Iva...

Ngày tải lên: 23/03/2014, 16:20

6 449 0
Báo cáo khoa học: "Identifying Word Translations in Non-Parallel Texts" potx

Báo cáo khoa học: "Identifying Word Translations in Non-Parallel Texts" potx

... algorithms for sentence and word- alignment allow the automatic iden- tification of word translations from paxalhl texts. This study suggests that the identi- fication of word translations should also ... determine the translations of words from comparable or even unrelated texts. 2 Approach It is assumed that there is a correlation between the co-occurrences of words...

Ngày tải lên: 08/03/2014, 07:20

3 219 0
Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

Tài liệu Báo cáo khoa học: "Bilingual Terminology Acquisition from Comparable Corpora and Phrasal Translation to Cross-Language Information Retrieval" pptx

... the comparable corpora. Re-scoring through the Comparable Corpora Comparable corpora could be considered for the disambiguation of translation alternatives and thus selection of best phrasal translations ... Acquisition (Two-stages Comparable Corpora- based Model) Linguistic- based Pruning Phrasal Translation / Selection Japanese Doc. Content words (nouns, verbs, adjectives...

Ngày tải lên: 20/02/2014, 16:20

4 377 0
Báo cáo khoa học: "Mining Parenthetical Translations from the Web by Word Alignment" potx

Báo cáo khoa học: "Mining Parenthetical Translations from the Web by Word Alignment" potx

... correspondences between the words using a word alignment algorithm. At first glance, word alignment appears to be a more difficult problem than the extraction of par- enthetical translations. Extraction ... translations. Extraction of parenthetical translations need only determine the first pre- parenthesis word aligned with an in-parenthesis word, whereas word alignment requ...

Ngày tải lên: 17/03/2014, 02:20

9 612 0
Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx

Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx

... are different to those from Collins (2002) and are specific to Chinese, are shown in Table 2. The word segmentation features are extracted from word bigrams, capturing word, word length and character ... last word can be a complete word or a partial word. A problem arises in whether to give POS tags to incomplete words. If partial words are given POS tags, it is likely that som...

Ngày tải lên: 20/02/2014, 09:20

9 576 0
Báo cáo khoa học: A novel trehalase from Mycobacterium smegmatis ) purification, properties, requirements potx

Báo cáo khoa học: A novel trehalase from Mycobacterium smegmatis ) purification, properties, requirements potx

... fraction or in the purified fraction from the column. Partial characterization of the trehalase from M. tuberculosis The M. tuberculosis trehalase was isolated from cytoso- lic extracts and partially ... to the hypothetical protein from M. avium. However, the aggregated form of the tuberculosis trehalase emerges from a Sephacryl S-300 column later than the enzyme from M. smegmatis,...

Ngày tải lên: 16/03/2014, 11:20

14 271 0
Báo cáo khoa học: "Building Emotion Lexicon from Weblog Corpora" potx

Báo cáo khoa học: "Building Emotion Lexicon from Weblog Corpora" potx

... between words and emotions using weblog corpora. A collocation model is proposed to learn emotion lexicons from weblog articles. Emotion classification at sentence level is experimented by using ... the total word occurrences. A word entry of a lexicon may contain several emotion senses. They are ordered by the colloca- tion strength co. Figure 2 shows two Chinese ex- ample...

Ngày tải lên: 17/03/2014, 04:20

4 302 0
Báo cáo khoa học: "Constructing Transliteration Lexicons from Web Corpora" docx

Báo cáo khoa học: "Constructing Transliteration Lexicons from Web Corpora" docx

... sentence contained at least one English word. Analysis showed that 17.43% of the English terms were transliterated, and that most of them were content words (words that carry essential meaning, ... existing dictionaries. Regularly exploring Web corpora is a good way to update dictionaries. Transliterated-term extraction using non-parallel corpora has also been conducted (Kuo, 20...

Ngày tải lên: 17/03/2014, 06:20

4 218 0
Báo cáo khoa học: "Learning Bilingual Lexicons from Monolingual Corpora" pot

Báo cáo khoa học: "Learning Bilingual Lexicons from Monolingual Corpora" pot

... input two monolingual corpora and per- haps some seed translations, and we produce as out- put a bilingual lexicon, defined as a list of word pairs deemed to be word- level translations. Preci- sion ... the word type features from a base- line normal distribution with variance σ 2 I d S , with hyperparameter σ 2  0; unmatched target words are similarly generated. If two word type...

Ngày tải lên: 31/03/2014, 00:20

9 300 0
Báo cáo khoa học: "Learning Tense Translation from Bilingual Corpora" docx

Báo cáo khoa học: "Learning Tense Translation from Bilingual Corpora" docx

... mor- phosyntactically (analytic tenses). 2 Words Are Not Enough Often, sentence meaning is not compositional but arises from combinations of words (1). (1) a. Ich habe ihn gestern gesehen. ... context depen- dence. For every ambiguous word, the part of the context relevant for disambiguation must be identified (disambiguation strategy), and every word potentially occurring in thi...

Ngày tải lên: 31/03/2014, 04:20

5 279 0
w