... 449–459, Avignon, France, April 23 - 27 2012. c 2012 Association for Computational Linguistics Detecting Highly Confident Word Translations from Comparable Corpora without Any Prior Knowledge Ivan Vuli ´ c and ... In other words, if the most prob- able translation candidate for a source word w S 1 is a target word w T 2 and, vice versa, the most prob- able translation candidate of the target word w T 2 451 Proceedings ... the list. In other words, if the first translation candidate for the source word isola is the target word island, and, vice versa, the first translation candidate for the target word island is isola,...
Ngày tải lên: 08/03/2014, 21:20
Báo cáo khoa học: "Automatic Identification of Word Translations from Unrelated English and German Corpora" pot
... two words ahead of another word B, a second vector for the case that word A is one word ahead of word B, a third vector for A directly following B, and a fourth vector for A following two words ... frequencies: kl~ = frequency of common occurrence of word A and word B kl2 = corpus frequency of word A - kll k21 = corpus frequency of word B - kll k22 = size of corpus (no. of tokens) - ... of test words. 4 This means that alternative translations of a word were not considered. Another approach, as conducted by Fung & Yee (1998), would be to consider all possible translations...
Ngày tải lên: 08/03/2014, 06:20
Báo cáo khoa học: "Identifying Word Translations in Non-Parallel Texts" potx
... algorithms for sentence and word- alignment allow the automatic iden- tification of word translations from paxalhl texts. This study suggests that the identi- fication of word translations should also ... determine the translations of words from comparable or even unrelated texts. 2 Approach It is assumed that there is a correlation between the co-occurrences of words which are translations ... co-occurrences of German word pairs in the German corpus. As a starting point, word order in the two matrices was chosen such that word n in the German matrix was the translation of word n in the English...
Ngày tải lên: 08/03/2014, 07:20
Báo cáo khoa học: "Identifying Word Translations from Comparable Corpora Using Latent Topic Models" potx
... Italian word vectors and English word vectors with TF-IDF scores in the original word- document space (Cos), with aligned documents. Table 1 shows the Precision@1 scores (the per- centage of words ... These methods need an initial lexicon of translations, cognates or simi- lar words which are then used to acquire additional translations of the context words. In contrast, our method does not ... TF-IDF scores for the orig- inal word- document space (Manning and Sch ¨ utze, 1999). If we are given a source word w i , n (w i ) k,S de- notes the number of times the word w i is associated with...
Ngày tải lên: 23/03/2014, 16:20
Báo cáo khoa học: "Detecting Compositionality in Multi-Word Expressions" doc
... sequences of words that tend to cooccur more frequently than chance and are either idiosyncratic or decompos- able into multiple simple words (Baldwin, 2006). Deciding idiomaticity of MWEs is highly ... evaluation set is derived from WordNet in a semi- supervised way. Graph connectivity mea- sures are employed for unsupervised pa- rameter tuning. 1 Introduction and related work Multi -word expressions (MWEs) ... Papers, pages 65–68, Suntec, Singapore, 4 August 2009. c 2009 ACL and AFNLP Detecting Compositionality in Multi -Word Expressions Ioannis Korkontzelos Department of Computer Science The University...
Ngày tải lên: 23/03/2014, 17:20
Tài liệu "Word of Mouth": con dao 2 lưỡi docx
... trong cũng bỏ ra bán nốt. 5. Thực hiện như một công nghệ " ;Word of Mouth": con dao 2 lưỡi Phương pháp Marketing Word of Mouth (WOMM) chính là một hình thức của chiến dịch quảng ... sản phẩm hay dịch vụ của bạn có chất lượng thực sự “đỉnh” thì nỗ lực Marketing bằng phương pháp Word of Mouth này sẽ được tự động thực hiện bởi chính những khách hàng. Lúc ấy, bạn chỉ cần nỗ ... nghĩ và cứ thế mà bấm Copy, gửi đi hàng loạt thì sẽ gây hậu quả thế nào? Hay một ví dụ khác mà Word of Mouth” làm điêu đứng một thương hiệu nước ngọt khi mọi người rỉ tai nhau: “Có ai nghe chuyện...
Ngày tải lên: 20/01/2014, 15:20
Báo cáo khoa học: "Confidence Measure for Word Alignment" potx
... English words following each Chi- nese word is its literal translation. We find untrans- lated Chinese and English words (marked with underlines). These spurious words cause signifi- cant word alignment ... lexical translation probability of the aligned word pair with the translation probabilities of all the target words given the source word. If a word t occurs N times in the target sentence, for any ... learned based on word alignment. In this paper we introduce a confidence mea- sure for word alignment, which is robust to extra or missing words in the bilingual sentence pairs, as well as word alignment...
Ngày tải lên: 17/03/2014, 01:20
Báo cáo khoa học: "Mining Parenthetical Translations from the Web by Word Alignment" potx
... between words, we also compute the φ 2 scores of prefixes and suffixes of Chinese and English words. For both languages, the prefix of a word is defined as the first three bytes of the word ... Competitive Linking to deal with multi -word alignments and takes advantage of word- internal correspondences between transliter- ated words or morphologically composed words. Finally, through our discussion ... the alignments are restricted word- to -word align- ments, which implies that multi -word expressions can only be partially linked at best. 4.1 Dealing with multi -word alignment We made a small...
Ngày tải lên: 17/03/2014, 02:20
Báo cáo khoa học: "Subword-based Tagging for Confidence-dependent Chinese Word Segmentation" pdf
... characters in a word. The length of a Chi- nese word has discriminative roles for word composition. For example, single-character words are more apt to form new words than are multiple-character words. ... stands for word and t, for IOB tag. The subscripts are position indicators, where 0 means the current word/ tag; −1, −2, the first or second word/ tag to the left; 1, 2, the first or second word/ tag ... parts: a dictionary- based N-gram word segmentation for segmenting IV words, a maximum entropy subword-based tagger for recognizing OOVs, and a confidence-dependent word disambiguation used for merging...
Ngày tải lên: 17/03/2014, 04:20
Báo cáo khoa học: "Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the 0" potx
Ngày tải lên: 30/03/2014, 17:20
Lý luận của chủ nghĩa Mác con người và vấn đề con người trong sự nghiệp CNH - HĐH đất nước
Ngày tải lên: 07/08/2012, 10:38
Con người dưới góc nhìn của triết học và vấn đề con người trong quá trình đổi mới hiện nay
Ngày tải lên: 07/08/2012, 16:25
Bạn có muốn tìm thêm với từ khóa: