Báo cáo khoa học: "Unsupervised Discovery of Rhyme Schemes"

Báo cáo khoa học: "Unsupervised Discovery of Rhyme Schemes" pdf

... work on ﬁnding rhyme schemes. 3 Finding Stanza Rhyme Schemes A collection of rhyming poetry inevitably contains repetition of rhyming pairs. For example, the word trees will often rhyme with breeze ... + I i,r  j<i:r i =r j θ x i ,x j /  w θ w,x i (2) 3 While the number of rhyme schemes of length n is tech- nically the number of partitions of an n- element set (the Be...

Ngày tải lên: 07/03/2014, 22:20

6 371 0

Báo cáo khoa học: "Unsupervised Discovery of Persian Morphemes" docx

... is the left part of word, RP is the right part of it, Len (p) is the length of part P (number of characters), freq(p) is the frequency of part P in corpus, WN is the number of words (corpus ... length of the corpus. Given a probabil- istic model of the corpus, the description length is the sum of the most compact statement of the model expressible in some universal la...

Ngày tải lên: 17/03/2014, 22:20

4 357 0

Báo cáo khoa học: "Unsupervised Discovery of Domain-Speciﬁc Knowledge from Text" pptx

... Research with a Series of Reading Tasks. In Proceedings of LREC 2010. Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In Proceedings of the 16th international ... number of classes per entity is 6.87. The total number of distinct classes for entities is 63, 942. This is a huge number to model in our state space. 1 Instead of man...

Ngày tải lên: 23/03/2014, 16:20

10 377 0

Báo cáo khoa học: "Unsupervised Discovery of Generic Relationships Using Pattern Clusters and its Evaluation by Automatically Generated SAT Analogy Questions" pot

... are often being manifested by several different patterns. In this paper, unlike the majority of studies that use patterns in order to ﬁnd instances of given relationships, we use sets of patterns ... sets of pairs into several clusters, where each cluster corresponds to one of a known set of re- lationship types. Their classiﬁcation setting is thus very different from our unsupe...

Ngày tải lên: 23/03/2014, 17:20

9 390 0

Tài liệu Báo cáo khoa học: "Unsupervised Segmentation of Chinese Text by Use of Branching Entropy" pdf

... Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 428–435, Sydney, July 2006. c 2006 ... Computational Linguistics 428 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 1 2 3 4 5 6 7 8 entropy offset 429 430 431 432 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.55 0.6 0.65 0.7 0.75

Ngày tải lên: 20/02/2014, 12:20

8 395 0

Báo cáo khoa học: "Unsupervised Learning of Acoustic Sub-word Units" pot

... France emmanuel.dupoux@gmail.com Abstract Accurate unsupervised learning of phonemes of a language directly from speech is demon- strated via an algorithm for joint unsupervised learning of the topology and parameters of a hidden Markov model ... improvement in the efﬁcacy of the SSS algorithm as described in Section 2. It is based on observing that the improvement in the goodness...

Ngày tải lên: 08/03/2014, 01:20

4 295 0

Báo cáo khoa học: "Unsupervised Learning of Arabic Stemming using a Parallel Corpus" pot

... indicates an improvement of 22-38% in average pre- cision over unstemmed text, and 96% of the performance of the proprietary stem- mer above. 1 Introduction Stemming is the process of normalizing word ... two examples use the joint probability of the preﬁx and sufﬁx, with a smoothing back-off (the product of the individual probabilities). Scor- ing models of this form proved to...

Ngày tải lên: 08/03/2014, 04:22

8 424 0

Báo cáo khoa học: "Unsupervised Decomposition of a Document into Authorial Components" pdf

... One of the advantages of using biblical litera- ture is the availability of a great deal of manual annotation. In particular, we are able to identify synsets by exploiting the availability of ... that’s the nature of the clustering algorithm, but in fact are not part of what we might think of as the core of either cluster. Informally, we say that a unit is in the core...

Ngày tải lên: 17/03/2014, 00:20

9 359 0

Báo cáo khoa học: "Unsupervised Part-of-Speech Tagging Employing Efficient Graph Clustering" ppt

... All of them employ a syntactic version of Harris’ distributional hypothesis: Words of similar parts of speech can be observed in the same syntactic contexts. Contexts in that sense are often ... state -of- the-art approaches, the kind and number of different tags is generated by the method itself. We compute and merge two partitionings of word graphs: one based on context...

Ngày tải lên: 17/03/2014, 04:20

6 352 0

Báo cáo khoa học: "Automatic Discovery of Named Entity Variants – Grammar-driven Approaches to Non-alphabetical Transliterations" pptx

... proposal has great po- tential of increasing robustness of future NER work by enabling discovery of new and unknown transliterated NE’s. Our study shows that resolution of transliterated NE variations ... Taiwan shukai@gmail.com Abstract Identiﬁcation of transliterated names is a particularly difﬁcult task of Named Entity Recognition (NER), especially in the Chi- nese context....

Ngày tải lên: 17/03/2014, 04:20

4 234 0