Báo cáo khoa học: "Unsupervised Multilingual Grammar Induction" doc

Báo cáo khoa học: "Prototype-Driven Grammar Induction" pdf

Báo cáo khoa học: "Prototype-Driven Grammar Induction" pdf

... im- provements over naive PCFG induction for English and Chinese grammar induction. 1 Introduction There has been a great deal of work on unsuper- vised grammar induction, with motivations rang- ing from scientific ... first two, one does not confuse the ability of the system to learn a consistent grammar with its ability to learn the grammar a user has in mind. In this paper, we presen...

Ngày tải lên: 08/03/2014, 02:21

8 328 0
Tài liệu Báo cáo khoa học: "Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure" pdf

Tài liệu Báo cáo khoa học: "Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure" pdf

... our model is as follows: • Draw the document-level topic proportions β (doc) ∼ GEM(α (doc) ). • Choose the document-level language model φ (doc) i ∼ Dir(γ (doc) ) for i ∈ {1, 2, . . .}. • Draw ... and each sentence n: – Draw type t (k) n ∼ Unif (Doc, P art). – If (t (k) n = Doc) ; draw topic z (k) n ∼ β (doc) ; gen- erate words x (k) n ∼ Mult(φ (doc) z (k) n ) – Otherwise; draw t...

Ngày tải lên: 20/02/2014, 04:20

5 376 0
Tài liệu Báo cáo khoa học: "Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora" ppt

Tài liệu Báo cáo khoa học: "Unsupervised Translation Induction for Chinese Abbreviations using Monolingual Corpora" ppt

... implemen- tation, the contexts include current sentence, the ti- tle of current document, and previous and next sen- tence in the document. Then, for each ngram (i.e., full) of the current sentence (i.e., ... phenom- ena, we identify possible abbreviations for full-form phrases. Figure 2 presents the pseudocode of the full-abbreviation relation extraction algorithm. Relation-Extraction(Corpus...

Ngày tải lên: 20/02/2014, 09:20

9 445 0
Báo cáo khoa học: "Learning Common Grammar from Multilingual Corpus" potx

Báo cáo khoa học: "Learning Common Grammar from Multilingual Corpus" potx

... languages, and try to extract a common grammar from non-parallel multilingual corpora. For this purpose, we propose a generative model for multilingual grammars that is learned in an unsupervised ... non- annotated multilingual corpus, where X l is a set of sentences in language l, and L is a set of lan- guages. The task is to learn multilingual PCFGs G = {G l } l∈L and a common gr...

Ngày tải lên: 07/03/2014, 22:20

5 326 0
Tài liệu Báo cáo khoa học: "Unsupervised Topic Modelling for Multi-Party Spoken Discourse" ppt

Tài liệu Báo cáo khoa học: "Unsupervised Topic Modelling for Multi-Party Spoken Discourse" ppt

... USA jbt@mit.edu Abstract We present a method for unsupervised topic modelling which adapts methods used in document classification (Blei et al., 2003; Griffiths and Steyvers, 2004) to unsegmented multi-party ... that the dataset itself provides the other, for example by the ex- plicit separation of individual documents or news stories in a collection). Spoken multi-party meet- ings pose a diffic...

Ngày tải lên: 20/02/2014, 11:21

8 366 0
Tài liệu Báo cáo khoa học: "TREE UNIFICATION GRAMMAR" pptx

Tài liệu Báo cáo khoa học: "TREE UNIFICATION GRAMMAR" pptx

... relations that are present in the grammar roles. Tree unification grammar (TUG) is a formalism which uses function-argument (FA) specif~ationa as its primary grammar structures. These specifications ... transparently. Tree adjoining grammars (TAGs) (Joshi, Levy and Takahashi, 1975, Vijay-Shanker and Joshi, 1988) possess trees as basic grammar structures, and grammar rules ar...

Ngày tải lên: 21/02/2014, 20:20

9 422 0
Từ khóa:
w