Tài liệu Báo cáo khoa học: "Unsupervised Topic Modelling for Multi-Party Spoken Discourse" ppt
... Association for Computational Linguistics Unsupervised Topic Modelling for Multi-Party Spoken Discourse Matthew Purver CSLI Stanford University Stanford, CA 94305, USA mpurver@stanford.edu Konrad ... simultaneously address the prob- lems of topic segmentation and topic identification: automatically segmenting multi-party meetings into topically co- herent segments with perfor...
Ngày tải lên: 20/02/2014, 11:21
... approach is to transform the abbreviation 425 into its full-form for which the current SMT system knows how to translate. For example, if the baseline system knows that the translation for “ ” is “Hong ... Performance 4.5.1 Precision on Translations of Chinese Full-form Phrases For the relations manually tagged as correct in Section 4.4, we manually look at the top-5 transla- tion...
Ngày tải lên: 20/02/2014, 09:20
... co-occurrence counts of the two units. Therefore, questions emerge: how good is the sense similarity computed via VSM for two units from parallel corpora? Is it useful for multi- lingual applications, ... (WSD) techniques in SMT for translation selection. However, WSD techniques for SMT do so indirectly, using source-side context to help select a particular translation for a so...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: " A Declarative Language for Implementing Dynamic Programs∗" pptx
... acquire more over time: we in- tend for it to generalize and encapsulate best practices, and serve as a testbed for new practices. Dyna is now be- ing used for parsing, machine translation, morphological analysis, ... probabilities, which permits heuristic early stopping before the agenda is empty. With viterbi values, it amounts to uniform-cost search for the best parse, and an item’...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Using Confidence Bands for Parallel Texts Alignment" pptx
... longer afford to waste human time and effort building manually these ever changing and in- complete databases or design language specific applications to solve this problem. The need for an automatic ... brackets). For average size texts (e.g. the Written Ques- tions), these words account for about 5% of the total (about 3k words / text). This number varies according to language similar...
Ngày tải lên: 20/02/2014, 18:20
... requierments for the formalism used. Hence, the indexing technique has a wide spectrum of applications for testing command relations in syntactic analysis. Futhermore, this method can also be used for ... special cases of definitions 1.1 and 1.2 for the property of being a set of maximal projections. Before I formulate the general command definition in a formal way, I will now...
Ngày tải lên: 22/02/2014, 10:20
Tài liệu Báo cáo khoa học: "Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information" doc
... context in- formation at the sentence level, we adopt the topical context information in our method for the following reasons: (1) the topic informa- tion captures the context information beyond the ... spe- cific topical contexts, and the topical context infor- mation has a great effect on translation selection. For example, “bank” often occurs in the sentences related to the economy t...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "A Statistical Model for Unsupervised and Semi-supervised Transliteration Mining" pptx
... labelled information for training. Our sys- tem extracts transliteration pairs in an unsupervised fashion. It is also able to utilize labelled information if available, obtaining improved performance. We ... of the Association for Computational Linguistics, pages 469–477, Jeju, Republic of Korea, 8-14 July 2012. c 2012 Association for Computational Linguistics A Statistical Model for...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Unsupervised Semantic Role Induction with Global Role Ordering" doc
... PARAMETERS —————– for all predicate p do for all voice vc ∈ {active, passive} do draw θ order p,vc ∼ Dirichlet(α order ) for all interval I do draw θ SR p,I ∼ Dirichlet(α SR ) for all adjacency ... the ordering of core roles is important information for SRL systems, and (ii) the intervals bounded by core roles provide good context infor- mation for classification of other roles. 4...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure" pdf
... part-specific topic proportions β (k) ∼ GEM(α (k) ) for k ∈ {1, . . . , K}. • Choose the part-specific language models φ (k) i ∼ Dir(γ (k) ) for k ∈ {1, . . . , K} and i ∈ {1, 2, . . .}. • For each ... parts. The formal definition of our model is as follows: • Draw the document-level topic proportions β (doc) ∼ GEM(α (doc) ). • Choose the document-level language model φ (doc) i ∼ Dir(γ...
Ngày tải lên: 20/02/2014, 04:20