Báo cáo khoa học: "Experiments with Interactive Question-Answering" doc
... associated with each topic signature more general (a) by replacing words with their (morphological) root form (e.g. wounded with wound, weapons with weapon), (b) by replacing lexemes with their ... determine a binary partition of the document collection into (1) a relevant set of documents (that is, the documents rel- evant to relation ) and (2) a set of non-relevant documents - . I...
Ngày tải lên: 08/03/2014, 04:22
... c 0 · √ d with probability 2 3 , and the lemma follows from the law of large numbers ✷ Corollary 6 With high probability τ = θ(1). Proof : Lemma 5 shows that for a sample x 1 , . . . , x N with high ... F −1 (θ( 1 √ N ) + 1 2 ). According to theorem 1, for a 10K dataset with 15% hard case rate, a hard case bias of about 1% cannot be ruled out with 95% confidence. Theorem 1 suggests t...
Ngày tải lên: 30/03/2014, 23:20
... between the signs, with respect to the groups, but there is no difference between the signs with respect to any single common member of the groups. When we are concerned with that particular ... have dealt with 500 rows, but 2000 have been prepared. For the initial sample of 500 a small number of words that we have called “starting words,”* with varying ranges of uses, but wit...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure" pdf
... our model is as follows: • Draw the document-level topic proportions β (doc) ∼ GEM(α (doc) ). • Choose the document-level language model φ (doc) i ∼ Dir(γ (doc) ) for i ∈ {1, 2, . . .}. • Draw ... and each sentence n: – Draw type t (k) n ∼ Unif (Doc, P art). – If (t (k) n = Doc) ; draw topic z (k) n ∼ β (doc) ; gen- erate words x (k) n ∼ Mult(φ (doc) z (k) n ) – Otherwise; draw t...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Learning with Unlabeled Data for Text Categorization Using Bootstrapping and Feature Projection Techniques" doc
... words within a document as a context. To extract contexts from a document, we use sliding window techniques (Maarek et al., 1991). The window is a slide from the first word of the document ... title word with the maximum similarity score with a word W, c max is the category of the title word T max , and T secondmax is other title word with the second high similarity score with...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Wysiwym with wider coverage" pdf
... multilin- gual technical documentation, and the formula- tion of queries to a database or expert system. In the first case, Wysiwym editing encodes the desired content of the document in an interlin- gua, ... accompanying demonstra- tion) is to describe these novelties. 2 Editing with simple types take patient aspirin ARG−1 ARG−2 Figure 1: A-box with simple types In early Wysiwym applicati...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Parsing with generative models of predicate-argument structure" pptx
... derivation tree itself. It as- sumes that binary trees (with parent category ) have one head child (with category ) and one non- head child (with category ), and that each node has one lexical head . ... projection of constituents. After expanding the node to and , the NP that is co-indexed with woman can- not be unified with the object of saw anymore. These examples have shown that two...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Generating with a Grammar Based on Tree Descriptions: a Constraint-Based Approach" pptx
... indices. Specifically, an active edge with category A( )/C(c ) (with c the se- mantics index of the missing component) is re- stricted to combine with inactive edges with cate- gory C(c ), and vice ... matrix with exactly one lexical entry, then parsing a sen- tence consists in finding the saturated model(s) with yield such that sat- isfies the conjunction of lexical tree descriptions...
Ngày tải lên: 20/02/2014, 18:20
Tài liệu Báo cáo khoa học: "EXPERIMENTS AND PROSPECTS OF EXAMPLE-BASED MACHINE TRANSLATION" ppt
... with detailed comparisons between RBMT and EBMT. 2 BASIC IDEA OF EBMT 2.1 BASIC FLOW In this section, the basic idea of EBMT, which is general and applicable to many phenomena dealt with ... retrieved, EBMT generates the most likely translation with a reliability factor based on distance and frequency. If there is no similar example within the given threshold, EBMT tells the u...
Ngày tải lên: 20/02/2014, 21:20
Tài liệu Báo cáo khoa học: "INTEGRATING WITH WORD BOUNDARY IDENTIFICATION SENTENCE UNDERSTANDING" docx
... words. It performs word boundary disambiguation concurrently with sentence understanding. In our investigation, we focus on sentences with clearly ambiguous word boundaries as they constitute ... Crescent, Singapore 0511 Internet: gankw@iscs.nus.sg Abstract Chinese sentences are written with no special delimiters such as space to indicate word boundaries. Existing Chi- nese NLP...
Ngày tải lên: 20/02/2014, 21:20