Báo cáo khoa học: "Power-Law Distributions for Paraphrases Extracted from Bilingual Corpora" pdf
... 27 2012. c 2012 Association for Computational Linguistics Power-Law Distributions for Paraphrases Extracted from Bilingual Corpora Spyros Martzoukos Christof Monz Informatics Institute, University ... and target phrases) for: (a) Components extracted from P . ‘1-1’ components are not shown. (b) Components extracted from the decomposition of P 0 . In the components emer...
Ngày tải lên: 17/03/2014, 22:20
... do not perform nearly as well as characters. In fact, the "words" variation increases the number of errors dramatically (from 36 to 50 for English-French and from 19 to 35 for English-German). ... genetic code sequences from different species, speech sequences from different speakers, gas chromatograph sequences from different compounds, and geologic sequences f...
Ngày tải lên: 20/02/2014, 21:20
... learning approach, outperforming them by as much as 4–7% on the three data sets for one of the performance metrics. 2 Related Work As mentioned before, our approach differs from the standard approach ... ranker underper- forms the perfect ranker by about 5% for BNEWS and 3% for both NPAPER and NWIRE in terms of F-measure, suggesting that the supervised ranker still has room for impr...
Ngày tải lên: 20/02/2014, 15:20
Báo cáo khoa học: "Variational Inference for Grammar Induction with Prior Knowledge" pdf
... be from a mixture family of distributions. We will use x to denote observable random variables, y to denote hidden structure, and θ to denote the to-be-learned parameters of the model (coming from ... steps, for 1 ≤ i ≤ r: E-step: For each i ∈ {1, , r}, optimize the bound given λ and q i (y)| i ∈{1, ,r }\ {i} and q i (θ)| i ∈{1, ,r } by selecting a new distribution q i (y). M...
Ngày tải lên: 08/03/2014, 01:20
Báo cáo khoa học: "Structured Models for Fine-to-Coarse Sentiment Analysis" pdf
... extensions for more complex situations. For example, longer doc- uments might benefit from an analysis on the para- graph level as well as the sentence and document levels. One possible model for this ... a new value k . For each doc- ument label, the k highest scoring labelings were Figure 4: An extension to the model from Figure 1 incorporating paragraph level analysis. extr...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "A Framework for Customizable Generation of Hypertext Presentations" pdf
... (producing the text string) and for- matting (determining the formatting marks to insert in the text string). Developing an appli- cation to present the information for a given domain is often ... alization, and formatting. PRESENTOR is im- plemented and is portable cross-platform and cross-domain. It has been used with success in several application domains including weather fo...
Ngày tải lên: 08/03/2014, 05:21
Báo cáo khoa học: "ENGLISH GENERATOR FOR A CASE-LABELLED DEPENDENCY REPRESENING" pdf
... It places no restrictions on the form of the fillers for any slot in a gran~ node. The production rules ~,force categorial and order~,~ restrictions. So, for example, the templates reflect ... verbs; it is also used to cover sane other forms of attac~nent to, and modification of, nouns, for example by determiners ( like "a" ) and even for plural or singular number. I...
Ngày tải lên: 09/03/2014, 01:20
Báo cáo khoa học: "AN ALGORITHM FOR GENERATION IN UNIFICATION CATEGORIAL GRAMMAR" pdf
... of commutativity or associativity are available for testing logical equivalence 1. One of the 1Strictly speaking, we test for a very strict form of consistency. Two LFs are considered logically ... arguments. reduce (Sign0, Sign) :- transform(Sign0, Sign1), reduce (Sign1, Sign) . transform(Daughter, Mother) :- unary_rule(Mother, Daughter). transform(Sign0, Sign) :- path_value(...
Ngày tải lên: 09/03/2014, 01:20
Báo cáo khoa học: "Reranking Answers for Definitional QA Using Language Modeling" pdf
... systems, for a given question, a vector is formed consisting of the most frequent co-occurring terms with the question target as the question profile. Candidate answers extracted from a given ... are for people (e.g., Aaron Copland), 10 are for organizations (e.g., Friends of the Earth) and 10 are for other entities (e.g., Quasars). We employ Lemur 6 to retrieve relevant d...
Ngày tải lên: 23/03/2014, 18:20
Báo cáo khoa học: "Evaluation tool for rule-based anaphora resolution methods" pdf
... unified treatment of the files used for training and of those used for evaluation (which are already annotated in XML format) and it is also useful if the file submitted for analysis to FDG already contains ... algorithms implemented for the workbench enriches this set of data with information relevant to its particular needs. Kennedy and Boguraev (1996), for example, need additional i...
Ngày tải lên: 23/03/2014, 19:20