... more contextualization in the trees results in more accurate models, the simplest model, baseline, has the worst oracle performance, filler-parent and parent-context models, adding similar contextualization ... discuss some important models here Beyond the models for parsing discussed in section 4, together with motivations for using or not in our work, another important model for syntactic parsing has ... features and labels for the CRF models (# features and # labels), and the number of rules for PCFG models (# rules) As we can see from the table, the number of rules is the same for the tree representations...
Ngày tải lên: 24/03/2014, 03:20
... network formed by the web hold also for the networks induced by semantic relations in text mining applications, for various semantic classes, semantic relations, and languages We can therefore apply ... harvests various kinds of semantic information and use this information to improve the performance of tasks such as information extraction (Riloff, 1993), textual entailment (Zanzotto et al., ... log-log scale We can see that for all networks the high-degree nodes tend to connect to other high-degree ones This explains why text mining algorithms should focus their effort on high-degree nodes...
Ngày tải lên: 30/03/2014, 21:20
probabilistic models for unsupervised learning
... Intractability For many probabilistic models of interest, exact inference is not computationally feasible This occurs for two (main) reasons: r distributions may have complicated forms (non-linearities ... algorithm ế ệ 9ể ề Summary ĩ Why probabilistic models? ĩ Factor analysis and beyond ĩ Inference and the EM algorithm ĩ Generative Model for Generative Models ĩ A few models in detail ĩ Approximate ... model complexity penalty (i.e coding cost for all the parameters of the model) so it can be compared across models Optimal form of falls out of free-form variational optimisation (i.e not assumed...
Ngày tải lên: 24/04/2014, 13:20
conditional random fields- probabilistic models for segmenting and labeling sequence data
... Freitag, D., & Pereira, F (2000) Maximum entropy Markov models for information extraction and segmentation Proc ICML 2000 (pp 591–598) Stanford, California Mohri, M (1997) Finite-state transducers in ... entries, for each y, y , and p α (· | y, x ) can have at most three nonzero entries for each y, x For each randomly generated model, a sample of 1,000 sequences of length 25 is generated for training ... of conditional models with the global normalization of random field models Other applications of exponential models in sequence modeling have either attempted to build generative models (Rosenfeld,...
Ngày tải lên: 24/04/2014, 13:20
Probabilistic models for reliability assessment of ageing equipment and maintenance optimization
... Statistics for Transformers K-Means Clustered by First Year of Operation 47 Table 3.6: Test Statistics for Transformers K-Means Clustered by Loading 48 Table 3.7: Test Statistics for Transformers ... process model for transformers 136 Figure A.1 (b): The proposed Markov decision process model for transformers 137 Figure A.1 (c): The proposed Markov decision process model for transformers ... process model for transformers 139 Figure A.1 (e): The proposed Markov decision process model for transformers 140 Figure A.1 (f): The proposed Markov decision process model for transformers ...
Ngày tải lên: 08/09/2015, 19:22
Beyong lexical meaning probabilistic models for sign language recognition
... Model (MH-HMM) for continuous sign recognition Just as in the case for the BN, the MH-HMM models the probabilistic relationship between lexical meaning and inflections, and the information streams ... 6.2 Test results on trained models for two Q-level H-HMMs for handshape and orientation components 172 6.3 Test results on MH-HMM combining trained models of location, handshape ... manual sign information and NMS, perform accurately in real-time and robustly in arbitrary environments, and allow for maximum user mobility Such a translation system is not the only use for SL recognition...
Ngày tải lên: 12/09/2015, 09:07
Tài liệu Báo cáo khoa học: "Towards History-based Grammars: Using Richer Models for Probabilistic Parsing*" docx
... parser must have a mechanism for estimating the coherence of an interpretation, both in isolation and in context Probabilistic language models provide such a mechanism A probabilistic language model ... very rich probabilistic models of context In this work, we present a model, the history-based grammar model, which incorporates a very rich model of context, and we describe a technique for estimating ... close to incorporating enough context to disambiguate many cases of ambiguity A significant reason researchers have limited the contextual information used by their models is because of the difficulty...
Ngày tải lên: 20/02/2014, 21:20
Tài liệu Báo cáo khoa học: "Web augmentation of language models for continuous speech recognition of SMS text messages" docx
... 17.0 for English, 18.7 for Spanish, and 22.5 for French For English, we also created web mixture models with KN smoothing The error rates were 16.5, 15.9 and 15.7 for the 20 MB, 40 MB and 70 MB models, ... results for the different LMs are given in Table The results are consistent in the sense that the web mixture models outperform the in-domain models, and augmentation helps more with larger models ... possible to create larger mixture models than in-domain models, there are no in-domain results for the largest model sizes Especially if large models can be afforded, the perplexity reductions...
Ngày tải lên: 22/02/2014, 02:20
Báo cáo khoa học: "Data-Defined Kernels for Parse Reranking Derived from Probabilistic Models" docx
... propose a new method for deriving a kernel from a probabilistic model which is specifically tailored to reranking tasks, and we apply this method to natural language parsing For the probabilistic model, ... 20 parses from the probabilistic model This method achieves a significant improvement over the accuracy of the probabilistic model alone Kernels Derived from Probabilistic Models In recent years, ... the probabilistic model (i.e the maximum a posteriori (MAP) classifier) There is guaranteed to be a linear classifier for the derived kernel which performs at least as well as the MAP classifier for...
Ngày tải lên: 08/03/2014, 04:22
A Comparison of Event Models for Naive Bayes Text Classication potx
... event models an average of 4.8% points better This domain tends to require smaller vocabularies for best performance See Figure for the remaining Reuters results Joachims (1998) found performance ... comparison of event models for different vocabulary sizes on the Yahoo data set Note that the multi-variate Bernoulli performs best with a small vocabulary and that the multinomial performs best with ... the event models diverge, the assumptions and formulations of each are presented Consider the task of text classification in a Bayesian learning framework This approach assumes that the text data...
Ngày tải lên: 16/03/2014, 19:20
Báo cáo khoa học: "Hybrid Parsing: Using Probabilistic Models as Predictors for a Symbolic Parser" docx
... trained on huge amounts of plain text Another reason for considering hybrid approaches is the influence that contextual factors might exert on the process of determining the most plausible sentence ... very great effort for the grammar writer Also, because many incorrect analyses are allowed, the space of possible trees becomes even larger than it would be for a prescriptive grammar For the task ... obvious or would require too much effort This has already been demonstrated for the case of part-of-speech tagging: because contextual cues are very effective in determining the categories of ambiguous...
Ngày tải lên: 31/03/2014, 01:20
Báo cáo khoa học: "Probabilistic disambiguation models for wide-coverage HPSG parsing" pot
... inconsistencies in probabilistic models estimated using simple relative frequency (Abney, 1997) Log-linear models are required for credible probabilistic models and are also benecial for incorporating ... incorporating various overlapping features This study follows previous studies on the probabilistic models for HPSG The probability, ễỉ ìà, of producing the parse result ỉ from a given sentence ... instance of a feature forest (Miyao and Tsujii, 2002; Geman and Johnson, 2002) A feature forest is an and/or graph to represent exponentiallymany tree structures in a packed form If è ìà is represented...
Ngày tải lên: 31/03/2014, 03:20
Báo cáo y học: "Anni 2.0: a multipurpose text-mining tool for the life sciences" ppt
... the literature: a case report of a search for new potential therapeutic uses for thalidomide J Am Med Inform Assoc 2003, 10:252-259 Srinivasan P: Text mining: generating hypotheses from MEDLINE ... used for use case prostate cancer as Differentially file Click functionality their functionality used genes case Overview of published text- mining tools, including Anni 2.0, and Additionalfor ... genetically inherited diseases using data mining Nat Genet 2002, 31:316-319 Jensen LJ, Saric J, Bork P: Literature mining for the biologist: from information retrieval to biological discovery...
Ngày tải lên: 14/08/2014, 08:21
Báo cáo y học: "Text-mining and information-retrieval services for molecular biology" pdf
... corpus for bio-textmining Bioinformatics 2003, 19:i180-i182 78 Yeh A, Hirschman L, Morgan A: Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup Bioinformatics ... proteins have also been valuable for text- mining tools Because of the restricted availability of full -text articles most of the existing text- mining systems for biology are centered on the analysis ... an Internet text- mining tool for biomedical information, with application to gene expression profiling Biotechniques 1999, 27:1210-1217 70 MedMiner [http://discover.nci.nih.gov/textmining/main.jsp]...
Ngày tải lên: 14/08/2014, 14:21
TEXT MINING OF ONLINE BOOK REVIEWS FOR NON-TRIVIAL CLUSTERING OF BOOKS AND USERS
... August, 2012 Text Mining of Online Book Reviews for Non-trivial Clustering of Books and Users Major Professor: Shiaofen Fang The classification of consumable media by mining relevant text for their ... to be established, before mining the user’s reviews For the vast majority of our users, there simply was not enough review text to extract any meaningful information Therefore, we decided to only ... review text, we would be able to identify key characteristics present in that book By performing this mining process for multiple groups, we hoped to be able to categorize groups into naturally forming...
Ngày tải lên: 24/08/2014, 10:44
Tapping into the Power of Text Mining
... specifically designed for text mining or — as a subgroup of text mining methods and a typical application of visualization methods — information retrieval In text mining or information retrieval ... = Information Extraction The first approach assumes that text mining essentially corresponds to information extraction (cf section 3.3) — the extraction of facts from texts Text Mining = Text ... Definition of Text Mining Text mining or knowledge discovery from text (KDT) — for the first time mentioned in Feldman et al [FD95] — deals with the machine supported analysis of text It uses techniques...
Ngày tải lên: 31/08/2012, 16:46
Text mining power ACM05
... however For example, if a user sets up an alert for text mining , s/he will receive several news stories on mining for minerals, and very few that are actually on text mining Some of the better text ... trends for text mining applications appears to involve the integration of data mining and text mining into a single system The combination of data and text mining is referred to as “duo -mining ... ClearForest Text Analysis Suite SAS Text Miner Retreival Ware TextAnalyst LexiQuest, Clementine Intelligent Miner for Text, TAKMI Table List of vendor websites and the names of the text mining...
Ngày tải lên: 31/08/2012, 17:12
Some studies on a probabilistic framework for finding object-oriented information in unstructured data
... the format of the number in standard and Vietnamese format For example, “123456.78” in standard format is “123,456.78” which in Vietnamese format is “123.456,78” We use regular expression for ... average precision of the probabilistic framework is much higher than baseline approach, increasing 29% for query 1, 59% for query 2, 55% for query 3, 27% for query and 22% for the last query With ... is crucial for Internet users to obtain the desired information in an efficient and direct manner Currently, there is a lot of information available in structured format on the web For example,...
Ngày tải lên: 23/11/2012, 15:04
Data Preparation for Data Mining- P3
... reason to prepare the data set for mining to best expose the information contained in it to the mining tool Indeed, the whole purpose for mining data is to transform the information content of a data ... Transformations and Difficulties—Variables, Data, and Information Much of this discussion has pivoted on information—information in a data set, information content of various scales, and transforming ... of various scales, and transforming information The concept of information is crucial to data mining It is the very substance enfolded within a data set for which the data set is being mined...
Ngày tải lên: 24/10/2013, 19:15