... k train (w) denote the number of occurrences of w in the training corpus, and k test (w) denote the number of occurrences of w in the test corpus. We define the empirical discount of w to be d(w) = k train (w) ... per- vasive phenomenon of growing empirical discounts, except in the case of extremely similar corpora. Growing discounts of this sort were previously sug- gested...
Ngày tải lên: 20/02/2014, 04:20
... measure of the degree of surprise of a text or corpus given a language model. In our case, we build a language model LM(M r ) for the refer- ence report M r , and measure the perplexity of the contrastive ... U} n-gram∈C Count where MU is the set of model units, Count m is the maximum number of n-grams co-ocurring in a peer summary and a model unit, and Count is the numbe...
Ngày tải lên: 20/02/2014, 15:20
Tài liệu Báo cáo khoa học: "An Empirical Investigation of Proposals in Collaborative Dialogues" docx
... exist to the set of constraint equations, each varl in the set of equations must have a solution. For exam- ple, if 5 instances of sofas are known for varsola, but every assignment of a value to ... de- grees of strength) to some future course of action. The only distinction is whether the commitment is conditional on H's agreement (Offer) or not (Com- mit). With an O...
Ngày tải lên: 20/02/2014, 18:20
Báo cáo khoa học: "An Empirical Study of Chinese Chunking" docx
... study. Training Test Num of Files 728 110 Num of Sentences 9,878 5,290 Num of Words 238,906 165,862 Num of Phrases 141,426 101,449 Table 2: Information of the CTB4 Corpus 3 Chinese Chunking 3.1 Models for ... conducted an empirical study of Chinese chunking. We compared the performance of four models, SVMs, CRFs, MBL, and TBL. We also investigated the effects of using differ...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "An Empirical Study of the Influence of Argument Conciseness on Argument Effectiveness" docx
... functions, one for each primitive attribute of the entity. A value tree is a decomposition of the value of an entity into a hierarchy of aspects of the entity 2 , in which the leaves correspond ... User Model Refiner (Figure 4 (3)) to produce a Refined Model of the User’s Preferences (Figure 4 (4)). At this point, the stage is set for argument generation. Given the Refi...
Ngày tải lên: 08/03/2014, 05:20
Tài liệu Báo cáo khoa học: An autoinhibitory effect of the homothorax domain of Meis2 ppt
... presence of alternative splicing around the 5¢-end of exon 6 of Meis2 and Meis3 was tested by RT-PCR. The positions of molecular mass markers are shown to the left, and the size in base pairs of the ... DNA-binding cofactors [10–12]. Meis2 is a member of the TALE superfamily of HD proteins, which are characterized by the presence of a three amino acid loop insertion between he...
Ngày tải lên: 16/02/2014, 15:20
Tài liệu Báo cáo khoa học: A kinetic model of the branch-point between the methionine and threonine biosynthesis pathways in Arabidopsis thaliana doc
... P i considerably affects the dynamics of the system. Indeed, in the presence of 10 m M P i , the model indicates that the catalytic rates of CGS and TS are divided by a factor of 6 and 11, respectively, ... a computer model of the branch-point and validated it in vitro. A satisfying but imperfect agreement of the predictions with the experimental results lead us to improve the...
Ngày tải lên: 20/02/2014, 02:21
Tài liệu Báo cáo khoa học: "An Unsupervised Model for Joint Phrase Alignment and Extraction" ppt
... thesis, Massachusetts Institute of Tech- nology. John DeNero and Dan Klein. 2008. The complexity of phrase alignment problems. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, ... Ignacio Thayer. 2006. Scalable inference and training of context-rich syntactic translation models. In Proceed- ings of the 44th Annual Meeting of the Associa...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "An Implemented Description of Japanese: The Lexeed Dictionary and the Hinoki Treebank" ppt
... description of the most familiar 28,000 words of Japanese. 1 Introduction In this paper we describe the current state of a new lexical resource: the Hinoki treebank. The ultimate goal of our research ... syntac- tic model is embodied in a grammar, while the se- mantic model is linked by an ontology. This makes it possible to test the use of similarity and/or se- mantic class bas...
Ngày tải lên: 20/02/2014, 12:20
Tài liệu Báo cáo khoa học: "An Evaluation Method of Words Tendency using Decision " docx
... classes of the input analysis data (test data). 2. POPULARITY OF WORDS CONSIDERING TIME-SERIES VARIATION 2.1 Stability Classes of the Words: To judge the index of popularity of words ... than that of straight line (2). The value of the slice of regression straight line (1) is also higher than that of regression straight line (2). So, we can decide that the words...
Ngày tải lên: 20/02/2014, 16:20