Báo cáo khoa học: "Detecting Verbal Participation in Diathesis Alternations" docx
... The minimum description length principle is then used to produce a model and cost for storing the head noun instances from a training corpus at the relevant argument slots. Alternating sub- ... models are obtained using the minimum description length (MDL) principle. MDL selects an appropriate model by compar- ing potential candidates in terms of the cost of storing the model and ....
Ngày tải lên: 31/03/2014, 04:20
... 1997). RIPPER is a fast rule induction algorithm. It starts with splitting the training set in two. On the basis of one half, it induces rules in a straightfor- ward way (roughly, by trying to maximize cov- erage ... sys- tem, usingmachine learning techniques. Their ap- proach also raises a number of interesting follow- up questions, some concerned with problem de- tection, others with th...
Ngày tải lên: 23/03/2014, 19:20
... ich- thyosiform erythroderma (NCIE, in layman’s terms translating as a nonblistering, inherited, scaly red skin). An independent study later extended these find- ings following the identification of 17 ... isomerase (hepoxilin synthase) The genetic findings concerning 12R-LOX and eLOX3 mutations in ichthyosis are intriguing from the bio- chemical point of view, partly because the eLOX3 pro-...
Ngày tải lên: 07/03/2014, 09:20
Báo cáo khoa học: "GENERATING PRECONDITION EXPRESSIONS IN INSTRUCTIONAL TEXT" docx
... proven useful in analyzing various kinds of conditions and circumstances that fre- quently arise in instructions. The analysis involves addressing two related issues: 1. Determining the range ... here the ter- minating condition). Finally, they may or may not be combined into a single sentence with the ex- pression of their related action (the issue of clause combining). Tex...
Ngày tải lên: 08/03/2014, 07:20
Báo cáo khoa học: "HANDLING SYNTACTICAL AMBIGUITY IN MACHINE TRANSLATION" docx
... confine to prob- lems to be met with (i), and, more concretely, to such English strings containing Vin f. These strings are mapped onto Bulgarian strings containing da-construction or a verbal ... convenient to distinguish two cases: Case A, in which to each syntactically ambiguous string in En- glish corresponds a syntactically ambiguous string in Bulgarlan, and Case B, in wh...
Ngày tải lên: 17/03/2014, 19:21
Báo cáo khoa học: "Measuring Syntactic Difference in British English" docx
... inserted in a path containing a leaf that is a leftmost sibling and a right bracket is inserted in a path containing a leaf that is a rightmost sibling. The bracket is inserted at the highest ... linguistic knowl- edge of the area being surveyed. These features, while probably lacking in completeness of coverage, certainly allowed a rough comparison of distance in all linguistic domai...
Ngày tải lên: 31/03/2014, 01:20
Báo cáo khoa học: "Non-Verbal Cues for Discourse Structure" docx
... intervals (inter- discourse-segment and inter-turn) normalization by number of inter-segment occurrences was sufficient (ps/int), however, for long intervals (intra-discourse segment and intra-turn) ... ps/s ps/int energy inter-turn 0.140 0.268 0.742 intra-turn 0.022 0.738 Initially, we classified data as being inter- or intra-turn. Table 4.1.2 shows that turn structure does have an i...
Ngày tải lên: 31/03/2014, 04:20
Báo cáo khoa học: "Modeling Filled Pauses in Medical Dictations" docx
... in the CONTROLLED-FP-CORPUS. 620 4. Models The language modeling process in this study was conducted in two stages. First, a bigram model containing bigram probabilities of FP's in ... of using bigram probabilities for extracting FP distribution from a corpus of hand- transcribed dam. The resulting bigram model is used to populate another Iraining corpus that original...
Ngày tải lên: 31/03/2014, 04:20
Tài liệu Báo cáo khoa học: "Detecting Semantic Equivalence and Information Disparity in Cross-lingual Documents" doc
... portions in P1 that are more informa- tive than portions in P2 (forward entailment). In such cases, the entailing (more informative) portions from P1 have to be translated and migrated to P2 in order ... occur in the original bilingual parallel cor- pora used for phrase table extraction. Our hypothe- sis is that the increase in recall obtained from relaxed matches through semantic...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Detecting Errors in Part-of-Speech Annotation" docx
... occurring in a hand-cleaned sub-corpus, as well as linguistic intuition. Using this method, Kveain and Oliva (2002) report find- ing 2661 errors in the NEGRA corpus (containing 396,309 tokens). Interestingly, ... work Considering the significant effort that has been put into obtaining pos-tagged reference corpora in the past decade, there are surprisingly few pub- lications on the issue...
Ngày tải lên: 22/02/2014, 02:20