Báo cáo khoa học: "Chinese sentence segmentation as comma classification" ppt
... paper, we formulate Chinese sentence seg- mentation as a comma disambiguation problem. The problem is basically one of separating commas that mark sentence boundaries (such as [2] and [5] in (1)) from ... disambiguate the two types of commas. Commas that mark sen- tence boundaries delimit loosely coordinated top- level IPs, as illustrated in Figure 1, and commas that don’t cover a...
Ngày tải lên: 07/03/2014, 22:20
... NLP tasks can be modeled as a sequence clas- sification problem, such as POS tagging, chunking, and incremental parsing. A traditional method to solve this problem is to decompose the whole task into ... Random Fields (CRF) as a general solution for sequence clas- sification. CRF models a sequence as an undirected graph, which means that all the individual tasks are solved simultaneousl...
Ngày tải lên: 20/02/2014, 12:20
... defined as& quot; d(dts(x:y)) = min { dts(v:x) dts(x.y), dts(y:w) dts(x:y) } Two basic hypotheses can be easily made as the consequence of context-dependability of dts(note: mi has not ... without any assistance of human. We believe the performance of the existing Chinese segmenters, that is, the ability to deal with segmentation ambiguities and unknown words as well as...
Ngày tải lên: 20/02/2014, 18:20
Báo cáo khoa học: "Extracting Narrative Timelines as Temporal Dependency Structures" ppt
... extraction models formulate temporal linking as a pair-wise classification task, where each pair of events and/or times is examined and classified as having a temporal relation or not. Early work ... models (SRP and MST) we proposed two baselines. Both are based on the assumption of linear temporal structures of narratives as the temporal ordering process that was evidenced by studies in...
Ngày tải lên: 07/03/2014, 18:20
Tài liệu Báo cáo khoa học: "The Sentimental Factor: Improving Review Classification via Human-Provided Information" docx
... s i ), i = 1, , n, a class’s marginal probability π k can be estimated trivially as the proportion of training samples be- longing to the class. Thus the critical aspect of clas- sification by Bayes’ ... Equation 9 has a simple structure: it is a linear function of d. Models that take this form are commonplace in classification. 2.3 Turney’s Classifier as Naive Bayes Although Naive Bayes cla...
Ngày tải lên: 20/02/2014, 16:20
Tài liệu Báo cáo khoa học: "Unsupervized Word Segmentation: the case for Mandarin Chinese" doc
... autonomy as a(x) = ˜ δ ← h(x) + ˜ δh → (x). The more an n-gram is autonomous, the more likely it is to be a word. 385 With this measure, we can redefine the sentence segmentation problem as the ... solely on a separation measure and get high segmentation scores. When maximized over a sentence, this mea- sure captures at least in part what can be modeled by a cohesion measure without...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: Peroxiredoxin II functions as a signal terminator for H2O2-activated phospholipase D1 doc
... other proteins, such as actin, protein kinase N, casein-kinase-2-like serine kinase and amphiphysin, Keywords hydrogen peroxide; peroxiredoxin II; phosphatidic acid; phospholipase D1; PMA Correspondence M. ... none (as shown in Fig. 2) to small amounts (as shown in Fig. 4). In all Fig. 1. PLD activity time course upon PMA stimulation. In vivo transphosphatidylation PLD assays were perform...
Ngày tải lên: 20/02/2014, 01:20
Tài liệu Báo cáo khoa học: Stem–loop oligonucleotides as tools for labelling double-stranded DNA pdf
... Moreover, as the triple helix was not stable in the absence of inter- calator, it was possible to switch easily from conditions where the triple helix was very stable to conditions where it was totally ... the labelled fragment and the plasmid was observed. We were able to detect 250 attomol of plasmid without ambiguity, and a band could be identified with as few as 50 attomol (Fig. 5)....
Ngày tải lên: 20/02/2014, 03:20
Tài liệu Báo cáo khoa học: "Unsupervised Discourse Segmentation of Documents with Inherently Parallel Structure" pdf
... experiments. 153 5 Experiment 5.1 Dataset and setup Dataset We apply our model to the ESL podcast dataset (Noh et al., 2010) of 200 episodes, with an average of 17 sentences per story and 80 sen- tences ... lecture side. However, given that the segmentation of the story was obtained by an automatic sentence splitter, there is no reason to attempt to reproduce this segmentation. There- fo...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Joint Word Segmentation and POS Tagging using a Single Perceptron" docx
... being treated as 15. The POS tagging features are based on contex- tual information from the tag trigram, as well as the neighboring three-word window. To reduce overfit- ting and increase the decoding ... input sentence, and T is the size of the tag set (T = 1 for pure word segmentation) . It worked well for word segmentation alone (Zhang and Clark, 2007), even with an agenda size...
Ngày tải lên: 20/02/2014, 09:20