Báo cáo khoa học: "Automatic training of lemmatization rules

Báo cáo khoa học: "Automatic training of lemmatization rules that handle morphological changes in pre-, in- and suffixes alike" docx

... the algorithms and by not subdividing the training words in word classes. 4 Generation of rules and look-up data structure 4.1 Building a rule set from training pairs The training algorithm ... one of the remaining candidates instead. The training pairs that are matched by the pat- tern of the winning rule become the supporters and non-supporters of that n...

Ngày tải lên: 23/03/2014, 16:21

9 372 0

Báo cáo khoa học: "Automatic Detection of Grammar Elements that Decrease Readability" pdf

... Automatic Detection of Grammar Elements that Decrease Readability Masatoshi Tsuchiya and Satoshi Sato Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University tsuchiya@pine.kuee.kyoto-u.ac.jp, ... unreadable. The goal of our study is to present tools that help rewriting work of improving readability in Japanese. The ﬁrst tool is t...

Ngày tải lên: 08/03/2014, 04:22

4 399 0

Báo cáo khoa học: Dual effect of echinomycin on hypoxia-inducible factor-1 activity under normoxic and hypoxic conditions docx

... Effect of echinomycin on HIF-1a protein level. HepG2 and HeLa cells were incubated for 5 or 16 h under hypoxia or normoxia in the presence or absence of increasing concentrations of echinomycin. ... effect is caused by an increase in HIF-1a protein level, resulting from an increase in the transcription of the HIF-1A gene in the presence of a low concentration of echino...

Ngày tải lên: 16/03/2014, 05:20

10 341 0

Báo cáo khoa học: "Automatic Detection of Syllable Boundaries Combining the Advantages of Treebank and Bracketed Corpora Training" docx

... samples of words serve as input to the training procedure. In a treebank training step we observe for each rule in the training grammar how often it is used for the training corpus. The grammar rules ... want to examine the in- ﬂuence of the size of the training corpus on the results of the evaluation. Therefore, we split the training corpus into 9 corpora, where...

Ngày tải lên: 23/03/2014, 19:20

8 455 0

Tài liệu Báo cáo khoa học: "Automatic Extraction of Lexico-Syntactic Patterns for Detection of Negation and Speculation Scopes" pdf

... adaptation of the rule set to new domains and corpora. 1 Motivation Information Extraction (IE) systems often face the problem of distinguishing between afﬁrmed, negated, and speculative information in ... of phrases split into subsets (preceding vs. following their scope) to identify cues using string matching. The cue scopes extend from the cue to the beginning or end of the s...

Ngày tải lên: 20/02/2014, 04:20

5 544 1

Tài liệu Báo cáo khoa học: "Automatic learning of textual entailments with cross-pair similarities" ppt

... also that the dashed lines con- necting placeholders of two texts (hypotheses) in- dicate structurally equivalent nodes. For instance, the dashed line between 3 and b links the main verbs both in ... the point of view of bag -of- word methods, the pairs (T 1 , H 1 ) and (T 1 , H 2 ) have both the same intra-pair similarity since the sentences of T 1 and H 1 as well as t...

Ngày tải lên: 20/02/2014, 12:20

8 413 0

Tài liệu Báo cáo khoa học: "Automatic Construction of Polarity-tagged Corpus from HTML Documents" docx

... sent ences in table. We can predict that there are opinion sentences in this table, because the left column acts as a header and there are indicators (plus and minus) in that column. 3.3 Linguistic ... Learning the polarity of words There are some works that discuss learning the polarity of words instead of sentences. Hatzivassiloglou and McKeown proposed a method o...

Ngày tải lên: 20/02/2014, 12:20

8 409 0

Tài liệu Báo cáo khoa học: "Automatic Identification of Pro and Con Reasons in Online Reviews" ppt

... described in that paper. The motivation for including the list of opinion-bearing words as one of our features is that pro and con sentences are quite likely to contain opinion-bearing expressions ... sentence in those reviews collected from each domain with the features described in Section 3.1. We divided the data for training and testing. We then trained our mod...

Ngày tải lên: 20/02/2014, 12:20

8 461 1

Tài liệu Báo cáo khoa học: "Automatic Evaluation of Sentence-Level Fluency Andrew Mutton∗" pdf

... noting that human translations are generally good and machine translations poor, that binary training data can be created by taking the human translations as posi- tive training instances and ... for doing this, as we were interested in the level of agreement of intuitive un- derstanding of ﬂuency. We instructed them also that they should evaluate the sentence without co...

Ngày tải lên: 20/02/2014, 12:20

8 508 0

Tài liệu Báo cáo khoa học: "Automatic Evaluation of Machine Translation Quality Using Longest Common Subsequence and Skip-Bigram Statistics" doc

... Evaluation of Machine Translation Quality Using Longest Com- mon Subsequence and Skip-Bigram Statistics Chin-Yew Lin and Franz Josef Och Information Sciences Institute University of Southern ... using bag -of- words instead. Instead of error measures, we can also use accuracy measures that compute similarity between candidate and ref- erence translations in proportion to t...

Ngày tải lên: 20/02/2014, 16:20

8 443 0