Tài liệu Báo cáo khoa học: "Which Are the Best Features for Automatic Verb Classification" pdf
... lex- ical information are useful for verb classification. Although neither SCF nor CO performs well on its own, a combination of them proves to be the most in- formative feature for this task. Other ... with the normal- 1 In our experiment, we only use monosemous verbs from these 48 verb classes. Due to the space limit, we do not list the 48 verb classes. The size of the...
Ngày tải lên: 20/02/2014, 09:20
... least. The words of the top N ranked answers are then added to the gold standard answer. The remaining answers are then rescored according the the new gold standard vector. In practice, we hold the ... exploration of the effects of domain-specific data, we also look at the effect of size on the overall performance. The main in- tuitive trends are there, i.e., the pe...
Ngày tải lên: 22/02/2014, 02:20
... and, in the case of equal lengt hs, a lphabe tically. The division is based on the evolutionary tree s (Fig. 4). For the amylopullulanase from Thermococcus hydrothermalis (Q9Y8I8_THEHY), the numbering ... of hypothetical proteins. Therefore, at present it is impossible to form any conclusions for Fig. 3. Active site of the 4-a-glucanotransferase from Thermococcus litoralis. T...
Ngày tải lên: 19/02/2014, 12:20
Tài liệu Báo cáo khoa học: Stem–loop oligonucleotides as tools for labelling double-stranded DNA pdf
... 2). The first type contains G and T in the loop of the AB Fig. 1. Scheme of the padlock structures. The central part of the TFO forms a triplex with the target dsDNA. The 5¢-and3¢-part of the ... than the melting temperature of the stem. Other Fig. 7. Scheme for interpretation of the results. The target, the stem–loop TFO and the short hairpin oligonucleotide a...
Ngày tải lên: 20/02/2014, 03:20
Tài liệu Báo cáo khoa học: "A New Dataset and Method for Automatically Grading ESOL Texts" pdf
... order to broaden the space of candidate fea- tures suitable for the task. The features used in our experiments are mainly motivated by the fact that lexical and grammatical features should be ... the number of words and is mainly added to balance the effect the length of a script has on other features. Finally, features whose overall frequency is lower than four are...
Ngày tải lên: 20/02/2014, 04:20
Tài liệu Báo cáo khoa học: "Semi-supervised latent variable models for sentence-level sentiment analysis" pdf
... sentences s i are always observed. Note that there are no factors connecting the document node, y d , with the input nodes, s , so that the sentence-level variables, y s , in effect form a bottleneck ... cascaded model in which the predictions at one level are used as input to the other. Figure 1a outlines the factor graph of the corre- 570 sponding conditional random field...
Ngày tải lên: 20/02/2014, 05:20
Tài liệu Báo cáo khoa học: "Demonstration of the UAM CorpusTool for text and image annotation" docx
... Abstract This paper introduced the main features of the UAM CorpusTool, software for human and semi -automatic annotation of text and images. The demonstration will show how to set ... would rather spend their time annotating text than learning how to use the system. The software is thus designed from the ground up to support typical user work- flow, and everythi...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "Learning Source-Target Surface Patterns for Web-based Terminology Translation" pdf
... Replace the tokens for E’s instances with the symbol “E” and the type-III token containing the translation F with the symbol “F”. Note the token denoted as “F” is a maximal string cover- ing the ... sys- tem for finding English translations for a given Japanese technical term by searching for mixed Japanese-English texts on the Web. The method involves locating...
Ngày tải lên: 20/02/2014, 15:20
Tài liệu Báo cáo khoa học: "Which words are hard to recognize? Prosodic, lexical, and disfluency factors that increase ASR error rates" ppt
... control- ling for other features. However, as with the other prosodic features, predictions of the joint model are dominated by quadratic trends, i.e., predicted error rates are lower for average ... Effects of numeric features on IWER of the SRI system for the no-contractions data set. All feature values were binned, and the average IWER for each bin is plotted, wit...
Ngày tải lên: 20/02/2014, 09:20
Tài liệu Báo cáo khoa học: "How Are Spelling Errors Generated and Corrected? " docx
... in the MTurk task. In order to simplify the interpretation of the log, we disabled the cursor movements and text highlighting via a mouse or the arrow keys in the text box; the workers are therefore forced ... are marching in the snow.” and “Mummy, my feet can’t touch the bottom.”) 3.2 Task interface For logging the keystrokes including the use of backspaces, we de...
Ngày tải lên: 19/02/2014, 19:20