Báo cáo khoa học: "Large Scale Collocation Data and Their Ap

Tài liệu Báo cáo khoa học: "Logistic Online Learning Methods and Their Application to Incremental Dependency Parsing" doc

... computes an update to the weight vector based on the current example. The resulting weight vector tends to be overﬁt to the last few examples; one way to reduce overﬁtting is to use the average of ... has to be taken to handle the large space of possible outputs. The in- tegration of the cost function into the logistic frame- work leads to two distinct (although related) updat...

Ngày tải lên: 20/02/2014, 12:20

6 470 0

Tài liệu Báo cáo khoa học: "Large-Scale Syntactic Language Modeling with Treelets" docx

... w −2 ) to p(w|P, R, r  , w −1 ) and then p(w|P, R, r  ). From there, we back off to p(w|P, R) where R is the sibling immediately to the right of P , then to a raw PCFG p(w|P ), and ﬁnally to a ... trained on the WSJ and Brown corpora because it does not scale to large amounts of data. We used the Berkeley LM toolkit (Pauls and Klein, 2011), which implements Kneser-Ne...

Ngày tải lên: 19/02/2014, 19:20

10 463 0

Báo cáo khoa học: "Large-Scale Cross-Document Coreference Using Distributed Inference and Hierarchical Models " pptx

... entity e  s to entity e  t . The set of factors to compute acceptance of the ﬁrst pro- posal are factors between l and mentions in e s and e t , while the set of factors required to compute acceptance ... inference (Gooi and Allan, 2004). Pedersen et al. (2006) and Purandare and Pedersen (2004) inte- grate second-order co-occurrence of words into the similarity function. M...

Ngày tải lên: 07/03/2014, 22:20

11 319 0

Tài liệu Báo cáo khoa học: Plant a-amylase inhibitors and their interaction with insect a-amylases ppt

... strands and a characteristic disul®de to pology. It revealed structural similarity to other proteins such a s the proteinase inhibitor from Cucubirta maxima [80], charybdo- toxin and conotoxins ... inds of a-amylase and proteinase inhibitors, present in seeds and vegetative organs, act to regulate numbers of phytophagous insects [ 9±11]. a-Amylase inhibitors are a ttractive cand...

Ngày tải lên: 21/02/2014, 03:20

16 540 0

Báo cáo khoa học: MicroRNA-143 reduces viability and increases sensitivity to 5-ﬂuorouracil in HCT116 human colorectal cancer cells potx

... clinic for several decades, and its metabolic pathways and cyto- toxic modes of action through the inhibition of thymi- dylate synthase activity and incorporation into RNA and DNA are well known. ... over- come tumour cell resistance to chemotherapy and to increase drug efﬁcacy, thereby minimizing toxic effects, are critically important. The molecular mechanisms of 5-FU cytotoxi...

Ngày tải lên: 07/03/2014, 00:20

12 368 0

Báo cáo khoa học: Thermodynamic characterization of substrate and inhibitor binding to Trypanosoma brucei 6-phosphogluconate dehydrogenase pot

... required to donate a proton to the C3 car- bonyl group of the keto-intermediate to facilitate decarboxylation. Both a base and an acid are needed in the tautomerization of the enediol intermediate to yield ... per se. To better understand why these analogues have high afﬁnity and to help in rational drug design, we under- took a thermodynamic characterization of substrate and...

Ngày tải lên: 07/03/2014, 05:20

10 402 0

Tài liệu Báo cáo khoa học: "Collecting Highly Parallel Data for Paraphrase Evaluation" doc

... are incentivized to cheat in order to maximize their rewards. To encourage native and ﬂuent contributions, we asked annotators to write the descriptions in the language of their choice. The ... large corpora and no consistent standards for what constitutes a high-quality paraphrase. In addition to the lack of standard datasets for training and testing, there are also n...

Ngày tải lên: 20/02/2014, 04:20

11 418 0

Tài liệu Báo cáo khoa học: "Web-Scale Features for Full-Scale Parsing" doc

... a given head-argument pair (we consider the words h and a to be indexed, and so features can be sensitive to their order and distance, as is also standard). 2.1 Afﬁnity Features Afﬁnity statistics, ... which contains English n-grams (n = 1 to 5) and their ob- served frequency counts, generated from nearly 1 trillion word tokens and 95 billion sentences. This corpus allow...

Ngày tải lên: 20/02/2014, 04:20

10 450 0

Tài liệu Báo cáo khoa học: "Learning with Unlabeled Data for Text Categorization Using Bootstrapping and Feature Projection Techniques" doc

... co- occurred with the title words and keywords: ‘driver’, ‘clutch’, ‘trunk’, and so on. They are words in first-order co-occurrence with the title words and the keywords. To gather more vocabulary, ... we need to find words that are semantically related to a title word, and we define them as keywords of each category. The score of semantic similarity between a title wor...

Ngày tải lên: 20/02/2014, 16:20

8 444 0

Tài liệu Báo cáo khoa học: "Large linguistically-processed Web corpora for multiple languages" doc

... it. The German function word list contains 124 terms. We require that a minimum of 10 types and 30 tokens appear in a page, with a ra- tio of function words to total words of at least one quarter. ... language identiﬁer. 8 Finally, we use a stop list of words likely to oc- cur in pornographic Web pages, not out of prudery, but because they tend to contain randomly generated text...

Ngày tải lên: 22/02/2014, 02:20

4 314 0

Báo cáo khoa học: "Large Scale Collocation Data and Their Application to Japanese Word Processor Technology" potx

Tài liệu Báo cáo khoa học: "Logistic Online Learning Methods and Their Application to Incremental Dependency Parsing" doc

Tài liệu Báo cáo khoa học: "Large-Scale Syntactic Language Modeling with Treelets" docx