Báo cáo khoa học: "Improving the IBM Alignment Models Using

Tài liệu Báo cáo khoa học: "Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition" pdf

... compared the result of the recog- nizers with and without ﬁltering using only 2000 sentences as the training data. Table 5 shows the result of the total system with different ﬁltering thresholds. The ... “O”, which indicates a non-named entity. For 98.0% of the named entities in the training data of the shared task in the 2004 JNLPBA, the label of the preced- ing ent...

Ngày tải lên: 20/02/2014, 12:20

8 527 0

Báo cáo khoa học: "Improving the Use of Pseudo-Words for Evaluating Selectional Preferences" docx

... Given two nouns, the noun with the higher co-occurrence count with the verb is cho- sen. As with the other models, if the two nouns have the same counts, it randomly guesses. The smoothing model ... select the nearest neighbor, the noun with frequency clos- est to the original. These methods evaluate the range of choices used in previous work. Our ex- periments compare...

Ngày tải lên: 07/03/2014, 22:20

9 405 0

Báo cáo khoa học: "Improving the Interpretation of Noun Phrases with Cross-linguistic Information" doc

... provided or they didn’t know what interpretation to give, they had to tag it as “OTHER-SR”, and respectively “OTHER-PP” 3 . The details of the anno- tation task and the observations drawn from there ... corpora and the contribution the features exempliﬁed in one baseline and six versions of the SVM model. The baseline is deﬁned only for the English part of the NP feature s...

Ngày tải lên: 17/03/2014, 04:20

8 386 0

Báo cáo khoa học: "Improving the Accuracy of Subcategorizations Acquired from Corpora" pdf

... lexicon of the two grammars into the training SCFs and the testing SCFs. The words in the testing SCFs were included in the acquired SCFs. When I apply my method to the acquired SCFs using the training ... c m . I then initialize the number of clusters k to the number of c m . I ﬁnally update the acquired SCFs using the ob- tained clusters and the conﬁdence v...

Ngày tải lên: 17/03/2014, 06:20

6 317 0

Tài liệu Báo cáo khoa học: "Guiding Statistical Word Alignment Models With Prior Knowledge" pdf

... asso- ciations among all competing hypothesis. The more reasonable constraints are imposed on this process, the easier the task would become. For instance, the most relaxed IBM Model-1, which assumes that ... Constrained Word Alignment Models The framework that we propose to incorporate statistical constraints into word alignment models is generic. It can be applied to c...

Ngày tải lên: 20/02/2014, 12:20

8 495 0

Tài liệu Báo cáo khoa học: "ON THE SYNTACTIC-SEMANTIC OF BOUND ANAPHORA ANALYSIS" potx

... about the right predictions: It is not the antecedent which must c-command the pronoun, but the quantificational NP, the host operator of the antecedent's discourse referent. In (4), the ... Since the discourse referent provided by a book takes its place at the top level of the restriction part of the every-NP, the in- definite should count as a proper an...

Ngày tải lên: 22/02/2014, 10:20

6 390 0

Báo cáo khoa học: "Improving On-line Handwritten Recognition using Translation Models in Multimodal Interactive Machine Translation" docx

... direct IBM1 and IBM2 models, and inverse IBM1 -inv and IBM2 -inv models with the inverse dictionary from Eq. 9. However, a more interesting set up than using lan- guage models or translation models ... respect to IBM models. However, linear interpolated models perform the best. In the Spanish test set the result is not better that the IBM2 since the linear para...

Ngày tải lên: 07/03/2014, 22:20

6 314 0

Báo cáo khoa học: "Determining the Specificity of Terms using Compositional and Contextual Information" pptx

... database using the disease names as quires. Therefore, all the abstracts are related to some of the disease names. The set consists of about 170,000 abstracts (20,000,000 words). The abstracts ... information. The methods are formulated as information theory like measures. Because the methods don't use domain specific information, they are easily adapted to terms o...

Ngày tải lên: 08/03/2014, 04:22

6 385 0

Báo cáo khoa học: "Improving data-driven dependency parsing using large-scale LFG grammars" pptx

... with the complex label corresponding to the concatenation of the labels from the multiple head attachments (Complex). The converted dependency analysis in Figure 1 shows the f-structure and the ... from the output of the deep grammars we wish to capture as much of the precise, linguistic generalizations embodied in the grammars as possible, whilst keeping with the re-...

Ngày tải lên: 17/03/2014, 02:20

4 279 0

Báo cáo khoa học: "Resolving Personal Names in Email Using Context Expansion" pot

... from the whole collection and build the identity models. The ﬁrst step in the resolution process is to determine the list of identity models that are viable candidates as the true referent. For the ... Fig- ure 1. In the network, the observed mention l is distributed conditionally on both the identity c and the name-type t. p(c) is the prior probability of ob- servi...

Ngày tải lên: 08/03/2014, 01:20

9 419 0

Báo cáo khoa học: "Improving the IBM Alignment Models Using Variational Bayes" pot

Tài liệu Báo cáo khoa học: "Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition" pdf

Báo cáo khoa học: "Improving the Use of Pseudo-Words for Evaluating Selectional Preferences" docx