Báo cáo khoa học: "Relieving The Data Acquisition Bottleneck In Word Sense Disambiguation" ppt
... the target word. 3.1.1 Manually Annotated Training Data The manually-annotated training data is the SEN- SEVAL2 Lexical Sample training data for the En- glish task, (SV2LS Train). 2 This training ... significance. L1 words that translate into the same L2 word are grouped into clusters; SALAAM identifies the appropriate senses for the words in those clusters based on t...
Ngày tải lên: 08/03/2014, 04:22
... Results The Senseval-3 Chinese ambiguous words are taken as the testing set, which includes 20 words, each with 2-8 senses. The data for the ambiguous words are divided into a training set ... decide the position of the ambiguous word starting from the leaf node of the tree structure. Words in the same leaf node are identical or similar in the linguistic funct...
Ngày tải lên: 20/02/2014, 12:20
... reflecting nonlinear patterns in the data dis- tribution, in ways that linear PCA cannot do. Note that in this space, the sense 1 instances are already better separated from sense 2 data points. ... of the principal component vectors obtained by transforming it into (a) a princi- pal component vector z 9 using the linear PCAtrans- form obtained from training, and (b) a no...
Ngày tải lên: 20/02/2014, 16:20
Báo cáo khoa học: "Domain Adaptation with Active Learning for Word Sense Disambiguation" pdf
... ensuring that the sense priors in the training data follow as closely as possible the sense priors in the evaluation data, while retaining enough training examples. These re- sults highlight the ... possible to the sense priors in WSJ. Assume sense s i is the predominant sense in the WSJ evaluation data, s i has a sense prior of p i in the WSJ dat...
Ngày tải lên: 08/03/2014, 02:21
Báo cáo khoa học: "SenseRelate::TargetWord – A Generalized Framework for Word Sense Disambiguation" doc
... found in chains that in- clude the target word. 74 2.4 Sense Inventory After having reduced the context to n words, the Sense Inventory stage determines the possible senses of each of the n words. ... the word senses. In our system, this module first decides the base (uninflected) form of each of the n words. It then retrieves all the senses for each word from...
Ngày tải lên: 08/03/2014, 04:22
Báo cáo khoa học: "An Empirical Study on Class-based Word Sense Disambiguation" pdf
... is very interesting since the MFC obtains high results due to the way it is defined, since the MFC over the total corpus is assigned if there are no oc- currences of the word in the training corpus. ... verbs measure around 2% while increasing the training corpus from 25% to 100% of SemCor. In SE3, the system again only improves the F1 measure around 2% while increasing...
Ngày tải lên: 24/03/2014, 03:20
Báo cáo khoa học: "Towards the Unsupervised Acquisition of Discourse Relations" pptx
... relation words • for every utterance – create a candidate triple consisting of the event type of the utterance, the relation word, and the event type of the preceding utterance. – add the candidate ... whether the relative frequency of a relation word for a pair of events is significantly higher or lower than the relative frequency of the relation word in the entire...
Ngày tải lên: 16/03/2014, 20:20
... under the control of the monitor which directs and coordinates theactivity of the specialists. The monitor starts by examining the first word of the sentence and puts the following information ... 60-90. There is not enough information in the working memory to allow performing this activity and the monitor, therefore, starts another processing step. In the ne...
Ngày tải lên: 01/04/2014, 00:20
Tài liệu Báo cáo khoa học: "Modeling the Translation of Predicate-Argument Structure for SMT" ppt
... Section 6.1) in our word- aligned bilingual training data. Then we extract all training events for verbal predicates which occur at least 10 times in the training data. A training event for a verbal ... structure in Chinese and its aligned English translation. The bold word in Chinese is the verbal predicate. The subscripts on the Chinese sentence show the indexes of...
Ngày tải lên: 19/02/2014, 19:20
Tài liệu Báo cáo khoa học: "Learning the Fine-Grained Information Status of Discourse Entities" pptx
... mentioned VP. In this case, the NP will receive the subtype event, as ex- emplified by the NP the bus in the sentence below, which is triggered by the VP traveling in Miami. (8) We were traveling in Miami, ... dialogues. Reporting re- sults using these two ways of obtaining chains fa- cilitates the comparison of the IS determination results that we can realistically obt...
Ngày tải lên: 22/02/2014, 03:20