robust disambiguation of named entities in text bibtex

Báo cáo khoa học: "Detecting Semantic Relations between Named Entities in Text Using Contextual Features" pdf

Báo cáo khoa học: "Detecting Semantic Relations between Named Entities in Text Using Contextual Features" pdf

... the antecedents of (zero) pronouns. When there is a (zero) pronoun in the text, noun phrases that are in the previous context of the pronoun are sorted in order of likelihood of being the antecedent. The ... property-sharing constraint in center- ing. Annual Meeting of Association of Computational Lin- guistics, pages 200–206. T. Kudo and Y. Matsumoto. 2004. A boosting algorithm for classification of semi-structured ... relations out of 236,142 pairs in the annotated text. We conducted ten-fold cross-validation over 236,142 pairs of NEs so that sets of pairs from a single text were not divided into the training and...

Ngày tải lên: 17/03/2014, 04:20

4 314 0
Tài liệu Báo cáo khoa học: "Robust Extraction of Named Entity Including Unfamiliar Word" doc

Tài liệu Báo cáo khoa học: "Robust Extraction of Named Entity Including Unfamiliar Word" doc

... compare perfor- mances of proposed methods and baseline methods. 3 Robust Extraction of Named Entities Including Unfamiliar Words The proposed method of extracting NEs consists of two steps. Its first ... Example of Training Instance for Proposed Method −→ Parsing Direction −→ Feature set F i−2 F i−1 F i F i+1 F i+2 Chunk label c i−2 c i−1 c i Figure 1 shows an example of training instance of the ... know. 1 2.2 Chunking of Named Entities It is quite common that the task of extracting Japanese NEs from a sentence is formalized as a chunking problem against a sequence of mor- 1 The organizer of the...

Ngày tải lên: 20/02/2014, 09:20

4 384 1
Báo cáo khoa học: "Annotating and Recognising Named Entities in Clinical Notes" pot

Báo cáo khoa học: "Annotating and Recognising Named Entities in Clinical Notes" pot

... baseline system was built using only bag -of- word features from the training corpus. A context-window size of 2 and tag pre- diction of previous token were used in all experi- ments. Without using ... acronyms in the notes. This also suggest that this kind of clin- ical notes are very noisy, and require a consider- 23 able amount of effort in pre-processing. Allow- ing partial matching increased ... overall increase of 2.47 F-score. Partial matching discov- ered a larger number of matching candidates us- ing a looser matching criteria, therefore decreased in precision with compensation of an increase...

Ngày tải lên: 08/03/2014, 01:20

9 413 0
Báo cáo khoa học: "Recognizing Named Entities in Tweets" docx

Báo cáo khoa học: "Recognizing Named Entities in Tweets" docx

... Linguistics Recognizing Named Entities in Tweets Xiaohua Liu ‡ † , Shaodian Zhang ∗ § , Furu Wei † , Ming Zhou † ‡ School of Computer Science and Technology Harbin Institute of Technology, Harbin, 150001, China Đ Department ... mingzhou}@microsoft.com Đ zhangsd.sjtu@gmail.com Abstract The challenges of Named Entities Recogni- tion (NER) for tweets lie in the insufficient information in a tweet and the unavailabil- ity of training data. We propose to com- bine a K-Nearest Neighbors ... 2010. Annotating named entities in twitter data with crowd- sourcing. In CSLDAMT, pages 80–88. Jenny Rose Finkel and Christopher D. Manning. 2009. Nested named entity recognition. In EMNLP, pages 141–150. Jenny...

Ngày tải lên: 17/03/2014, 00:20

9 296 0
Tài liệu Báo cáo khoa học: "The Effect of Corpus Size in Combining Supervised and Unsupervised Training for Disambiguation" pdf

Tài liệu Báo cáo khoa học: "The Effect of Corpus Size in Combining Supervised and Unsupervised Training for Disambiguation" pdf

... struc- tures giving rise to th e same set of dependen- cies (a piece of a tile of a roof of a house vs. a piece of a roof of a tile of a house) cannot be distinguished. We believe that an inverted index ... performance of 87.6% for a train- ing set of about 85% of WSJ. That num- ber is not that far from the 82.8% achieved by Collins’ parser in our experiments when trained on 50% of WSJ. Some of the super- vised ... co-training for statistical parsers. In Workshop on the Con- tinuum from Labeled to Unlabeled Data in Ma- chine Learning and Data Mining, ICML. Mark Johnson and Stefan Riezler. 2000. Ex- ploiting...

Ngày tải lên: 20/02/2014, 12:20

8 515 0
Tài liệu Báo cáo khoa học: "SenseLearner: Word Sense Disambiguation for All Words in Unrestricted Text" doc

Tài liệu Báo cáo khoa học: "SenseLearner: Word Sense Disambiguation for All Words in Unrestricted Text" doc

... reported during the recent SENSEVAL evaluations. 1 Introduction The task of word sense disambiguation consists of assigning the most appropriate meaning to a polyse- mous word within a given context. ... advantage of providing larger coverage. In this paper, we present a method for solving the semantic ambiguity of all content words in a text. The algorithm can be thought of as a minimally supervised word ... back-off method using the most frequent sense in WordNet when no training exam- ples were found in SEMCOR. This resulted into sig- nificantly higher complexity, with a very large num- ber of models...

Ngày tải lên: 20/02/2014, 15:20

4 400 0
Báo cáo khoa học: "A Method for Word Sense Disambiguation of Unrestricted Text" potx

Báo cáo khoa học: "A Method for Word Sense Disambiguation of Unrestricted Text" potx

... be able to distin- guish later the correct sense association from such a small pool. 3 Contextual ranking of word senses Since the Internet contains the largest collection of texts electronically ... results in a value indicating the frequency of occurrences for Wl and the sense of W2. In our experiments we used (Altavista, 1996) since it is one of the most powerful search engines currently ... SemCor was done of course within a larger context, the context of sentence and discourse. By working only with a pair of words we do not take advan- tage of such a broader context. For example,...

Ngày tải lên: 08/03/2014, 06:20

7 378 0
Means of transport available in Hanoi.DOC

Means of transport available in Hanoi.DOC

... Hanoi and arrive at Lao Cai train station in the early morning. Trains also leave in the evening from Lao Cai and arrive in the morning in Hanoi. Traveling within Sapa: Sapa is a small town ... History of VietNam trains: The first train in Indo-china running from Sai Gon to Cho Lon on 27 December 1881 In National Communication and Transpor system, VietNam Railway came into being later ... were installed in soft and hard berth cars,soft seat car as buffer car.beside,there were automatically-boilt hettles in these cars of individual demand of passengers.The installation of fresh...

Ngày tải lên: 03/09/2012, 09:19

12 1.1K 8
Immobilization of heavy metals in sediment dredged from a seaport by iron bearing materials

Immobilization of heavy metals in sediment dredged from a seaport by iron bearing materials

... Table of Contents What is medical waste 2 Definition of medical wastes 2 Definition according to Wikipedia 2 Definition according to Ministry of Public Health of Vietnam 2 Classification of ... easily by using one of the latest technologies – Biofast – a kind of machine for filtering liquid wastes. Biofast are outstanding because of its effectiveness and efficiency. Biofast operates ... testing of biological products. 1.1.2. Definition according to the Ministry of Public Health of Vietnam  Medical wastes are material in form of solid, liquid or gases which are eliminated...

Ngày tải lên: 23/09/2012, 15:38

10 723 0
w