1. Trang chủ
  2. » Thể loại khác

...utomatic Semantic Labelling MeVisLab.pdf

11 39 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 11
Dung lượng 652,42 KB

Nội dung

Proceedings of the ACL 2010 Student Research Workshop, pages 91–96, Uppsala, Sweden, 13 July 2010. c 2010 Association for Computational Linguistics Adapting Self-training for Semantic Role Labeling Rasoul Samad Zadeh Kaljahi FCSIT, University of Malaya 50406, Kuala Lumpur, Malaysia. rsk7945@perdana.um.edu.my Abstract Supervised semantic role labeling (SRL) sys- tems trained on hand-crafted annotated corpo- ra have recently achieved state-of-the-art per- formance. However, creating such corpora is tedious and costly, with the resulting corpora not sufficiently representative of the language. This paper describes a part of an ongoing work on applying bootstrapping methods to SRL to deal with this problem. Previous work shows that, due to the complexity of SRL, this task is not straight forward. One major difficulty is the propagation of classification noise into the successive iterations. We address this problem by employing balancing and preselection me- thods for self-training, as a bootstrapping algo- rithm. The proposed methods could achieve improvement over the base line, which do not use these methods. 1 Introduction Semantic role labeling has been an active re- search field of computational linguistics since its introduction by Gildea and Jurafsky (2002). It reveals the event structure encoded in the sen- tence, which is useful for other NLP tasks or ap- plications such as information extraction, ques- tion answering, and machine translation (Surdea- nu et al., 2003). Several CoNLL shared tasks (Carreras and Marquez, 2005; Surdeanu et al., 2008) dedicated to semantic role labeling affirm the increasing attention to this field. One important supportive factor of studying supervised statistical SRL has been the existence of hand-annotated semantic corpora for training SRL systems. FrameNet (Baker et al., 1998) was the first such resource, which made the emer- gence of this research field possible by the se- minal work of Gildea and Jurafsky (2002). How- ever, this corpus only exemplifies the semantic role assignment by selecting some illustrative examples for annotation. This questions its suita- bility for statistical learning. Propbank was started by Kingsbury and Palmer (2002) aiming at developing a more representative resource of English, appropriate for statistical SRL study. Propbank has been used as the learning framework by the majority of SRL work and competitions like CoNLL shared tasks. However, it only covers the newswire text from a specific genre and also deals only with verb predicates. All state-of-the-art SRL systems show a dra- matic drop in performance when tested on a new text domain (Punyakanok et al., 2008). This evince the infeasibility of building a comprehen- sive hand-crafted corpus of natural language use- ful for training a robust semantic role labeler. A possible relief for this problem is the utility of semi-supervised learning methods along with the existence of huge amount of natural language text available at a low cost. Semi-supervised me- thods compensate the scarcity of labeled data by utilizing an additional and much larger amount of unlabeled data via a variety of algorithms. Self-training (Yarowsky, 1995) is a semi- supervised algorithm which has been well stu- died in the NLP area and gained promising re- sult. It iteratively extend its training set by labe- ling the unlabeled data using a base classifier trained on the labeled data. Although the algo- rithm is theoretically straightforward, it involves a large number of parameters, highly influenced by the specifications of the underlying task. Thus to achieve the best-performing parameter set or even to investigate the usefulness of these algo- rithms for a learning task such as SRL, a tho- rough experiment See discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/275639040 Automatic Semantic Labelling of Urban Areas using a rule-based approach and realized with MeVisLab RESEARCH · APRIL 2015 DOI: 10.13140/RG.2.1.3345.0408 DOWNLOADS VIEWS 24 37 4 AUTHORS, INCLUDING: Markus Gerke Universiteit Twente 88 PUBLICATIONS 362 CITATIONS SEE PROFILE Available from: Markus Gerke Retrieved on: 17 June 2015 Automatic Semantic Labelling of Urban Areas using a rule-based approach and realized with MeVisLab Thijmen Speldekamp, Chris Fries, Caroline Gevaert, Markus Gerke 04-2015 Abstract The article is about a project done at the ITC faculty of the University of Twente It describes a rule-based system and the actual implementation of a model to perform an automatic semantic labelling of urban areas There will be a short introduction and some information about the software that was used in this project and the subset data that was used Followed by the explanation of the model we will show and assess the result and the accuracy assessment processing and visualization This means that the program is not used Introduction for its original purpose, which brings new insights for the subject, but also problems This report is the result of the individual final assignment from the The helpguide in Mevislab doesn’t Geo Data Processing & Spatial contain much explanation Information given at the ITC faculty concerning problems you encounter of the University of Twente For this when producing a model for assignment we looked at automatic automatic semantic labelling semantic labelling of very high The program uses modules, which resolution airborne images (derived allows for a simplified way of ortho image and height model) The programming The modules are assignment consisted of making a connected to each other to extract model to perform this automatic information from one another The semantic labelling in a simple network that is created is the model program which in this case performed the automatic semantic labelling Method MeVisLab For this project a program called MeVisLab (http://www.mevislab.de/) was used This is a program originally intended for medical image Subsets For the project the TOP (True Ortho Photos) & DSM (Digital Surface Models) files that was provided by the ISPRS (http://www2.isprs.org/commissio ns/comm3/wg4/semantic- labeling.html) were used The ‘area 30’ subset, created by the ISPRS, was used as the training sample, because this was a subset which had a good division between the four bigger classes: impervious surfaces, buildings, trees & low vegetation It also allowed for experimenting with the car class, since this is the most challenging We also made use of the ‘area 26’ subset to set some parameters in the training sample, the reason therefore was that there is water present in this subset For the testing stage, the ‘area 32’ subset was used The Model: Image processing The TOP file is loaded into the model via an image load operator and hereafter dissected in three bands, green, red, near infrared Hereafter two new arithmetic modules are placed The first one is to calculate the NDVI values of the image The formula used in this arithmetic is: 𝑁𝑁𝑁𝑁𝑁𝑁𝑁𝑁 = The use of the NDVI is based on the fact that green vegetation has a low reflectance in the red region due to chlorophyll and high reflectance in NIR due to the cell structure The reflectance in the NIR is much higher than in the red region This is a unique characteristic of vegetation So when you calculate the NDVI, if NIR is much higher than red, you will get a value closer to This is why green vegetation has higher NDVI values and all other objects such as cars, roads, etc have low NDVI values Another benefit is that it looks at the relative difference between NIR and red, so it is also capable of identifying vegetation in shadows This is shown in figure The same counts for the low vegetation in the image 𝑁𝑁𝑁𝑁𝑁𝑁 − 𝑅𝑅𝑅𝑅𝑅𝑅 𝑁𝑁𝑁𝑁𝑁𝑁 + 𝑅𝑅𝑅𝑅𝑅𝑅 + 0,0001 In the formula NIR stands for near infrared The term “+ 0,0001” in the denominator is used to prevent errors in the data because of dividing by The second arithmetic is to calculate the Intensity This will be used to find shadows and very dark areas in the image The formula used here is a simple average of the three input images This data is then sent to the thresholds, which will be discussed later Figure 1: The modules used for the NDVI Figure 2: The image from the NDVI output The DSM file is also loaded via an image load operator The grey values in this file are encoded as 32 bit float values But the DSM files in the data set contain values, which don’t represent a surface height For instance if a slope is present inside an image a building on one side of the image may be lower (in height) than a road on the other side For an automatic sematic labelling we need to adjust this Figure 3: The modules used for the DSM In our model we used normalized DSM data Using the ...Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 937–944, Sydney, July 2006. c 2006 Association for Computational Linguistics Stochastic Discourse Modeling in Spoken Dialogue Systems Using Semantic Dependency Graphs Jui-Feng Yeh, Chung-Hsien Wu and Mao-Zhu Yang Department of Computer Science and Information Engineering National Cheng Kung University No. 1, Ta-Hsueh Road, Tainan, Taiwan, R.O.C. {jfyeh, chwu, mzyang}@csie.ncku.edu.tw Abstract This investigation proposes an approach to modeling the discourse of spoken dia- logue using semantic dependency graphs. By characterizing the discourse as a se- quence of speech acts, discourse modeling becomes the identification of the speech act sequence. A statistical approach is adopted to model the relations between words in the user’s utterance using the semantic dependency graphs. Dependency relation between the headword and other words in a sentence is detected using the semantic dependency grammar. In order to evaluate the proposed method, a dia- logue system for medical service is devel- oped. Experimental results show that the rates for speech act detection and task- completion are 95.6% and 85.24%, re- spectively, and the average number of turns of each dialogue is 8.3. Compared with the Bayes’ classifier and the Partial- Pattern Tree based approaches, we obtain 14.9% and 12.47% improvements in ac- curacy for speech act identification, re- spectively. 1 Introduction It is a very tremendous vision of the computer technology to communicate with the machine us- ing spoken language (Huang et al., 2001; Allen at al., 2001). Understanding of spontaneous language is arguably the core technology of the spoken dia- logue systems, since the more accurate information obtained by the machine (Higashinaka et al., 2004), the more possibility to finish the dialogue task. Practical use of speech act theories in spoken lan- guage processing (Stolcke et al. 2000; Walker and Passonneau 2001; Wu et al., 2004) have given both insight and deeper understanding of verbal com- munication. Therefore, when considering the whole discourse, the relationship between the speech acts of the dialogue turns becomes ex- tremely important. In the last decade, several prac- ticable dialogue systems (McTEAR, 2002), such as air travel information service system, weather forecast system, automatic banking system, auto- matic train timetable information system, and the Circuit-Fix-it shop system, have been developed to extract the user’s semantic entities using the se- mantic frames/slots and conceptual graphs. The dialogue management in these systems is able to handle the dialogue flow efficaciously. However, it is not applicable to the more complex applications such as “Type 5: the natural language conversa- tional applications” defined by IBM (Rajesh and Linda, 2004). In Type 5 dialog systems, it is possi- ble for the users to switch directly from one ongo- ing task to another. In the traditional approaches, the absence of precise speech act identification without discourse analysis will result in the failure in task switching. The capability for identifying the speech act and extracting the semantic objects by reasoning plays a more important role for the dia- log systems. This research proposes a semantic dependency-based discourse model to capture and share the semantic objects among tasks that switch during a dialog for semantic resolution. Besides 937 acoustic speech recognition, natural language un- derstanding is one of the most important research issues, since Database Description with SDM: A Semantic Database Model MICHAEL HAMMER Massachusetts Institute of Technology and DENNIS McLEOD University of Southern California SDM is a high-level semantics-based database description and structuring formalism (database model) for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a database in terms of the kinds of entities that exist in the application environment, the classifications and groupings of those entities, and the structural interconnections among them. SDM provides a collection of high-level modeling primitives to capture the semantics of an application environment. By accommodating derived information in a database structural specification, SDM allows the same information to be viewed in several ways; this makes it possible to directly accommodate the variety of needs and processing requirements typically present in database applications. The design of the present SDM is based on our experience in using a preliminary version of it. SDM is designed to enhance the effectiveness and usability of database systems. An SDM database description can serve as a formal specification and documentation tool for a database; it can provide a basis for supporting a variety of powerful user interface facilities, it can serve as a conceptual database model in the database design process; and, it can be used as the database model for a new kind of database management system. Key Words and Phrases: database management, database models, database semantics, database definition, database modeling, logical database design CR Categories: 3.73, 3.74, 4.33 1. INTRODUCTION Every database is a model of some real world system. At all times, the contents of a database are intended to represent a snapshot of the state of an application environment, and each change to the database should reflect an event (or sequence of events) occurring in that environment. Therefore, it is appropriate Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. This research was supported in part by the Joint Services Electronics Program through the Air Force Office of Scientific Research (AFSC) under Contract F44620-76-C-0061, and, in part by the Advanced Research Projects Agency of the Department of Defense through the Office of Naval Research under Contract N00014-76-C-0944. The alphabetical listing of the authors indicates indistinguishably equal contributions and associated funding support. Authors’ addresses: M. Hammer, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139; D. McLeod, Computer Science Department, University of Southern California, University Park, Los Angeles, CA 90007. 0 1981 ACM 0362-5915/81/0900-0351800.75 ACM ‘hatsdons on Database Systems, Vol. 6, No. 3, September 1981, Pages 351-386. 352 - M. Hammer and D. McLeod that the structure of a database mirror the structure of the system that it models. A database whose organization is based on naturally occurring structures will be easier for a database designer to construct and modify than one that forces him to translate the primitives Hindawi Publishing Corporation EURASIP Journal on Image and Video Processing Volume 2007, Article ID 18019, 14 pages doi:10.1155/2007/18019 Research Article Enabling Seamless Access to Digital Graphical Contents for Visually Impaired Individuals via Semantic-Aware Processing Zheshen Wang, Xinyu Xu, and Baoxin Li Department of Computer Science and Engineering, School of Computing and Informatics, Arizona State University, Tempe, AZ 85287-8809, USA Received 15 January 2007; Revised 2 May 2007; Accepted 20 August 2007 Recommended by Thierry Pun Vision is one of the main sources through which people obtain information from the world, but unfortunately, visually impaired people are partially or completely deprived of this type of information. With the help of computer technologies, people with visual impairment can independently access digital textual information by using text-to-speech and text-to-Braille softwares. However, in general, there still exists a major barrier for people who are blind to access the graphical information independently in real time without the help of sighted people. In this paper, we propose a novel multilevel and multimodal approach aiming at addressing this challenging and practical problem, with the key idea being semantic-aware visual-to-tactile conversion through semantic image categorization and segmentation, and semantic-driven image simplification. An end-to-end prototype system was built based on the approach. We present the details of the approach and the system, report sample experimental results with realistic data, and compare our approach with current typical practice. Copyright © 2007 Zheshen Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION Visual information in digital form has become widely avail- able with the prevalence of computers and the Internet. A significant part of the digital visual information is conveyed in graphical form (e.g., digital images, maps, diagrams). Sighted people can easily enjoy the added value that graph- ical contents bring to a digital document. Nowadays, people with visual impairment can independently access digital tex- tual information with the help of text-to-speech and text-to- Braille software (e.g., [1]). Unfortunately, in general, without assistance from sighted people, computer users with visual impairment are partially or completely deprived of the bene- fit of graphical information which may be vital to understand the underlying digital media. For example, there are still no well-accepted systems/technologies that can readily convert any online graphics into tactile forms that can be immedi- ately consumed by a computer user who is blind. In other words, despite the improved access to information enabled by recent technology on computer system and software, there still exists a major barrier for a computer user who is blind to access the digital graphical information independently with- out the help of sighted people. Our work aims at addressing this challenging problem. Conventional procedures for producing tactile graphics by sighted tactile graphic specialist (TGS) are in general time- consuming and labor-intensive. Therefore, it is impractical to expect a computer user who is blind to rely on such pro- cedures for instant help. It is thus desirable to have a self- sufficient method that may deliver a tactile printout on de- mand whenever the user wants it, independent of the assis- tance of a sighted Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 2008, Article ID 693731, 10 pages doi:10.1155/2008/693731 Research Article Optimizing Training Set Construction for Video Semantic Classification Jinhui Tang, 1 Xian-Sheng Hua, 2 Yan Song, 1 Tao Mei, 2 and Xiuqing Wu 1 1 Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei 230027, China 2 Microsoft Research Asia, Beijing 100080, China Correspondence should be addressed to Jinhui Tang, jhtang@mail.ustc.edu.cn Received 9 March 2007; Revised 14 September 2007; Accepted 12 November 2007 Recommended by Mark Kahrs We exploit the criteria to optimize training set construction for the large-scale video semantic classification. Due to the large gap between low-level features and higher-level semantics, as well as the high diversity of video data, it is difficult to represent the prototypes of semantic concepts by a training set of limited size. In video semantic classification, most of the learning-based approaches require a large training set to achieve good generalization capacity, in which large amounts of labor-intensive manual labeling are ineluctable. However, it is observed that the generalization capacity of a classifier highly depends on the geometrical distribution of the training data rather than the size. We argue that a training set which includes most temporal and spatial distribution information of the whole data will achieve a good performance even if the size of training set is limited. In order to capture the geometrical distribution characteristics of a given video collection, we propose four metrics for constructing/selecting an optimal training set, including salience, temporal dispersiveness, spatial dispersiveness, and diversity.Furthermore,basedonthese metrics, we propose a set of optimization rules to capture the most distribution information of the whole data using a training set with a given size. Experimental results demonstrate these rules are effective for training set construction in video semantic classification, and significantly outperform random training set selection. Copyright © 2008 Jinhui Tang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1. INTRODUCTION Video content analysis is an elementary step for mining the semantic information in video collections, in which seman- tic classification (or we may call it annotation) of video seg- ments is essential for further analysis, as well as important for enabling semantic-level video search. For human being, most semantic concepts are clear and easy to identify, while due to the large gap between semantics and low-level features, the corresponding features generally are not well-separated in feature space thus difficult to be identified by computer. This is an open difficulty in computer vision and visual con- tent analysis area. Generally, learning-based video semantic classification methods use statistical learning algorithms to model the se- mantic concepts (generative learning) or the discriminations among different concepts (discriminative learning). In [1], hidden Markov model and dynamic programming are ap- plied to play/break segmentation in soccer videos. Fan et al. [2] classify semantic concepts for surgery education videos by using Bayesian classifiers with an adaptive EM algorithm. ZhongandChang[3] propose a unified framework for scene ... Twente For this when producing a model for assignment we looked at automatic automatic semantic labelling semantic labelling of very high The program uses modules, which resolution airborne images... information from one another The semantic labelling in a simple network that is created is the model program which in this case performed the automatic semantic labelling Method MeVisLab For this...Automatic Semantic Labelling of Urban Areas using a rule-based approach and realized with MeVisLab Thijmen Speldekamp,

Ngày đăng: 04/11/2017, 14:15