A BIOLOGICAL AND BIOINFORMATICS ONTOLOGY FOR SERVICE DISCOVERY AND DATA INTEGRATION

91 2 0
A BIOLOGICAL AND BIOINFORMATICS ONTOLOGY FOR SERVICE DISCOVERY AND DATA INTEGRATION

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

-i- A BIOLOGICAL AND BIOINFORMATICS ONTOLOGY FOR SERVICE DISCOVERY AND DATA INTEGRATION Mindi M Dippold Submitted to the faculty of Indiana University in partial fulfillment of the requirements for the degree Masters of Science in the School of Informatics Indiana University December 2005 - ii - Accepted by the Graduate Faculty, Indiana University, in partial fulfillment of the requirements for the degree of Master of Science in Bioinformatics _ Malika Mahoui, PhD _ Zina Ben Miled, PhD _ Jake Chen, PhD Douglas Perry! 7/21/05 2:40 PM - iii Acknowledgements I would like to extend a thank you to all the support necessary for the completion of this work I offer gratitude to the support in part by NSF CAREER DBI-DBI0133946 and NSF DBI-0110854 I would also like to thank Dr Malika Mahoui and Dr Zina Ben Miled, my advisors who provided exquisite support and direction throughout the research process In addition, I would like to thank Dr Jake Chen, who with Dr.Mahoui and Dr Ben Miled supplied the knowledge and time to stand on my committee I would also like to thank the members of my research team, Nianhua Li, Bing Yao, and Ali Farooq, who provided great insight and encouragement throughout my work Finally, I would like to thank my husband, Ryan Dippold for his great support and patience throughout this process Without everyone’s support I could not have made it this far Thank you - iv - Table of Contents Page LIST OF FIGURES ABSTRACT INTRODUCTION……………………………………………………………… 1.1 BIOLOGICAL DOMAIN……………………………………………………1 1.2 BIOINFORMATICS DOMAIN…………………………………………… 1.3 WHAT IS ONTOLOGY? .3 1.3.1 WHAT IS OWL AND WHY USE IT? .4 1.4 REASONING……………………………………………………… ………8 RELATED RESEARCH………………………………………………………….8 2.1 BACIIS……………………………………………………………………….8 2.2 SIBIOS………………………………………………………………………11 2.3 ADDITIONAL RESOURCES…………………………………………… 14 2.3.1 TAMBIS…………………………………………………………… 14 2.3.2 PROTEUS……………………………………………………………15 2.3.3 BIOMOBY………………………………………………………… 15 2.3.4 MYGRID…………………………………………………………….16 2.4 PROPOSED THESIS WORK………………………………………………18 MATERIALS……………………………………………………………………19 3.1 PROTÉGÉ………………………………………………………………… 19 3.2 RACERPRO……………………………………………………………… 20 3.3 WONDERWEB ONTOLOGY VALIDATOR…………………………… 21 PROCEDURES AND INTERVENTIONS…………………………………… 22 4.1 LEARNING OWL………………………………………………………… 22 4.2 STUDYING SERVICES AND DATASOURCES…………………………22 4.3 ONTOLOGY DESIGN – ADVANCEMENTS OF PREVIOUS WORK….24 4.3.1 BIOLOGICAL DOMAIN……………………………………………25 4.3.2 BIOINFORMATICS DOMAIN…………………………………… 33 -v4.3.2.1 SERVICE PROCESS CLASSIFICATION…………………… 35 4.3.2.2 BIOINFORMATICS RESOURCE CLASSIFICATION……….36 4.3.2.3 FORMAT CLASSIFICATION………………………………….37 4.3.2.4 SERVICE ALGORITHM CLASSIFICATION……………… 38 4.3.2.5 BIOINFORMATICS TERMS CLASSIFICATION…………….40 4.3.2.6 CHALLENGES…………………………………………………42 4.3.3 APPLICATION DOMAIN………………………………………… 46 4.3.4 RESTRICTIONS…………………………………………………….47 4.3.4.1 HAS_INPUT / HAS_OUTPUT………………………………….47 4.3.4.2 PERFORMS_TASK…………………………………………… 50 4.3.4.3 USES_ALGORITHM………………………………………… 51 4.3.4.4 USES_RESOURCE…………………………………………… 51 4.4 ANALYSIS / TESTING…………………………………………………….52 4.5 EXPECTED RESULTS…………………………………………………… 54 4.6 ALTERNATE PLANS…………………………………………………… 56 CONCLUSION………………………………………………………………… 57 DISCUSSION……………………………………………………………………57 - vi - List of Figures Page Figure 1.1 A Schematic Drawing of the Process of Protein Functions and Origin….1 Figure 1.2 Class Definition in DAML + OIL……………………………………… Figure 1.3 Class Definition in OWL………………………………………………….6 Figure 2.1 BACIIS Architecture…………………………………………………… Figure 2.2 A Partial Structure of BAO…………………………………………… 11 Figure 2.3 SIBIOS Architecture…………………………………………………….13 Figure 2.4 The myGrid ontology model……………………………………………17 Figure 2.5 The myGrid Service Classification model…………………………… 18 Figure 3.1 The Protégé OWL Plugin Interface…………………………………… 20 Figure 4.1 Ontology Domain representation……………………………………….24 Figure 4.2: The top level figure of the distributed ontology domain……………… 29 Figure 4.3 A representation of a few of the top level of the Biological Domain……30 Figure 4.5 The reorganization of Enzyme Classification………………………… 31 Figure 4.6 The hierarchical relationship of Protein and Protein Classification…….33 Figure 4.7 The Bioinformatics Domain Hierarchy…………………………………35 - vii Figure 4.8 The Diagram hierarchy in the Bioinformatics Ontology…………………37 Figure 4.9 Bioinformatics data-format sub tree…………………………………….38 Figure 4.10 The Service Algorithm Classification hierarchy………………………40 Figure 4.11 The overall depiction of the Bioinformatics Terms classification…….41 Figure 4.12 The Bioinformatics Data Structures classification…………………….44 Figure 4.13 Bioinformatics format sub tree……………………………………… 44 Figure 4.14 A depiction of the application domain for SIBIOS……………………46 Figure 4.15 The has_input, has_output properties of BLASTN_SERVICE…………50 Figure 4.16 The SIBIOS Service Discovery Query Interface……………………….52 Figure 4.17 SIBIOS Service browsing capabilities for service discovery….………55 Figure 4.18 Selection panes for Service Discovery…………………………………55 Figure 4.19 SIBIOS Service Discovery System Workflow…………………………56 - viii - Abstract This project addresses the need for an increased expressivity and robustness of ontologies already supporting BACIIS and SIBIOS, two systems for data and service integration in the life sciences The previous ontology solutions as global schema and facilitator of service discovery sustained the purposes for which they were built to provide, but were in need of updating in order to keep up with more recent standards in ontology descriptions and utilization as well as increase the breadth of the domain and expressivity of the content Thus, several tasks were undertaken to increase the worth of the system ontologies These include an upgrade to a more recent ontology language standard, increased domain coverage, and increased expressivity via additions of relationships and hierarchies within the ontology as well as increased ease of maintenance by a distributed design - ix - -x1 INTRODUCTION 1.1 BIOLOGICAL DOMAIN Biology is a complex and diverse science that is ever evolving One aspect of the complexity of Biology is the complexity of the living systems themselves that are studied and represented One example of a process that occurs within a living system is the transcription of DNA and translation of that DNA into a protein that performs a particular function There are many steps to this process, and many entities involved in the process that produces the outcome of a specific protein function As depicted in Figure 1, a simple concept of “protein function” evolves from a very complex system These complex systems must therefore be clearly defined in database systems in order to have precise querying of information of interest Also, definitions (i.e constraints and relationships) need to be included in a well-defined knowledge base from which to build queries Organism Human Contains DNA regulates Contains Genes Gene regulatory Encode Function as Protein receptor storage structure motors signals transport enzymes Figure 1.1 A Schematic Drawing of the Process of Protein functions and origin Not only are biological systems and processes complex, but also the terms that represent such entities With the onset of advanced technology, data has been - 55 ADDITIONAL RESOURCES 7.1 TAMBIS TAMBIS is another biological database integration system The TAMBIS Ontology (TaO) closely resembles the BAO in its structure of biological entities [41] Also, these ontologies provide alike functions in aiding the creation of queries for biological web database integration TaO supports a hierarchical structure of representing biological entities from the most general depiction to more specific descriptions of structures, functions, processes, and substances [41] In order to provide for retrieval and analysis tasks, the Tao has been designed as a broad and shallow ontology which can support a wide variety of queries upon generalized biological terminology but allowing broad modeling of complex entities such as protein A subset of the high level classification of terms provided by TaO includes protein, enzyme, expressed sequence tag, nucleic acid, sequence homology, and taxonomy [41] For the purpose of the TAMBIS system, the ontology was a good support tool, but as systems progress and the field of bioinformatics demands more support, more expressive ontologies will be necessary - 56 7.2 PROTEUS Proteus is a Service Discovery System driven by an ontological knowledge base; much like SIBIOS The Proteus design includes two types of ontologies; a domain ontology which describes biological and bioinformatics concepts, and an application ontology which describes the main bioinformatics applications represented as workflows This ontology takes a two tier approach with an upper ontology level that describes the rationale of applications and software and a lower level that contains specific metadata about installed software and data sources The ontology divides the bioinformatics resources into two classes The class of Biological data stores is classified based on, (1.) kind of biological data, (2.) format in which the data is stored, (3.) type of data stored, and (4.) annotations The class of Bioinformatics processes and software components are classified based on (1.) task performed, (2.) steps composing the task, (3.) methodology of software, (4.) algorithm, (5.) data source, (6.) output, and (7.) software components [37] This ontology design appears to be well classified and semantically rich which is the goal of the SIBIOS ontology Thus, the recommendations of the Proteus ontology should be examined throughout the design of our bioinformatics ontology for service discovery - 57 7.3 BIOMOBY The BioMOBY service system was another application which was researched during the time of the original SIBIOS ontology design [36] This ontology design was implemented in order to discover service ‘pipelines’ to automate the service discovery process The MOBY-S registry uses ontologies to determine the structure and relationships between datatypes and services in order to invoke service discovery represented in OWL DL [38] The data entities are represented as classes in OWL, and properties exist such as hasa and isa Three ontologies comprise the information necessary to perform service discovery/implementation in BioMOBY; a Namespace ontology, Service ontology, and Object ontology The Namespace ontology includes namespaces for services such as Pfam [56], Swiss-Prot [51], GO [27], etc The Service ontology contains entities which describe the generic processes necessary to conduct services, i.e Retrieval, Resolution, Parsing, Analysis, and Registration The BioMOBY Object ontology classifies possible object entities necessary to perform a task and makes no distinction between bioinformatics terms and general object types e.g Phenotype Description and String are both found as subclasses of Object The design of distributed ontologies for the BioMOBY application have been hypothesized to support a broad knowledge base in semantic content [38] - 58 7.4 MYGRID myGrid is a bioinformatics ontology that set the basis for the current SIBIOS ontology design [36] myGrid classifies services based on functionality and focuses on being able to find similar services by defining the functional class of a service myGrid service descriptions are divided into two categories domain metadata, and business metadata The domain data consists of describing bioinformatics services while the business metadata covers data quality, quality of service, cost, geographical location, etc [39] The myGrid ontology uses a tier model of services which includes; 1)the class of a service as in a protein sequence database 2)specific examples of an abstract service such as Swiss-Prot 3) instance service description of a specific service which is provided by the business data and include facts such as Swiss-Prot is provided by EMBL [51]and 4) invoked instance description describing the instantiated parameters such as date in time for record keeping The tier model defines descriptions, classifications, and constraints in order to create a functional bioinformatics ontology Figure 2.4 represents the overall myGrid ontology model [39] The myGrid ontology service classification design is represented in Figure 2.5 This design classifies services based on the function that the service provides, for example, BLAST_service is a pairwise_sequence_alignment_service Not only does myGrid define a hierarchical structure organization, but also uses semantic relationships to represent the services These properties include has_input, has_output, performs_task, uses_resource, is_function_of (all of which are defined in the SIBIOS ontology), and geographicalRadius, provided_by, and qualityRating Thus, it is important to understand the design specification of the MyGrid ontology in order to best enhance the current design and coverage of the SIBIOS ontology - 59 - Upper level ontology Task ontology Molecular biology Informatics ontology Publishing ontology Organization ontology Bioinformati cs ontolgoy Web service ontology Figure 5.1 The myGrid ontology model Bioinformatics_service EMBOSS water service global_sequence alignment_service Figure 5.2 The myGrid Service Classification model EMBOSS_needle service tBLASTn service multiple_sequence alignment_service EMBOSS_stretcher service BLASTp service Smith-Waterman sequence_alignment BLASTn service BLAST_service pairwise_sequence alignment_service sequence_alignment service ClustalW_service - 60 CONCLUSION In conclusion, the proposed plan of providing a robust biological and bioinformatics ontology for the SIBIOS system has been delivered This project has matured from a biological ontology of 244 terms and a service discovery ontology containing 433 terms to an expressive ontology that abides by the recent W3C standards with 1958 defined classes and 44 properties defined for restrictions Within the scope of the current standings of the field of ontology creation and knowledge representation, it is understood that additional ontologies are available to expand the knowledge contained within the current solution for SIBIOS service discovery and integration With the design of this ontology, however, additions to the scope and detail contained will be easily supported The broad categorization of the terms represented within the scope of this project allows for dynamic classification of bioinformatics services and easy integration of additional knowledge sources The field of knowledge representation has grown quickly within the time frame of this project, and as it continues to grow, advancements are sure to astound the capabilities of data integration In light of the limitations of tools for development of this ontology, it has proved to be a vital part of the SIBIOS integration and service discovery system However, there are limitations that fell beyond the scope of this project Since the field of knowledge representation has grown so quickly, the language and representations used within this project have already been replaced with new and improved capabilities Therefore there are some clear recommendations of future enhancements to be made to the ontology and the systems which support the design, development and maintenance of it As far as design, it may be beneficial to capture business information for the integration system This would include date and time stamps in order to document usage as well as information concerning the efficiency and reliability of particular services Additional enhancements could also be to integrate additional ontologies within ours in the distributed design that encourages data share and reuse In attempt to iterate over the ontology it may also be beneficial to include - 61 unique identifiers for ontology terms much like those used in database design in order to ensure consistency within the domain In terms of development and maintenance, a user interface which allows users to make their own ontology specifications is a helpful tool that is also currently being developed and will definitely be useful in support of this knowledge base Also, translating the current ontology to OWL-S language for use and development in Web Services may lead to additional functionalities in providing better support for service discovery [58] Overall, I understand that there is a great need for knowledge representation now that the amount of information available to the general user has come to an overwhelming capacity Therefore, systems such as SIBIOS, BACIIS, and supporting knowledge bases will continue to prove to be useful tools for biological research both in academia and industry The work presented here has direct impact to the systems it was made to support as well as the indirect impact to the field of bioinformatics as a whole This flexible and extendible ontology design has potential to be used as a base knowledge base for the bioinformatics and biological domain in many cases It could easily be integrated with the additionally available ontologies in the domain in order to provide domain information for a number of domain integration systems With the continuing advancements of integration systems and knowledge bases in the bioinformatics domain, a fully semantic web based bioinformatics domain could someday be feasible -1References [1] Zina Ben Miled, Malika Mahoui, Ning Gao, Lingma Lu, Jessica Chen, Yue He A Service Discovery Approach in Support of Web Service Integration BIBE’04, 2004 [2] Thomas R Gruber Toward Principles for the Design of Ontologies Used for Knowledge Sharing Formal Ontology in Conceptual Analysis and Knowledge Representation 1993 [3] Zina Ben Miled, Ning Gao, Omran Bukhres, Lingma Lu, Nianhua Li, Yue He, Malika Mahoui, and Jessica Chen SIBIOS: A System for the Integration of Bioinformatics Services Proceedings of the Second International Workshop on Challenges of Large Applications in Distributed Environments IEEE 2004 [4] Christopher Welty Ontology Research AI Magazine 2003 [5] Jim Hendler, Eric Miller Web Ontology Language 2004 http://www.w3.org/2004/OWL/ [6] Natalya F Noy and Deborah L McGuiness ``Ontology Development 101: A Guide to Creating Your First Ontology'' Stanford Knowledge Systems Laboratory Technical Report KSL-01-05 and Stanford Medical Informatics Technical Report SMI-2001-0880, March 2001 [7] David Beckett RDF/XML Syntax Specification (Revised) W3C Recommendation 10 February 2004 http://www.w3.org/TR/rdf-syntax-grammar/ [8] Dan Connolly, Frank van Harmelen, Ian Horrocks, Deborah L McGuiness, Peter F Patel-Schneider, Lynn Andrea Stein DAML + OIL (March 2001) Reference Description 2001 http://www.w3.org/TR/daml+oil-reference [9] Tim Bray, Jean Paoli, C M Sperberg-McQueen, Eve Maler, Franỗois Yergeau Extensible Markup Language (XML) 1.0 (Third Edition) W3C Recommendation 04 February 2004 http://www.w3.org/TR/2004/REC-xml-20040204/ [10] Michael K Smith, Chris Welty, Deborah L McGuinness OWL Web Ontology Language Guide W3C Recommendation 10 February 2004 http://www.w3.org/TR/owl-guide/ [11] Deborah L McGuiness Conceptual Modeling for Distributed Ontology Environments In the Proceedings of The Eighth International Conference on Conceptual Structures in Logical, Linguistic, and CompuatationalIssues (ICCS 2000) Darmstadt, Germany August 14-18 2000 [12] A Maedche and B Motik and L Stojanovic Managing Multiple Distributed Ontologies in the Semantic Web.VLDB Journal 12 (4) 286-302 2003 [13] Holger Knublauch, Ray W Ferguson, Natalya F Noy and Mark A Musen The Protégé OWL Plugin: An Open Development Environment for Semantic Web Applications ISWC 2004 [14] Javier Gonzalez-Castillo, David Trastour, and Claudio Bartolini Description Logics for Matchmaking of Services Hewlett-Packard Company Technical Report 2001 [15] Sean B Palmer The Semantic Web: An Introduction 2001 http://infomesh.net/2001/swintro/ [16] Ben Miled, Z., Liu, Y., Li, N and Bukhres, O., “Distributed databases,” Wiley Encyclopedia for Biomedical Engineering, 2004 -2[17] Ben Miled, Li, N., Baumgartner, M., Liu, Y “A Decentralized Approach to the Integration of Life Science Web Databases,” Informatica, Vol 27, No.1, pages 3-14, 2003 [18] Ben Miled, Z., Webster, Y., Li, N., Liu, Y., “An Ontology for the Semantic Integration of Life Science Web Databases,” International Journal of Cooperative Information Systems, Vol 12, N0.2, 2003 [19] Ben Miled, Z., Wang, Y., Li, N., Ben Miled, Z., Bukhres, O., Martin, J., Nayar, A and Oppelt, R., "BAO, A Biological and Chemical Ontology For Information Integration," Online Journal Bioinformatics, Vol 1, pages 59-73, 2002 [20] R.M MacGregor, H Chalupsky, and E.R Melz, PowerLoom Manual, 1997 [21] Biotech Life Science Dictionary http://biotech.icmb.utexas.edu/search/dictsearch.html [22] Dictionary.com http://dictionary.reference.com/ [23] UniProt Knowledgebase Swiss-Prot Protein Knowledgebase TrEMBL Protein Database: User Manual 2005 http://us.expasy.org/sprot/userman.html [24] Schomburg I., Chang A., Hofmann O., Ebeling C., Ehrentreich F., Schomburg D BRENDA: a resource for enzyme data and metabolic information Trends Biochem Sci 2002 Jan;27(1):54-6 [25] Julia Boguslavsky New Year Tools for the ‘New’ Biology Bio-ITWorld 2004 http://www.bio-itworld.com/archive/011204/equipped.html [26] A Maedche, B Motik, L Stojanovic, R Studer, and R Volz An Infrastructure for Searching, Reusing and Evolving Distributed Ontologies ACM 2003 [27] http://www.geneontology.orf/ , The Gene Ontology Consortium [28] http://www.ebi.ac.uk/ European Bioinformatics Institute [29] http://www.ncbi.nlm.nih.gov/ National Center for Biotechnology Information [30] http://www.appliedbiosystems.com/ Applied Biosystems [31] http://www.ch.embnet.org/EMBOSS/ EMBOSS [32] http://www.m-w.com/ Merriam-Webster Online [33] http://www.ebi.ac.uk/clustalw/ ClustalW [34] Scordis P, Flower DR, Attwood TK FingerPRINTScan: intelligent searching of the PRINTS motif database Bioinformatics 1999 Oct, 15(10):799-806 [35] Ben Miled, Z, Gao, N, Bukhres, O, Lu, L, Li, N, He, Y, Mahoui, M, Chen, J SIBIOS: A System for the Integration of Bioinformatics Services Proceedings of the Second International Workshop on Challenges of Large Applications in Distributed Environments IEEE 2004 [36] Ben Miled, Z., Mahoui, M., Gao, N., Lu, L., Chen, J., He, Y., A Service Discovery Approach in Support of Web Service Integration BIBE'04, 2004 [37] Mario Cannatoro, Carmela Comito, Filippo Lo Schiavo, and Pierangelo Veltri Proteus, a Grid based Problem Solving Experiment for Bioinformatics: Architecture and Experiments IEEE Computational Intelligence Bulletin Feb 2004 Vol No [38] Wilkinson, MD, Gessler, D, Farmer, A, Stein, L The BioMOBY Project Explores Open-Source, Simple, Extensible Protocols for Enabling Biological Databse Interoperability Proceedings of the Virtual Conference on Genomics and Bioinformatics -3[39] Chris Wroe, Robert Stevens, Carole Goble, Angus Roberts, Mark Greenwood “A Suite of DAML+OIL Ontologies to Describe Bioinformatics Web Services Data.” URL: http://www.mygrid.org.uk [40] Robert Stevens, Carole Goble, Ian Horrocks, and Sean Bechofer Building a Bioinformatics Ontology Using OIL IEE Trans Inf Technol Biomed 6, 135 2002 [41] Patricia G Baker, Carole A Goble, Sean Bechhofer, Norman W Paton, Robert Stevens, Andy Brass An Ontology for Bioinformatics Applications Bioinformatics Vol 15 No 6, pp 510-520 1999 [42] http://protege.stanford.edu/ The Protégé Ontology Editor and Knowledge Acquisition System [43] Holger Knublauch, Ray W Ferguson, Natalya F Noy, and Mark A Musen The Protégé OWL Plugin: An Open Development Environment for Semantic Web Applications http://protege.stanford.edu/ [44] Volker Haarslev and Ralf Moller RACER System Description 2001 [45] http://www.embl.org/ European Molecular Biology Lab [46] http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/information3.html BLAST Information Guide [47] http://phoebus.cs.man.ac.uk:9999/OWL/Validator WonderWeb OWL Ontology Validator [48] Wang, Xiao Zhang, Da Qing Gu, Tao, Keng Pung, Hung Ontology Based Context Modeling and Reasoning using OWL [49] Lambrix, P Description Logics http://www.ida.liu.se/labs/iislab/people/patla/DL/ [50] Gao, Ning An Approach of Integrating Bioinformatics Services in Distributed Environments Masters Thesis 2004 [51] B Boeckmann, A Bairoch, R Apweiler, M.C Blatter, A Estreicher, E Gasteiger, M J Martin, K Michoud, C O’Donovan, I Phan, S Pilbout, and M Schneider The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 Nucleic Acids Research Vol 31, pp 365-370 January 2003 [52] Abel KJ, Xu J, Yin GY, Lyons RH, Meisler MH, Weber BL Mouse Brca1: localization sequence analysis and identification of evolutionarily conserved domains Human Molecular Genetics 1995 Dec 4(12): 2265-73 [53] Michael Wessel and Ralf Moller A High Performance Semantic Web Query Answering Engine [54] The World Wide Web Consortium http://www.w3.org/ [55] Transeq, EMBOSS tool for translating DNA/RNA into protein, http://www.ebi.ac.uk/emboss/transeq/ [56] A Bateman, L Coin, R Durbin, R.D Finn, V Hollich, S Griffiths-Jones, A Khanna, M Marshall, S Moxon, E.L Sonnhammer, D.J Studholme, C Yeats, and S.R Eddy The Pfam protein families database,” Nucleic Acids Res., vol 1, p 32, January 2004 [57] Ben Miled, Z., Mahoui, M., Dippold, M., Farooq, A., Li, N., and Bukhres, O., A Wrapper Induction Application with Knowledge Base Support: A Use Case for Initiation and Maintenance of Wrappers,” BIBE 2005 [58] OWL-S 1.0 Release http://www.daml.org/services/owl-s/1.0/ -4[59] Robert Stevens, Carole A Goble, Sean Bechhofer Ontology Based Knowledge Representation for Bioinformatics Ontology Briefings 2000 [60] Yue W Webster, "Ontology for Biological and Chemical Information Integration", MS Thesis, May 2002 -5Appendix Curriculum Vita -6- MINDI M DIPPOLD 12403 Hurlock Drive Fishers, IN 46038 cellular: (317) 502-4926 Mindi_dippold@yahoo.com EDUCATION August 2003-December 2005 Indiana University-Purdue University of Indianapolis, Indianapolis, IN Master of Science, Bioinformatics August 1998 – May 2003 The University of Toledo, Toledo, OH Bachelor of Science, Bioengineering THESIS M Dippold A Biological and Bioinformatics Oontology for Service Discovery and Data Integration Masters of Science Thesis Indiana University December 2005 PUBLICATIONS Z Ben Miled, M Mahoui, M Dippold, A Faroq, N Li, O Bukhres “A Wrapper Induction Application with Knowledge Base Support: A Use Case for Initiation and Maintenance of Wrappers IEEE BIBE Conference, Minneapolis, MN, October 19-21, 2005 J D Johnson, D.Plenz, J Beggs, W Li, M Meier, K Owen "Analysis of spontaneous activity in cultured brain tissue using the discrete wavelet transform", IEEE BIBE Conference, Washington, D.C., March 10-12, 2003 -7INDUSTRY EXPERIENCE June 2003-December 2005 Distributed and Parallel Information Systems Laboratory, IUPUI Research Assistant • Designed and implemented an OWL DL Distributed Ontology for Biological Information Systems and Service Discovery • Implemented developmental tools for the task of service integration via a domain ontology May 2005 - Present Dow Agrosciences, Indianapolis, IN Discovery Information Management Contractor • Develop a relational database system for stable storage and sharing of Vector NTI constructs and annotations • Develop an automated curator system in PERL to support business rules • Provide system support and user training for Vector NTI and supporting software May-August 2004 Dow Agrosciences, Indianapolis, IN Discovery Information Management Intern • Configured Spotfire Decision Site for viewing Biological Pathways, and the Gene Ontology browser in order to provide necessary research information for scientists • Provided Spotfire user support by troubleshooting and creating user specified queries to retrieve information from various information databases • Designed and supported a relational database and PERL scripts to provide automatic curation and storage for experimental constructs September 2002-May 2003 The Bioinformatics Laboratory, The University of Toledo, Toledo, Ohio Program Developer, Senior Design Project •Proposed and prepared a new computational program to analyze neurological data •Utilized Matlab and C++ to write an efficient, user friendly computer program that can efficiently model large quantities of experimental data -8May-August 2002 The Ohio State University/OARDC, Wooster, OH Plant Pathology Intern •Completed experiments for a population genetic study of Phytophthora sojae in Ohio •Performed DNA isolations, Gel Electrophoresis, Southern blots, and prepared DNA probes •Assisted in field evaluations of disease management strategies for Phytophthora sojae •Evaluated soybean populations for molecular markers associated with a novel resistance gene Jan - May / Aug - Dec 2001 Depuy, a Johnson & Johnson Company, Warsaw, Indiana Co-op Engineer •Biomechanical testing in support of knee, hip, and shoulder development •Responsibilities included specimen preparation, testing, data acquisition, data analysis, and report writing •Initiated and developed an updated design for the Depuy Impact Tester •Operated several different servo-hydraulic test frames, and used a variety of microscopes and photographic equipment SKILLS •PERL •SQL •Spotfire •Vector NTI •XML/RDF syntax HONORS AND ACTIVITIES •Society of Women Engineers-Vice President 2000, Secretary 2003 •Biomedical Engineering Society-President 2000, Student Member •Bioengineering Student and Industrial Advisory Committees •Geist Fitness and IUPUI Recreational Sports Aerobics Instructor ... the Biological and Bioinformatics Domains, we conclude that a knowledge base that can define and constrain biological data is - xiii necessary Thus, a Biological and Bioinformatics Ontology that... This ontology was created in effort to aid in data integration by resolving incompatibilities in data formats, query formulation, data representations, and data source schema [18] BAO (BACIIS ontology) ... database databases _and search_engines Bioinformatics presenting_media Bioinformatics_ servi ce provider - 26 - reference database sequence database genome mapping database BioinformaticsResources Classification

Ngày đăng: 19/10/2022, 03:35

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan