Web information systems

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	391
Dung lượng	4,41 MB

Nội dung

Hershey • London • Melbourne • Singapore IDEA GROUP PUBLISHING Web Information Systems David Taniar Monash University, Australia Johanna Wenny Rahayu La Trobe University, Australia Acquisitions Editor: Mehdi Khosrow-Pour Senior Managing Editor: Jan Travers Managing Editor: Amanda Appicello Development Editor: Michele Rossi Copy Editor: Jennifer Wade Typesetter: Jennifer Wetzel Cover Design: Lisa Tosheff Printed at: Yurchak Printing, Inc. Published in the United States of America by Idea Group Publishing (an imprint of Idea Group Inc.) 701 E. Chocolate Avenue, Suite 200 Hershey PA 17033 USA Tel: 717-533-8845 Fax: 717-533-8661 E-mail: cust@idea-group.com Web site: http://www.idea-group.com and in the United Kingdom by Idea Group Publishing (an imprint of Idea Group Inc.) 3 Henrietta Street Covent Garden London WC2E 8LU Tel: 44 20 7240 0856 Fax: 44 20 7379 3313 Web site: http://www.eurospan.co.uk Copyright © 2004 by Idea Group Inc. All rights reserved. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopy- ing, without written permission from the publisher. Library of Congress Cataloging-in-Publication Data Web information systems / David Taniar, editor ; Johanna Wenny Rahayu, editor. p. cm. ISBN 1-59140-208-5 (hardcover) -- ISBN 1-59140-283-2 (pbk.) -- ISBN 1-59140-209-3 (ebook) 1. Information technology. 2. World Wide Web. I. Taniar, David. II. Rahayu, Johanna Wenny. T58.5.W37 2004 004.67'8--dc22 2003022612 British Cataloguing in Publication Data A Cataloguing in Publication record for this book is available from the British Library. All work contributed to this book is new, previously-unpublished material. The views expressed in this book are those of the authors, but not necessarily of the publisher. Web Information Systems Table of Contents Preface vi S ECTION I: W EB I NFORMATION M ODELING Chapter I. Story Boarding for Web-Based Information Systems . 1 Roland Kaschek, Massey University, New Zealand Klaus-Dieter Schewe, Massey University, New Zealand Catherine Wallace, Massey University, New Zealand Claire Matthews, Massey University, New Zealand Chapter II. Structural Media Types in the Development of Data-Intensive Web Information Systems 34 Klaus-Dieter Schewe, Massey University, New Zealand Bernhard Thalheim, Brandenburgian Technical University, Germany Chapter III. Toward a Model of the Migration of Communication Between Media Devices . 71 Richard Hall, La Trobe University, Australia S ECTION II: W EB I NFORMATION R EPRESENTATION , S TORAGE , AND A CCESS Chapter IV. Storage and Access Control Issues for XML Documents 104 George Pallis, Aristotle University of Thessaloniki, Greece Konstantina Stoupa, Aristotle University of Thessaloniki, Greece Athena Vakali, Aristotle University of Thessaloniki, Greece Chapter V. Transformation of XML Schema to Object Relational Database 141 Nathalia Devina Widjaya, Monash University, Australia David Taniar, Monash University, Australia Johanna Wenny Rahayu, La Trobe University, Australia S ECTION III: W EB I NFORMATION E XTRACTION Chapter VI. A Practical Approach to the Derivation of a Materialized Ontology View . 191 Carlo Wouters, La Trobe University, Australia Tharam Dillon, University of Technology Sydney, Australia Johanna Wenny Rahayu, La Trobe University, Australia Elizabeth Chang, Curtin University, Australia Robert Meersman, Vrije Universiteit Brussel, Belgium Chapter VII. Web Information Extraction via Web Views 227 Wee Keong Ng, Nanyang Technological University, Singapore Zehua Liu, Nanyang Technological University, Singapore Zhao Li, Nanyang Technological University, Singapore Ee Peng Lim, Nanyang Technological University, Singapore S ECTION IV: W EB I NFORMATION M INING Chapter VIII. A Knowledge-Based Web Information System for the Fusion of Distributed Classifiers . 268 Grigorios Tsoumakas, Aristotle University of Thessaloniki, Greece Nick Bassiliades, Aristotle University of Thessaloniki, Greece Ioannis Vlahavas, Aristotle University of Thessaloniki, Greece Chapter IX. Indexing Techniques for Web Access Logs . 305 Yannis Manolopoulos, Aristotle University of Thessaloniki, Greece Mikolaj Morzy, Poznan University of Technology, Poland Tadeusz Morzy, Poznan University of Technology, Poland Alexandros Nanopoulos, Aristotle University of Thessaloniki, Greece Marek Wojciechowski, Poznan University of Technology, Poland Maciej Zakrzewicz, Poznan University of Technology, Poland Chapter X. Traversal Pattern Mining in Web Usage Data 335 Yongqiao Xiao, Georgia College & State University, USA Jenq-Foung (J.F.) Yao, Georgia College & State University, USA About the Authors 359 Index . 368 Preface vi The chapters of this book provide an excellent overview of current research and development activities in the area of web information systems. They supply an in-depth description of different issues in web information systems areas, including web-based information modeling, migration between different media types, web information mining, and web information extraction issues. Each chapter is accompanied by examples or case studies to show the applicability of the described techniques or methodologies. The book is a reference for the state of the art in web information systems, including how information on the Web can be retrieved effectively and efficiently. Furthermore, this book will help the reader to gain an understanding of web-based information representation using XML, XML documents storage and access, and web views. Following our call for chapters in 2002, we received 29 chapter propos- als. Each proposed chapter was carefully reviewed and, eventually, 10 chapters were accepted for inclusion in this book. This book brought together academic and industrial researchers and practitioners from many different countries, including Singapore, Greece, Poland, Germany, New Zealand, the US and Australia. Their research and industrial experience, which are re- flected in their work, will certainly allow readers to gain an in-depth knowledge of their areas of expertise. INTENDED AUDIENCE Web Information Systems is intended for individuals who want to en- hance their knowledge of issues relating to modeling, representing, storing and mining information on the Web. Specifically, these individuals could in- clude: vii • Computer Science and Information Systems researchers: All of the topics in this book will give an insight to researchers about new development in web information system area. The topics on mining web usage data and mining data across geographically distributed environ- ment will give researchers an understanding into the state of the art of web data mining. Information Systems researchers will also find this book useful, as it includes some topics in the area of information extraction and ontology, as well as techniques for modeling information on the Web. • Computer Science and Information Systems students and teachers: The chapters in this book are grouped into four categories to cover important issues in the area. This will allow students and teachers in web information system field to effectively use the appropriate materials as a reference or reading resources. These categories are: (i) information modeling; (ii) information representation, storage and access; (iii) information extraction; and (iv) information mining. The chapters also provide examples to guide students and lecturers in using the methods or implementing the techniques. • Web-based Application Developers: The chapters in this book can be used by web application developers as a reference to use the correct techniques for modeling and design, migrating from other media devices, as well as efficiently handling huge amount of web information. For example, the practical techniques for materialized ontology view, as well as the techniques for deriving customized web views, can be used to man- age large web-based application development more effectively. • General community who is interested in current issues of web information systems: The general computer (IT) community will benefit from this book through its technical, as well as practical, overview of the area. PREREQUISITES The book as a whole is meant for anyone professionally interested in the development of web information systems and who, in some way, wants to gain an understanding of how the issues in modeling and implementation of a web-based information system differ from the traditional development techniques. Each chapter may be studied separately or in conjunction with other chapters. As each chapter may cover topics different from other chapters, the prerequisites for each may vary. However, we assume the readers have at least a basic knowledge of: viii • Web representation techniques, including HTML, XML, XML Schema, and DTD. • Web information repository, including XML databases, Relational databases, and Object-Relational databases. OVERVIEW OF WEB INFORMATION SYSTEMS The era of web technology has enabled information and application sharing through the Internet. The large amount of information on the Internet, the large number of users, and the complexity of the application and information types have introduced new areas whereby these issues are explored and addressed. Many of the existing information systems techniques and methods for data sharing, modeling, and system implementation are no longer effective and, therefore, need major adjustment. This has stimulated the emergence of web information systems. First, the way we model web information system requires different techniques from the existing information system modeling. The fact that a web- based system is accessed by numerous (often unpredictable) user character- istics, different end-user devices, and different internet connectivity, has introduced high complexity in defining a suitable modeling technique that will be capable and flexible enough to facilitate the above aspects. Another issue related to designing a web information system is how to migrate existing information between different media types, in particular from another media type to a web-based system. The second important issue in web information system is how information can be represented in a uniform way to allow communication and inter- change between different information sites. XML has been widely used as a standard for representing semi-structured information on the Web. Currently, one of the major issues in XML-based information systems includes how to efficiently store and access the XML documents. The fact that relational databases have been widely used and tested has encouraged many practitioners in this area to use it as XML data repository. On the other hand, native XML database systems are currently being developed and tested for a different alternative in storing XML documents. The third issue relates to the way we can efficiently retrieve and use the large amount of information on the Web. Moreover, very often users have interest in a specific aspect of the information only, and, therefore, download- ing or accessing the whole information repository will be inefficient. In this book, techniques for deriving a materialized ontology view and for generating a personalized web view are presented. Another issue, which is also closely related to data retrieval, is data or information mining. Data mining is discovering new information or patterns which were previously unknown in the collection of information. With web accesses, mining over web data becomes important. Web mining is basically a means for discovering patterns in user accesses and behaviour on the Web. This information will be particularly useful in building a web portal which is tailored for each user. New techniques for mining distributed information on the Web are needed. All of these issues need to be addressed, particularly in order to understand the benefits and features that web information systems bring, and this book is written for this purpose. ORGANIZATION OF THIS BOOK The book is divided into four major sections: I. Web information modeling II. Web information representation, storage, and access III. Web information extraction IV. Web information mining Each section, in turn, is divided into several chapters: Section I focuses on the topic of modeling web information. This section includes chapters on general web information system modeling and data intensive web system modeling techniques. This section also incorporates a chapter which describes a model to allow information migration and preser- vation between different media types. Section I consists of three chapters. Chapter 1, contributed by Roland Kaschek, Klaus-Dieter Schewe, Catherine Wallace, and Claire Matthews, proposes a holistic usage centered approach for analyzing requirements and conceptual modeling of web information systems (WIS) using a technique called story boarding. In this approach, WIS is conceptualized as an open information system whereby the linguistic, communicational and methodological aspects are described. The WIS is viewed from a business perspective, and this perspective is used to distinguish WIS from IS in general. Chapter 2, presented by Klaus-Dieter Schewe and Bernhard Thalheim, discusses a conceptual modeling approach for the design of data intensive WIS. In this chapter, the notion of media type, which is a view on an underlying database schema that allows transformation of database contents into a ix [...]... issues in web information systems is still difficult to find Most books are about either web technology focusing on developing websites, HTML, and possibly XML, or covering very specific areas only, such as information retrieval and semantic web This book is, therefore, different in that it covers an extensive range of topics, including web information conceptual modeling, XML related issues, web information. .. of web information system development The chapters on web conceptual modeling demonstrate techniques for capturing the complex requirements of web information systems in general, and then followed by more specific techniques for the development of data intensive web information systems These chapters are more specialized than the topics on traditional information system modeling normally found in information. .. SECTION I WEB INFORMATION MODELING Story Boarding for Web- Based Information Systems 1 Chapter I Story Boarding for Web- Based Information Systems Roland Kaschek, Massey University, New Zealand Klaus-Dieter Schewe, Massey University, New Zealand Catherine Wallace, Massey University, New Zealand Claire Matthews, Massey University, New Zealand ABSTRACT The present chapter is about story boarding for web information. .. seeking of information, readers interested in conceptual modeling of web information systems and how to migrate existing information in a different media type to the Web may read Chapters 1, 2, and 3 Readers interested in looking at XML and the recent development for efficiently storing and accessing XML documents may study the chapters in the second section Readers who are interested in web- based information. .. modeling, XML related issues, web information extraction, and web mining This book gives a good overview of important aspects in the development of web information systems The four major aspects covering web information modeling, storage, extraction and mining, described in four sections of this book respectively, form the fundamental flow of web information system development cycle The uniqueness of this... information systems publications Web information extraction is described using the concept of views, both at the interface level using web views as well as at the underlying ontology level using ontology views Both concepts are described in a practical manner, with case studies and examples throughout the chapters The chapters on information mining are solely focused on min- xiv ing web information, ... information systems (WIS) It is a holistic usage-centered approach for analyzing requirements and conceptual modeling of WIS We conceptualize web information systems as open information systems and discuss them from a business point of view, including their linguistic, communicational and methodological foundations To illustrate story boarding, we discuss a simple application example INTRODUCTION Information. .. data in, copies, or deletes data from a collection • A disseminating operation imports data from or exports data to an InS Web Information Systems as Open Information Systems ISs traditionally were closed systems in three respects Exchange of data with other than the foreseen systems was not easy to establish, if possible at all Only staff of the organization running the IS were given access to it... semantic web (Berners-Lee et al., 2001), or a web of ideas (Cherry, 2002), or on new Copyright © 2004, Idea Group Inc Copying or distributing in print or electronic forms without written permission of Idea Group Inc is prohibited 2 Kaschek, Schewe, Wallace, & Matthews business models due to the impact of information technology (see Kaner, 2002; Kaschek et al., 2003a) Since long information systems. .. for Web- Based Information Systems 3 measures and activities which make the WIS effective Clearly, for a more complete understanding of ISs, their maintenance, i.e., the measures and activities required for keeping them efficient, as well as deployment, i.e., actually making them effective, and retirement, i.e., the measures to make them stop being effective, would need to be discussed Information Systems . area of web information systems. They supply an in-depth description of different issues in web information systems areas, including web- based information. sections: I. Web information modeling II. Web information representation, storage, and access III. Web information extraction IV. Web information mining

Ngày đăng: 19/10/2013, 03:15

Xem thêm