Data mining special issue in annals of information systems stahlbock, crone lessmann 2009 11 23

402 68 0
Data mining  special issue in annals of information systems stahlbock, crone  lessmann 2009 11 23

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Annals of Information Systems Series Editors Ramesh Sharda Oklahoma State University Stillwater, OK, USA Stefan Voß University of Hamburg Hamburg, Germany For further volumes: http://www.springer.com/series/7573 Robert Stahlbock · Sven F Crone · Stefan Lessmann Editors Data Mining Special Issue in Annals of Information Systems 123 Editors Robert Stahlbock Department of Business Administration University of Hamburg Institute of Information Systems Von-Melle-Park 20146 Hamburg Germany stahlbock@econ.uni-hamburg.de Sven F Crone Department of Management Science Lancaster University Management School Lancaster United Kingdom LA1 4YX sven.f.crone@crone.de Stefan Lessmann Department of Business Administration University of Hamburg Institute of Information Systems Von-Melle-Park 20146 Hamburg Germany lessmann@econ.uni-hamburg.de ISSN 1934-3221 e-ISSN 1934-3213 ISBN 978-1-4419-1279-4 e-ISBN 978-1-4419-1280-0 DOI 10.1007/978-1-4419-1280-0 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2009910538 c Springer Science+Business Media, LLC 2010 All rights reserved This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com) Preface Data mining has experienced an explosion of interest over the last two decades It has been established as a sound paradigm to derive knowledge from large, heterogeneous streams of data, often using computationally intensive methods It continues to attract researchers from multiple disciplines, including computer sciences, statistics, operations research, information systems, and management science Successful applications include domains as diverse as corporate planning, medical decision making, bioinformatics, web mining, text recognition, speech recognition, and image recognition, as well as various corporate planning problems such as customer churn prediction, target selection for direct marketing, and credit scoring Research in information systems equally reflects this inter- and multidisciplinary approach Information systems research exceeds the software and hardware systems that support data-intensive applications, analyzing the systems of individuals, data, and all manual or automated activities that process the data and information in a given organization The Annals of Information Systems devotes a special issue to topics at the intersection of information systems and data mining in order to explore the synergies between information systems and data mining This issue serves as a follow-up to the International Conference on Data Mining (DMIN) which is annually held in conjunction within WORLDCOMP, the largest annual gathering of researchers in computer science, computer engineering, and applied computing The special issue includes significantly extended versions of prior DMIN submissions as well as contributions without DMIN context We would like to thank the members of the DMIN program committee Their support was essential for the quality of the conferences and for attracting interesting contributions We wish to express our sincere gratitude and respect toward Hamid R Arabnia, general chair of all WORLDCOMP conferences, for his excellent and tireless support, organization, and coordination of all WORLDCOMP conferences Moreover, we would like to thank the two series editors, Ramesh Sharda and Stefan Voß, for their valuable advice, support, and encouragement We are grateful for the pleasant cooperation with Neil Levine, Carolyn Ford, and Matthew Amboy from Springer and their professional support in publishing this volume In addition, we v vi Preface would like to thank the reviewers for their time and their thoughtful reviews Finally, we would like to thank all authors who submitted their work for consideration to this focused issue Their contributions made this special issue possible Hamburg, Germany Hamburg, Germany Lancaster, UK Robert Stahlbock Stefan Lessmann Sven F Crone Contents Data Mining and Information Systems: Quo Vadis? Robert Stahlbock, Stefan Lessmann, and Sven F Crone 1.1 Introduction 1.2 Special Issues in Data Mining 1.2.1 Confirmatory Data Analysis 1.2.2 Knowledge Discovery from Supervised Learning 1.2.3 Classification Analysis 1.2.4 Hybrid Data Mining Procedures 1.2.5 Web Mining 1.2.6 Privacy-Preserving Data Mining 1.3 Conclusion and Outlook References 1 3 10 11 12 13 Part I Confirmatory Data Analysis Response-Based Segmentation Using Finite Mixture Partial Least Squares Christian M Ringle, Marko Sarstedt, and Erik A Mooi 2.1 Introduction 2.1.1 On the Use of PLS Path Modeling 2.1.2 Problem Statement 2.1.3 Objectives and Organization 2.2 Partial Least Squares Path Modeling 2.3 Finite Mixture Partial Least Squares Segmentation 2.3.1 Foundations 2.3.2 Methodology 2.3.3 Systematic Application of FIMIX-PLS 2.4 Application of FIMIX-PLS 2.4.1 On Measuring Customer Satisfaction 2.4.2 Data and Measures 2.4.3 Data Analysis and Results 19 20 20 22 23 24 26 26 28 31 34 34 34 36 vii viii Contents 2.5 Summary and Conclusion 44 References 45 Part II Knowledge Discovery from Supervised Learning Building Acceptable Classification Models David Martens and Bart Baesens 3.1 Introduction 3.2 Comprehensibility of Classification Models 3.2.1 Measuring Comprehensibility 3.2.2 Obtaining Comprehensible Classification Models 3.3 Justifiability of Classification Models 3.3.1 Taxonomy of Constraints 3.3.2 Monotonicity Constraint 3.3.3 Measuring Justifiability 3.3.4 Obtaining Justifiable Classification Models 3.4 Conclusion References Mining Interesting Rules Without Support Requirement: A General Universal Existential Upward Closure Property Yannick Le Bras, Philippe Lenca, and St´ephane Lallich 4.1 Introduction 4.2 State of the Art 4.3 An Algorithmic Property of Confidence 4.3.1 On UEUC Framework 4.3.2 The UEUC Property 4.3.3 An Efficient Pruning Algorithm 4.3.4 Generalizing the UEUC Property 4.4 A Framework for the Study of Measures 4.4.1 Adapted Functions of Measure 4.4.2 Expression of a Set of Measures of Ddcon f 4.5 Conditions for GUEUC 4.5.1 A Sufficient Condition 4.5.2 A Necessary Condition 4.5.3 Classification of the Measures 4.6 Conclusion References 53 54 55 57 58 59 60 62 63 68 70 71 75 76 77 80 80 80 81 82 84 84 87 90 90 91 92 94 95 Classification Techniques and Error Control in Logic Mining 99 Giovanni Felici, Bruno Simeone, and Vincenzo Spinelli 5.1 Introduction 100 5.2 Brief Introduction to Box Clustering 102 5.3 BC-Based Classifier 104 5.4 Best Choice of a Box System 108 5.5 Bi-criterion Procedure for BC-Based Classifier 111 Contents ix 5.6 Examples 112 5.6.1 The Data Sets 112 5.6.2 Experimental Results with BC 113 5.6.3 Comparison with Decision Trees 115 5.7 Conclusions 117 References 117 Part III Classification Analysis An Extended Study of the Discriminant Random Forest 123 Tracy D Lemmond, Barry Y Chen, Andrew O Hatch, and William G Hanley 6.1 Introduction 123 6.2 Random Forests 124 6.3 Discriminant Random Forests 125 6.3.1 Linear Discriminant Analysis 126 6.3.2 The Discriminant Random Forest Methodology 127 6.4 DRF and RF: An Empirical Study 128 6.4.1 Hidden Signal Detection 129 6.4.2 Radiation Detection 132 6.4.3 Significance of Empirical Results 136 6.4.4 Small Samples and Early Stopping 137 6.4.5 Expected Cost 143 6.5 Conclusions 143 References 145 Prediction with the SVM Using Test Point Margins 147 ă og uă r-Akyăuz, Zakria Hussain, and John Shawe-Taylor Săureyya Oză 7.1 Introduction 147 7.2 Methods 151 7.3 Data Set Description 154 7.4 Results 154 7.5 Discussion and Future Work 155 References 157 Effects of Oversampling Versus Cost-Sensitive Learning for Bayesian and SVM Classifiers 159 Alexander Liu, Cheryl Martin, Brian La Cour, and Joydeep Ghosh 8.1 Introduction 159 8.2 Resampling 161 8.2.1 Random Oversampling 161 8.2.2 Generative Oversampling 161 8.3 Cost-Sensitive Learning 162 8.4 Related Work 163 8.5 A Theoretical Analysis of Oversampling Versus Cost-Sensitive Learning 164 ... Crone · Stefan Lessmann Editors Data Mining Special Issue in Annals of Information Systems 123 Editors Robert Stahlbock Department of Business Administration University of Hamburg Institute of. .. process the data and information in a given organization The Annals of Information Systems devotes a special issue to topics at the intersection of information systems and data mining in order to... while maintaining the efficiency and feasibility of a rule mining algorithm The field of logic mining represents a special form of classification rule mining in the sense that the resulting models

Ngày đăng: 23/10/2019, 15:16

Tài liệu cùng người dùng

Tài liệu liên quan