1. Trang chủ
  2. » Công Nghệ Thông Tin

Commercial data mining processing, analysis and modeling for predictive analytics projects the savvy managers guide nettleton 2014 03 05

361 96 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 361
Dung lượng 14,56 MB

Nội dung

Commercial Data Mining This page intentionally left blank vi Contents Data Representation Introduction Basic Data Representation Basic Data Types Representation, Comparison, and Processing of Variables of Different Types Normalization of the Values of a Variable Distribution of the Values of a Variable Atypical Values Outliers Advanced Data Representation Hierarchical Data Semantic Networks Graph Data Fuzzy Data Data Quality Introduction Examples of Typical Data Problems Content Errors in the Data Relevance and Reliability Quantitative Evaluation of the Data Quality Data Extraction and Data Quality – Common Mistakes and How to Avoid Them Data Extraction Derived Data Summary of Data Extraction Example How Data Entry and Data Creation May Affect Data Quality Selection of Variables and Factor Derivation Introduction Selection from the Available Data Statistical Techniques for Evaluating a Set of Input Variables Summary of the Approach of Selecting from the Available Data Reverse Engineering: Selection by Considering the Desired Result Statistical Techniques for Evaluating and Selecting Input Variables For a Specific Business Objective Transforming Numerical Variables into Ordinal Categorical Variables Customer Segmentation Summary of the Reverse Engineering Approach Data Mining Approaches to Selecting Variables Rule Induction Neural Networks Clustering Packaged Solutions: Preselecting Specific Variables for a Given Business Sector The FAMS (Fraud and Abuse Management) System Summary 49 49 49 49 51 56 57 58 61 61 62 63 64 67 67 69 70 71 73 74 74 77 77 78 79 79 80 81 87 87 87 90 92 99 99 99 100 101 101 103 104 Contents Data Sampling and Partitioning Introduction Sampling for Data Reduction Partitioning the Data Based on Business Criteria Issues Related to Sampling Sampling versus Big Data Data Analysis Introduction Visualization Associations Clustering and Segmentation Segmentation and Visualization Analysis of Transactional Sequences Analysis of Time Series Bank Current Account: Time Series Data Profiles Typical Mistakes when Performing Data Analysis and Interpreting Results Data Modeling Introduction Modeling Concepts and Issues Supervised and Unsupervised Learning Cross Validation Evaluating the Results of Data Models Measuring Precision Neural Networks Predictive Neural Networks Kohonen Neural Network for Clustering Classification: Rule/Tree Induction The ID3 Decision Tree Induction Algorithm The C4.5 Decision Tree Induction Algorithm The C5.0 Decision Tree Induction Algorithm Traditional Statistical Models Regression Techniques Summary of the use of regression techniques K means Other Methods and Techniques for Creating Predictive Models Applying the Models to the Data Simulation Models – “What If?” Summary of Modeling 10 Deployment Systems: From Query Reporting to EIS and Expert Systems Introduction Query and Report Generation Query and Reporting Systems Executive Information Systems vii 105 105 106 111 115 116 119 119 120 121 122 124 129 130 131 134 137 137 137 137 138 139 141 141 144 144 146 147 148 149 149 151 151 152 153 154 156 159 159 159 163 164 viii Contents EIS Interface for a “What If” Scenario Modeler Executive Information Systems (EIS) Expert Systems Case-Based Systems Summary 11 Text Analysis Basic Analysis of Textual Information Advanced Analysis of Textual Information Keyword Definition and Information Retrieval Identification of Names and Personal Information of Individuals Identifying Blocks of Interesting Text Information Retrieval Concepts Assessing Sentiment on Social Media Commercial Text Mining Products 12 Data Mining from Relationally Structured Data, Marts, and Warehouses Introduction Data Warehouse and Data Marts Creating a File or Table for Data Mining 13 CRM – Customer Relationship Management and Analysis Introduction CRM Metrics and Data Collection Customer Life Cycle Example: Retail Bank Integrated CRM Systems CRM Application Software Customer Satisfaction Example CRM Application 164 166 167 169 170 171 171 172 173 173 174 175 176 178 181 181 182 186 195 195 195 196 198 200 200 201 201 14 Analysis of Data on the Internet I – Website Analysis and Internet Search (Online Chapter) 209 15 Analysis of Data on the Internet II – Search Experience Analysis (Online Chapter) 211 16 Analysis of Data on the Internet III – Online Social Network Analysis (Online Chapter) 213 17 Analysis of Data on the Internet IV – Search Trend Analysis over Time (Online Chapter) 215 Contents 18 Data Privacy and Privacy-Preserving Data Publishing Introduction Popular Applications and Data Privacy Legal Aspects – Responsibility and Limits Privacy-Preserving Data Publishing Privacy Concepts Anonymization Techniques Document Sanitization 19 Creating an Environment for Commercial Data Analysis Introduction Integrated Commercial Data Analysis Tools Creating an Ad Hoc/Low-Cost Environment for Commercial Data Analysis ix 217 217 218 220 221 221 223 226 229 229 229 233 20 Summary 239 Appendix: Case Studies 241 Case Study 1: Customer Loyalty at an Insurance Company Introduction Definition of the Operational and Informational Data of Interest Data Extraction and Creation of Files for Analysis Data Exploration Modeling Phase Case Study 2: Cross-Selling a Pension Plan at a Retail Bank Introduction Data Definition Data Analysis Model Generation Results and Conclusions Example Weka Screens: Data Processing, Analysis, and Modeling Case Study 3: Audience Prediction for a Television Channel Introduction Data Definition Data Analysis Audience Prediction by Program Audience Prediction for Publicity Blocks Glossary (Online) Bibliography Index 241 241 242 242 243 248 251 252 252 255 259 262 262 268 268 269 270 272 273 277 279 281 ... Sources of Data and Information,” discusses possible sources of data and information that can be used for a commercial data mining project and how to establish which data sources are available and can... values (for example, 50) On the other hand, there could be just 10 other variables, each of which has only two possible values Data volumes: The more records there are in the data, the higher the. .. the data, the operations, and the IT processes The IT manager may also dedicate time to technical interpretation of the data in order to extract the required data from the data sources Thus there

Ngày đăng: 23/10/2019, 15:15

TỪ KHÓA LIÊN QUAN

TÀI LIỆU CÙNG NGƯỜI DÙNG

TÀI LIỆU LIÊN QUAN