TIMELY. PRACTICAL. RELIABLE. Laura L. Reeves A Manager’s Guide to Data Warehousing Wiley Computer Publishing Timely. Practical. Reliable. An ideal guide for the non-technical professional eager to learn more about data warehousing each step of a data warehouse project, and provides a clear explanation of what’s involved in efficiently building a data warehouse and what must be done to deliver the data. You’ll examine the business management of a data warehouse and discover essential methods for cultivating a strong partnership between the business and IT elements of your organization. You can use this knowledge to be more effective when sharing your requirements and concerns during a project. A Manager’s Guide to Data Warehousing explains what you need to create your data warehouse and establish long-term success. The book covers: • The most common factors for ensuring data warehousing success and the roadblocks that can prevent it • How to ensure that business and technical staff have a common understanding of the data warehouse project Database/Data Warehousing LAURA L. REEVES, coauthor of The Data Warehouse Lifecycle Toolkit, has over 23 years of experience in end-to-end data warehouse development focused on developing comprehensive project plans, collecting business requirements, designing business dimensional models and database schemas, and creating enterprise data warehouse strategies and data architectures. A successful data warehouse project can provide immense value for business enterprises or other organizations. Building and maintaining a data warehouse demands the combined efforts of both IT and non-technical personnel. While there are plenty of resources aimed at the technology professionals who design and build data warehouses, there has to date been no useful guide written for a non-technical audience. This book fills that void and serves as an ideal resource for business and IT managers and others from the non-IT side who want to do their part to ensure data warehousing success. This helpful book provides a solid introduction to the fundamentals of data warehousing. The author details Visit our Web site at www.wiley.com/compbooks/ A Manager’s Guide to Data Warehousing • How to effectively communicate your business requirements for the data warehouse • The tools you need to make certain that data is organized and can be delivered as needed • Ways to deploy the data warehouse and ensure sustainable success Reeves spine=.96" ISBN: 978-0-470-17638-2 www.it-ebooks.info www.it-ebooks.info A Manager’s Guide to Data Warehousing www.it-ebooks.info www.it-ebooks.info A Manager’s Guide to Data Warehousing Laura L. Reeves Wiley Publishing, Inc. www.it-ebooks.info A Manager’s Guide to Data Warehousing Published by Wiley Publishing, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2009 by Wiley Publishing, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-0-470-17638-2 Manufactured in the United States of America 10987654321 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at www.wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitationwarrantiesoffitnessforaparticularpurpose.No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read. For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Library of Congress Cataloging-in-Publication Data Reeves, Laura L. A manager’s guide to data warehousing / Laura L. Reeves. p. cm. Includes index. ISBN 978-0-470-17638-2 (paper/website) 1. Data warehousing–Management. I. Title. QA76.9.D37R44 2009 005.74068–dc22 2009007401 Trademarks: Wiley and the Wiley logo are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. All other trademarks are the property of their respective owners. Wiley Publishing, Inc. is not associated with any product or vendor mentioned in this book. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. www.it-ebooks.info About the Author Laura L. Reeves started designing and implementing data warehouse solu- tions in 1986. Since then she has been involved in hundreds of projects. She has extensive experience in end-to-end data warehouse development, including developing comprehensive project plans, collecting business requirements, developing business dimensional models, designing database schemas (both star and snowflake designs), and developing enterprise data warehouse archi- tecture and strategies. These have been implemented for many business functions for private and public industry. Laura co-founded StarSoft Solutions, Inc., in 1995 and has been a faculty member with The Data Warehousing Institute since 1997. She is a contributing author of Building a Data Warehouse for Decision Support (Prentice Hall, 1996) and a co-author of the first editionof The Data WarehouseLifecycle Toolkit (Wiley, 1998). Laura graduated magna cum laude from Alma College with a bachelor of science degree in mathematics and computer science, with departmental honors. v www.it-ebooks.info www.it-ebooks.info Credits Executive Editor Robert Elliott Development Editor Sara Shlaer Technical Editor Jonathan Geiger Production Editor Melissa Lopez Copy Editor Luann Rouff Editorial Manager Mary Beth Wakefield Production Manager Tim Tate Vice President and Executive Group Publisher Richard Swadley Vice President and Executive Publisher Barry Pruett Associate Publisher Jim Minatel Proofreader Josh Chase, Jen Larsen, and Kyle Schlesinger, WordOne Indexer Robert Swanson Cover Image © Digital Vision vii www.it-ebooks.info www.it-ebooks.info [...]... referred to as the extract, transform, and load (ETL) process The database in which the data is organized to support the business is called a data mart A data mart includes all of the data that is loaded into a single database and used together for analysis Data marts are often developed to meet the needs of a business group such as marketing or finance The key to a successful data mart is to create it in an... critical concept that warrants some attention: the mechanism used to help organize data, which is called a data model What Is a Data Model? A data model is an abstraction of how individual data elements relate to each other It visually depicts how the data is to be organized and stored in a database A data model provides the mechanism for documenting and understanding how data is organized There are many... Contents Chapter 8 Managing Data As a Corporate Asset What Is Information Management? Information Management Example—Customer Data IM Beyond the Data Warehouse Master Data Management Master Data Feeds the Data Warehouse Finding the Right Resources Data Governance Data Ownership Who Really Owns the Data? Your Responsibilities If You Are ‘‘the Owner’’ What are IT’s Responsibilities? Challenges with Data Ownership... Implementing a Data Dictionary The Data Dictionary Application Populating the Data Dictionary Accessing the Data Dictionary Maintaining the Data Dictionary Getting Started with Information Management Understanding Your Current Data Environment What Data Do You Have? What Already Exists? Where Do You Want to Be? Develop a Realistic Strategy Sharing the Information Management Strategy Setting Up a Sustainable... history Data integration and balancing Data is balanced within the scope of this one system Data must be integrated and balanced from multiple systems Source System Data Data Organized to Support the Business Source System Data Figure 1-1 Basic data warehousing environment www.it-ebooks.info Access & Use of Data Source System Data Prepare the Data Source System Data 5 6 Part I ■ The Essentials of Data. .. Ownership Data Quality Profiling the Data How Clean Does the Data Really Need to Be? Measuring Quality Quality of Historical Data Cleansing at the Source Cleaning Up for Reporting Managing the Integrity of Data Integration Quality Improves When It Matters Example: Data Quality and Grocery Checkout Scanners Example: Data Quality and the Evaluation of Public Education Realizing the Value of Data Quality Implementing... performance, and exception reporting Data usage Capture and maintain the data Exploit the data Data validation Data verification occurs upon entry Data verification occurs after the fact Update frequency Data is updated when business transactions occur (e.g., client uses debit card, web order is placed) Data is updated by periodic, scheduled processes Historical data requirement Current data Multiple years of... in an integrated manner It is also recommended that data be loaded into only one data mart and then shared across the organization to ensure data consistency Finally, an application or reporting layer is provided to facilitate access and analysis of the data This is where business users access reports, dashboards, and analytical applications Collections of these reports and analyses are called business... Infrastructure, and Tools What Is Architecture? Why Do We Need Architecture? Making Architecture Work Data Architecture Revisiting DW Goals Components of DW Data Architecture A Closer Look at Common Data Warehouse Architectures Bottom-Up Data Architecture Top-Down Data Architecture Publish the Data: Data Marts Adopting an Architecture Technical Architecture Technical Architecture Basics Components of Technical Architecture... Essentials of Data Warehousing What Is a Data Warehouse? Differences Between Operational and DW Systems The Data Warehousing Environment What Is a Data Model? Understanding Industry Perspectives Design and Development Sequence Why Build a Data Warehouse? The Value of Data Warehousing The Promises of Data Warehousing Keys to Success Developing and Maintaining Strong Business and Technology Partnerships . Managing Data As a Corporate Asset 231 What Is Information Management? 232 Information Management Example—Customer Data 235 IM Beyond the Data Warehouse 239 Master Data Management 240 Master Data Feeds. of Data Warehousing 1 Chapter 1 Gaining Data Warehouse Success 3 The Essentials of Data Warehousing 3 What Is a Data Warehouse? 4 Differences Between Operational and DW Systems 4 The Data Warehousing. that business and technical staff have a common understanding of the data warehouse project Database /Data Warehousing LAURA L. REEVES, coauthor of The Data Warehouse Lifecycle Toolkit, has