1. Trang chủ
  2. » Công Nghệ Thông Tin

data warehousing architecture andimplementation phần 4 ppsx

30 221 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 30
Dung lượng 419,64 KB

Nội dung

Hardware or Operating System Platforms The following evaluation criteria can be applied to hardware and operating system platforms: • Scalability. The warehouse solution can scale up in terms of space and processing power. This scalability is particularly important if the warehouse is projected to grow at a rapid rate. • Financial stability. The product vendor has proven to be a strong and visible player in the hardware segment, and its financial performance indicates growth or stability. • Price/performance. The product performs well in a price/performance comparison with other vendors of similar products. • Delivery lead time. The product vendor can deliver the hardware or an equivalent service unit within the required time frame. If the unit is not readily available within the same country, there may be delays due to importation logistics. • Reference sites. The hardware vendor has a reference site that is using a similar unit for the same purpose. The warehousing team can either arrange a site visit or interview representatives from the site visit. Alternatively, an onsite test of the unit can be conducted, especially if no reference is available. • Availability of support. Support for the hardware and its operating system is available, and support response times are within the acceptable down time for the warehouse. How Does Data Warehousing Affect My Existing Systems? Existing operational systems are the source of internal warehouse data. Extractions can take place only during the batch windows of the operational systems, typically after office hours. If batch windows are sufficiently large, warehouse-related activities will have little or no disruptive effects on normal, day-to-day operations. Improvement Areas in Operational Systems Data warehousing, however, does highlight areas in existing systems where improvements can be made to operational systems, particularly in two areas: • Missing data items. Decisional information needs almost always require the collection of data that are currently outside the scope of the existing systems. If possible, the existing system are extended to support the collection of such data. The team will have to study alternatives to data collection if the operational systems cannot be modified (for example, if the operational system is an application package whose warranties will be void if modifications are made). • Insufficient data quality. The data warehouse efforts may also identify areas where the data quality of the operational systems can be improved. This is especially true for data items that are used to uniquely identify customers, such as social security numbers. The data warehouse implementation team should continuously provide constructive feedback regarding the operational systems. Easy improvements can be quickly implemented, and improvements that require significant effort and resources can be prioritized during IT planning. By ensuring that each rollout of a data warehouse phase is always accompanied by a review of the existing systems, the warehousing team can provide valuable inputs to plans for enhancing operational systems. Data Warehousing and Its Impact on Other Enterprise Initiatives By its enterprise-wide nature, a data warehousing initiative will naturally have an impact on other enterprise initiatives, two of which are discussed below. How Does Data Warehousing Tie In with BPR? Data warehousing refers to the gamut of activities that support the decisional information requirements of the enterprise. BPR is "the radical redesign of strategic and value-added processes—and the systems, policies, and organizational structures that support them—to optimize the work flows and productivity in an organization." Most BPR projects have focused on the optimization of operational business processes. Data warehousing, on the other hand, focuses on optimizing the decisional (or decision-making) processes within the enterprise. It can be said that data warehousing is the technology enabler for reengineering decisional processes. The ready availability of integrated data for corporate decision-making also has implications for the organizational structure of the enterprise. Most organizations are structured or designed to collect, summarize, report, and direct the status of operations (i.e., there is an operational monitoring purpose). The availability of integrated data at different levels of detail may encourage a flattening of the organization structure. Data warehouses also provide the enterprise with the measures for gauging competitive standing. The use of the warehouse leads to insights as to what drives the enterprise. These insights may quickly lead to business process reengineering initiatives in the operational areas. How Does Data Warehousing Tie In with Intranets? The term intranet refers to the use of Internet technologies for internal corporate networks. Intranets have been touched as cost-effective, client/server solutions to enterprise computing needs. Intranets are also popular due to the universal, easy-to-learn, easy-to-use front-end, i.e., the web browser. The web-publishing nature of the Internet, and the browser's metaphor of searching for information, are consistent with the data warehouse's querying metaphor. The availability of many web-based tools that draw their data from relational database structures has naturally encouraged the use of web technology as a means for delivering warehouse data to end-users. A data warehouse with a web-enabled front-end therefore provides enterprises with interesting options for intranet-based solutions. With the introduction of technologies that enable secure connections over the public Internet infrastructure, enterprises now also have a cost-effective way of distributing or delivering warehouse data to users in multiple locations. When Is a Data Warehouse Not Appropriate? Not all organizations are ready for a data warehousing initiative. Below are two instances when a data warehouse is simply inappropriate. When the Operational Systems Are Not Ready The data warehouse is populated with information primarily from the operational systems of the enterprise. A good indicator of operational system readiness is the amount of IT effort focused on operational systems. A number of telltale signs indicate a lack of readiness. These include the following: • Many new operational systems are planned for development or are in the process of being deployed. Much of the enterprise's IT resources will be assigned to this effort and will therefore not be available for data warehousing projects. • Many of the operational systems are legacy applications that require much firefighting. The source systems are brittle or unstable and are candidates for replacement. IT resources are also directed at fighting operational system fires. • Many of the operational systems require major enhancements and must be overhauled. If the operational systems require major enhancements, then chances are these systems do not sufficiently support the day-to-day operations of the enterprise. Again, IT resources will be directed to enhancement or replacement efforts. Furthermore, deficient operational systems almost always fail to capture all the data required to meet the decisional information needs of business managers. Regardless of the reason for a lack of operational system readiness, the bottom line is simple: an enterprise-wide data warehouse is out of the question due to the lack of adequate source systems. However, this does not preclude a phased data warehousing initiative, as illustrated in Figure 4-2. Figure 4-2 Data Warehouse Rollout Strategy The enterprise may opt for an interleaved deployment of systems. A series of projects can be conducted, where a project to deploy an operational system is followed by a project that extends the scope of the data warehouse to encompass the newly stabilized operational system. The main focus of the majority of IT staff remains on deploying the operational systems. However, a data warehouse scope extension project is initiated as each operational system stabilizes. This project extends the data warehouse scope with data from each new operational system. Note, however, that this approach may create unrealistic end-user expectations, particularly during earlier rollouts. The scope and strategy should therefore be communicated clearly and consistently to all users. Most, if not all, business users will understand that enterprise-wide views of data are not possible while most of the operational systems are not feeding the warehouse. When the Need Is Operational Integration Despite its ability to provide integrated data for decisional information needs, a data warehouse does not in any way contribute to meeting the operational information needs of the enterprise. Data warehouses are refreshed at best on a daily basis. They do not integrate data quickly enough or often enough for operational management purposes. If the enterprise needs operational integration, then the typical data warehouse deployment (as shown in Figure 4-3) is insufficient. Figure 4-3 Traditional Data Warehouse Architecture Instead, the enterprise needs an Operational Data Store and its accompanying front-end applications. As mentioned in Chapter 1, flash monitoring and reporting tools are often likened to a dashboard that is constantly refreshed to provide operational management with the latest information about enterprise operations. Figure 4-4 illustrates the Operational Data Store architecture. Figure 4-4 The Data Warehouse and the Operational Data Store When the intended users of the system are operational managers and when the requirements are for an integrated view of constantly refreshed operational data, an Operational Data Store is the appropriate solution. Enterprises that have implemented Operational Data Stores will find it natural to use the Operational Data Store as one of the primary source systems for their data warehouse. Thus, the Data Warehouse contains a series (i.e., layer upon layer) of ODS snapshots, where each layer corresponds to data as of a specific point in time. How Do I Manage or Control a Data Warehouse Initiative? There are several ways to manage or control a data warehouse project. Note that most of the techniques described below are useful in any technology project. Milestones. Clearly defined milestones provide project management and the Project Sponsor with regular checkpoints to track the progress of the data warehouse development effort. Milestones should be far enough apart to show real progress, but not so far apart that senior management becomes uneasy or loses focus and commitment. In general, one data warehouse rollout should be treated as one project, lasting anywhere between three to six months. Incremental Rollouts, Incremental Investments. Avoid biting off more than you can chew; projects that are gigantic leaps forward are more likely to fail. Instead, break up the data warehouse initiative into incremental rollouts. By doing so, you give the warehouse team manageable but ambitious targets and clearly defined deliverables. Applying a phased approach also has the added benefit of allowing the Project Sponsor and the warehousing team to set priorities and manage end-user expectations. The benefits of each rollout can be measured separately, and the data warehouse is justified on a phase-per-phase basis. A phased approach, however, requires an overall architect so that each phase also lays the foundation for subsequent warehousing efforts, and earlier investments remain intact. Clearly Defined Rollout Scopes. To the maximum extent possible, clearly define the scope of each rollout to set the expectations of both senior management and warehouse end-users. Each rollout should deliver useful functionality. As in most development projects, the project manager will be walking the fine line between increasing the scope to better meet user needs and ruthlessly controlling the scope to meet the rollout deadline. Individually Cost-Justified Rollouts. The scope of each rollout determines the corresponding rollout cost. Each rollout should be cost-justified on its own merits to ensure appropriate return on investment. However, this practice should not preclude long-term architectural investments that do not have an immediate return in the same rollout. Plan to Have Early Successes. Data warehousing is a long-term effort that must have early and continuous successes that justify the length of the journey. Focus early efforts on areas that can deliver highly visible success, and that success will increase organizational support. Plan to be Scalable. Initial successes with the data warehouse will result in a sudden demand for increased data scope, increased functionality, or both! The warehousing environment and design must both be scalable to deal with increased demand as needed. Reward your Team. Data warehousing is hard work, and teams need to know their work is appreciated. A motivated team is always an asset in long-term initiatives. In Summary The Chief Information Officer (CIO) has the unenviable task of juggling the limited IT resources of the enterprise. He or she makes the resource assignment decisions that determine the skill sets of the various IT project teams. Unfortunately, data warehousing is just one of the many projects on the CIO's plate. If the enterprise is still in the process of deploying operational system, data warehousing will naturally be at a lower priority. CIOs also have the difficult responsibility of evolving the enterprise's IT architecture. They must ensure that the addition of each new system, and the extension of each existing system, contributes to the stability and resiliency of the overall IT architecture. Fortunately, data warehouse and operational data store technologies allow CIOs to migrate reporting and analytical functionality from legacy or operational environments, thereby creating a more robust and stable computing environment for the enterprise. Chapter 5. The Project Manager The warehouse Project Manager is responsible for any and all technical activities related to planning, designing, and building a data warehouse. Under ideal circumstances, this role is fulfilled by internal IT staff. It is not unusual, however, for this role to be outsourced, especially for early or pilot projects, because warehousing technologies and techniques are so new. How Do I Roll Out a Data Warehouse Initiative? If you are starting a data warehouse initiative, there are three main things to keep in mind. Always start with a planning activity. Always implement a pilot project as your "proof of concept." And, always extend the functionality of the warehouse in an iterative manner. Start with a Data Warehouse Planning Activity The scope of a data warehouse varies from one enterprise to another. The desired scope and scale are typically determined by the information requirements that drive the warehouse design and development. These requirements, in turn, are driven by the business context of the enterprise—the industry, the fierceness of competition, and the state of the art in industry practices. Regardless of the industry, however, it is advisable to start a data warehouse initiative with a short planning activity. The Project Manager should launch and manage the activities listed below. Decisional Requirements Analysis. Start with an analysis of the decision support needs of the organization. The warehousing team must understand the user requirements and attempt to map these to the data sources available. The team also designs potential queries or reports that can meet the stated information requirements. Note that unlike system development projects for OLTP applications, the information needs of decisional users cannot be pinned down and are frequently changing. The Requirements Analysis team should therefore gain enough of an understanding of the business to be able to anticipate likely changes to end-user requirements. Decisional Source System Audit. Conduct an audit of all potential sources of data. This crucial and very detailed task verifies that data sources exist to meet the decisional information needs identified during requirements analysis. There is no point in designing a warehouse schema that cannot be populated because of a lack of source data. [...]... Tiered Data Warehousing Architectures The enterprise is free to mix and match these two database technologies, depending on the scale and size of the data warehouse, as illustrated in Figure 5-9 Figure 5-9 Tiered Data Warehousing Architecture It will not be unusual to find an enterprise with the following tiered data warehousing architecture: • ROLAP tools, which run directly against relational databases,... the data Figure 5-7 Relational Databases MDDBs in Warehousing Architectures Alternatively, data is extracted from the relational data warehouse and placed in multidimensional data structures to reflect the multidimensional nature of the data (see Figure 5-8) Multidimensional OLAP (MOLAP) tools run against the multidimensional server, rather than against the data warehouse Figure 5-8 Multidimensional Databases... one another Figure 5-5 Data Warehouse Components Do I Still Use Relational Databases for Data Warehousing? Although there were initial doubts about the use of relational database technology in data warehousing, experience has shown that there is actually no other appropriate database management system for an enterprise-wide data warehouse MDDBs The confusion about relational databases arises from the... innovation will certainly introduce new data warehousing architectures How Long Does a Data Warehousing Project Last? Data warehousing is a long, daunting task; it requires significant, prolonged effort on the part of the enterprise and may have the unpleasant side effect of highlighting problem areas in operational systems Like any task of great magnitude, the data warehousing effort must be partitioned... together for data warehouse and data mart implementation RDBMSes in Warehousing Architectures Data warehouses are built on relational database technology Online Analytical Processing (OLAP) tools are then used to interact directly with the relational data warehouse or with a relational data mart (see Figure 5-7) Relational OLAP (ROLAP) tools recognize the relational nature of the database but still present... enterprise data warehouse These subsets are determined either by geography (i.e., one data mart per location) or by user group Data marts, due to their smaller size, may take advantage of multidimensional databases for better reporting and analysis performance Warehousing Architectures Below, we present how the relational and multi-dimensional database technologies can be used together for data warehouse... parts of the warehouse architecture • Enterprise data warehouses These have a tendency to grow significantly beyond the size limit of most MDDBs and are therefore typically implemented with relational database technology Only relational database technology is capable of storing up to terabytes of data while still providing acceptable load and query performance • Data marts A data mart is typically... These tools create, store, and manage the warehouse metadata • Data access and retrieval tools These are tools used by warehouse end users to access, format, and disseminate warehouse data in the form of reports, query results, charts, and graphs Other data access and retrieval tools actively search the data warehouse for patterns in the data (i.e., data mining tools) Decision Support Systems and Executive... volume of data Databases of this size require different database optimization and tuning techniques Project Progress and Effort Are Highly Dependent on Accessibility and Quality of Source System Data The progress of a data warehouse project is highly dependent on where the operational data resides Enterprises that make use of proprietary application packages will find themselves dealing with locked data. .. establishing the working relationship among warehousing team members, especially if third parties are involved Note that easy access to warehousing data may also be limited to the organizational scope that is within the control or authority of the Project Sponsor • What are the IS or IT groups in the organization? Which are involved in the data warehousing effort? Since data warehousing is very much a technology-based . information about enterprise operations. Figure 4- 4 illustrates the Operational Data Store architecture. Figure 4- 4 The Data Warehouse and the Operational Data Store When the intended users of. Figure 5-5 Data Warehouse Components Do I Still Use Relational Databases for Data Warehousing? Although there were initial doubts about the use of relational database technology in data warehousing, . of the data. Figure 5-7 Relational Databases MDDBs in Warehousing Architectures. Alternatively, data is extracted from the relational data warehouse and placed in multidimensional data structures

Ngày đăng: 14/08/2014, 06:22

TỪ KHÓA LIÊN QUAN