Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 20 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
20
Dung lượng
566,11 KB
Nội dung
The ADVERTISEMENT table is decided on being the fact table. The MUSICIAN table is a single layer fact table. The BAND, MERCHANDISE, SHOWS, and DISCOGRAPHY tables constitute a two-layer dimensional hierarchy. There are some gray areas between static and dynamic information in this situation. There are other ways of building this data warehouse database model. Project Management When it comes to planning and establishing timelines, there are some remarkably good project-planning software tools available. When planning any software project, including a database model design, there is often more than one person involved in a project. This excludes the planner. If not using a software tool, and multiple people are to perform multiple overlapping, interdependent tasks, you could end up with lots of drawings, and an awfully full garbage can. These things change. Plans can change. People get sick and go on vacation. They find new jobs. Sometimes people simply don’t show up at all! In other words, expect things to change. The fact is when it comes to planning, establishing timelines, and budgeting there is no enforceable or reasonably useful applicable set of rules and steps. It just doesn’t exist. In any planning process (including budgeting), for any type of project, there is no way to accurately predict to any degree of mathematical certainty. None of us can really read the future. Planning, establishing timelines, and budgeting entail a flexible process, which is almost entirely dependant on the expertise, and past experience, of the planner. And then again, if changes and unexpected snags or events are not anticipated, a project can very quickly spiral out of control. Project Planning and Timelines A software development company can meet with its financial demise by not allowing enough time in project plans. It is quite common in the software development field for project-tender budgets from multiple companies to be as much as 50 times different. In other words, the difference in dollar estimates between the lowest and highest bidders can be enormous. This indicates guesswork. It is often the case that the most inexperienced companies put in the lowest bids. The highest bids are probably greedy. Those between the average and the highest are probably the most realistic. Also, they are likely to make a profit, and thus be around to support very expensive software tools, say, 5 to 10 years from now. As already stated, there is much research into project planning in general. Unfortunately, few, if any quantifiable and useful results have been obtained. Expert level assessment based on past experience of experts, is usually the best measure of possibilities. There is an International Standards Institute (ISO) model called the “ISO 9000-3 Model.” This model is used more to give a method of quality assurance against the final product, of, say, an analysis of a database model. This ISO model does not give a set of instructions as to how to go about performing analysis process itself, but rather presents a method of validation after the fact. 253 Planning and Preparation Through Analysis 15_574906 ch09.qxd 11/4/05 10:49 AM Page 253 The accuracy of a planned budget is dependent on the experience of the planner. Database designers, administrators, and programmers think that their project managers and planners do nothing. They are wrong. The planners take all the risk, make all the wild guesses; the programmers get to write all that mathematically precise programming code. So, the next time you see your project manager in a cold sweat, you know why. It’s not your job on the line — it’s theirs. Here are some interesting—and sometimes amusing —terms, often used to describe planning, budgeting, and project management: ❑ “Why did the chicken cross the road?” —This is a well-known quotation about a now unfortu- nately discontinued software development consultancy company. This company was famous for producing enormous amounts of paper. Lots of paper can help to make more profit for the software development company. The more paper produced, the more useless waffle. However, the more paper produced, the less likely something has been overlooked, and the more likely the plan and budget are accurate. The other problem with lots of paper is no one has time to read it, and doesn’t really want to either. ❑ “Get someone with a great, big, long signature and a serious title to sign off on it ASAP.” —This one is also called “passing the buck.” That buck has to stop somewhere, and the higher up, the better, and preferably not with the planner. The planner has enough things to think about without worrying about whether their wildest guesses will bear fruit, or simply crash and burn. ❑ “Don’t break it if it’s already fixed.” —If something is working properly, why change it? ❑ “Use information in existing systems, be they computerized or on paper.” — Existing structures can usually tell more about a company that its people can. There is, however, a danger that if a sys- tem is being replaced, there is probably something very wrong with the older system. ❑ “Try not to reinvent the wheel.” —When planning on building software or a new database model, use other people’s ideas if they are appropriate. Of course, beware of outdated ideas. Do thor- ough research. The Internet is an excellent source of freely available ideas, both old and new. ❑ “More resources do not mean faster work.”—The more people involved in a project, the more con- fusion. Throwing more bodies at a project can just as likely make a project impossible to man- age, as it can make it go faster. Figure 9-30 shows a pretty picture of what a project timeline might look like, containing multiple people, multiple skills levels, multiple tasks (both overlapping and interdependent). Project timelines can become incredibly complicated. Simplify, if possible. Too much interdependency can lead to problems if one area overruns on time limitations. Keep something spare in terms of people, available hours, and otherwise. Expect changes. Plan with some resources in reserve. Figure 9-30 shows five separate tasks in a simplistic project-timeline Gantt chart. Task 1 is assigned to Jim. There is no time conflict between Task 2 and Task 3, and can both be assigned to Joe (one person). Task 3 and Task 4 overlap between both Janet and Joe, in more ways than one. Joe is going to be very busy indeed. A project manager would allow overlap where the person doing the assigned tasks is known to be capable of multiple concurrent activities. 254 Chapter 9 15_574906 ch09.qxd 11/4/05 10:49 AM Page 254 Gantt charts were invented as a graphical tool for project management. Gantt charts allow for a pictorial illustration of a schedule, thus helping with the planning, coordination between, and tracking of multiple tasks. Tasks can be independent or interdependent. Many off-the-shelf software tools allow computerized project management with Gantt charts (and otherwise), such as Microsoft Project, Excel spreadsheet plug-ins, and even Visio. Figure 9-30: An example project timeline Gantt chart. Budgeting When it comes to budgeting for a project, budgeting is part of planning, and is thus open to expert interpretation. Once again, there is research on how to apply formal methods to budgeting. Much of that research is unsuccessful. In the commercial world, most observations offer a lot of guesswork based on experience. Also, there is an intense reluctance on the part of the experts to justify and quantify the how and why of the results they come up with. Budgeting is, as already stated, just like planning. It is educated guesswork. There is nothing to really present in a book such as this one, telling someone like you, the reader, how to go about budgeting a project, such as a database modeling project. One big problem with software development (including database model development and design) is its intangibility. Software projects are difficult to quantify, partially because they are unpredictable, they can change drastically during development. The sheer complexity of software is another factor. Complexity is not only within each separate unit and step. One step could be the database model analysis and design. Another step could be writing of front-end application code. There is a complexity issue simply because of the need to account for all possibilities. There is also complexity because of the heavy interdependence between all parts of software development. This interdependence stems from the highest level (comparing, for example, database model and application coding) even down to the smallest piece (comparing two different fields in a single table, in a database model). In short, database model analysis and design is extremely difficult to budget for. The most common and probably successful practice is to guess at an estimated cost based on estimated time to be spent (work hours). Then take the result and add a percentage, even as much as 50 percent, and sometimes multiply- ing the first conservative estimate by much more. Task Name Task number 1 5/23/2005 Task number 2 Task number 4 Task number 3 Task number 5 MTWThF 5/30/2005 MTWThF 6/6/2005 MTWThF Jim Joe Joe Janet Janet and Joe 255 Planning and Preparation Through Analysis 15_574906 ch09.qxd 11/4/05 10:49 AM Page 255 A number of detail areas already have been discussed previously in this chapter: ❑ Hiring help costs money —Hired help in the form of software development expertise can be astro- nomically expensive. ❑ Hardware costs—Hardware can be cheap to incredibly expensive. Generally, more expensive hardware leads to the following scenarios: ❑ Expensive hardware can help to alleviate software costs, but usually in the form of hiding performance problems. Sloppy database design and software development can be overcome by upgrading hardware. If growth is expected, and the hardware selected is ever outgrown, costly problems could be the result. The temptation when using expensive hardware is to not worry too much about building database and software applications properly. Fast-performing hardware lets you get away with a lot of things. ❑ Expensive hardware is complicated. Hiring people to set up and maintain that complexity can be extremely expensive. Simplistic and cheap hardware requires less training and lower skills levels. Less-skilled labor is much cheaper. ❑ Maintenance —Maintenance is all about complexity and quality. The more complex something is, the more difficult it is to maintain. Somewhat unrelated, but just as important, poor quality results in a lot more maintenance. Indirectly, maintenance is related to the life of a database model, and ultimately applications. How long will they last, how long will something be useful, and help to turn a profit? There is a point where constant maintenance is outweighed by doing a complete rewrite. In other words, sometimes rebuilding databases and software applications from scratch can save money, rather than continuing to maintain older software that has seen so many changes that it is no longer cost-effective to maintain. ❑ Training —Training affects all levels, from technical staff to unknown huge quantities of end-users on the Internet. Obviously, you don’t want to have to train Internet users located at far-flung parts of the globe. Attempting to train Internet users is pointless. People lose interest in applications that are difficult to use. You might as well start again in this case. Training in-house staff is different. The higher the complexity, the more training is involved. Training costs money —sometimes a lot of money! Summary In this chapter, you learned about: ❑ The basics of analysis and design of a database model ❑ A usable set of semi-formal steps, for the database modeling analysis process ❑ Some common problem areas and misconceptions that can arise ❑ How talking, listening, and following a paper trail can be of immense value ❑ How to create a database model to cover objectives for both an OLTP and a data warehouse database model 256 Chapter 9 15_574906 ch09.qxd 11/4/05 10:49 AM Page 256 ❑ How to refine a database model using the application of business rules, through basic analysis, creating both an OLTP and a data warehouse database model ❑ How to apply everything learned so far with a comprehensive case study, creating both an OLTP and a data warehouse database model This chapter has attempted to apply everything learned in previous chapters, by going through the motions of beginning with the process of creating a database model. An online auction company has been used as the case study example. Chapter 10 expands on the analysis process (simplistic in this chapter at best) and design with more detail provided for the OLTP and database warehouse database models, as presented analytically in this chapter. The idea is to build things step-by-step, with as much planning as possible. This chapter has begun that database model building process, as the first little step in the right direction. Exercises Use the ERDs in Figure 9-19 and Figure 9-29 to help you answer these questions. 1. Create scripts to create tables for the OLTP database model Figure 9-19. Create the tables in the proper order by understanding the relationships between the tables. 2. Create scripts to create tables for the data warehouse database model Figure 9-29. Once again, create the tables in the proper order by understanding the relationships between the tables. 257 Planning and Preparation Through Analysis 15_574906 ch09.qxd 11/4/05 10:49 AM Page 257 15_574906 ch09.qxd 11/4/05 10:49 AM Page 258 10 Creating and Refining Tables During the Design Phase “Everything in the world must have design or the human mind rejects it. But in addition, it must have purpose or the human conscience shies away from it.” (John Steinbeck) Analysis is all about what needs to done. Design does it! This chapter builds on and expands on the basic analytical process and structure discovered during the case study approach in Chapter 9, which covered analysis. Analysis is the process of discovering what needs to be done. Analysis is all about the operations of a company, what it does to make a living. Design is all about how to implement that which was analyzed into a useful database model. This chapter passes from the analysis phase into the design phase by discovering through a case study example how to build and relate tables in a relational database model. By the end of this chapter, you will have a much deeper understanding of how database models are created and designed. This chapter teaches you how to begin the implementation of what was analyzed (discovered) in Chapter 9. In short, implementation is the process of creating tables, and sticking those tables together with appropriate relationships. In this chapter, you learn about the following: ❑ Database model design ❑ The difference between analysis and design ❑ Creating tables ❑ Enforcing inter-table relationships and referential integrity ❑ Normalization without going too far ❑ Denormalization without going too far 16_574906 ch10.qxd 11/4/05 10:46 AM Page 259 A Little More About Design When designing a database model, the intention is to refine database model structure for reasons of maintaining data integrity, good performance, and adaptations to specific types of applications, such as a data warehouse. A data warehouse database model requires a different basic table structure than an OLTP database model, mostly for reasons of acceptable reporting performance. Performance is not exclusively a programming or implementation construction or even a testing or “wait until it’s in production” activity. Performance should be included at the outset in both analysis and design stages. This is particularly important for database model analysis and design for two reasons: ❑ Redesigning a database model after applications are written will change applications (a little like a rewrite—a complete waste of time and money). ❑ There is simply no earthly reason or excuse to not build for performance right from the beginning of the development process. If this is not possible, you might want additional expertise in the form of specialized personnel. Hiring a short-term consultant at the outset of a development project could save enormous amounts of maintenance, rewriting, redevelopment costs, and time in the future. Business events and operations discovered in the analysis stage should be utilized to drive the design process, which consists of refining table pictures and ERDs already drawn. For larger projects, the design stage can also consist of detailed technical specifications. Technical specifications are used by programmers and administrators to create databases and programming code. Essentially, the beginning of the design process marks the point where the thought processes of analysts and programmers begin to mix. In the analysis stage, the approach was one of a business operation (a business-wide view) of a company. In the design stage, it starts to get more technical. When designing a database model, you should begin thinking about a database model from the perspective of how applications will use that database model. In other words, when considering how to build fact and dimensional table structures in a data warehouse database model, consider how reports will be structured. Consider how long those reports will take to run. For a data warehouse, not only are the table contents and relationships important, but factors such as reconstruction of data using materialized views and alternate indexing can help with the building and performance of data warehouse reporting. Relational database model design includes the following: ❑ Refine Database Models —At this stage of the game, most of this is about normalization and denormalization. ❑ Finalization and Approval—This includes finalization (and most especially, approval) of business and technical design issues. You need to get it signed off for two reasons: ❑ Software development involves large amounts of investment in both money and time. Management, and probably even executive management approval and responsibility, is required. A designer does not need this level of worry, but may need some powerful clout to back up the development process. Other departments and managers getting in the way of the development process could throw your schedule and budget for a complete loop. 260 Chapter 10 16_574906 ch10.qxd 11/4/05 10:46 AM Page 260 Every project needs a sponsor and champion; otherwise, there is no point in progressing. ❑ You need to cover your back. You also need to be sure that you are going in the right direction because you may very well not be. There is usually a good reason why your boss is your boss, and it usually has a lot do with him or her having a lot more experience. Experience is always valuable. It is extremely likely that this person will help you get things moving when you need, such as when you need information and help from other departments. So, it’s not really about passing the buck up the ladder. It’s really about getting a job done. The stronger approval and support for a project, the better the chance of success —unless, of course, it’s your money. In that case, it’s your problem! So, pinch those pennies. Of course, some entrepreneurs say that the secrets to making money are all about cash flow, spending it, and not stashing it in the bank for a rainy day. Be warned, however, that software development is a very expensive and risky venture! So, the design stage is the next stage following the analysis stage. Design is a process of figuring out how to implement what was discovered during the analysis stage. As described in Chapter 9, analysis is about what needs to be done. Design is about how it should be done. The design stage deals with the following aspects of database model creation: ❑ More precise tables, including practical application of normalization. ❑ Establishment of primary and foreign key fields. ❑ Enforcement of referential integrity by establishing and quantifying precise relationships between tables, using primary and foreign key fields. ❑ Denormalization in the design stage (the sooner the better), particularly in the case of data warehouse table structures. ❑ Alternate and extra indexing in addition to that of referential integrity, primary and foreign keys; however, alternate indexing is more advanced (detailed) design, and is discussed in Chapter 11. ❑ Advanced database structures, such as materialized views, and some specialized types of indexing. Similar to alternate indexing, this is also more advanced (detailed) design, and is discussed in Chapter 11. ❑ Precise field definitions, structure, and their respective datatypes (again advanced design). The intention of this chapter is to focus on the firm and proper establishment of inter-table relationships, through the application of normalization and denormalization for both OLTP and data warehouse database models. This process is performed as a case study, continuing with the use of the online auction company introduced in Chapter 9. Let’s create some tables. 261 Creating and Refining Tables During the Design Phase 16_574906 ch10.qxd 11/4/05 10:46 AM Page 261 262 Chapter 10 Case Study: Creating Tables In Chapter 9, tables were created on an analytical level, creating basic pictures. Following the basic pictures, simple ERDs were constructed. In this section, basic commands are used to create the initial simple tables, as shown in the analytical process of Chapter 9. The idea is to retain the step-by-step instruction of each concept layer, in the database modeling design process, for both OLTP and data warehouse database models. These models are created for the online auction house case study database models. The OLTP Database Model Figure 10-1 shows a simple analytical diagram, covering the various operational aspects of the online auction house, OLTP database model. Notice how the BIDS table is connected to both the LISTING and BUYER tables. This is the only table in this database structure that is connected to more than one table and not as part of a hierarchy. Category tables are part of a hierarchy. Figure 10-1: Analytical OLTP online auction house database model. Figure 10-2 shows the application of business rules to the simple analytical diagram shown in Figure 10-1. Once again, notice the dual links for the BID table (now called BID for technical accuracy because each record represents a single bid), to both the LISTING and the BUYER tables. This double link represents a Primary Category Secondary Category Tertiary Category Buyer History Listing Seller History Seller Bids Buyer 16_574906 ch10.qxd 11/4/05 10:46 AM Page 262 [...]... as 100.000 The Data Warehouse Database Model The previous section created very basic tables for the online auction house OLTP database model Now do exactly the same thing for the data warehouse database model of the online auction house Figure 10-3 shows a simple analytical diagram displaying the various operational sections for the online auction house data warehouse database model All the fact information... SECONDARY, followed by TERTIARY tables: CREATE TABLE CATEGORY_PRIMARY(PRIMARY STRING); CREATE TABLE CATEGORY_SECONDARY(SECONDARY STRING); CREATE TABLE CATEGORY_TERTIARY(TERTIARY STRING); Some databases (if not many databases) do not allow use of keywords, such as PRIMARY, or even SECONDARY PRIMARY could be reserved to represent a primary key and SECONDARY could be reserved to represent secondary (alternate)... popularity_rating join_date address Buyer_History seller comment_date listing# comments Bid bidder seller bid_price bid_date Figure 10-2: Business rules OLTP online auction house database model The easiest way to create tables in a database model is to create them in a top-down fashion, from static to dynamic tables, gradually introducing more and more dependent detail In others words, information that... Buyer Location Time Figure 10-3: Analytical data warehouse online auction house database model Multiple star schemas within a single data warehouse are sometimes known as individual data marts Figure 10-4 shows the application of business rules to the simple analytical diagram shown in Figure 10-3 for the data warehouse database model Once again, you must take table dependencies into account It is significant... account It is significant to observe how the three category tables, shown in Figure 10-2 (the OLTP database model), have been merged into a single hierarchical category table (CATEGORY_HIERARCHY) in the data warehouse model shown in Figure 10-4 This is a form of denormalization, used especially in data warehouse databases to simplify and compress dimensional information use 266 Creating and Refining Tables... datatype called BINARY, used to store an image That image could be a JPG, BMP, or any other type of graphic file format Binary object datatypes allow storage of binary formatted data inside relational databases The BINARY datatype is not really important to this book; however, storing images into text strings is awkward Lastly, create the BID table (the BID table is dependent on the LISTING table):... history_buyer_comments history_seller history_seller_comment_date history_seller_comments buyer popularity_rating join_date address Time month quarter year Figure 10-4: Business rules data warehouse online auction house database model Now you create the tables for the data warehouse model shown in Figure 10-4 In a well-designed data warehouse star schema, there is only one layer of dependence between a single layer of... HISTORY_SELLER_COMMENTS BIGSTRING ); This fact table shows the source of all fields as being listing, bidder, buyer history, and seller history Tables have been created for the OLTP and data warehouse database models for the online auction house The next step is to establish and enforce referential integrity Case Study: Enforcing Table Relationships Referential integrity maintains and enforces the data... deleting all the seller’s listings first If the seller is deleted, their listings become orphaned records An orphaned record is term applied to a record not findable within the logical table structure of a database model Essentially, the seller’s name and address details are stored in the SELLER table and not in the LISTING table If the seller record was deleted, any of their listings are useless because... to a child table, whose uniquely identifying value lies in another table (the parent table containing the primary key) Now demonstrate implementation of primary and foreign keys by re-creating the OLTP database model tables, as shown in Figure 10-2 The ERD in Figure 10-2 has been changed to the ERD shown in Figure 10-5 Seller Category_Primary seller primary Category_Secondary primary (FK) secondary Category_Tertiary . in the database modeling design process, for both OLTP and data warehouse database models. These models are created for the online auction house case study database models. The OLTP Database. How to create a database model to cover objectives for both an OLTP and a data warehouse database model 256 Chapter 9 15_574906 ch09.qxd 11/4/05 10:49 AM Page 256 ❑ How to refine a database model. and database warehouse database models, as presented analytically in this chapter. The idea is to build things step-by-step, with as much planning as possible. This chapter has begun that database