1. Trang chủ
  2. » Công Nghệ Thông Tin

Beginning Database Design- P16 doc

20 221 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 20
Dung lượng 725 KB

Nội dung

273 Creating and Refining Tables During the Design Phase In Figure 10-8, the relationship between the CATEGORY_PRIMARY and CATEGORY_SECONDARY tables is one-to-zero, one, or many. What this means is that a CATEGORY_PRIMARY record can exist with no related CATEGORY_SECONDARY records. On the contrary, a CATEGORY_SECONDARY record cannot exist unless it has a related parent CATEGORY_PRIMARY record. Similarly, a seller does not have to have any history ( SELLER_HISTORY records), but there will be no SELLER_HISTORY if there is no seller for that history to be entered against. Figure 10-8: Parent records can exist and child records are not absolutely required (one-to-zero, one, or many). Child Records with Optional Parents A table containing child records with optional parent records is often typical of data warehouse fact tables, such that not all dimensions need be defined for every fact. This is especially true where facts stem from differing sources, such as BID in Figure 10-8. The result is some fact records with one or more NULL valued foreign keys. In Figure 10-8, a LISTING table record can be set as either being a secondary or a tertiary category. Thus, the relationship between both CATEGORY_SECONDARY and CATEGORY_TERTIARY tables to that of LISTING, is zero or one-to-zero, one, or zero. In other words, a listing can be specified as a secondary or a tertiary category (not both). The result is that for every LISTING record, that either the SECONDARY_ID, or the TERTIARY_ID fields can be NULL valued. Thus, LISTING table records can be said to have optional parents. Optional parents are technically and more commonly known as NULL valued foreign keys. Seller_History Seller seller_id seller popularity_rating join_date address return_policy international payment_methods seller_id (FK) buyer comment_date listing# comments Buyer_History buyer_id (FK) seller comment_date listing# comments Buyer buyer_id buyer popularity_rating join_date address Category_Primary primary_id primary secondary Category_Secondary secondary_id primary_id (FK) tertiary Category_Tertiary tertiary_id secondary_id (FK) Listing buyer_id (FK) seller_id (FK) secondary_id (FK) tertiary_id (FK) description image start_date listing_days currency starting_price reserve_price buy_now_price number _of_bids winning_price Bid buyer_id (FK) listing# (FK) bidder bid_price bid_date listing# 16_574906 ch10.qxd 11/4/05 10:46 AM Page 273 274 Chapter 10 The OLTP Database Model with Referential Integrity The final step in this section on enforcing table relationships is to create the tables. In this version, all the primary and foreign keys to enforce referential integrity relationships are included. This is a sample script for creating tables for the OLTP database model of the online auction house. In this version, all primary and foreign key definitions are included, to enforce referential integrity: CREATE TABLE CATEGORY_PRIMARY ( PRIMARY_ID INTEGER PRIMARY KEY, PRIMARY STRING ); CREATE TABLE CATEGORY_SECONDARY ( SECONDARY_ID INTEGER PRIMARY KEY, PRIMARY_ID INTEGER FOREIGN KEY REFERENCES CATEGORY_PRIMARY, SECONDARY STRING ); CREATE TABLE CATEGORY_TERTIARY ( TERTIARY_ID INTEGER PRIMARY KEY, SECONDARY_ID INTEGER FOREIGN KEY REFERENCES CATEGORY_SECONDARY, TERTIARY STRING ); The CATEGORY_ PRIMARY table has no foreign keys. The CATEGORY_TERTIARY table has no link to the CATEGORY_PRIMARY table, as a result of surrogate key use, and non-identifying relationships. A foreign key specification references the parent table, not the parent table primary key. The parent table already “knows” what its primary key field is. CREATE TABLE SELLER ( SELLER_ID INTEGER PRIMARY KEY, SELLER STRING, POPULARITY_RATING INTEGER, JOIN_DATE DATE, ADDRESS STRING, RETURN_POLICY STRING, INTERNATIONAL STRING, PAYMENT_METHODS STRING ); CREATE TABLE BUYER ( BUYER_ID INTEGER PRIMARY KEY, BUYER STRING, POPULARITY_RATING INTEGER, JOIN_DATE DATE, ADDRESS STRING ); The SELLER and BUYER tables are at the top of the hierarchy, so they have no foreign keys. 16_574906 ch10.qxd 11/4/05 10:46 AM Page 274 275 Creating and Refining Tables During the Design Phase The SELLER_HISTORY and BUYER_HISTORY tables are incorrect as shown in Figure 10-8 because the lone foreign key is also the primary key. With the structure as it is in Figure 10-8, each seller and buyer would be restricted to a single history record each. A primary key value must also be unique across all records for an entire table. One solution is shown in Figure 10-9, with the script following it. In Figure 10-9, the primary key becomes a composite of the non-unique SELLER_ID or BUYER_ID, plus a subsidiary SEQ# (counting sequence number). The counting sequence number counts upwards from 1, for each seller and buyer (with history records). So, if one buyer has 10 history entries, that buyers history SEQ values would be 1 to 10, for those 10 records. Similarly, a second buyer with 25 history records would have SEQ# fields values valued at 1 to 25. Figure 10-9: Non-unique table records become unique using subsidiary sequence auto counters. Following is the script for Figure 10-9: CREATE TABLE SELLER_HISTORY ( SELLER_ID INTEGER FOREIGN KEY REFERENCES SELLER NOT NULL, SEQ# INTEGER NOT NULL, BUYER STRING, COMMENT_DATE DATE, LISTING# STRING, COMMENTS STRING, CONSTRAINT PRIMARY KEY(SELLER_ID, SEQ#) ); CREATE TABLE BUYER_HISTORY Seller_History Seller seller_id seller popularity_rating join_date address return_policy international payment_methods seller_id (FK) seq# buyer comment_date listing# comments Buyer_History buyer_id (FK) seq# seller comment_date listing# comments Buyer buyer_id buyer popularity_rating join_date address Category_Primary primary_id primary secondary Category_Secondary secondary_id primary_id (FK) tertiary Category_Tertiary tertiary_id secondary_id (FK) Listing buyer_id (FK) seller_id (FK) secondary_id (FK) tertiary_id (FK) description image start_date listing_days currency starting_price reserve_price buy_now_price number _of_bids winning_price Bid buyer_id (FK) listing# (FK) bidder bid_price bid_date listing# 16_574906 ch10.qxd 11/4/05 10:46 AM Page 275 276 Chapter 10 ( BUYER_ID INTEGER FOREIGN KEY REFERENCES BUYER NOT NULL, SEQ# INTEGER NOT NULL SELLER STRING, COMMENT_DATE DATE, LISTING# STRING, COMMENTS STRING, CONSTRAINT PRIMARY KEY(BUYER_ID, SEQ#) ); The SELLER_ID, BUYER_ID and SEQ# fields have all been specifically declared as being NOT NULL. This means that a value must be entered into these fields for the creation of a new record (or a change to an existing record). All the fields in a composite (multiple field) primary key must be declared as NOT NULL . This ensures uniqueness of the resulting composite primary key. A primary key declared on more than one field (a composite key) cannot be specified inline with that specific field definition. This is because there is more than one field. The primary key is declared out-of-line to field definitions, as a specific constraint. This forces the requirement for the NOT NULL specifications of all the primary key fields. Another more elegant but perhaps more mathematical and somewhat confusing solution is to create a surrogate key for the BUYER_HISTORY and SELLER_HISTORY tables as well. The result is shown in Figure 10-10. The result is non-identifying relationships from SELLER to SELLER_HISTORY, and BUYER to BUYER_HISTORY tables. Figure 10-10: Non-unique table records can become unique by using surrogate key auto counters. Seller_History Seller seller_id seller popularity_rating join_date address return_policy international payment_methods seller_history_id buyer_id (FK) seller_id (FK) comment_date comments Buyer_History buyer_history_id seller_id (FK) buyer_id (FK) comment_date comments Buyer buyer_id buyer popularity_r ating join_date address Category_Primary primary_id primary secondary Category_Secondary secondary_id primary_id (FK) tertiary Category_Tertiary tertiary_id secondary_id (FK) Listing buyer_id (FK) seller_id (FK) secondary_id (FK) tertiary_id (FK) description image start_date listing_days currency starting_price reserve_price buy_now_price number _of_bids winning_price Bid bidder_id (FK) listing# (FK) bid_price bid_date listing# 16_574906 ch10.qxd 11/4/05 10:46 AM Page 276 As a reminder, a non-identifying relationship is when the parent table primary key is not part of the pri- mary key in the child table. A child table record is not specifically or uniquely identified by a child table record. Further changes in Figure 10-10 are as follows: ❑ The addition of the relationships between SELLER to BUYER_HISTORY tables, and BUYER to SELLER_HISTORY tables. Every history record of seller activity is related to something pur- chased from that seller (by a buyer). The same applies to buyers. ❑ A buyer can be a seller as well, and visa versa. This database model is beginning to look a little messy. Following is the script for changes introduced in Figure 10-10: CREATE TABLE SELLER_HISTORY ( SELLER_HISTORY_ID INTEGER PRIMARY KEY, SELLER_ID INTEGER FOREIGN KEY REFERENCES SELLER, BUYER_ID INTEGER FOREIGN KEY REFERENCES BUYER, COMMENT_DATE DATE, COMMENTS STRING ); CREATE TABLE BUYER_HISTORY ( BUYER_HISTORY_ID INTEGER PRIMARY KEY, BUYER_ID INTEGER FOREIGN KEY REFERENCES BUYER, SELLER_ID INTEGER FOREIGN KEY REFERENCES SELLER, COMMENT_DATE DATE, COMMENTS STRING ); Figure 10-11 shows further refinement for the BID table. Essentially, this section has included some specific analysis-design reworking. Some things can’t be assessed accurately by simple analysis. Don’t mistake the fiddling with relationships, and specific fields, for primary and foreign keys as normalization or denormalization. Note, however, that some extensive normalization has occurred merely by the establishment of one-to-many relationships. This normalization activity has actually been performed from an analytical perspective. 277 Creating and Refining Tables During the Design Phase 16_574906 ch10.qxd 11/4/05 10:46 AM Page 277 278 Chapter 10 Figure 10-11: Refining the BID table and related tables. Figure 10-11 has essentially redefined the BID table and made a few other small necessary changes. These changes all make analytical sense and don’t need normalization. Potential buyers place bids on auction listings, but do not necessarily win the auction; however, the history of all bids for a listing needs to be retained. The BID table, therefore, contains all bids for a listing, including all losing bids and the final winning bid. The final winning bid is recorded as the winning bid, by setting the BUYER_ID field in the LISTING table. As a result, a losing buyer is not stored in the LISTING table as a buyer because he or she is only a bidder (a buyer is only a bidder who wins the listing). The results are as follows: ❑ LISTING to BID is one-to-zero, one, or many—A listing that has just been listed is likely to have no bids. Also, an unpopular listing may have no bids placed over the life of the auction listing. It is still a listing, but it has no bids. ❑ BUYER to LISTING is zero or one to zero, one, or many —A losing buyer is only a bidder and not the winning bidder. Losing bidders are not entered into the LISTING table as buyers because they lost the auction. You don’t actually have to apply the rules of normalization, using Normal Forms, to create a first pass of a database model. So far, it’s all been common sense. This is one of the reasons why these final chapters are presented as a case study example. This case study is not an application of theory, by applying normalization and Normal Forms, to a bucket of information. A bucket of information implies a whole pile of things thrown into a heap, on the floor in front of you. Seller_History Seller seller_id seller popularity_rating join_date address return_policy international payment_methods seller_history_id buyer_id (FK) seller_id (FK) comment_date comments Buyer_History buyer_history_id seller_id (FK) buyer_id (FK) comment_date comments Buyer buyer_id buyer popularity_r ating join_date address Category_Primary primary_id primary secondary Category_Secondary secondary_id primary_id (FK) tertiary Category_Tertiary tertiary_id secondary_id (FK) Listing buyer_id (FK) seller_id (FK) secondary_id (FK) tertiary_id (FK) description image start_date listing_days currency starting_price reserve_price buy_now_price number _of_bids winning_price Bid bidder_id (FK) listing# (FK) bid_price bid_date listing# 16_574906 ch10.qxd 11/4/05 10:46 AM Page 278 ❑ BUYER to BID is one-to-one or many (zero is not allowed)—A bid cannot exist without a potential buyer. This item is not a change, but listed as a reinforcing explanation of said relationships between BID, BUYER, and LISTING tables. The result is the following script for creating the LISTING and BID tables: CREATE TABLE LISTING ( LISTING# STRING PRIMARY KEY, BUYER_ID INTEGER FOREIGN KEY REFERENCES BUYER WITH NULL, SELLER_ID INTEGER FOREIGN KEY REFERENCES SELLER, SECONDARY_ID INTEGER FOREIGN KEY REFERENCES CATEGORY_SECONDARY WITH NULL, TERTIARY_ID INTEGER FOREIGN KEY REFERENCES CATEGORY_TERTIARY WITH NULL, DESCRIPTION STRING, IMAGE BINARY, START_DATE DATE, LISTING_DAYS INTEGER, CURRENCY STRING, STARTING_PRICE MONEY, RESERVE_PRICE MONEY, BUY_NOW_PRICE MONEY, NUMBER_OF_BIDS INTEGER, WINNING_PRICE MONEY ); The BUYER_ID field is specified as a WITH NULL foreign key, indicating that LISTING records only contain the BUYER_ID for the winning bidder. If no one bids, then a LISTING record will never have a BUYER_ID value. The SECONDARY_ID and TERTIARY_ID category fields are also listed as WITH NULL foreign key fields because either is allowed (not both). CREATE TABLE BID ( BIDDER_ID INTEGER FOREIGN KEY REFERENCES BIDDER, LISTING# INTEGER FOREIGN KEY REFERENCES LISTING, BID_PRICE MONEY, BID_DATE DATE, CONSTRAINT PRIMARY KEY(BIDDER_ID, LISTING#) ); CREATE TABLE commands would have to be preceded by DROP TABLE commands for all tables, preferably in reverse order to that of creation. Some databases will allow changes to primary and foreign keys using ALTER TABLE commands. Some databases even allow these changes directly into an ERD GUI tool. Microsoft Access allows these changes to be made very easily, using a GUI. The Data Warehouse Database Model with Referential Integrity The data warehouse database model for the online auction house is altered slightly in Figure 10-12, including addition of surrogate primary keys to all tables. All of the dimensional-fact table relationships are zero or one, to zero, one, or many. This indicates that the fact table contains a mixture of multiple fact sources (multiple transaction types, including listings, bids, and histories). Essentially, a fact table does not 279 Creating and Refining Tables During the Design Phase 16_574906 ch10.qxd 11/4/05 10:46 AM Page 279 280 Chapter 10 absolutely have to contain all fields from all records, for all facts. In other words, fact records do not always have to contain location ( LOCATION table) information, for example. The LOCATION table LOCATION_ID foreign key in the fact table can contain NULL values. Figure 10-12: The data warehouse database model ERD. Seller seller_id seller popularity_rating join_date address return_policy international payment_methods Location location_id region country state city Time time_id month quarter year Category_Hierarchy category_id parent_id category Listing_Bids_History fact_id time_id (FK) buyer_id (FK) location_id (FK) seller_id (FK) category_id (FK) listing# listing_description listing_image listing_start_date listing_days listing_currency listing_starting_price listing_reserve_price listing_buy_now_price listing_number_of_bids listing_winning_price listing_winner_buyer bidder bidder_price bidder_date history_buyer history_buyer_comment_date history_buyer_comments history_seller history_seller_comment_date history_seller_comments Buyer buyer_id buyer popularity_rating join_date address 16_574906 ch10.qxd 11/4/05 10:46 AM Page 280 A script to create the tables shown in Figure 10-12 is as follows: CREATE TABLE CATEGORY_HIERARCHY ( CATEGORY_ID INTEGER PRIMARY KEY, PARENT_ID INTEGER FOREIGN KEY REFERENCES CATEGORY_HIERARCHY WITH NULL, CATEGORY STRING ); The PARENT_ID field points at a possible parent category. If there is no parent, then the PARENT_ID is NULL valued (WITH NULL). Primary categories will have NULL valued PARENT_ID fields. Data warehouse database model SELLER and BUYER tables are the same as for the OLTP database model: CREATE TABLE SELLER ( SELLER_ID INTEGER PRIMARY KEY, SELLER STRING, POPULARITY_RATING INTEGER, JOIN_DATE DATE, ADDRESS STRING, RETURN_POLICY STRING, INTERNATIONAL STRING, PAYMENT_METHODS STRING ); CREATE TABLE BUYER ( BUYER_ID INTEGER PRIMARY KEY, BUYER STRING, POPULARITY_RATING INTEGER, JOIN_DATE DATE, ADDRESS STRING ); The LOCATION and TIME tables are as follows: CREATE TABLE LOCATION ( LOCATION_ID INTEGER PRIMARY KEY, REGION STRING, COUNTRY STRING, STATE STRING, CITY STRING ); CREATE TABLE TIME ( TIME_ID INTEGER PRIMARY KEY, MONTH STRING, QUARTER STRING, YEAR STRING ); 281 Creating and Refining Tables During the Design Phase 16_574906 ch10.qxd 11/4/05 10:46 AM Page 281 282 Chapter 10 Finally, the single fact table has optional dimensions for all but sellers, and thus all foreign keys (except the SELLER_ID foreign key) are declared as WITH NULL fields: CREATE TABLE LISTING_BIDS_HISTORY ( FACT_ID INTEGER PRIMARY KEY, CATEGORY_ID INTEGER FOREIGN KEY REFERENCES CATEGORY_HIERARCHY WITH NULL, TIME_ID INTEGER FOREIGN KEY REFERENCES TIME WITH NULL, LOCATION_ID INTEGER FOREIGN KEY REFERENCES LOCATION WITH NULL, BUYER_ID INTEGER FOREIGN KEY REFERENCES BUYER WITH NULL, SELLER_ID INTEGER FOREIGN KEY REFERENCES SELLER, ); That is how referential integrity is enforced using primary and foreign keys. There are other ways of enforcing referential integrity, such as using stored procedures, event triggers, or even application code. These methods are, of course, not necessarily built in database model business rules. Even so, primary and foreign keys are a direct application of business rules in the database model and thus are the only relevant topic. So, where do normalization and denormalization come in here? Normalization and Denormalization Normalization divides things up into smaller pieces. Denormalization does the opposite of normaliza- tion by reconstituting those little-bitty pieces back into larger pieces. When implementing normalization and denormalization, there are a number of general conceptual approaches to consider: ❑ Don’t go overboard with normalization for an OLTP database model. ❑ Don’t be afraid to denormalize, even in an OLTP database model. ❑ Generally, an OLTP database model is normalized and a data warehouse model is denormalized. Doing the opposite to each database model is usually secondary, and usually as a result of going too far initially, in the opposite direction. ❑ An OLTP database model should be normalized and a data warehouse database model should be denormalized, where appropriate. At this stage, to maintain the case study approach, it makes sense to continue with use of the online auction company to go through the normalization and denormalization process in detail. Use of the term “detail” implies going through the whole process from scratch once again, as for analysis in Chapter 9, but executing the process using the mathematical approach (normalization), rather than the analytical approach. 16_574906 ch10.qxd 11/4/05 10:46 AM Page 282 [...]...Creating and Refining Tables During the Design Phase Case Study: Normalizing an OLTP Database Model Figure 10-11 shows the most recent version of the OLTP database model for the online auction house From an operational perspective, you identified categories, sellers, buyers, listings, bids, seller histories, and buyer histories... be to begin from scratch but perhaps to identify mathematically (using normalization) which Normal Forms were applied to create the OLTP database model shown in Figure 10-11 Figure 10-13 identifies the different layers of normalization applied to create the OLTP database model shown in Figure 10-11 Seller seller_id Category_Primary seller popularity_rating join_date address return_policy international... example Beyond 3NF the Easy Way Many commercial relational database models do not extend beyond 3NF Sometimes 3NF is not used The simple rule is not to make a rule out of it Commerce requires flexibility, not rigidity The reason why is because of the generation of too many tables, and the resulting complex SQL code joins, with resulting terrible database response times One common case that bears mentioning... seen a relational database that physically allows Data Definition Language (DDL) commands that permits the creation of a table with more than one primary key per table Using a CREATE TABLE command or even a table-design GUI (such as in Microsoft Access) more than one primary key is simply not an option In other words, your Microsoft Access table-modeling GUI and your Oracle or SQL Server database DDL commands... Figure 10-19: Over application of BCNF for the online auction house OLTP database model The items shown in Figure 10-19 are obviously all very extreme applications of BCNF, and normalization in general Perhaps it would suffice to say that the various Normal Forms, and particularly those beyond 3NF, are not the be-all and end-all of database model design The extremes of application of BCNF are unnecessary... very likely the case for the online auction house OLTP database model, as shown in Figure 10-19; however, it does look good from a mathematical perspective Let’s demonstrate by building a simple query to return all sellers of specific popularity rating, for all their listings and histories, 291 Chapter 10 listed on a particular date Using the database model depicted in Figure 10-18, the query could... ‘12-DEC-2004’; This query joins three tables Query optimization procedures in any database has a much easier task of deciding how best to execute the query, joining only three tables, than it has trying to figure out an efficient method of joining the eight tables shown in the following query (based on the BCNF normalized database model shown in Figure 10-19): SELECT * FROM SELLER S JOIN SELLER_NAME SN... join_date address seller_id (FK) buyer_id (FK) comment_date comments Bid bidder_id (FK) listing# (FK) bid_price bid_date 3NF Figure 10-17: Identifying 3NF normalization for the online auction house OLTP database model 288 Creating and Refining Tables During the Design Phase Normal Forms are not necessarily applied as 1NF, 2NF 3NF, but could be iterative, such as 1NF, 2NF, 3NF, 1NF, 3NF Purists will state... is likely because the sequential application of Normal Form layers applies more to individual tables (or small groupings of tables) On the contrary normalization is not typically applied to an entire database model — at least not for all tables at the same time Deeper Normalization Layers Now examine various cases for applying beyond 3NFs to the current structure as represented in Figure 10-18 Seller... buyer_history_id buyer popularity_rating join_date address seller_id (FK) buyer_id (FK) comment_date comments Bid bidder_id (FK) listing# (FK) bid_price bid_date Figure 10-18: The online auction house OLTP database model normalized to 3NF BCNF Boyce-Codd Normal Form (BCNF) — Every determinant in a table is a candidate key If there is only one candidate key, 3NF and Boyce-Codd normal form are one and the . go overboard with normalization for an OLTP database model. ❑ Don’t be afraid to denormalize, even in an OLTP database model. ❑ Generally, an OLTP database model is normalized and a data warehouse. each database model is usually secondary, and usually as a result of going too far initially, in the opposite direction. ❑ An OLTP database model should be normalized and a data warehouse database. changes to be made very easily, using a GUI. The Data Warehouse Database Model with Referential Integrity The data warehouse database model for the online auction house is altered slightly in

Ngày đăng: 03/07/2014, 01:20

TỪ KHÓA LIÊN QUAN