Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 37 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
37
Dung lượng
0,92 MB
Nội dung
CHAPTER 6 Logical Database Design Using Normalization 165 Demystified / Databases Demystified / Oppel/ 225364-9 / Chapter 6 effort is underway, which includes building integrated application and database systems to perform basic business functions. The User Views UTLA wishes to construct a system to track their academic activities, including course offerings, instructor qualifications for the courses, course enrollment, and student grades. The following illustrations show the desired output reports with sample data (these are the user views that should be normalized). Student report: Course report: Instructor report: P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:08 AM Color profile: Generic CMYK printer profile Composite Default screen Section report: One cannot design a database without some knowledge of the business rules and processes of an organization. Here are a few such items to keep in mind: • Only one mailing address and one contact phone number are kept for each student. • Each course has a fixed number of credits (that is, there are no variable credit courses). • Each course may have one or more prerequisite courses. The list of all prerequisite courses for each course is shown in the Course report. • Only one mailing address, one home phone number, and one office phone number are kept for each instructor. • A qualifications committee must approve instructors before they are permitted to teach a particular course. The qualifications (that is, the courses that the committee has determined the instructor is qualified to teach) are then added to the instructor’s records, as shown in the Instructor report. The list of qualified courses does not imply that the instructor has ever actually taught the course but onlythatheorsheisqualifiedtodoso. • Based on demand, any course may be offered multiple times, even in the same year and semester. Each offering is called a “section,” as shown in the Section report. • Students enroll in a particular section of a course and receive a grade for their participation in that course offering. Should they take the course again at a later time, they receive another grade, and both grades are part of their permanent academic record. 166 Databases Demystified P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:08 AM Color profile: Generic CMYK printer profile Composite Default screen TEAM FLY CHAPTER 6 Logical Database Design Using Normalization 167 Demystified / Databases Demystified / Oppel/ 225364-9 / Chapter 6 • Although the day, time, building, and room for each section is noted in the Section report, this is done merely to facilitate registering students. The scheduling of classrooms is out of scope for this project. • The day(s) and time(s) attributes on the Section report are merely text descriptions of the meeting schedule. The building of a meeting calendar for sections is out of scope for this project. As a convenience, here are the attributes rewritten using our relation listing method, with repeatinggroups and multivalued attributes enclosed in parentheses: STUDENT REPORT: # ID, NAME, STREET ADDRESS, CITY, STATE, ZIP CODE, HOME PHONE COURSE REPORT: # ID, TITLE, NUMBER OF CREDITS, (PREREQUISITE COURSES), DESCRIPTION INSTRUCTOR REPORT: # ID, NAME, STREET ADDRESS, CITY, STATE, ZIP CODE, HOME PHONE, OFFICE PHONE, (QUALIFIED COURSES) SECTION REPORT: YEAR, SEMESTER, BUILDING, ROOM, DAYS, TIMES, INSTRUCTOR ID, INSTRUCTOR NAME, COURSE ID, NUMBER OF CREDITS, (STUDENT ID, STUDENT NAME, GRADE) Author’s Solution Database design is not an exact science, so there is some latitude for alternative solu- tions. However, all must meet the criteria for third normal form. Here are the normal- ized relations, with the hash mark (#) denoting primary key attributes: COURSE: # COURSE ID, TITLE, DESCRIPTION, NUMBER OF CREDITS INSTRUCTOR: # INSTRUCTOR ID, NAME, HOME ADDRESS STREET, HOME ADDRESS CITY, HOME ADDRESS STATE, HOME ADDRESS ZIP CODE, HOME PHONE, OFFICE PHONE COURSE SECTION: # SECTION ID, YEAR, SEMESTER, COURSE ID, BUILDING, ROOM, MEETING DAY, MEETING TIME, INSTRUCTOR ID STUDENT: # STUDENT ID, NAME, HOME ADDRESS, CITY, STATE, ZIP CODE, PHONE STUDENT SECTION: # STUDENT ID, # SECTION ID, GRADE COURSE PREREQUISITE: COURSE ID, PREREQUISITE COURSE ID COURSE INSTRUCTOR QUALIFIED: INSTRUCTOR ID, COURSE ID A few notes on this particular solution are in order: • There was no simple natural key for the Course Section relation, so a surrogate key was added. P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:08 AM Color profile: Generic CMYK printer profile Composite Default screen 168 Databases Demystified Demystified / Databases Demystified / Oppel/ 225364-9 / Chapter 6 • The Course Prerequisite relation can be quite confusing. This is the intersection relation for a many-to-many recursive relationship. A course can have many prerequisites, which may be found by joining COURSE ID in the COURSE relation with COURSE ID in the COURSE PREREQUISITE relation. At the same time, any course may be a prerequisite for many other courses. These may be found by joining COURSE ID in the COURSE relation with PREREQUISITE COURSE ID in the COURSE PREREQUISITE relation. This means that there are two relationships between the COURSE and COURSE PREREQUISITE: one where COURSE ID is the foreign key and another where PREREQUISITE COURSE ID is the foreign key. Comparing the upcoming illustrations for the COURSE and COURSE_ PREREQUISITE tables should help make this point clear. To assist you in visualizing how all this works, the following illustrations show each of the tables as implemented in a Microsoft Access database, each loaded with the data from the original user view (report) examples. Figure 6-5 shows the ERDfor the solution, using the Microsoft Relationships panel as the presentation media. COURSE table: INSTRUCTOR table: P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:08 AM Color profile: Generic CMYK printer profile Composite Default screen CHAPTER 6 Logical Database Design Using Normalization 169 Demystified / Databases Demystified / Oppel/ 225364-9 / Chapter 6 COURSE_SECTION table: STUDENT table: STUDENT_SECTION table: COURSE_PREREQUISITE table: P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:09 AM Color profile: Generic CMYK printer profile Composite Default screen COURSE_INSTRUCTOR_QUALIFIED table: Computer Books Company The Computer Books Company (CBC) buys books from publishers and sells them to individuals via mail and telephone orders. They are looking to expand their ser - vices by offering online ordering via the Internet, and in doing so, have a compelling need to build a database to hold their business information. 170 Databases Demystified Figure 6-5 ERD (Relationships panel) P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:09 AM Color profile: Generic CMYK printer profile Composite Default screen CHAPTER 6 Logical Database Design Using Normalization 171 Demystified / Databases Demystified / Oppel/ 225364-9 / Chapter 6 The User Views Throughout these user views, “sale” and “price” are references to the retail sale of a book to a CBC customer, whereas “purchase” and “cost” are references to the pur - chase of books from a publisher (CBC supplier). Each user view is described briefly with a list of the attributes in the view following each description. Per our conven - tion, multivalued attributes and repeating groups are enclosed in parentheses. The Book Catalog lists all the books that CBC has for sale. Each book is uniquely identified by the International Standard Book Number (ISBN). Although an ISBN uniquely identifies a book, it is essentially a surrogate key, so there is no way to tell what edition a particular book is simply by looking at the ISBN. When new editions come out, CBC typically has leftover stock of prior editions and offers them at a re - duced price. The previous edition code in the Book Catalog is intended to help the buyer find the prior edition, if there is one. Books are organized by subject, with each book having only one subject. Any book may have multiple authors. (Although the catalog shows only author names, keep in mind that people’s names are seldom unique, and nothing would stop two people with the same name from both writing books). Here is the information in the Book Catalog: BOOK CATALOG: SUBJECT CODE, SUBJECT DESCRIPTION, BOOK TITLE, BOOK ISBN, BOOK PRICE, PREVIOUS EDITION ISBN, PREVIOUS EDITION PRICE, (BOOK AUTHORS), PUBLISHER NAME The Book Inventory Report helps the warehouse manager control the inventory in the warehouse. The Recommended Quantity is the reorder point, meaning when on- hand inventory falls below the recommended quantity, it is time to order more books of that title. INVENTORY REPORT: BOOK ISBN, BOOK EDITION CODE, COST, SELLING PRICE, QUANTITY ON HAND, QUANTITY ON ORDER, RECOMMENDED QUANTITY The Customer Book Orders view shows orders placed by CBC customers for pur - chases of books: CUSTOMER BOOK ORDERS: CUSTOMER ID, CUSTOMER NAME, STREET ADDRESS, CITY, STATE, ZIP CODE (ISBN, BOOK EDITION CODE, QUANTITY, PRICE), ORDER DATE, TOTAL PRICE P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:09 AM Color profile: Generic CMYK printer profile Composite Default screen CBC bills customers as books are shipped. An invoice is created for each ship - ment. (An order can have zero, one, or more invoices, but each invoice belongs to only one order.) The Book Sales Invoice looks like this: BOOK SALES INVOICE: SALES INVOICE NUMBER, CUSTOMER ID, CUSTOMER NAME, CUSTOMER STREET ADDRESS, CUSTOMER CITY, CUSTOMER STATE, CUSTOMER ZIP CODE, (BOOK ISBN, TITLE, EDITION CODE, (BOOK AUTHORS), QUANTITY, PRICE, PUBLISHER NAME), SHIPPING CHARGES, SALES TAX The Master Billing Report helps the Collections and Customer Service Depart - ments manage customer accounts. A system for recording customer payments against invoices is out of scope for the current project, but the CBC project sponsors do want to keep a running balance showing what each customer owes CBC. As in- voices are generated, a database trigger will be used to add invoice totals to the Bal- ance Due. As payments are received, the CBC staff will manually adjust the Balance Due. The Master Billing Report attributes are as follows: MASTER BILLING REPORT: CUSTOMER ID, NAME, STREET ADDRESS, CITY, STATE, ZIP CODE, PHONE, BALANCE DUE Each time CBC buys books from a publisher, the publisher sends an invoice to CBC. To assist in managing inventory cost, CBC wishes to store the Purchase In- voice information and report it using this view: PURCHASE INVOICE: PUBLISHER ID, PUBLISHER NAME, STREET ADDRESS, CITY, STATE, ZIP CODE, PURCHASE INVOICE NUMBER, INVOICE DATE, (BOOK ISBN, EDITION CODE, TITLE, QUANTITY, COST EACH, EXTENDED COST), TOTAL COST Note that Extended Cost is calculated as Cost Each times Quantity. Author’s Solution As before, there is some room for alternative solutions, provided all relations are in third normal form. The normalized relations in this solution follow, with primary keys noted with a hash mark (#): BOOK: # ISBN, BOOK TITLE, SUBJECT CODE, PUBLISHER ID, EDITION CODE, COST, SELLING PRICE, QUANTITY ON HAND, QUANTITY ON ORDER, RECOMMENDED QUANTITY, 172 Databases Demystified P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:09 AM Color profile: Generic CMYK printer profile Composite Default screen PREVIOUS EDITION ISBN CUSTOMER ORDER: # CUSTOMER ORDER NUMBER, CUSTOMER ID, ORDER DATE, CANCEL DATE CUSTOMER ORDER BOOK: # CUSTOMER ORDER NUMBER, # ISBN, QUANTITY, BOOK PRICE SUBJECT: # SUBJECT CODE, DESCRIPTION AUTHOR: # AUTHOR ID, AUTHOR NAME BOOK-AUTHOR: # AUTHOR ID, # ISBN CUSTOMER: # CUSTOMER ID, NAME, STREET ADDRESS, CITY, STATE, ZIP CODE, PHONE, BALANCE DUE PUBLISHER: # PUBLISHER ID, NAME, STREET ADDRESS, CITY, STATE, ZIP CODE, AMOUNT PAYABLE RECEIVABLE (SHIPPED) ORDER: # SALES INVOICE NUMBER, CUSTOMER ORDER NUMBER, SALES TAX, SHIPPING CHARGES RECEIVABLE ORDER BOOK: # SALES INVOICE NUMBER, # ISBN, QUANTITY PAYABLE (PURCHASES): # PURCHASE INVOICE NUMBER, PUBLISHER ID, INVOICE DATE, INVOICE AMOUNT PAYABLE BOOK: # PURCHASE INVOICE NUMBER, # ISBN, QUANTITY, COST EACH Figure 6-6 shows the complete design, implemented in Microsoft Access. CHAPTER 6 Logical Database Design Using Normalization 173 Figure 6-6 CBC ERD (Microsoft Access Relationships panel) P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:10 AM Color profile: Generic CMYK printer profile Composite Default screen Quiz Choose the correct responses to each of the multiple-choice questions. Note that there may be more than one correct response to each question. 1. Normalization: a. Was developed by Dr. Codd b. Was first introduced with five normal forms c. First appeared in 1972 d. Provides a set of rules for each normal form e. Provides a procedure for converting relations to each normal form 2. The purpose of normalization is a. To eliminate redundant data b. To remove certain anomalies from the relations c. To provide a reason to denormalize the database d. To optimize data-retrieval performance e. To optimize data for inserts, updates, and deletes 3. When implemented, a third normal form relation becomes a. An index b. A referential constraint c. A table d. A view e. A database 4. The insert anomaly refers to a situation where: a. Data must be inserted before it can be deleted. b. Too many inserts cause the table to fill up. c. Data must be deleted before it can be inserted. d. A required insert cannot be done due to an artificial dependency. e. A required insert cannot be done due to duplicate data. 5. The delete anomaly refers to a situation where: a. Data must be deleted before it can be inserted. b. Data must be inserted before it can be deleted. c. Data deletion causes unintentional loss of another entity’s data. d. A required delete cannot be done due to referential constraints. e. A required delete cannot be done due to lack of privileges. 6. The update anomaly refers to a situation where: a. A simple update requires updates to multiple rows of data. b. Data cannot be updated because it does not exist in the database. 174 Databases Demystified P:\010Comp\DeMYST\364-9\ch06.vp Monday, February 09, 2004 9:09:10 AM Color profile: Generic CMYK printer profile Composite Default screen [...]... “many.” 5 The IDEF1X ERD format: a Was first released in 1983 b Follows a standard developed by the National Institute of Standards and Technology c Has many variants d Has been adopted as a U.S Federal Government standard e Covers both data and process models CHAPTER 7 Data and Process Modeling 6 The IDEF1X ERD format shows a Identifying relationships with a solid line b Minimal cardinality using a. .. data model is transformed into a physical database design, it is essential to have a physical ERD that the 181 Databases Demystified 182 Figure 7-2 Acme Industries logical ERD, relational format project team can use in developing the application system The beginnings of the physical model are shown here to help make that point Here are the particulars of the relational ERD format: • Relationship cardinality... “one” (participation in the relationship is mandatory) Figure 7-3 notes a few combinations of minimum and maximum cardinality 183 Databases Demystified 184 • A Product may have zero to many associated Invoice Line Items (shown as a circle and a crow’s foot); an Invoice Line Item must have one and only one associated Product (shown as two vertical bars) • An Invoice must have one or more associated Invoice...CHAPTER 6 Logical Database Design Using Normalization c Data cannot be updated due to lack of privileges d Data cannot be updated due to an existing unique constraint e Data cannot be updated due to an existing referential constraint 7 The roles of unique identifiers in normalization are a They are unnecessary b They are required once you reach third normal form c All normalized forms require designation... normal form 177 This page intentionally left blank 7 Data and Process Modeling As you saw in Chapter 5, data and process modeling are major undertakings that are part of the logical design stage of an application system development project You have already seen the rudiments of data modeling when we used entity relationship diagrams (ERDs) in prior chapters In this chapter, we will look at ERDs and data... file, database, or even a printed page The term was CHAPTER 7 Data and Process Modeling Figure 7-10 Data flow diagram page for the Acme Industries order-fulfillment process chosen so that no particular type of storage is implied Because we already have an ERD for our example, the data stores should closely align with the entities we have already identified • Sources and destinations of data (external entities... that are not primary or candidate keys e Constraints that are not the result of the definitions of domains and keys 18 Domain key normal form deals with anomalies caused by: a Multivalued attributes b Transitive dependencies c Join dependencies d Determinants that are not primary or candidate keys e Constraints that are not the result of the definitions of domains and keys CHAPTER 6 Logical Database... Multivalued attributes c Partial dependency on the primary key 175 Databases Demystified 1 76 d Repeating groups e Join dependencies 13 In general, violations of a normalization rule are resolved by: a Combining relations b Moving attributes or groups of attributes to a new relation c Combining attributes d Creating summary tables e Denormalization 14 A foreign key in a normalized relation may be a The... 195 Databases Demystified 1 96 • Flows of data are shown using lines with arrowheads indicating the direction of flow Above each flow, words are used to describe the content of the data being sent Bidirectional flows are permissible but are usually shown as separate flows because the data is seldom exactly the same in both directions The strengths of the data flow diagram are as follows: • It easily... something, as opposed to a nonprocedural language, such as SQL, where the programmer merely describes the desired results The most commonly used procedural language today is probably C and its variants (C++, C#, and so on), but others, such as FORTRAN and COBOL, still see some use Also, specialized procedural languages for relational databases, including PL/SQL for Oracle and Transact SQL for Sybase and Microsoft . CHAPTER 6 Logical Database Design Using Normalization 165 Demystified / Databases Demystified / Oppel/ 225 364 -9 / Chapter 6 effort is underway, which includes building integrated application and. participation in that course offering. Should they take the course again at a later time, they receive another grade, and both grades are part of their permanent academic record. 166 Databases Demystified P:10CompDeMYST 364 -9ch 06. vp Monday,. Default screen CHAPTER 6 Logical Database Design Using Normalization 169 Demystified / Databases Demystified / Oppel/ 225 364 -9 / Chapter 6 COURSE_SECTION table: STUDENT table: STUDENT_SECTION table: COURSE_PREREQUISITE