SQL PROGRAMMING STYLE- P41 docx

1.2 Follow the ISO-11179 Standards Naming Conventions 17 Although not required, the correlation name on a table expression can be followed by a list of new column names in parentheses. If this list is missing, the correlation name inherits the names from the base tables or views in the table expression. In the case of a simple table correlation name, such a list would probably be redundant because we usually want to use the original column names. In the case of a table expression correlation name, such a list would probably be a good idea to avoid ambiguous column names. It also forces the programmer to trim the expression of extraneous columns that were not actually needed in the query. Exceptions: If there is no obvious, clear, simple name for the table correlation name, then use an invented name, such as a single letter like X. Likewise, if a computation has no immediate name, then you might use an invented name. 1.2.7 Relationship Table Names Should Be Common Descriptive Terms Rationale: Tables and views can model relationships, usually one-to-many or many-to-many, as well as entities. If the relationship has a common name that is understood in the context, then use it. There is a tendency for newbies to concatenate the names of the tables involved to build a nounce word. For example, they name a table “Marriages” because that is the common term for that relationship rather than “ManWoman,” “HusbandsWives,” or something really weird. Likewise, “Enrollment” makes more sense than “Students_Courses”; once you start looking for the names, they come easily. This concatenation falls apart when the relationship is not a simple binary one, such as an escrow on a house that has a buyer, a seller, and a lender. Exceptions: If there is no common term for the relationship, you will need to invent something, and it might well be a concatenation of table names. 18 CHAPTER 1: NAMES AND DATA ELEMENTS 1.2.8 Metadata Schema Access Objects Can Have Names That Include Structure Information This rule does not apply to the schema information tables, which come with standardized names. It is meant for naming indexes and other things that deal directly with storage and access. The postfix “_idx” is acceptable. Rationale: This is simply following the principle that a name should tell you what something is. In the case of indexes and other things that deal directly with storage and access, that is what they are. They have nothing to do with the data model. Exceptions: This does not apply to schema objects that are seen by the user. Look for the rules for the other schema objects as we go along. 1.3 Problems in Naming Data Elements Now that we have talked about how to do it right, let’s spend some time on common errors in names that violate the rules we set up. 1.3.1 Avoid Vague Names Rationale: “ That sounds vaguely obscene to me! I can’t stand vagueness! ” —Groucho Marx. At one extreme the name is so general that it tells us nothing. The column is a reserved word such as “date” or it is a general word like “id,” “amount,” “date,” and so forth. Given a column called “date,” you have to ask, “date of what?” An appointment? Birth? Hire? Termination? Death? The name begs the question on the face of it. At another extreme, the name is made useless by telling us a string of qualifiers that contradict each other. Consider the typical newbie column name like “type_code_id” as an example. If it is an identifier, then it is unique for every entity that has it, like the vehicle identification number (VIN) on a automobile. If it is a code, then what is the trusted source that maintains it like a ZIP code? It is drawn from a domain of values that is not unique. If it is a type, then what is the taxonomy to 1.3 Problems in Naming Data Elements 19 which it belongs? Why not go all the way and call it “type_code_id_value” instead? Why did we not find a mere “customer_type” that would have been understood on sight? Exceptions: None Improperly formed data element names seem to be the result of ignorance and object-oriented (OO) programming. In particular, OO programmers put “_id” on every primary key in every table and have problems understanding that SQL is a strongly typed language in which things do not change their data types in programs. The names get absurd at times. Consider a lookup table for colors: CREATE TABLE TblColors (color_value_id INTEGER NOT NULL PRIMARY KEY, color_value VARCHAR(50) NOT NULL); But what does “_value_id” mean? Names like this are generated without thought or research. Assume that we are using the Pantone color system in the database, so we have a trusted source and a precise description—we did the research! This might have been written as follows: CREATE TABLE Colors (pantone_nbr INTEGER NOT NULL PRIMARY KEY, color_description VARCHAR(50) NOT NULL); 1.3.2 Avoid Names That Change from Place to Place Rationale: The worst possible design flaw is changing the name of an attribute on the fly, from table to table. As an example, consider this slightly cleaned- up piece of actual code from a SQL newsgroup: SELECT Incident.Type, IPC.DefendantType, Recommendation.Notes, Offence.StartDate, Offence.EndDate, Offence.ReportedDateTime, IPC.URN FROM IPC INNER JOIN Incident ON IPC.URN = Incident.IPCURN 20 CHAPTER 1: NAMES AND DATA ELEMENTS INNER JOIN IncidentOffence ON Incident.URN = IncidentOffence.IncidentURN INNER JOIN Offence ON Offence.URN = IncidentOffence.OffenceURN INNER JOIN IPCRecommendation ON IPC.URN = IPCRecommendation.IPCURN INNER JOIN Recommendation ON IPCRecommendation.RecommendationID = Recommendation.ID; Those full table names are difficult to read, but the newbie who wrote this code thinks that the table name must always be part of the column name. That is the way that a file worked in early COBOL programs. This means that if you have hundreds of tables, each appearance of the same attribute gets a new name, so you can never build a proper data dictionary. Did you also notice that it is not easy to see underscores, commas, and periods? Try this cleaned-up version, which clearly shows a simple star schema centered on the IPC table. SELECT I1.incident_type, IPC.defendant_type, R1.notes, O1.start_date, O1.end_date, O1.reported_datetime, IPC.urn FROM Incidents AS I1, IPC, Recommendations AS R1, Offences AS O1, WHERE IPC.recommendation_id = R1.recommendation_id AND IPC.urn = O1.urn AND IPC.urn = I1.urn AND IPC.urn = R1.urn AND I1.urn = O1.urn; I have no idea what a URN is, but it looks like a standard identifier of some kind. Look at all of the kinds of “URNs” (i.e., URN, IPCURN, and OffenseURN) in the original version of the query. It gives you the feeling of being in a crematorium gift shop. As you walk from room to room in your house, do you also change your name, based on your physical location? Of course not! The name we seek identifies the entity, not the location. Exceptions: Aliases inside a query can temporarily give a new name to an occurrence of a data element. These are temporary and disappear at the end of the statement. We discuss rules for this in another section 1.2.6. 1.3 Problems in Naming Data Elements 21 1.3.3 Do Not Use Proprietary Exposed Physical Locators Rationale: The most basic idea of modern data modeling is to separate the logical model and the physical implementation from each other. This allows us to reuse the model on different platforms and not be tied to just one platform. In the old days, the logical and physical implementations were fused together. I will explain this in more detail in the next chapter, but for now the rule is to never use proprietary physical locators. We want to have portable code. But the real problem is that the proprietary physical locator violates the basic idea of a key in the relational model. When new SQL programmers use IDENTITY, GUID, ROWID, or other auto-numbering vendor extensions to get a key that can be used for locating a given row, they are imitating a magnetic tape’s sequential access. It lets them know the order in which a row was added to the table—just like individual records went onto the end of the magnetic tape! We will spend more time discussing this flaw in Chapter 3. Exceptions: You might want to fake a sequential file when you are using a SQL table structure for some purpose other than a relational database management system (RDBMS). For example, staging and scrubbing data outside the “Real Schema” that do not have any data integrity issues. . ignorance and object-oriented (OO) programming. In particular, OO programmers put “_id” on every primary key in every table and have problems understanding that SQL is a strongly typed language. from table to table. As an example, consider this slightly cleaned- up piece of actual code from a SQL newsgroup: SELECT Incident.Type, IPC.DefendantType, Recommendation.Notes, Offence.StartDate,. proprietary physical locator violates the basic idea of a key in the relational model. When new SQL programmers use IDENTITY, GUID, ROWID, or other auto-numbering vendor extensions to get a key

Định dạng
Số trang	5
Dung lượng	83,1 KB