Date page 6.2 Suggestion: Show the following picture of a tuple as an example and annotate it dynamically to illustrate the formal terms tuple value tuple for short, component, attribute
Trang 1Copyright (c) 2003 C J Date page 6.2
Suggestion: Show the following picture of a tuple as an example
and annotate it dynamically to illustrate the formal terms tuple value (tuple for short), component, attribute, attribute name,
attribute value, attribute type (and attribute type name), degree, heading, tuple type (and tuple type name) Note in particular
that we define an attribute to consist specifically of an
attribute-name / type-name pair Note further that it will follow (when we get to relvars) that, e.g., attributes S# in relvar S and
S# in relvar SPJ are the same attribute This will become
important when we get to the relational algebra in Chapter 7
┌──────────┬────┬──────────┬────┬─────┬─────┐
│ MAJOR_P# : P# │ MINOR_P# : P# │ QTY : QTY │
├──────────┴────┼──────────┴────┼─────┴─────┤
│ P2 │ P4 │ 7 │
└───────────────┴───────────────┴───────────┘
Note: Attribute values should really be P#('P2'), P#('P4'),
QTY(7)──explain In a similar vein, we often omit the type names
when we give informal examples of tuple (and relation) headings
Don't bother to talk through the precise formal definition of
"tuple"──just say it's in the book (you can show it, if you like,
but the point is that, as so often, precise definitions make
simple concepts look very complicated)
Show a Tutorial D tuple selector invocation, and explain the
following important properties of tuples:
• Every tuple contains exactly one value (of the appropriate
type) for each of its attributes Note: As an aside, you
might want to point out that null is not a value, so right
here we have an overwhelming argument against nulls (as
usually understood)
• There's no left-to-right ordering to the components
• Every subset of a tuple is a tuple, and every subset of a
heading is a heading──and these remarks are true of the empty subset in particular
Explain the TUPLE type generator Probably don't get into the
point that tuple types have no name apart from the one we already
know about of the form TUPLE { <attribute commalist> }
Explain tuple equality very carefully (so much depends on
it!) As the book says, all of the following are defined in terms
of tuple equality:
Trang 2Copyright (c) 2003 C J Date page 6.3
• Candidate keys (see Chapter 9)
• Foreign keys (see Chapter 9 again)
• Essentially all of the operators of the relational algebra (see Chapter 7)
• Functional and other dependencies (see Chapters 11-13)
and more besides
The "<" and ">" operators do not apply to tuples (explain
why──fundamentally because tuples are sets)
Mention tuple projection
Don't discuss tuple types vs possreps unless someone asks
about it Even then, I'd probably deal with the issue offline
6.3 Relation Types
Suggestion: Show the following picture of a relation as an
example and annotate it dynamically to illustrate the formal terms
relation value (relation for short), attribute, attribute name, attribute value, attribute type (and attribute type name), degree, cardinality, heading, body, relation type (and relation type
name)
┌──────────┬────┬──────────┬────┬─────┬─────┐
│ MAJOR_P# : P# │ MINOR_P# : P# │ QTY : QTY │
├══════════╧════┼══════════╧════┼─────┴─────┤
│ P1 │ P2 │ 5 │
│ P1 │ P3 │ 3 │
│ P2 │ P3 │ 2 │
│ P2 │ P4 │ 7 │
│ P3 │ P5 │ 4 │
│ P4 │ P6 │ 8 │
└───────────────┴───────────────┴───────────┘
Note: Attribute values should really be P#('P1') etc Note that the tuple we talked about in the previous section is a tuple in (the body of) this relation
Don't bother to talk through the precise formal definition of
"relation"──just say it's in the book
Show a Tutorial D relation selector invocation Every subset
of a heading is a heading (as with tuples); every subset of a body
is a body In both cases, the subset in question might be empty
Trang 3Copyright (c) 2003 C J Date page 6.4
Explain the RELATION type generator and relation equality
6.4 Relation Values
This section is perhaps the core of this chapter State "the four properties" of relations:
1 Relations are normalized
2 Attributes are unordered, left to right
3 Tuples are unordered, top to bottom
4 There are no duplicate tuples
Now justify them:
1 Regarding normalization: You should be aware that the
history here is somewhat confused (in particular, the first few editions of this book were confused) The true state of affairs is as follows:
Attribute values are single values, but those values can be absolutely anything
We reject the old notion of "value atomicity," on the grounds that it has no absolute meaning──it simply depends on your point of view.* (Draw a parallel with atoms in physics, if you like, which are regarded as indivisible for some purposes
but not for others.) Thus, all relations are normalized in
the relational model──even relations that contain other
relations nested inside themselves It's true that relations with others nested inside themselves are often
contraindicated, but that's a separate point (which we'll be addressing in Chapter 12)
──────────
* In other words, the concept of "nonatomic values" has never been very clearly defined (certainly it's not very precise)
After all, even a number might be decomposed (e.g., into decimal digits, or into integer and fractional parts) in suitable
circumstances; so is a number atomic? What about bit and
character strings, which are obviously decomposable? What about
dates and times? And so on
──────────
Trang 4Copyright (c) 2003 C J Date page 6.5
You might want to add that one reason relation-valued
attributes (RVAs) are often──though not
always──contraindicated is that relations involving RVAs are usually asymmetric, leading to complications over query
formulation (see Section 11.6 for further discussion)
Another is that the predicate for such a relation is often fairly complicated.* For example, consider the relation of Fig 6.2 That relation shows among other things that
supplier S1 supplies the set of parts {P1,P2,P3,P4,P5,P6} It
is thus a "true fact" that supplier S1 supplies the set of parts {P1,P2,P3,P4,P5} and the set of parts {P1,P2,P3,P4} and the set of parts {P1,P2,P3} and many other sets of parts as well (actually 60 others) Doesn't the Closed World Assumption thus require the relation to include tuples
corresponding to these additional "true facts" as well? Well, obviously not but why not, exactly?
──────────
* Some might say it's second-order
──────────
Note: Further discussion of the whole issue of all
relations being in first normal form (also of relation-valued attributes) can be found in an article by myself, "What Does First Normal Form Really Mean?" (in two parts), to appear soon
on the website http://www.dbdebunk.com (probably before the
book itself is published) Among other things, this article offers some thoughts on the current flurry of interest in the so-called "multi-value" (or "multi-value column") systems, which you might find you need to be aware of in order to fend off certain possible criticisms
2 Regarding no attribute ordering: The book doesn't explicitly
make this point, but a good pragmatic argument to justify this
property is that, without it, A JOIN B is different from B JOIN A! Another is that, in SQL, programs that use "SELECT *"
are fragile (they can break in the face of left-to-right
column rearrangements in the database──lack of data
independence!) Note: Further discussion of this issue can
be found in another article by myself, "A Sweet Disorder,"
also due to appear soon on the website www.dbdebunk.com
3 Regarding no tuple ordering: The argument that "n ways to represent information means n sets of operators" (and n = 1 is
sufficient) is a very strong one Of course, "no tuple
ordering" doesn't mean we can't do ORDER BY but it does
Trang 5Copyright (c) 2003 C J Date page 6.6
mean the result of ORDER BY isn't a relation (important
point!)
4 Regarding no duplicate tuples:
■ A strong logical argument here is the one that relies on
the fact that tuples are supposed to represent true
propositions If I tell you "The sun is shining outside" and "The sun is shining outside," then I'm simply telling you "The sun is shining outside." If something is true,
saying it twice doesn't make it more true!
■ One philosophical argument is: If things are distinct, they must have distinct identities (quote The Principle of Identity of Indiscernibles?*); the relational model says let's represent those identities in the same way as
everything else (namely, as attribute values within
tuples), and then all kinds of good things will happen
──────────
* If there's no discernible difference between two entities, then there aren't two but only one
──────────
■ One technical argument is: Duplicates inhibit the
optimizer (because they make expression transformation──aka
"query rewrite"──harder to do and less widely applicable), thereby leading to worse performance among other things We'll elaborate on this argument in Chapter 7
■ Another (and this one is, specifically, an SQL argument): Suppose rows r1 and r2 are duplicates If we position a cursor on r1 (say) and issue a DELETE via that cursor,
there's no guarantee──at least according to my reading of
the standard──that the effect won't be to delete r2 instead
(!)
Relations vs Tables
The book summarizes some of the main differences between relations and tables It's worth spending a few minutes on that topic here;
in fact, all of the points made in this subsection are worth an airing in a live class Note that (as the book says) the list of
differences is not exhaustive; others include (a) the fact that
tables are usually thought of as having at least one column (we'll talk about this one in a few minutes); (b) the fact that tables
Trang 6Copyright (c) 2003 C J Date page 6.7
(at least in SQL) are allowed to include nulls (forward reference
to Chapter 19); and (c) the horrible but widespread perception
that "relations are flat" (forward reference to Chapter 22)
Note: The book also makes the point that columns (as opposed
to attributes) might have duplicate names, or even no names at all, and asks the question: What are the column names in the
result of the following SQL query?
SELECT S.CITY, S.STATUS * 2, P.CITY
FROM S, P ;
Answer: Column 1 is called CITY; column 2 has no name; column 3
is called CITY again Note for barrack-room lawyers: Actually,
the SQL standard does say the implementation is required to assign names to otherwise anonymous columns, but those names are
implementation-dependent (they vary from system to system,
possibly even from release to release or even more frequently)
In any case, those names are also invisible (they're not exposed
to the user) Besides, this implementation requirement, even if
you believe in it, still doesn't address the problem of duplicate
column names
Relation-Valued Attributes
Some of the points made in this subsection are probably best made under the earlier discussion of normalization──it probably isn't worth making a separate topic out of them in a live class
Relations with No Attributes
A gentle introduction to this concept is DEFINITELY worth
including as a separate topic Strong logical justification: TABLE_DEE plays a role in the relational algebra analogous to that
played by zero in ordinary arithmetic Don't get into details──I
think the point's intuitively clear Can you imagine an
arithmetic without zero? Of course not.* Well just as you can't imagine an arithmetic without zero, so you shouldn't be able
to imagine a relational algebra without TABLE_DEE
──────────
* Of course, we did have an arithmetic without zero for many
centuries (think of the ancient Romans), but it didn't work very well In fact, the invention (or discovery) of the concept of zero is arguably one of the great intellectual achievements of the human race
Trang 7Copyright (c) 2003 C J Date page 6.8
──────────
Operators on Relations
Definitely discuss relation comparisons (including "=" in
particular, though it was mentioned previously in Section 6.3) Relation comparisons are another (important!) topic typically
omitted in other database texts Note that the availability of relational comparisons makes the "complicated" operator DIVIDEBY logically unnecessary (forward pointer to Chapter 7) Mention the
IS_EMPTY shorthand (it is shorthand; to be specific, IS_EMPTY(r)
is shorthand for r{} = TABLE_DUM)
Relational comparisons aren't relational operators, since they
return a truth value, not a relation
Explain "t ε r" and TUPLE FROM r (also not relational
operators) Don't bother with type inference (here or anywhere else in this chapter)
You've probably already discussed ORDER BY──but if not, then certainly discuss it here
6.5 Relation Variables
Remind students what a relvar is (relations vs relvars is an
important special case of values vs variables in general) We distinguish base relvars vs views ("real vs virtual relvars" in The Third Manifesto ) Here we're primarily concerned with base
relvars, but anything we say about "relvars" without that "base" qualifier is true of relvars in general, not just base ones
Remind students that base relvars are not necessarily physically
stored! To be more specific, the degree of variation allowed
between base and stored relvars should be at least as great as
that allowed between views and base relvars (see Chapter 10); the
only logical requirement is that it must be possible to obtain the
base relvars somehow from those that are physically stored (and then the derived ones can be obtained too) Possible forward
pointer to Appendix A?
Explain base relvar definition syntax (and cover default
values briefly) The terms heading, body, attribute, tuple,
degree, etc., are all interpreted in the obvious way to apply to relvars as well as relations Candidate keys and foreign keys
will be discussed in detail in Chapter 9 Note: Prior to Chapter
9, the book assumes for simplicity that each base relvar has
exactly one candidate key, called the primary key In Chapter 9,
we're going to argue that the historical emphasis on primary keys
Trang 8Copyright (c) 2003 C J Date page 6.9
has always been a little bit off base, but don't get into that discussion here
Relvars have predicates (also discussed in Chapter 9)
Explain relational assignment (including a reminder re
multiple assignment) and INSERT, DELETE, and UPDATE shorthands
(including Tutorial D expansions) Further points to emphasize:
• Remind students re the use of WITH
• Relational assignment, and hence INSERT, UPDATE, and DELETE,
are all set-level operations These operations sometimes
can't be simulated by a sequence of tuple-level operations (in
fact, there are no tuple-level operations in the relational
model──one of several reasons why SQL's cursor operations are
a bad idea, incidentally)
• Of course, sets sometimes have cardinality one, but updating
a set containing just one tuple isn't always possible
(assuming the system supports integrity constraints
properly──but most don't) See Chapter 9 for further
discussion
• Expressions such as (e.g.) "updating a tuple" are really
rather sloppy (though convenient); tuples, like relations, are
values and can't be updated, by definition (quite apart from the fact that we should really be talking about the set that
contains the tuple in question anyway, instead of about the tuple itself)
Relvars and Their Interpretation
Although not new, this stuff is important and bears repeating
Explain intended interpretation and the Closed World Assumption
Forward reference to Chapter 9
6.6 SQL Facilities
SQL supports rows, not tuples (remind students of [some of] the differences) Briefly explain columns, fields, row value
constructors, row type constructors, row assignment, row
comparisons Note: As a practical matter, nobody──no SQL vendor,
that is──supports rows (apart from rows within tables) at the time
of writing
SQL supports tables, not relations (remind students of [some
of] the differences, or at least of the fact that they are
Trang 9Copyright (c) 2003 C J Date page 6.10
different) Explain table value constructors SQL does not
support (a) "table type constructors," (b) table assignment, or (c) table comparisons (It does support IS_EMPTY, more or less, via NOT EXISTS.) Explain the IN operator and "row subqueries" (this term isn't used in the book, but it means a table expression enclosed in parentheses that is required to evaluate to a table containing just one row note the coercion involved here!) SQL doesn't properly distinguish between table values and table variables
Discuss CREATE TABLE* (classic version──we'll get to "typed tables" in a little while) No table-valued columns Mention DROP and ALTER TABLE if you like
──────────
* Note that "TABLE" in this context means a base table
specifically: a prime indicator of SQL's lack of understanding of relational concepts right there!
──────────
The SQL INSERT, UPDATE, and DELETE operations were covered in Chapter 4 SELECT will be covered in more detail in Chapter 8
There's more to say regarding CREATE TABLE Recall structured types from Chapter 5 In that chapter we implied that such types were scalar──though the availability of SQL's "observer and
mutator methods" mean they aren't really scalar, because those
methods "break encapsulation" for those structured types (in fact, structured types are more like tuple types in some ways)
And──following on from this observation──such types can be used as the basis for creating base tables: The attributes of the
structured type become columns of the base table (Actually the base table has one extra column too, which we'll get to in a
moment.) Here's the example from the book:
CREATE TYPE POINT AS ( X FLOAT, Y FLOAT ) NOT FINAL
REF IS SYSTEM GENERATED ; CREATE TABLE POINTS OF POINT
( REF IS POINT# SYSTEM GENERATED ) ;
Follow the explanation as given in the book but no further (more
details will come at more appropriate points later) What's this stuff all about? Well, it has to do primarily with the idea of incorporating some kind of "object functionality" into SQL; that's why we defer detailed discussion for now (we need to talk about
"objects" in some detail first) But there's nothing in the
Trang 10Copyright (c) 2003 C J Date page 6.11
standard to say that the features in question can be used only in connection with that object functionality, which is why we at
least mention them here We'll ignore them from this point on, however, until much later (Chapter 26)
References and Bibliography
Reference [6.1] (either version) is strongly recommended and
should be distributed to students if at all possible By
contrast, reference [6.2] is mentioned in the book only because it would be inappropriate not to! Students should be warned that few authorities agree with all──or even very many──of the positions articulated in reference [6.2] See references [6.7] and [6.8] for some specific criticisms
Answers to Exercises
6.1 Fundamentally, cardinality is a concept that applies to sets:
The cardinality of a set is the number of elements it contains However, the concept is extended to other kinds of "collections"
also; thus, we speak of the cardinality of a bag, the cardinality
of a list, and so on In particular, the cardinality of a
relation is the number of tuples in the body of that relation, and
the cardinality of a relvar is the cardinality of the relation
that happens to be the current value of that relvar Sometimes the term is even applied to an attribute of some relation, in
which case it means the cardinality of either (a) the bag or (b) the set of values (with duplicates eliminated) appearing in that attribute in that relation Note: Since interpretation (a) is
guaranteed to give a result identical to the cardinality of the containing relation, interpretation (b) is probably more
common──but watch out for the possibility of confusion in this regard (especially since, to repeat, cardinality is fundamentally
a concept that applies to sets rather than bags)
6.2 See Sections 6.2 and 6.3
6.3 Note first that two x's are equal if and only if they are the
same x, and this observation is valid regardless of whether the
x's are tuples, or tuple types, or relations, or relation types (or anything else).* For tuples, see Section 6.2, subsection
"Operators on Tuples." For tuple types, see Section 6.2,
subsection "The TUPLE Type Generator." For relations, see Section 6.4, subsection "Operators on Relations." For relation types, see Section 6.3, subsection "The RELATION Type Generator."
──────────