• The term closure of relational operations refers to the fact that a the output from any relational operation is the same kind of object as the input──they're all relations──and so b t
Trang 1Copyright (c) 2003 C J Date page 3.7
and what the primary and foreign keys are (it isn't so important
to know exactly what the sample data values are!)." Mention the fact that Fig 3.8 is repeated inside the back cover of the book, for ease of subsequent reference
Answers to Exercises
3.1 As usual, some of the following definitions elaborate slightly
on those given in the body of the chapter
• The term automatic navigation refers to the fact that (in a
relational system) the process of "navigating" around the
stored data in order to implement user requests is performed automatically by the system, not manually by the user
• A base relvar──also known as a real relvar [3.3]──is a relvar
that has independent or autonomous existence More precisely, it's a relvar that isn't a derived relvar (q.v.) It's not
necessarily the same thing as a "stored relvar."
• The catalog is a set of system relvars whose purpose is to
contain descriptors regarding the various objects that are of
interest to the system itself, such as base relvars, views, indexes, users, integrity constraints, security constraints, and so on
• The term closure (of relational operations) refers to the
fact that (a) the output from any relational operation is the same kind of object as the input──they're all relations──and
so (b) the output from one operation can become input to
another Closure implies that we can write nested (relation-valued) expressions
Note: We stress the point that when we say that the output
from each operation is another relation, we are talking from a conceptual point of view We don't necessarily mean to imply that the system actually has to materialize the result of
every individual operation in its entirety In fact, of
course, the system tries very hard not to, if such
materialization is logically unnecessary (see the brief
discussion of pipelined evaluation in the body of the
chapter)
• Commit is the operation that signals successful
end-of-transaction Any updates made to the database by the
transaction in question are now "made permanent" and become visible to other transactions
Trang 2Copyright (c) 2003 C J Date page 3.8
• A derived relvar is a relvar whose value at any given time is
the result of evaluating a specified relational expression,
typically involving other relvars (ultimately, base relvars)
Note that (like a base relvar) a derived relvar is still a
variable!*──in other words, the term "relvar" does not refer
just to base relvars; moreover, derived relvars must be
updatable (for otherwise they cannot be said to be variables)
──────────
* To be more precise, a derived relvar is a variable if and only
if its defining relational expression involves at least one
relvar; otherwise it would be more accurate to think of it as a
relation constant (a "relcon"?), and it wouldn't be updatable
──────────
• A foreign key is a column or combination of columns in one
relvar whose values are required to match those of the primary
key in some other relvar (or possibly in the same relvar)
Note: This definition is only approximate A more precise
definition is given in Chapter 9 (where, among other things,
the point is stressed that a foreign key is a set of columns and a foreign key value is a set of values──in fact, a
(sub)tuple)
• Join is a relational operation that joins two relations
together on the basis of common values in a common column
Note: This definition is only approximate A more precise
definition is given in Chapter 7
• Optimization is the process of deciding how to implement user
access requests In other words, it's the process of deciding
how to perform automatic navigation (q.v.)
• A predicate is a truth-valued function Every relation has a
corresponding predicate that defines (loosely) "what the
relation means." Each row in a given relation denotes a
certain true proposition, obtained from the predicate by
substituting certain argument values of the appropriate type for the parameters of the predicate ("instantiating the
predicate") Note: These remarks are all true of relvars as well as relations, mutatis mutandis
• The primary key of a given relvar is a column or combination
of columns in that relvar whose values can be used to identify rows within that relvar uniquely (in other words, it's a
Trang 3Copyright (c) 2003 C J Date page 3.9
unique identifier for the rows of that relvar) Note: This
definition is only approximate A more precise definition is given in Chapter 9 (where, among other things, the point is
stressed that a primary key is a set of columns and a primary key value is a set of values──in fact, a (sub)tuple)
• Projection is a relational operation that extracts specified
columns from a relation Note: This definition is only
approximate A more precise definition is given in Chapter 7
• A proposition is, loosely, something that evaluates to either
TRUE or FALSE, unequivocally
• A relational database is a database in which the data is
perceived by the user at any given time as relations (and
nothing but relations) Equivalently, a relational database
is a container for relvars (and nothing but relvars)
• A relational DBMS is a DBMS that supports relational
databases and relational operations such as restrict, project, and join on the data in those databases
• The relational model is an abstract theory of data that's
based on certain aspects of mathematics (principally set
theory and predicate logic) It can be thought of as a way of looking at data──i.e., as a prescription for a way of
representing data (namely, by means of relations), and a
prescription for a way of manipulating such a representation
(namely, by means of operators such as join) Note: The very
abstract definition of the relational model given at the end
of Section 3.2 is explained in detail in Chapter 10 of these notes (in the answer to Exercise 10.20)
• Restriction (also known as selection) is a relational
operation that extracts specified rows from a relation Note:
This definition is only approximate A more precise
definition is given in Chapter 7
• Rollback is the operation that signals unsuccessful
end-of-transaction Any updates made to the database by the
transaction in question are "rolled back" (undone) and are
never made visible to other transactions
• A set-level operation is an operation that operates on entire
sets as operands and returns an entire set as a result
Relational operations are all set-level, since they operate on
and return entire relations, and relations contain sets of
rows
Trang 4Copyright (c) 2003 C J Date page
3.10
• A (relational) view──also known as a virtual relvar [3.3]──is
a named derived relvar Views are virtual, in the sense that
they don't have any existence apart from the base relvars from which they're derived (but users should typically not be aware that a given view is in fact virtual in this sense, though SQL falls very short in this regard, owing to its weak support for view updating) Operations on views are processed by
translating them into equivalent operations on those
underlying base relvars
3.2 The following figure doesn't include the catalog entries for
relvars TABLE and COLUMN themselves Note: The figure is
incomplete in many other ways as well See Exercise 5.10 in
Chapter 5
╔════════════════════════════════════════════════════════════════╗
║ ┌─────────┬──────────┬──────────┬───────┐ ║
║ TABLE │ TABNAME │ COLCOUNT │ ROWCOUNT │ │ ║
║ ├═════════┼──────────┼──────────┼───────┤ ║
║ │ S │ 4 │ 5 │ │ ║
║ │ P │ 5 │ 6 │ │ ║
║ │ SP │ 3 │ 12 │ │ ║
║ │ │ │ │ │ ║
║ ║
║ ┌─────────┬──────────┬───────┐ ║
║ COLUMNS │ TABNAME │ COLNAME │ │ ║
║ ├═════════┼══════════┼───────┤ ║
║ │ S │ S# │ │ ║
║ │ S │ SNAME │ │ ║
║ │ S │ STATUS │ │ ║
║ │ S │ CITY │ │ ║
║ │ P │ P# │ │ ║
║ │ P │ PNAME │ │ ║
║ │ P │ COLOR │ │ ║
║ │ P │ WEIGHT │ │ ║
║ │ P │ CITY │ │ ║
║ │ SP │ S# │ │ ║
║ │ SP │ P# │ │ ║
║ │ SP │ QTY │ │ ║
║ │ │ │ │ ║
║ ║
╚════════════════════════════════════════════════════════════════╝ 3.3 The following figure shows the entries for the TABLE and COLUMN relvars only (i.e., the entries for the user's own relvars are omitted) It's obviously not possible to give precise COLCOUNT and ROWCOUNT values ╔════════════════════════════════════════════════════════════════╗ ║ ┌─────────┬──────────┬──────────┬───────┐ ║
Trang 5Copyright (c) 2003 C J Date page
3.11
║ TABLE │ TABNAME │ COLCOUNT │ ROWCOUNT │ │ ║
║ ├═════════┼──────────┼──────────┼───────┤ ║
║ │ TABLES │ (>3) │ (>2) │ │ ║
║ │ COLUMNS │ (>2) │ (>5) │ │ ║
║ │ │ │ │ │ ║
║ ║
║ ┌─────────┬──────────┬───────┐ ║
║ COLUMN │ TABNAME │ COLNAME │ │ ║
║ ├═════════┼══════════┼───────┤ ║
║ │ TABLE │ TABNAME │ │ ║
║ │ TABLE │ COLCOUNT │ │ ║
║ │ TABLE │ ROWCOUNT │ │ ║
║ │ COLUMN │ TABNAME │ │ ║
║ │ COLUMN │ COLNAME │ │ ║
║ │ │ │ │ ║
║ ║
╚════════════════════════════════════════════════════════════════╝
3.4 The query retrieves supplier number and city for suppliers who supply part P2
3.5 The meaning of the query is "Get supplier numbers for London suppliers who supply part P2." The first step in processing the query is to replace the name V by the expression that defines V, giving:
( ( ( ( S JOIN SP ) WHERE P# = P# ('P2') ) { S#, CITY } )
WHERE CITY = 'London' ) { S# } This simplifies to:
( ( S WHERE CITY = 'London' ) JOIN
( SP WHERE P# = P# ('P2') ) ) { S# } For further discussion and explanation, see Chapters 10 and 18
3.6 Atomicity means that transactions are guaranteed (from a
logical point of view) either to execute in their entirety or not
to execute at all, even if (say) the system fails halfway through
the process Durability means that once a transaction
successfully commits, its updates are guaranteed to be applied to the database, even if the system subsequently fails at any point
Isolation means that database updates made by a given transaction
T1 are kept hidden from all distinct transactions T2 until and
unless T1 successfully commits Serializability means that the
interleaved execution of a set of concurrent transactions is
guaranteed to produce the same result as executing those same
transactions one at a time in some (unspecified) serial order
Trang 6Copyright (c) 2003 C J Date page 3.12
3.7 The Information Principle states that the entire information
content of the database is represented in one and only one way: namely, as explicit values in column positions in rows in tables Equivalently: The database contains relvars, and nothing but
relvars Note: As indicated in the chapter, The Information
Principle might better be called The Principle of Uniform
Representation
3.8 No answer provided
Trang 7Copyright (c) 2003 C J Date page 4.1
Chapter 4
A n I n t r o d u c t i o n t o
S Q L
Principal Sections
• Overview
• The catalog
• Views
• Transactions
• Embedded SQL
• Dynamic SQL and SQL/CLI
• SQL isn't perfect
General Remarks
The overall purpose of Chapter 3 was to give the student the big picture of what relational systems in general are (or should be!) all about By contrast, the overall purpose of the present
chapter is to give the student the big picture of what SQL systems
in particular are all about
All SQL discussions in the book are based on the current
standard SQL:1999 (except for a few brief mentions here and there
of the expected next version, SQL:2003) Warn the students that
"their mileage may vary" when it comes to commercial SQL
dialects!──see reference [4.22] Also warn them that we
deliberately won't be using SQL as a vehicle for teaching database
principles; we'll cover the principles first and then consider how
(and to what extent) those principles are realized──or departed
from──in SQL afterward While SQL is obviously important from a
pragmatic standpoint, it's a very poor realization of proper
database principles, as well as being a very poorly designed
language from just about any standpoint Better that students
learn proper concepts and principles first before getting their heads bent out of shape by SQL
Incidentally, I can't resist the temptation to point out that it's really a bit of a joke──or a confidence trick──to be talking
about "SQL:2003," when nobody has yet implemented even SQL:1992 in
its entirety, let alone SQL:1999 Nor in fact could anybody do so!──given that SQL:1992 is full of gaps and contradictions, gaps and contradictions that still exist in SQL:1999 and will certainly still exist in SQL:2003 as well See reference [4.20], Appendix
Trang 8Copyright (c) 2003 C J Date page 4.2
D, for an extended discussion of some of those gaps and
contradictions
I also can't resist mentioning the fact that upgrading the SQL coverage to the SQL:1999 level caused me more trouble than
anything else in producing the eighth edition The 1999 standard
is simultaneously enormous in size and extremely hard to
understand (in this regard, you can get a sense of the general flavor from the not atypical quote that appears in Chapter 10, Section 10.6; that same quote is repeated in Chapter 10 of this manual)
The foregoing negative remarks notwithstanding, the chapter
per se contains little in the way of detailed or specific
criticism; rather, such criticisms appear, where relevant, at
appropriate points in later chapters See also references
[4.15-4.20] at the end of the chapter Note: The chapter and all "SQL
Facilities" sections in later chapters could be skipped if the course is concerned only with principles and not pragma But few instructors are likely to enjoy such a luxury
One point instructors need to be aware of: Exercise 4.1
introduces the extended version of the running suppliers-and-parts
example (viz., suppliers, parts, and projects) Subsequent
chapters tend to use suppliers-and-parts as a basis for the main body of the text and suppliers, parts, and projects as a basis for
exercises; however, this separation is not rigidly adhered to Be
aware, therefore, that there might be some occasional potential for confusion in this area The endpapers can help here (Figs 3.8 and 4.5 are both repeated inside the back cover)
BNF Notation
Chapter 4 is the first in the book to use standard BNF notation,
or rather a simple variant thereof The variant in
question──which isn't explained in detail in the book──is defined
as follows:
• Special characters and material in uppercase must be written exactly as shown Material in lowercase enclosed in angle brackets "<" and ">" represents a syntactic category that
appears on the left side of another production rule, and hence must eventually be replaced by specific items chosen by the user
• Vertical bars "|" are used to separate alternatives
• Square brackets "[" and "]" are used to indicate that the material enclosed in those brackets is optional
Trang 9Copyright (c) 2003 C J Date page 4.3
The text also makes extensive use of a shorthand based on
lists and commalists These terms are explained in the book (in
Sections 5.4 and 4.6, respectively), but I'll repeat the
explanations here for convenience Let <xyz> denote an arbitrary
syntactic category (i.e., anything that appears on the left side
of some BNF production rule) Then:
• The expression <xyz list> denotes a sequence of zero or more
<xyz> s in which each pair of adjacent <xyz>s is separated by
one or more blanks
• The expression <xyz commalist> denotes a sequence of zero or more <xyz>s in which each pair of adjacent <xyz>s is separated
by a comma (and possibly one or more blanks on either side of the comma)
Give some simple examples
4.2 Overview
SQL talks in terms of tables (and rows and columns), not relations
(and tuples and attributes) SQL is often said to include both
data definition and data manipulation facilities (though these
terms have become increasingly inappropriate as SQL has expanded
to become a computationally complete programming language*) It also includes a bunch of miscellaneous other facilities
──────────
* With the ratification of SQL/PSM in 1996, SQL is indeed now computationally complete──entire applications can now be written
in SQL, without any need for a distinct host language (except for I/O facilities, which SQL doesn't provide)
──────────
Regarding data definition, cover CREATE TABLE and (briefly)
built-in scalar types Note: User-defined types were added in
SQL:1999, and we'll discuss them in detail in the next chapter (we'll say a bit more in that chapter about built-in types as
well) Do not discuss SQL-style "domains"! (See reference [4.20]
for an explanation of how SQL-style domains differ from true
types.)
Regarding data manipulation, cover SELECT (including "SELECT
*" and SELECT formulations of restrict, project, and join queries)
and set-level INSERT, DELETE, and UPDATE (no relational assignment
Trang 10Copyright (c) 2003 C J Date page 4.4
as such!) Note carefully, however, that this section
deliberately doesn't get into a lot of detail on SELECT (and so the exercises and answers don't, either); such matters are
deferred to Section 8.6, after the relevant relational concepts have been described.* INSERT, DELETE, and UPDATE, by contrast,
are not explained much further in any later chapter (the treatment
here is it, more or less)
──────────
* If you like, you could beef up the treatment of SELECT by
bringing in some of the material from Section 8.6 in here
──────────
4.3 The Catalog / 4.4 Views / 4.5 Transactions
Briefly survey the relevant SQL features:
• Information Schemas
• CREATE VIEW and the substitution mechanism (how does it look
in SQL?──leads to a brief introduction to nested subqueries)
Do not get into details of SQL view updating
• START TRANSACTION, COMMIT WORK, ROLLBACK WORK No need to
get into the effect of these operations on cursors yet (unless anyone asks)──that material's covered in Chapter 15 Don't
mention SET TRANSACTION Note: START TRANSACTION was added
in SQL:1999; prior to that, transactions could be started in SQL only implicitly, a state of affairs that caused some
grief For reasons of backward compatibility, of course, it's still possible to start transactions implicitly, but I
wouldn't get into this unless anyone asks about it A tiny point of syntax: It's really odd that the SQL committee chose
to call the operator START TRANSACTION and not BEGIN
TRANSACTION, given that BEGIN was already a reserved word and START wasn't An illustration of the point that designing a language by committee isn't a very good idea?
4.6 Embedded SQL
This section is probably the most important in the chapter; it gives details (some of them unfortunately a little tedious) that don't logically belong anywhere else in the book Discuss:
• The dual-mode principle