8.4 If supplier S2 currently supplies no parts, the original query will return all supplier numbers currently appearing in S including in particular S2, who presumably appears in S but n
Trang 18.4 If supplier S2 currently supplies no parts, the original query will return all supplier numbers currently appearing in S
(including in particular S2, who presumably appears in S but not
in SP) If we replace SX by SPX throughout, it will return all
supplier numbers currently appearing in SP The difference
between the two formulations is thus as follows: The first means
"Get supplier numbers for suppliers who supply at least all those parts supplied by supplier S2" (as required) The second means
"Get supplier numbers for suppliers who supply at least one part and supply at least all those parts supplied by supplier S2."
8.5 a Get part name and city for parts supplied to every project
in Paris by every supplier in London in a quantity less than 500
b The result of this query is empty
8.6 This exercise is very difficult!──especially when we take into account the fact that part weights aren't unique (If they were,
we could paraphrase the query as "Get all parts such that the
count of heavier parts is less than three.") The exercise is so difficult, in fact, that we don't even attempt to give a pure
calculus solution here It illustrates very well the point that
relational completeness is only a basic measure of expressive
power, and probably not a sufficient one (The next two exercises also illustrate this point.) See reference [7.5] for an extended discussion of queries of this type
8.7 Let PSA, PSB, PSC, , PSn be range variables ranging over
(the current value of) relvar PART_STRUCTURE, and suppose the
given part is part P1 Then:
a A calculus expression for the query "Get part numbers for all
parts that are components, at the first level, of part P1" is:
PSA.MINOR_P# WHERE PSA.MAJOR_P# = P# ( 'P1' )
b A calculus expression for the query "Get part numbers for all
parts that are components, at the second level, of part P1"
is:
PSB.MINOR_P# WHERE EXISTS PSA
( PSA.MAJOR_P# = P# ( 'P1' ) AND PSB.MAJOR_P# = PSA.MINOR_P# )
c A calculus expression for the query "Get part numbers for all
parts that are components, at the third level, of part P1" is:
PSC.MINOR_P# WHERE EXISTS PSA EXISTS PSB
( PSA.MAJOR_P# = P# ( 'P1' ) AND PSB.MAJOR_P# = PSA.MINOR_P# AND PSC.MAJOR_P# = PSB.MINOR_P# )
Trang 2And so on A calculus expression for the query "Get part
numbers for all parts that are components, at the nth level,
of part P1" is:
PSn.MINOR_P# WHERE EXISTS PSA EXISTS PSB EXISTS PS(n-1)
( PSA.MAJOR_P# = P# ( 'P1' ) AND PSB.MAJOR_P# = PSA.MINOR_P# AND PSC.MAJOR_P# = PSB.MINOR_P# AND AND
PSn.MAJOR_P# = PS(n-1).MINOR_P# )
All of these result relations a., b., c., then need to be
"unioned" together to construct the PART_BILL result
The problem is, of course, that there's no way to write n such expressions if the value of n is unknown In fact, the part
explosion query is a classic illustration of a problem that can't
be formulated by means of a single expression in a language that's only relationally complete──i.e., a language that's no more
powerful than the original calculus (or algebra) We therefore need another extension to the original calculus (and algebra) The TCLOSE operator discussed briefly in Chapter 7 is part of the solution to this problem (but only part) Further details are beyond the scope of this book
Note: Although this problem is usually referred to as "bill-of-materials" or "parts explosion," it's actually of much wider applicability than those names might suggest In fact, the kind
of relationship typified by the "parts contain parts" structure occurs in a very wide range of applications Other examples
include management hierarchies, family trees, authorization
graphs, communication networks, software module invocation
structures, transportation networks, etc., etc
8.8 This query can't be expressed in either the calculus or the algebra For example, to express it in the calculus, we would basically need to be able to say something like the following:
Does there exist a relation r such that there exists a tuple t
in r such that t.S# = S#('S1')?
In other words, we would need to be able to quantify over
relations instead of tuples, and we would therefore need a new kind of range variable, one that denoted relations instead of
tuples The query therefore can't be expressed in the relational calculus as currently defined
Note, incidentally, that the query under discussion is a
"yes/no" query (the desired answer is basically a truth value) You might be tempted to think, therefore, that the reason the
Trang 3query can't be handled in the calculus or the algebra is that
calculus and algebra expressions are relation-valued, not
truth-valued However, yes/no queries can be handled in the calculus
and algebra if properly implemented! The crux of the matter is to recognize that yes and no (equivalently, TRUE and FALSE) are
representable as relations The relations in question are
TABLE_DEE and TABLE_DUM, respectively
8.9 In order to show that SQL is relationally complete, we have to show, first, (a) that there exist SQL expressions for each of the five primitive (algebraic) operators restrict, project, product, union, and difference, and then (b) that the operands to those SQL expressions can be arbitrary SQL expressions in turn
We begin by observing that SQL effectively does support the relational algebra RENAME operator, thanks to the availability of
the optional "AS <column name>" specification on items in the
SELECT clause.* We can therefore ensure that all tables do have proper column names, and in particular that the operands to
product, union, and difference satisfy the requirements of (our version of) the algebra with respect to column naming
Furthermore──provided those operand column-naming requirements are indeed satisfied──the SQL column name inheritance rules in fact coincide with those of the algebra as described (under the name
relation type inference) in Chapter 7
──────────
* To state the matter a little more precisely: An SQL analog of the algebraic expression T RENAME A AS B is the (extremely
inconvenient!) SQL expression SELECT A AS B, X, Y, , Z FROM T (where X, Y, , Z are all of the columns of T apart from A, and
we choose to overlook the fact that the SQL expression results in
a table with a left-to-right ordering to its columns)
──────────
Here then are SQL expressions corresponding approximately to the five primitive operators:
Algebra SQL
A WHERE p SELECT * FROM A WHERE p
A { X, Y, , Z } SELECT DISTINCT X, Y, , Z FROM A
A TIMES B SELECT * FROM A, B
A UNION B SELECT * FROM A UNION SELECT * FROM B
Trang 4A MINUS B SELECT * FROM A EXCEPT SELECT * FROM B Reference [4.20] shows that each of A and B in the SQL
expressions shown above is in fact a <table reference> It also
shows that if we take any of the five SQL expressions shown and
enclose it in parentheses, what results is in turn a <table
reference>.* It follows that SQL is indeed relationally complete
──────────
* We ignore the fact that SQL would in fact require such a <table reference> to include a pointless range variable definition
──────────
Note: Actually there is a glitch in the foregoing──SQL fails
to support projection over no columns at all (because it also
fails to support empty SELECT clauses) As a consequence, it
doesn't support TABLE_DEE or TABLE_DUM
8.10 SQL supports EXTEND but not SUMMARIZE (at least, not very
directly) Regarding EXTEND, the relational algebra expression
EXTEND A ADD exp AS Z
can be represented in SQL as
SELECT A.*, exp' AS Z
FROM ( A ) AS A
The expression exp' in the SELECT clause is the SQL counterpart of the EXTEND operand exp The parenthesized A in the FROM clause is
a <table reference> of arbitrary complexity (corresponding to the EXTEND operand A); the other A in the FROM clause is a range
variable name
Regarding SUMMARIZE, the basic problem is that the relational algebra expression
SUMMARIZE A PER B
yields a result with cardinality equal to that of B, while the SQL
"equivalent"
SELECT
FROM A
GROUP BY C ;
Trang 5yields a result with cardinality equal to that of the projection
of A over C
8.11 SQL doesn't support relational comparisons directly
However, such operations can be simulated, albeit only in a very cumbersome manner For example, the comparison
A = B
(where A and B are relvars) can be simulated by the SQL expression NOT EXISTS ( SELECT * FROM A
WHERE NOT EXISTS ( SELECT * FROM B
WHERE A-row = B-row ) )
AND
NOT EXISTS ( SELECT * FROM B
WHERE NOT EXISTS ( SELECT * FROM A
WHERE B-row = A-row ) ) (where A-row and B-row are <row value constructor>s representing
an entire row of A and an entire row of B, respectively)
8.12 Here are a few such formulations Note that the following list isn't even close to being exhaustive [4.19] Note too that this is a very simple query!
SELECT DISTINCT S.SNAME
FROM S
WHERE S.S# IN
( SELECT SP.S#
FROM SP WHERE SP.P# = P#('P2') ) ; SELECT DISTINCT T.SNAME
FROM ( S NATURAL JOIN SP ) AS T
WHERE T.P# = P#('P2') ;
SELECT DISTINCT T.SNAME
FROM ( S JOIN SP ON S.S# = SP.P# AND SP.P# = P#('P2') ) AS T ; SELECT DISTINCT T.SNAME
FROM ( S JOIN SP USING S# ) AS T
WHERE T.P# = P#('P2') ;
SELECT DISTINCT S.SNAME
FROM S
WHERE EXISTS
( SELECT *
FROM SP
Trang 6WHERE SP.S# = S.S#
AND SP.P# = P#('P2') ) ; SELECT DISTINCT S.SNAME
FROM S, SP
WHERE S.S# = SP.S#
AND SP.P# = P#('P2') ;
SELECT DISTINCT S.SNAME
FROM S
WHERE 0 <
( SELECT COUNT(*)
FROM SP WHERE SP.S# = S.S#
AND SP.P# = P#('P2') ) ; SELECT DISTINCT S.SNAME
FROM S
WHERE P#('P2') IN
( SELECT SP.P#
FROM SP WHERE SP.S# = S.S# ) ; SELECT S.SNAME
FROM S, SP
WHERE S.S# = SP.S#
AND SP.P# = P#('P2')
GROUP BY S.SNAME ;
Subsidiary question: What are the implications of the
foregoing? Answer: The language is harder to document, teach,
learn, remember, use, and implement efficiently, than it ought to
be
8.13 We've numbered the following solutions as 8.13.n, where 7.n
is the number of the original exercise in Chapter 7 We assume that SX, SY, PX, PY, JX, JY, SPJX, SPJY (etc.) are range variables ranging over suppliers, parts, projects, and shipments,
respectively; definitions of those range variables aren't shown
8.13.13 JX
8.13.14 JX WHERE JX.CITY = 'London'
8.13.15 SPJX.S# WHERE SPJX.J# = J# ( 'J1' )
8.13.16 SPJX WHERE SPJX.QTY ≥ QTY ( 300 ) AND
SPJX.QTY ≤ QTY ( 750 )
8.13.17 { PX.COLOR, PX.CITY }
Trang 78.13.18 { SX.S#, PX.P#, JX.J# } WHERE SX.CITY = PX.CITY
AND PX.CITY = JX.CITY
8.13.19 { SX.S#, PX.P#, JX.J# } WHERE SX.CITY =/ PX.CITY
OR PX.CITY =/ JX.CITY
OR JX.CITY =/ SX.CITY
8.13.20 { SX.S#, PX.P#, JX.J# } WHERE SX.CITY =/ PX.CITY
AND PX.CITY =/ JX.CITY AND JX.CITY =/ SX.CITY
8.13.21 SPJX.P# WHERE EXISTS SX ( SX.S# = SPJX.S# AND
SX.CITY = 'London' )
8.13.22 SPJX.P# WHERE EXISTS SX EXISTS JX
( SX.S# = SPJX.S# AND SX.CITY = 'London' AND JX.J# = SPJX.J# AND JX.CITY = 'London' )
8.13.23 { SX.CITY AS SCITY, JX.CITY AS JCITY }
WHERE EXISTS SPJX ( SPJX.S# = SX.S# AND SPJX.J# = JX.J# )
8.13.24 SPJX.P# WHERE EXISTS SX EXISTS JX
( SX.CITY = JX.CITY AND SPJX.S# = SX.S# AND SPJX.J# = JX.J# )
8.13.25 SPJX.J# WHERE EXISTS SX EXISTS JX
( SX.CITY =/ JX.CITY AND SPJX.S# = SX.S# AND SPJX.J# = JX.J# )
8.13.26 { SPJX.P# AS XP#, SPJY.P# AS YP# }
WHERE SPJX.S# = SPJY.S# AND SPJX.P# < SPJY.P#
8.13.27 COUNT ( SPJX.J# WHERE SPJX.S# = S# ( 'S1' ) ) AS N
8.13.28 SUM ( SPJX WHERE SPJX.S# = S# ( 'S1' )
AND SPJX.P# = P# ( 'P1' ), QTY ) AS Q
Note: The following "solution" is not correct (why not?):
SUM ( SPJX.QTY WHERE SPJX.S# = S# ( 'S1' )
AND SPJX.P# = P# ( 'P1' ) ) AS Q
Answer: Because duplicate QTY values will now be eliminated
before the sum is computed
8.13.29 { SPJX.P#, SPJX.J#,
Trang 8SUM ( SPJY WHERE SPJY.P# = SPJX.P#
AND SPJY.J# = SPJX.J#, QTY ) AS Q }
8.13.30 SPJX.P# WHERE
AVG ( SPJY WHERE SPJY.P# = SPJX.P#
AND SPJY.J# = SPJX.J#, QTY ) > QTY ( 350 )
8.13.31 JX.JNAME WHERE EXISTS SPJX ( SPJX.J# = JX.J# AND
SPJX.S# = S# ( 'S1' ) )
8.13.32 PX.COLOR WHERE EXISTS SPJX ( SPJX.P# = PX.P# AND
SPJX.S# = S# ( 'S1' ) )
8.13.33 SPJX.P# WHERE EXISTS JX ( JX.CITY = 'London' AND
JX.J# = SPJX.J# )
8.13.34 SPJX.J# WHERE EXISTS SPJY ( SPJX.P# = SPJY.P# AND
SPJY.S# = S# ( 'S1' ) )
8.13.35 SPJX.S# WHERE EXISTS SPJY EXISTS SPJZ EXISTS PX
( SPJX.P# = SPJY.P# AND SPJY.S# = SPJZ.S# AND SPJZ.P# = PX.P# AND PX.COLOR = COLOR ( 'Red' ) )
8.13.36 SX.S# WHERE EXISTS SY ( SY.S# = S# ( 'S1' ) AND
SX.STATUS < SY.STATUS )
8.13.37 JX.J# WHERE FORALL JY ( JY.CITY ≥ JX.CITY )
Or: JX.J# WHERE JX.CITY = MIN ( JY.CITY )
8.13.38 SPJX.J# WHERE SPJX.P# = P# ( 'P1' ) AND
AVG ( SPJY WHERE SPJY.P# = P# ( 'P1' )
AND SPJY.J# = SPJX.J#, QTY ) > MAX ( SPJZ.QTY WHERE SPJZ.J# = J# ( 'J1' ) )
8.13.39 SPJX.S# WHERE SPJX.P# = P# ( 'P1' )
AND SPJX.QTY >
AVG ( SPJY
WHERE SPJY.P# = P# ( 'P1' ) AND SPJY.J# = SPJX.J#, QTY )
8.13.40 JX.J# WHERE NOT EXISTS SPJX EXISTS SX EXISTS PX
( SX.CITY = 'London' AND PX.COLOR = COLOR ( 'Red' ) AND SPJX.S# = SX.S# AND
SPJX.P# = PX.P# AND SPJX.J# = JX.J# )
Trang 98.13.41 JX.J# WHERE FORALL SPJY ( IF SPJY.J# = JX.J#
THEN SPJY.S# = S# ( 'S1' ) END IF )
8.13.42 PX.P# WHERE FORALL JX
( IF JX.CITY = 'London' THEN EXISTS SPJY ( SPJY.P# = PX.P# AND
SPJY.J# = JX.J# ) END IF )
8.13.43 SX.S# WHERE EXISTS PX FORALL JX EXISTS SPJY
( SPJY.S# = SX.S# AND SPJY.P# = PX.P# AND SPJY.J# = JX.J# )
8.13.44 JX.J# WHERE FORALL SPJY ( IF SPJY.S# = S# ( 'S1' ) THEN
EXISTS SPJZ ( SPJZ.J# = JX.J# AND SPJZ.P# = SPJY.P# ) END IF )
8.13.45 RANGEVAR VX RANGES OVER
( SX.CITY ), ( PX.CITY ), ( JX.CITY ) ; VX.CITY
8.13.46 SPJX.P# WHERE EXISTS SX ( SX.S# = SPJX.S# AND
SX.CITY = 'London' )
OR EXISTS JX ( JX.J# = SPJX.J# AND
JX.CITY = 'London' )
8.13.47 { SX.S#, PX.P# }
WHERE NOT EXISTS SPJX ( SPJX.S# = SX.S# AND
SPJX.P# = PX.P# )
8.13.48 { SX.S# AS XS#, SY.S# AS YS# }
WHERE FORALL PZ
( ( IF EXISTS SPJX ( SPJX.S# = SX.S# AND
SPJX.P# = PZ.P# ) THEN EXISTS SPJY ( SPJY.S# = SY.S# AND
SPJY.P# = PZ.P# ) END IF )
AND ( IF EXISTS SPJY ( SPJY.S# = SY.S# AND
SPJY.P# = PZ.P# ) THEN EXISTS SPJX ( SPJX.S# = SX.S# AND
SPJX.P# = PZ.P# ) END IF ) )
8.13.49 { SPJX.S#, SPJX.P#, { SPJY.J#, SPJY.QTY WHERE
Trang 10SPJY.S# = SPJX.S# AND SPJY.P# = SPJX.P# } AS JQ }
8.13.50 Let R denote the result of evaluating the expression shown
in the previous solution Then:
RANGEVAR RX RANGES OVER R ,
RANGEVAR RY RANGES OVER RX.JQ ;
{ RX.S#, RX.P#, RY.J#, RY.QTY }
We're extending the syntax and semantics of <range var def>
slightly here The idea is that the definition of RY depends on that of RX (note that the two definitions are separated by a
comma, not a semicolon, and are thereby bundled into a single
operation) See reference [3.3] for further discussion
8.14 We've numbered the following solutions as 8.14.n, where 7.n
is the number of the original exercise in Chapter 7
8.14.13 SELECT *
FROM J ;
Or simply:
TABLE J ;
8.14.14 SELECT J.*
FROM J
WHERE J.CITY = 'London' ;
8.14.15 SELECT DISTINCT SPJ.S#
FROM SPJ
WHERE SPJ.J# = J#('J1') ;
8.14.16 SELECT SPJ.*
FROM SPJ
WHERE SPJ.QTY >= QTY(300)
AND SPJ.QTY <= QTY(750) ;
8.14.17 SELECT DISTINCT P.COLOR, P.CITY
FROM P ;
8.14.18 SELECT S.S#, P.P#, J.J#
FROM S, P, J
WHERE S.CITY = P.CITY
AND P.CITY = J.CITY ;
8.14.19 SELECT S.S#, P.P#, J.J#
FROM S, P, J