332 CHAPTER 17: THE SELECT STATEMENT SELECT F1.fruit, F1.score, FROM Foobar AS F1 ORDER BY F1.score DESC; One outcome might have been any of these: Result #1 ('Apples', 5) ('Oranges', 5) ('Apples', 2) ('Banana', 2) ('Apples', 1) Result #2 ('Oranges', 5) ('Apples', 5) ('Apples', 2) ('Banana', 2) ('Apples', 1) Result #3 ('Oranges', 5) ('Apples', 5) ('Banana', 2) ('Apples', 2) ('Apples', 1) Result #4 ('Apples', 5) ('Oranges', 5) ('Banana', 2) ('Apples', 2) ('Apples', 1) If you use: SELECT F1.fruit, F1.score, FROM Foobar AS F1 ORDER BY F1.score DESC, F1.fruit ASC; Result ('Apples', 5) ('Oranges', 5) 17.1 SELECT and JOINs 333 ('Apples', 2) ('Banana', 2) ('Apples', 1) But this is not what we wanted—the order within fruits has been destroyed. Likewise: SELECT F1.fruit, F1.score FROM Foobar AS F1 ORDER BY F1.fruit ASC, F1.score DESC; Results ('Apples', 5) ('Apples', 2) ('Apples', 1) ('Banana', 2) ('Oranges', 5) But this is still not what we wanted—the order within scores has been destroyed. We need a dummy column to preserve the ordering, thus: SELECT F1.fruit, F1.score, (SELECT MAX(score) FROM Foobar AS F2 WHERE F1.fruit = F2.fruit) AS score_preserver FROM Foobar AS F1 ORDER BY score_preserver DESC, F1.fruit ASC, F1.score DESC; Cursors include an <updatability clause>, which tells you if the cursor is FOR READ ONLY or for UPDATE [OF <column name list>], but this clause in optional. If ORDER BY is specified, or if the result table is a read-only table, then the <updatability clause> defaults to FOR READ ONLY. The ORDER BY and CASE Expressions SQL-99 allows you to use a function in an ORDER BY clause. While it is now legal, it is still not a good programming practice. Users should see the fields that are used for the sort, so they can use them to read and locate lines of data in reports. The sorting values are usually on the left side of each line, since we read left to right. The most portable method is to use a CASE expression that takes an external parameter of the form: 334 CHAPTER 17: THE SELECT STATEMENT SELECT first_name, last_name, dept, CASE :flag WHEN 'f' THEN first_name WHEN 'l' THEN last_name WHEN 'd' THEN dept ELSE NULL END AS sort_col FROM Personnel ORDER BY sort_col; Obviously, the expression in the THEN clauses must either be of the same data type or be cast into the same data type. Controlling the direction of the sort is a little trickier and requires two columns, one of which is always set to all NULLs. SELECT last_name, CASE :flag WHEN 'la' THEN last_name ELSE NULL END AS sort_col1, CASE :flag WHEN 'ld' THEN last_name ELSE NULL END AS sort_col2 FROM Personnel ORDER BY sort_col1, sort_col2 DESC; You can get a bit fancy with this basic idea: SELECT CASE :flag_1 WHEN 'a' THEN CAST (a AS CHAR(n)) WHEN 'b' THEN CAST (b AS CHAR(n)) WHEN 'c' THEN CAST (c AS CHAR(n)) ELSE NULL END AS sort_1, CASE :flag_2 WHEN 'x' THEN CAST (x AS CHAR(n)) WHEN 'y' THEN CAST (y AS CHAR(n)) WHEN 'z' THEN CAST (z AS CHAR(n)) ELSE NULL END AS sort_2, CASE :flag_n WHEN 'n1' THEN CAST (n1 AS CHAR(n)) WHEN 'n2' THEN CAST (n2 AS CHAR(n)) WHEN 'n3' THEN CAST (n3 AS CHAR(n)) ELSE NULL END AS sort_2, 17.1 SELECT and JOINs 335 FROM MyTable WHERE ORDER BY sort_1, sort_2, ; If you have more than one sort column and only a limited set of combinations, use concatenation. CASE :flag_1 WHEN 'ab' THEN CAST(a AS CHAR(n)) ||' ' || CAST(b AS CHAR(n)) WHEN 'ba' THEN CAST(b AS CHAR(n)) ||' ' || CAST(a AS CHAR(n)) ELSE NULL END AS sort_1, If you need ASC and DESC options, then use a combination of CASE and ORDER BY: CASE :flag_1 WHEN :flag_1 = 'a' AND :flag_1_ad = 'ASC' THEN CAST (a AS CHAR(n)) WHEN :flag_1 = 'b' AND :flag_1_ad = 'ASC' THEN CAST (b AS CHAR(n)) WHEN :flag_1 = 'c' AND :flag_1_ad = 'ASC' THEN CAST (c AS CHAR(n)) ELSE NULL END AS sort_1_a, CASE :flag_1 WHEN :flag_1 = 'a' AND :flag_1_ad = 'DESC' THEN CAST (a AS CHAR(n)) WHEN :flag_1 = 'b' AND :flag_1_ad = 'DESC' THEN CAST (b AS CHAR(n)) WHEN :flag_1 = 'c' AND :flag_1_ad = 'DESC' THEN CAST (c AS CHAR(n)) ELSE NULL END AS sort_1_d ORDER BY sort_1_a ASC, sort_1_d DESC I have shown explicit CAST(<exp> AS CHAR(n)), but if the data types of the THEN clause expressions were already the same, there would be no reason to force the conversions. You change the ELSE NULL clause to any constant of the appropriate data type, but it should be 336 CHAPTER 17: THE SELECT STATEMENT something useful to the reader. A neater way of doing this is to use one column for each sorting option. SELECT MyTable.* , CASE WHEN :flag = 'a' THEN a ELSE NULL END AS sort1, CASE WHEN :flag = 'b' THEN b ELSE NULL END AS sort2, CASE WHEN :flag = 'c' THEN c ELSE NULL END AS sort3 FROM Personnel WHERE ORDER BY sort1, sort2, sort3; This code is easy to read, and you do not have worry about CAST() operations. The trade-off is a larger result set being sent to the cursor. 17.2 OUTER JOINs OUTER JOINs used to be done with proprietary vendor syntax. Today, the use of the Standard OUTER JOIN is universal. An OUTER JOIN is a JOIN that preserves all the rows in one or both tables, even when they do not have matching rows in the second table. Let’s take a real-world situation: I have a table of orders and a table of suppliers that I wish to JOIN for a report to tell us how much business we did with each supplier. With an inner join, the query would be this: SELECT Suppliers.sup_id, sup_name, order_nbr, order_amt FROM Suppliers, Orders WHERE Suppliers.sup_id = Orders.sup_id; Some suppliers’ totals include credits for returned merchandise, and our total business with them works out to zero dollars. Other suppliers never got an order from us at all, so we did zero dollars’ worth of business with them, too. But the first case will show up in the query result and be passed on to the report, whereas the second case will disappear in the INNER JOIN. If we had used an OUTER JOIN, preserving the Suppliers table, we would have all the suppliers in the results. When a supplier with no orders was found in the Orders table, the order_nbr and order_amt columns would be given a NULL value in the result row. 17.2 OUTER JOINs 337 17.2.1 Syntax for OUTER JOINs In the old SQL-89 standard, there was no OUTER JOIN syntax, so you had to construct it by hand with a messy UNION in products such as earlier versions of DB2 from IBM: SELECT sup_id, sup_name, order_amt regular INNER JOIN FROM Suppliers, Orders WHERE Suppliers.sup_id = Orders.sup_id UNION ALL SELECT sup_id, sup_name, CAST(NULL AS INTEGER) preserved rows of LEFT JOIN FROM Suppliers WHERE NOT EXISTS (SELECT * FROM Orders WHERE Suppliers.sup_id = Orders.sup_id); You have to use a NULL with the correct data type to make the UNION work, hence the CAST() functions. Some products are smart enough that just NULL by itself will be given the correct data type, but this version is portable and safer. The other alternative is to insert a constant of some sort to give a more meaningful result. This is easy in the case of a CHARACTER column, where a message like '{{NONE}}' can be quickly understood. It is much harder in the case of a numeric column, where we could have a balance with a supplier that is positive, zero, or negative because of returns and credits. There really is a difference between a vendor we did not use, and a vendor whose returns canceled out its orders. The most common vendor extensions were for the LEFT OUTER JOIN. These extensions were all different in syntax, or semantics, or both. Today, they are all gone and replaced by the Standard SQL syntax. In the second edition of this book, I described the proprietary extensions in detail; since they are mercifully gone, I am not going to tell you about them in this edition. The name LEFT OUTER JOIN comes from the fact that the preserved table is on the left side of the operator. Likewise, a RIGHT OUTER JOIN would have the preserved table on the right-hand side, and a FULL OUTER JOIN would preserve both tables. Here is how OUTER JOINs work in Standard SQL. Assume you are given: 338 CHAPTER 17: THE SELECT STATEMENT Table1 Table2 a b a c ====== ====== 1 w 1 r 2 x 2 s 3 y 3 t 4 z and the OUTER JOIN expression: Table1 LEFT OUTER JOIN Table2 ON Table1.a = Table2.a <== JOIN condition AND Table2.c = 't'; <== single table condition We call Table1 the “preserved table” and Table2 the “unpreserved table” in the query. What I am going to give you is a little different, but equivalent to the ANSI/ISO standards. 1. Build the CROSS JOIN of the two tables. Scan each row in the result set. 2. If the predicate tests TRUE for that row, keep it. You also remove all rows derived from it from the CROSS JOIN. 3. If the predicate tests FALSE or UNKNOWN for that row, then keep the columns from the preserved table, convert all the columns from the unpreserved table to NULLs, and remove the duplicates. Let us execute this by hand: Let @ = passed the first predicate {{typesetter: pick better tokens, such as bullets}} Let * = passed the second predicate Table1 CROSS JOIN Table2 a b a c ========================= 1 w 1 r @ 1 w 2 s 1 w 3 t * 2 x 1 r 17.2 OUTER JOINs 339 2 x 2 s @ 2 x 3 t * 3 y 1 r 3 y 2 s 3 y 3 t @* <== the TRUE set 4 z 1 r 4 z 2 s 4 z 3 t * Table1 LEFT OUTER JOIN Table2 a b a c ========================= 3 y 3 t <= only TRUE row 1 w NULL NULL Sets of duplicates 1 w NULL NULL 1 w NULL NULL 2 x NULL NULL 2 x NULL NULL 2 x NULL NULL 3 y NULL NULL <== derived from the TRUE set - Remove 3 y NULL NULL 4 z NULL NULL 4 z NULL NULL 4 z NULL NULL The final results are: Table1 LEFT OUTER JOIN Table2 a b a c ========================= 1 w NULL NULL 2 x NULL NULL 3 y 3 t 4 z NULL NULL The basic rule is that every row in the preserved table is represented in the results in at least one result row. 340 CHAPTER 17: THE SELECT STATEMENT There is a myth among ACCESS programmers that the ON clause can contain only a JOIN condition, and the WHERE can contain only search conditions. This is not true, and the differences in the position of the predicates are important. Consider the two famous Chris Date tables. Suppliers SupParts sup_id sup_id part_nbr qty ========= ================= S1 S1 P1 100 S2 S1 P2 250 S3 S2 P1 100 S2 P2 250 If you write the OUTER JOIN with only the join predicate in the ON clause, like this: SELECT Suppliers.sup_id, SupParts.part_nbr, SupParts.qty FROM Suppliers LEFT OUTER JOIN SupParts ON Supplier.sup_id = SupParts.sup_id WHERE qty < 200; You get: sup_id part_nbr qty =================== 'S1' 'P1' 100 'S2' 'P1' 100 But if you put the search predicate in the ON clause: SELECT Suppliers.sup_id, SupParts.part_nbr, SupParts.qty FROM Suppliers LEFT OUTER JOIN SupParts ON Supplier.sup_id = SupParts.sup_id AND qty < 200; 17.2 OUTER JOINs 341 You get: sup_id part_nbr qty =================== 'S1' 'P1' 100 'S2' 'P1' 100 'S3' NULL NULL Another problem is that, in general, the order of execution matters with a chain of OUTER JOINs. That is to say, ((T1 OUTER JOIN T2) OUTER JOIN T3) does not produce the same results as (T1 OUTER JOIN (T2 OUTER JOIN T3)). I can use any of the options in the ON clause of an outer join: SELECT S1.sup_id, S1.sup_name, O1.order_nbr, O1.order_amt FROM Suppliers AS S1 LEFT OUTER JOIN Orders AS O1 ON S1.sup_id = O1.sup_id; or: SELECT S1.sup_id, S1.sup_name, O1.order_nbr, O1.order_amt FROM Suppliers AS S1 LEFT OUTER JOIN Orders AS O1 USING (S1.sup_id); or: SELECT S1.sup_id, S1.sup_name, O1.order_nbr, O1.order_amt FROM Suppliers AS S1 NATURAL LEFT OUTER JOIN Orders AS O1; A SELECT expression that returns a single row with a single value can be used where a scalar expression can be used. If the result of the scalar query is empty, it is converted to a NULL. This will sometimes, but not always, let you write an OUTER JOIN as a query within the SELECT clause; thus, this query will work only if each supplier has one or zero orders: SELECT sup_id, sup_name, order_nbr, (SELECT order_amt FROM Orders WHERE Suppliers.sup_id = Orders.sup_id) AS order_amt FROM Suppliers; . 2) ('Apples', 1) Result #3 ('Oranges', 5) ('Apples', 5) ('Banana', 2) ('Apples', 2) ('Apples', 1) Result #4 ('Apples', 5) ('Oranges',. Orders AS O1 ON S1 .sup_id = O1.sup_id; or: SELECT S1 .sup_id, S1 .sup_name, O1.order_nbr, O1.order_amt FROM Suppliers AS S1 LEFT OUTER JOIN Orders AS O1 USING (S1 .sup_id); or: SELECT S1 .sup_id, S1 .sup_name,. 5) ('Apples', 2) ('Banana', 2) ('Apples', 1) Result #2 ('Oranges', 5) ('Apples', 5) ('Apples', 2) ('Banana', 2) ('Apples',