102 CHAPTER 6: CODING CHOICES 1. We build the CROSS JOIN of the two tables. Scan each row in the result set. 2. If the predicate tests TRUE for that row, then you keep it. You also remove all rows derived from it from the CROSS JOIN. 3. If the predicate tests FALSE or UNKNOWN for that row, then keep the columns from the preserved table, convert all the columns from the unpreserved table to NULLs, and remove the duplicates. So let us execute this by hand: Let @ = passed the first predicate Let * = passed the second predicate Table1 CROSS JOIN Table2 a b a c ========================= 1 w 1 r @ 1 w 2 s 1 w 3 t * 2 x 1 r 2 x 2 s @ 2 x 3 t * 3 y 1 r 3 y 2 s 3 y 3 t @* <== the TRUE set 4 z 1 r 4 z 2 s 4 z 3 t * Table1 LEFT OUTER JOIN Table2 a b a c ========================= 3 y 3 t <= only TRUE row 1 w NULL NULL Sets of duplicates 1 w NULL NULL 1 w NULL NULL 6.1 Pick Standard Constructions over Proprietary Constructions 103 2 x NULL NULL 2 x NULL NULL 2 x NULL NULL 3 y NULL NULL <== derived from the TRUE set - Remove 3 y NULL NULL 4 z NULL NULL 4 z NULL NULL 4 z NULL NULL= The final results: Table1 LEFT OUTER JOIN Table2 a b a c ========================= 1 w NULL NULL 2 x NULL NULL 3 y 3 t 4 z NULL NULL The basic rule is that every row in the preserved table is represented in the results in at least one result row. 6.1.1.1 Extended Equality and Proprietary Syntax Before the standard was set, vendors all had a slightly different syntax with slightly different semantics. Most of them involved an extended equality operator based on the original Sybase implementation. There are limitations and serious problems with the extended equality, however. Consider the two Chris Date tables: Suppliers SupParts supno supno partno qty ========= ============== S1 S1 P1 100 S2 S1 P2 250 S3 S2 P1 100 S2 P2 250 104 CHAPTER 6: CODING CHOICES And let’s do a Sybase-style extended equality OUTER JOIN like this: SELECT * FROM Supplier, SupParts WHERE Supplier.supno *= SupParts.supno AND qty < 200; If I do the OUTER join first, I get: Suppliers LOJ SupParts supno supno partno qty ======================= S1 S1 P1 100 S1 S1 P2 250 S2 S2 P1 100 S2 S2 P2 250 S3 NULL NULL NULL Then I apply the (qty < 200) predicate and get: Suppliers LOJ SupParts supno supno partno qty =================== S1 S1 P1 100 S2 S2 P1 100 Doing it in the opposite order results in the following: Suppliers LOJ SupParts supno supno partno qty =================== S1 S1 P1 100 S2 S2 P1 100 S3 NULL NULL NULL Sybase does it one way, Oracle does it another, and Centura (née Gupta) lets you pick which one to use—the worst of both nonstandard worlds! In SQL-92, you have a choice and can force the order of execution. Either do the predicates after the join: 6.1 Pick Standard Constructions over Proprietary Constructions 105 SELECT * FROM Supplier LEFT OUTER JOIN SupParts ON Supplier.supno = SupParts.supno WHERE qty < 200; or do it in the joining: SELECT * FROM Supplier LEFT OUTER JOIN SupParts ON Supplier.supno = SupParts.supno AND qty < 200; Another problem is that you cannot show the same table as preserved and unpreserved in the extended equality version, but it is easy in SQL- 92. For example, to find the students who have taken Math 101 and might have taken Math 102: SELECT C1.student, C1.math, C2.math FROM (SELECT * FROM Courses WHERE math = 101) AS C1 LEFT OUTER JOIN (SELECT * FROM Courses WHERE math = 102) AS C2 ON C1.student = C2.student; Exceptions: None. Almost every vendor, major and minor, has the ANSI infixed OUTER JOIN operator today. You will see various proprietary notations in legacy code, and you can convert it by following the discussion given previously. 6.1.2 Infixed INNER JOIN and CROSS JOIN Syntax Is Optional, but Nice SQL-92 introduced the INNER JOIN and CROSS JOIN operators to match the OUTER JOIN operators and complete the notation; other infixed JOIN operators are not widely implemented but exist for completeness. The functionality of the INNER JOIN and CROSS JOIN 106 CHAPTER 6: CODING CHOICES existed in the FROM clause before and did not give the programmer anything new like the OUTER JOINs. Rationale: The CROSS JOIN is a handy piece of documentation that is much harder to miss seeing than a simple comma. Likewise, writing out INNER JOIN instead of the shorthand INNER helps document the code. However, many INNER JOIN operators can be visually confusing, and you might consider using the older syntax. The older syntax lets you put all of the predicates in one place and group them in some manner for readability. A rule of thumb is the “rule of five” in human psychology. This says that we have problems handling more than five things at once, get serious problems with seven, and break down at nine (Miller 1956). So when you have fewer than five tables, the infixed operators are fine but questionable for more than five INNER JOIN-ed tables. Trying to associate ON clauses to INNER JOIN operators is visually difficult. In particular, a Star Schema has an easily recognized pattern of joins from the fact table to each dimension table, like this in pseudocode: SELECT FROM Facts, Dim1, Dim2, , DimN WHERE Facts.a1 = Dim1.a AND Facts.a2 = Dim2.a AND Facts.an = DimN.a The reader can look down the right-hand side of the WHERE clause and see the dimensions in a vertical list. One style that is popular is to put the join conditions in the FROM clause with INNER JOIN syntax, then do the search arguments in the WHERE clause. Some newbies believe that this is required, but it is not. However, if the search arguments change, having them in one place is handy. A quick heuristic when using old-style joins is that the number of tables in the FROM clause should be one more than the number of join conditions in the WHERE clause. This shows that you do not have cycles in the joins. If the difference between the number of tables and the number of join conditions is more than one, then you might have an unwanted CROSS JOIN caused by a missing join condition. Old style: . and Centura (née Gupta) lets you pick which one to use—the worst of both nonstandard worlds! In SQL- 92, you have a choice and can force the order of execution. Either do the predicates after. the same table as preserved and unpreserved in the extended equality version, but it is easy in SQL- 92. For example, to find the students who have taken Math 101 and might have taken Math 102: . discussion given previously. 6.1.2 Infixed INNER JOIN and CROSS JOIN Syntax Is Optional, but Nice SQL- 92 introduced the INNER JOIN and CROSS JOIN operators to match the OUTER JOIN operators and