Listing 8.45 lists the books that outsold all the books that author A06 wrote (or cowrote). The inner query uses a join to find the sales of each book by author A06. The outer query inspects the highest sales figure in the list and determines whether each book sold more copies. See Figure 8.45 for the result. Again, the IS NOT NULL condition is needed in case sales is null for a book by author A06. I can replicate Listing 8.45 by using GROUP BY , HAVING , and MAX() (instead of ALL ): SELECT title_id FROM titles GROUP BY title_id HAVING MAX(sales) > (SELECT MAX(sales) FROM title_authors ta INNER JOIN titles t ON t.title_id = ta.title_id WHERE ta.au_id = ‘A06’); Listing 8.46 uses a correlated subquery in the HAVING clause of the outer query to list the types of books for which the highest sales figure is more than twice the average sales for that type. The inner query is evalu- ated once for each group defined in the outer query (once for each type of book). See Figure 8.46 for the result. ✔ Tips ■ <> ALL is equivalent to NOT IN ; see “Testing Set Membership with IN ” earlier in this chapter. ■ MySQL 4.0 and earlier don’t support subqueries; see the DBMS Tip in “Understanding Subqueries” earlier in this chapter. In older PostgreSQL versions, convert the floating-point numbers in Listing 8.46 to DECIMAL ; see “Converting Data Types with CAST() ” in Chapter 5. To run Listing 8.46, change the floating-point literal to: CAST(2.0 AS DECIMAL) 290 Chapter 8 Comparing All Subquery Values with ALL Listing 8.45 List the books that outsold all the books that author A06 wrote (or cowrote). See Figure 8.45 for the result. SELECT title_id, title_name FROM titles WHERE sales > ALL (SELECT sales FROM title_authors ta INNER JOIN titles t ON t.title_id = ta.title_id WHERE ta.au_id = 'A06' AND sales IS NOT NULL); Listing title_id title_name T05 Exchange of Platitudes T07 I Blame My Mother T12 Spontaneous, Not Annoying Figure 8.45 Result of Listing 8.45. Listing 8.46 List the types of books for which the highest sales figure is more than twice the average sales for that type. See Figure 8.46 for the result. SELECT t1.type FROM titles t1 GROUP BY t1.type HAVING MAX(t1.sales) >= ALL (SELECT 2.0 * AVG(t2.sales) FROM titles t2 WHERE t1.type = t2.type); Listing type biography Figure 8.46 Result of Listing 8.46. Comparing Some Subquery Values with ANY ANY works like ALL (see the preceding section) but instead determines whether a value is equal to, less than, or greater than any (at least one) of the values in a subquery result. The important characteristics of subquery comparisons that use ANY are: ◆ ANY modifies a comparison operator in a subquery comparison test and follows = , <> , < , <= , > , or >= ; see “Comparing a Subquery Value by Using a Comparison Operator” earlier in this chapter. ◆ The combination of a comparison opera- tor and ANY tells the DBMS how to apply the comparison test to the values returned by a subquery. < ANY , for example, means less than at least one value in the subquery result, and > ANY means greater than at least one value in the subquery result. ◆ When ANY is used with < , <= , > , or >= , the comparison is equivalent to evaluating the subquery result’s maximum or mini- mum value. < ANY means less than at least one subquery value—in other words, less than the maximum value. > ANY means greater than at least one subquery value— that is, greater than the minimum value. Table 8.3 shows equivalent ANY expres- sions and column functions. Listing 8.49 later in this section shows how to repli- cate a > ANY query by using MIN() . ◆ The comparison = ANY is equivalent to IN ; see “Testing Set Membership with IN ” earlier in this chapter. ◆ The subquery can be simple or correlated (see “Simple and Correlated Subqueries” earlier in this chapter). ◆ The subquery’s SELECT -clause list can include only one expression or column name. ◆ The compared values must have the same data type or must be implicitly convertible to the same type (see “Converting Data Types with CAST() ” in Chapter 5). ◆ String comparisons are case insensitive or case sensitive, depending on your DBMS; see the DBMS Tip in “Filtering Rows with WHERE ” in Chapter 4. ◆ The subquery must return exactly one column and zero or more rows. A sub- query that returns more than one column will cause an error. ◆ If the subquery returns no rows, the ANY condition is false. 291 Subqueries Comparing Some Subquery Values with ANY Table 8.3 ANY Equivalencies ANY Expression Column Function < ANY(subquery) < MAX(subquery values) > ANY(subquery) > MIN(subquery values) To compare some subquery values: ◆ In the WHERE clause of a SELECT state- ment, type: WHERE test_expr op ANY (subquery) test_expr is a literal value, a column name, an expression, or a subquery that returns a single value; op is a com- parison operator ( = , <> , < , <= , > , or >= ); and subquery is a subquery that returns one column and zero or more rows. If any (at least one) value in subquery satisfies the ANY condition, the condition evaluates to true. The ANY condition is false if no value in subquery satisfies the condition or if subquery is empty (has zero rows) or contains all nulls. The same syntax applies to a HAVING clause: HAVING test_expr op ANY (subquery) Listing 8.47 lists the authors who live in a city in which a publisher is located. The inner query finds all the cities in which publishers are located, and the outer query compares each author’s city to all the pub- lishers’ cities. See Figure 8.47 for the result. You can use IN to replicate Listing 8.47: SELECT au_id, au_lname, au_fname, city FROM authors WHERE city IN (SELECT city FROM publishers); Listing 8.48 lists the nonbiographies that are priced less than at least one biography. The inner query finds all the biography prices. The outer query inspects the highest price in the list and determines whether each nonbiogra- phy is cheaper. See Figure 8.48 for the result. Unlike the ALL comparison in Listing 8.44 in the preceding section, the price IS NOT NULL condition isn’t required here, even though the price of biography T10 is null. The DBMS doesn’t determine whether all the price com- parisons are true—just whether at least one is true—so the null comparison is ignored. 292 Chapter 8 Comparing Some Subquery Values with ANY Listing 8.47 List the authors who live in a city in which a publisher is located. See Figure 8.47 for the result. SELECT au_id, au_lname, au_fname, city FROM authors WHERE city = ANY (SELECT city FROM publishers); Listing au_id au_lname au_fname city A03 Hull Hallie San Francisco A04 Hull Klee San Francisco A05 Kells Christian New York Figure 8.47 Result of Listing 8.47. Listing 8.48 List the nonbiographies that are cheaper than at least one biography. See Figure 8.48 for the result. SELECT title_id, title_name FROM titles WHERE type <> 'biography' AND price < ANY (SELECT price FROM titles WHERE type = 'biography'); Listing title_id title_name T01 1977! T02 200 Years of German Humor T04 But I Did It Unconsciously T05 Exchange of Platitudes T08 Just Wait Until After School T09 Kiss My Boo-Boo T11 Perhaps It's a Glandular Problem Figure 8.48 Result of Listing 8.48. Listing 8.49 lists the books that outsold at least one of the books that author A06 wrote (or cowrote). The inner query uses a join to find the sales of each book by author A06. The outer query inspects the lowest sales figure in the list and determines whether each book sold more copies. See Figure 8.49 for the result. Again, unlike the ALL compari- son in Listing 8.45 in the preceding section, the IS NOT NULL condition isn’t needed here. I can replicate Listing 8.49 by using GROUP BY , HAVING , and MIN() (instead of ANY ): SELECT title_id FROM titles GROUP BY title_id HAVING MIN(sales) > (SELECT MIN(sales) FROM title_authors ta INNER JOIN titles t ON t.title_id = ta.title_id WHERE ta.au_id = ‘A06’); ✔ Tips ■ = ANY is equivalent to IN , but <> ANY isn’t equivalent to NOT IN . If subquery returns the values x, y, and z, test_expr <> ANY (subquery) is equivalent to: test_expr <> x OR test_expr <> y OR test_expr <> z But test_expr NOT IN (subquery) is equivalent to: test_expr <> x AND test_expr <> y AND test_expr <> z ( NOT IN actually is equivalent to <> ALL .) 293 Subqueries Comparing Some Subquery Values with ANY Listing 8.49 List the books that outsold at least one of the books that author A06 wrote (or cowrote). See Figure 8.49 for the result. SELECT title_id, title_name FROM titles WHERE sales > ANY (SELECT sales FROM title_authors ta INNER JOIN titles t ON t.title_id = ta.title_id WHERE ta.au_id = 'A06'); Listing title_id title_name T02 200 Years of German Humor T03 Ask Your System Administrator T04 But I Did It Unconsciously T05 Exchange of Platitudes T06 How About Never? T07 I Blame My Mother T09 Kiss My Boo-Boo T11 Perhaps It's a Glandular Problem T12 Spontaneous, Not Annoying T13 What Are The Civilian Applications? Figure 8.49 Result of Listing 8.49. ■ In the SQL standard, the keywords ANY and SOME are synonyms. In many DBMSs, you can use SOME in place of ANY . ■ MySQL 4.0 and earlier don’t support subqueries; see the DBMS Tip in “Understanding Subqueries” earlier in this chapter. Testing Existence with EXISTS So far in this chapter, I’ve been using the comparison operators IN , ALL , and ANY to compare a specific test value to values in a subquery result. EXISTS and NOT EXISTS don’t compare values; rather, they simply look for the existence or nonexistence of rows in a subquery result. The important characteristics of an existence test are: ◆ An existence test doesn’t compare values, so it isn’t preceded by a test expression. ◆ The subquery can be simple or correlated but usually is correlated (see “Simple and Correlated Subqueries” earlier in this chapter). ◆ The subquery can return any number of columns and rows. ◆ By convention, the SELECT clause in the subquery is SELECT * to retrieve all columns. Listing specific column names is unnecessary, because EXISTS simply tests for the existence of rows that satisfy the subquery conditions; the actual val- ues in the rows are irrelevant. ◆ All IN , ALL , and ANY queries can be expressed with EXISTS or NOT EXISTS . I’ll give equivalent queries in some of the examples later in this section. ◆ If the subquery returns at least one row, an EXISTS test is true, and a NOT EXISTS test is false. ◆ If the subquery returns no rows, an EXISTS test is false, and a NOT EXISTS test is true. ◆ A subquery row that contains only nulls counts as a row. (An EXISTS test is true, and a NOT EXISTS test is false.) ◆ Because an EXISTS test performs no comparisons, it’s not subject to the same problems with nulls as tests that use IN , ALL , or ANY ; see “Nulls in Subqueries” earlier in this chapter. 294 Chapter 8 Testing Existence with EXISTS To test existence: ◆ In the WHERE clause of a SELECT state- ment, type: WHERE [NOT] EXISTS (subquery) subquery is a subquery that returns any number of columns and rows. If subquery returns one or more rows, the EXISTS test evaluates to true. If subquery returns zero rows, the EXISTS test evalu- ates to false. Specify NOT to negate the test’s result. The same syntax applies to a HAVING clause: HAVING [NOT] EXISTS (subquery) Listing 8.50 lists the names of the publish- ers that have published biographies. This query considers each publisher’s ID in turn and determines whether it causes the exis- tence test to evaluate to true. Here, the first publisher is P01 (Abatis Publishers). The DBMS ascertains whether any rows exist in the table titles in which pub_id is P01 and type is biography. If so, Abatis Publishers is included in the final result. The DBMS repeats the same process for each of the other pub- lisher IDs. See Figure 8.50 for the result. If I wanted to list the names of publishers that haven’t published biographies, I’d change EXISTS to NOT EXISTS . See Listing 8.33 earlier in this chapter for an equivalent query that uses IN . Listing 8.51 lists the authors who haven’t written (or cowritten) a book. See Figure 8.51 for the result. See Listing 8.35 earlier in this chapter for an equivalent query that uses NOT IN . 295 Subqueries Testing Existence with EXISTS Listing 8.50 List the names of the publishers that have published biographies. See Figure 8.50 for the result. SELECT pub_name FROM publishers p WHERE EXISTS (SELECT * FROM titles t WHERE t.pub_id = p.pub_id AND type = 'biography'); Listing pub_name Abatis Publishers Schadenfreude Press Figure 8.50 Result of Listing 8.50. Listing 8.51 List the authors who haven’t written (or cowritten) a book. See Figure 8.51 for the result. SELECT au_id, au_fname, au_lname FROM authors a WHERE NOT EXISTS (SELECT * FROM title_authors ta WHERE ta.au_id = a.au_id); Listing au_id au_fname au_lname A07 Paddy O'Furniture Figure 8.51 Result of Listing 8.51. Listing 8.52 lists the authors who live in a city in which a publisher is located. See Figure 8.52 for the result. See Listing 8.47 earlier in this chapter for an equivalent query that uses = ANY . “Finding Common Rows with INTERSECT ” in Chapter 9 describes how to use INTERSECT to retrieve the rows that two tables have in common. You also can use EXISTS to find an intersection. Listing 8.53 lists the cities in which both an author and publisher are located. See Figure 8.53 for the result. See Listing 9.8 in Chapter 9 for an equivalent query that uses INTERSECT . You also can replicate this query with an inner join: SELECT DISTINCT a.city FROM authors a INNER JOIN publishers p ON a.city = p.city; 296 Chapter 8 Testing Existence with EXISTS Listing 8.52 List the authors who live in a city in which a publisher is located. See Figure 8.52 for the result. SELECT au_id, au_lname, au_fname, city FROM authors a WHERE EXISTS (SELECT * FROM publishers p WHERE p.city = a.city); Listing au_id au_lname au_fname city A03 Hull Hallie San Francisco A04 Hull Klee San Francisco A05 Kells Christian New York Figure 8.52 Result of Listing 8.52. Listing 8.53 List the cities in which both an author and publisher are located. See Figure 8.53 for the result. SELECT DISTINCT city FROM authors a WHERE EXISTS (SELECT * FROM publishers p WHERE p.city = a.city); Listing city New York San Francisco Figure 8.53 Result of Listing 8.53. “Finding Different Rows with EXCEPT ” in Chapter 9 describes how to use EXCEPT to retrieve the rows in one table that aren’t also in another table. You also can use NOT EXISTS to find a difference. Listing 8.54 lists the cities in which an author lives but a publish- er isn’t located. See Figure 8.54 for the result. See Listing 9.9 in Chapter 9 for an equivalent query that uses EXCEPT . You also can replicate this query with NOT IN : SELECT DISTINCT city FROM authors WHERE city NOT IN (SELECT city FROM publishers); Or with an outer join: SELECT DISTINCT a.city FROM authors a LEFT OUTER JOIN publishers p ON a.city = p.city WHERE p.city IS NULL; Listing 8.55 lists the authors who wrote (or cowrote) three or more books. See Figure 8.55 for the result. 297 Subqueries Testing Existence with EXISTS Listing 8.54 List the cities in which an author lives but a publisher isn’t located. See Figure 8.54 for the result. SELECT DISTINCT city FROM authors a WHERE NOT EXISTS (SELECT * FROM publishers p WHERE p.city = a.city); Listing city Boulder Bronx Palo Alto Sarasota Figure 8.54 Result of Listing 8.54. Listing 8.55 List the authors who wrote (or cowrote) three or more books. See Figure 8.55 for the result. SELECT au_id, au_fname, au_lname FROM authors a WHERE EXISTS (SELECT * FROM title_authors ta WHERE ta.au_id = a.au_id HAVING COUNT(*) >= 3); Listing au_id au_fname au_lname A01 Sarah Buchman A02 Wendy Heydemark A04 Klee Hull A06 Kellsey Figure 8.55 Result of Listing 8.55. Listing 8.56 uses two existence tests to list the authors who wrote (or cowrote) both children’s and psychology books. See Figure 8.56 for the result. Listing 8.57 performs a uniqueness test to determine whether duplicates occur in the column au_id in the table authors . The query prints Ye s if duplicate values exist in the column au_id ; otherwise, it returns an empty result. See Figure 8.57 for the result. au_id is the primary key of authors , so of course it contains no duplicates. 298 Chapter 8 Testing Existence with EXISTS Listing 8.56 List the authors who wrote (or cowrote) a children’s book and also wrote (or cowrote) a psychology book. See Figure 8.56 for the result. SELECT au_id, au_fname, au_lname FROM authors a WHERE EXISTS (SELECT * FROM title_authors ta INNER JOIN titles t ON t.title_id = ta.title_id WHERE ta.au_id = a.au_id AND t.type = 'children') AND EXISTS (SELECT * FROM title_authors ta INNER JOIN titles t ON t.title_id = ta.title_id WHERE ta.au_id = a.au_id AND t.type = 'psychology'); Listing au_id au_fname au_lname A06 Kellsey Figure 8.56 Result of Listing 8.56. Listing 8.57 Does the column au_id in the table authors contain duplicate values? See Figure 8.57 for the result. SELECT DISTINCT 'Yes' AS "Duplicates?" WHERE EXISTS (SELECT * FROM authors GROUP BY au_id HAVING COUNT(*) > 1); Listing Duplicates? Figure 8.57 Result of Listing 8.57. Listing 8.58 shows the same query for the table title_authors , which does contain duplicate au_id values. See Figure 8.58 for the result. You can add grouping columns to the GROUP BY clause to determine whether multiple-column duplicates exist. ✔ Tips ■ You also can use COUNT(*) to determine whether a subquery returns at least one row, but COUNT(*) (usually) is less efficient than EXISTS . The DBMS quits processing an EXISTS subquery as soon as it deter- mines whether the subquery returns a row, whereas COUNT(*) forces the DBMS to process the entire subquery. This query is equivalent to Listing 8.52 but runs slower: SELECT au_id, au_lname, au_fname, city FROM authors a WHERE (SELECT COUNT(*) FROM publishers p WHERE p.city = a.city) > 0; ■ Although SELECT * is the most common form of the SELECT clause in an EXISTS subquery, you can use SELECT column or SELECT constant_value to speed queries if your DBMS’s optimizer isn’t bright enough to figure out that it doesn’t need to construct an entire interim table for an EXISTS subquery. For more informa- tion, see “Comparing Equivalent Queries” later in this chapter. continues on next page 299 Subqueries Testing Existence with EXISTS Listing 8.58 Does the column au_id in the table title_authors contain duplicate values? See Figure 8.58 for the result. SELECT DISTINCT 'Yes' AS "Duplicates?" WHERE EXISTS (SELECT * FROM title_authors GROUP BY au_id HAVING COUNT(*) > 1); Listing Duplicates? Yes Figure 8.58 Result of Listing 8.58. . IN ” earlier in this chapter. ■ MySQL 4.0 and earlier don’t support subqueries; see the DBMS Tip in “Understanding Subqueries” earlier in this chapter. In older PostgreSQL versions, convert the floating-point. Applications? Figure 8.49 Result of Listing 8.49. ■ In the SQL standard, the keywords ANY and SOME are synonyms. In many DBMSs, you can use SOME in place of ANY . ■ MySQL 4.0 and earlier don’t support subqueries;